00:00 Introduction to the program (Jon Ippolito)
01:04 Introducing Anna Perricci
Varieties of web archiving
05:01 Introduction to web archiving
06:55 Automated versus human-scale archiving
("Automation still requires quality assurance"
09:35 Community collecting with Occupy Wall Street, Internet Archive, and Documenting the Now
Perricci's work archiving Occupy Wall Street
10:50 Introduction to Webrecorder
14:19 What is high-fidelity web archiving?
20:27 Built-in emulation of vintage browsers
You can choose a legacy browser on a site that has obsolete content (eg, Flash and Java)
Captures content for you, but depends on structure of social media sites.
21:26 How to get technical help
23:17 Archiving representative samples (Matthew Revitt)
25:40 More on Autopilot (Meagan Doyle)
27:36 Browsertrix (Alex Kaelin)
28:25 How to patch missing content (Colin Windhorst)
29:41 Demo of Webrecorder acting on a site
37:04 "Capture URL again" v. "Patch this URL" (Sarah Danser)
37:50 Using emulated browsers for both capture and playback (John Bell)
39:11 Time frame for archiving a social media site like Facebook (Renee DesRoberts)
40:22 Capturing beyond images, eg iframes and hidden web structures
42:07 Capturing data-driven websites (Cynde Moya)
43:08 Capturing outgoing requests, like a query in a search box (John)
Carnegie Hall case study
44:02 Ease of patching compared to other tools (Sean Crawford)
44:18 Editing options (Kim Sawtelle)
"Trying to pull a thread out of a tapestry"
46:02 Capturing live content in real-time, like streaming radio (Alex)
47:43 Saving local backups (Colin)
48:44 Capturing dead links, eg in Scalar books (Colin).
50:24 Case study of media-rich journalism (Rhonda Carpenter)
The Snowfall project
51:45 Archive-It and capturing database content (Matthew)
54:17 Top-down harvesters (such as OAIS) versus bottom-up, human-scale solutions.
This teleconference is a project of the University of Maine's Digital Curation program. For more information, contact ude.eniam@otiloppij.
Timecodes are in Minutes: hours
In this interactive discussion hosted by the University of Maine's Digital Curation program, Anna Perricci presents new tools and techniques for saving Internet culture for posterity.
As more of our work and entertainment moves online, the challenge of preserving websites and social media becomes increasingly urgent. Studies peg the average time before a website or mobile app is rewritten or lost at 50-100 days. And unlike the static HTML pages of the 1990s, modern websites are built dynamically from JavaScript as they load in the browser; without those external calls, saving a Twitter or Facebook page can leave you with a handful of floating text snippets flanked by gray rectangles.
In this public webinar, Perricci presents new tools and techniques for saving online data and culture for posterity. Perricci has been working at the leading edge of web archiving for the better part of a decade. After positions at the New York Public Library and ArtStor, Perricci served as a web archiving librarian at Columbia University and a digital archivist for Occupy Wall Street. She's taught web archiving for the Society of American Archivists and been an associate director for Webrecorder, where she helped secure a million-dollar grant to produce and promote this innovative, browser-based tool used to create high-fidelity, interactive captures of websites.
Watch the entire video or choose an excerpt from the menu on this page.