High Fidelity Web Archiving of News Sites and New Media with Browsertrix
dc.audience | Audience::News Media Section | en_US |
dc.contributor.author | Walsh, Tessa | |
dc.contributor.author | Wilkinson, Henry | |
dc.contributor.author | Kreymer, Ilya | |
dc.coverage.spatial | Location::Canada | en_US |
dc.coverage.spatial | Location::United States of America | en_US |
dc.date.accessioned | 2024-06-25T09:21:09Z | |
dc.date.available | 2024-06-20 | |
dc.date.available | 2024-06-25T09:21:09Z | |
dc.date.issued | 2024-05-30 | |
dc.description.abstract | This paper discusses how Webrecorder’s free and open source browser-based web archiving tools such as Browsertrix can and have been used by libraries and archives to create and provide access to high fidelity web archives of online news sites, social media, digital publications, digital humanities projects, and other historically difficult to preserve forms of online news media. Emphasis is placed on recently developed assistive quality assurance (QA) tools implemented in Browsertrix that allow users to assess the quality of captured content with the assistance of automatically calculated metrics such as screenshot and text comparison between the site as visited by a browser during crawling and its replay from the captured archive. This exciting new development builds on existing features which differentiate Webrecorder’s browser-based crawling from alternative web archiving methods, such as the use of browser profiles to archive material behind log-ins and on personalized social media feeds, ad and cookie blocking features, and a suite of extendable behaviors that drive the browser during capture, allowing for autoscroll as well as automated navigation of certain social media sites. The paper discusses how these features enable librarians to easily and effectively preserve and provide access to news media, referencing several recent collaborations between Webrecorder, libraries, journalists, and others invested in high fidelity archiving of important and often complex online content. | en_US |
dc.identifier.uri | https://repository.ifla.org/handle/20.500.14598/3399 | |
dc.language.iso | en | en_US |
dc.publisher | International Federation of Library Associations and Institutions (IFLA) | en_US |
dc.relation.ispartofseries | IFLA International News Media Conference 2024;Aarhus, Denmark, 29 - 31 May 2024 | |
dc.rights.holder | International Federation of Library Associations and Institutions (IFLA) | en_US |
dc.rights.license | CC BY 4.0 | en_US |
dc.rights.uri | https://creativecommons.org/licenses/by/4.0/ | en_US |
dc.subject | Subject::Digital preservation | en_US |
dc.subject | Subject::Online news | en_US |
dc.subject | Subject::News media | en_US |
dc.title | High Fidelity Web Archiving of News Sites and New Media with Browsertrix | en_US |
dc.type | Events Materials | en_US |
ifla.Unit | Units::Section::News Media Section | en_US |
ifla.Unit | Units::Special Interest Group::Digital Humanities – Digital Scholarship Special Interest Group | en_US |
ifla.Unit | Units::Section::Information Literacy Section | en_US |
ifla.oPubId | 0 | en_US |
Files
Original bundle
1 - 1 of 1
Loading...
- Name:
- Walsh_IFLA2024.pdf
- Size:
- 198.25 KB
- Format:
- Adobe Portable Document Format
- Description:
- High Fidelity Web Archiving of News Sites and New Media with Browsertrix