High Fidelity Web Archiving of News Sites and New Media with Browsertrix

dc.audienceAudience::News Media Sectionen_US
dc.contributor.authorWalsh, Tessa
dc.contributor.authorWilkinson, Henry
dc.contributor.authorKreymer, Ilya
dc.coverage.spatialLocation::Canadaen_US
dc.coverage.spatialLocation::United States of Americaen_US
dc.date.accessioned2024-06-25T09:21:09Z
dc.date.available2024-06-20
dc.date.available2024-06-25T09:21:09Z
dc.date.issued2024-05-30
dc.description.abstractThis paper discusses how Webrecorder’s free and open source browser-based web archiving tools such as Browsertrix can and have been used by libraries and archives to create and provide access to high fidelity web archives of online news sites, social media, digital publications, digital humanities projects, and other historically difficult to preserve forms of online news media. Emphasis is placed on recently developed assistive quality assurance (QA) tools implemented in Browsertrix that allow users to assess the quality of captured content with the assistance of automatically calculated metrics such as screenshot and text comparison between the site as visited by a browser during crawling and its replay from the captured archive. This exciting new development builds on existing features which differentiate Webrecorder’s browser-based crawling from alternative web archiving methods, such as the use of browser profiles to archive material behind log-ins and on personalized social media feeds, ad and cookie blocking features, and a suite of extendable behaviors that drive the browser during capture, allowing for autoscroll as well as automated navigation of certain social media sites. The paper discusses how these features enable librarians to easily and effectively preserve and provide access to news media, referencing several recent collaborations between Webrecorder, libraries, journalists, and others invested in high fidelity archiving of important and often complex online content.en_US
dc.identifier.urihttps://repository.ifla.org/handle/20.500.14598/3399
dc.language.isoenen_US
dc.publisherInternational Federation of Library Associations and Institutions (IFLA)en_US
dc.relation.ispartofseriesIFLA International News Media Conference 2024;Aarhus, Denmark, 29 - 31 May 2024
dc.rights.holderInternational Federation of Library Associations and Institutions (IFLA)en_US
dc.rights.licenseCC BY 4.0en_US
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/en_US
dc.subjectSubject::Digital preservationen_US
dc.subjectSubject::Online newsen_US
dc.subjectSubject::News mediaen_US
dc.titleHigh Fidelity Web Archiving of News Sites and New Media with Browsertrixen_US
dc.typeEvents Materialsen_US
ifla.UnitUnits::Section::News Media Sectionen_US
ifla.UnitUnits::Special Interest Group::Digital Humanities – Digital Scholarship Special Interest Groupen_US
ifla.UnitUnits::Section::Information Literacy Sectionen_US
ifla.oPubId0en_US

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Walsh_IFLA2024.pdf
Size:
198.25 KB
Format:
Adobe Portable Document Format
Description:
High Fidelity Web Archiving of News Sites and New Media with Browsertrix

Collections