Archiving and Accessing HTML-Based Newspapers Using XML and CDATA Strings
| dc.audience | Audience::Newspapers Section | |
| dc.conference.date | 10-12 August 2016 | |
| dc.conference.place | Lexington, KY, USA | |
| dc.conference.sessionType | Satellite Meeting: News Media | |
| dc.conference.title | News, new roles & preservation advocacy: moving libraries into action | |
| dc.conference.venue | The Hilton | |
| dc.contributor.author | Weig, Eric | |
| dc.date.accessioned | 2025-09-24T08:48:26Z | |
| dc.date.available | 2025-09-24T08:48:26Z | |
| dc.date.issued | 2017 | |
| dc.description.abstract | This article outlines one in-house model for archiving and providing access to HTML-based news in the Kentucky Digital Newspaper Program (KDNP) at the University of Kentucky (UK). To allow for search and retrieval of HTML-based news in the KDNP which already contains news content digitized from analog sources, the encapsulation of HTML content using XML encoded CDATA strings read by a prototype open-source PHP viewer is described. | en |
| dc.identifier.citation | 1. The World Wide Web Consortium (W3C) . What Is Hypertext? [Internet]. The World Wide Web Consortium (W3C); [cited 2016 Feb 10] . Available from: http://www.w3.org/WhatIs.html 2. Herborth C. 2010 Jan 12. Dealing with Data in XML. [Internet]. IBM DeveloperWorks; [cited 2016 Feb 10]. Available from: http://www.ibm.com/developerworks/library/x-cdata/ 3. Nielsen J. 1995 Feb 01. History of Hypertext [Internet]. Nielsen Norman Group; [cited 2016 Feb 10]. Available from: https://www.nngroup.com/articles/hypertext-history/ 4. Newspaper Digitization Interest Group (NDIG). 2014. Metadata Application Profile - Digital Newspapers [Internet]. Newspaper Digitization Interest Group (NDIG); [cited 2016 Feb 10]. Available from: https://sites.google.com/site/digitalnewspaperspractices/technical-specifications/metadata-specfication 5. Geiger B. 20 Jan 2016. Fate of Your Archives Is ... Uncertain [Internet]. California Newspaper Publishers Association; [cited 2016 Feb 10]. Available from: http://www.cnpa.com/california_publisher/features/fate-of-your-archives-is-uncertain/article_8f3e2cda-bfd4-11e5-86f8-9797680b24ed.html 6. Pierce V. 2012 Feb 09. Finding That Needle in the Haystack: The Power of Full Text Searching in Chronicling America. [Internet]. South Carolina Digital Newspaper Program; [cited 2016 Feb 10]. Available from: http://library.sc.edu/blogs/newspaper/2012/02/09/finding-that-needle-in-the-haystack-the-power-of-full-text-searching-in-chronicling-america/ 7. Lepore J. 2015. What the Web Said Yesterday. The New Yorker [Internet]. [cited 2016 Feb 10] Available from: http://www.newyorker.com/magazine/2015/01/26/cobweb 8. Grainger S. 2000. Emulation as a Digital Preservation Strategy. D-lib Magazine [Internet]. [cited 2016 Feb 10]. Available from: http://www.dlib.org/dlib/october00/granger/10granger.html 9. Johnston L. 2014 Feb 11. Considering Emulation for Digital Preservation [Internet]. The Signal Digital Preservation: Library of Congress; [cited 2016 Feb 10]. Available from: https://blogs.loc.gov/digitalpreservation/2014/02/considering-emulation-for-digital-preservation/ 10. Sawers P. 2015 Oct 22. The Internet Archive Is Rebuilding the Wayback Machine to Make Web History Easier to search [Internet]. VentureBeat; [cited 2016 Feb 10]. Available from: http://venturebeat.com/2015/10/22/the-internet-archive-is-rebuilding-the-wayback-machine-to-make-the-webs-history-easier-to-search/ 11. University of Kentucky Libraries. 2016. Newz Viewer [Internet]. GitHub Code Repository; [cited 2016 Feb 10]. Available from: https://github.com/uklibraries/newz-viewer 12. Project Blacklight, 2016. Blacklight Discovery Platform Framework [Internet]; [cited 2016 Mar 31]. Availabel from: http://projectblacklight.org/ | |
| dc.identifier.relatedurl | https://2016.ifla.org/programme/satellite-meetings | |
| dc.identifier.uri | https://repository.ifla.org/handle/20.500.14598/6246 | |
| dc.language.iso | eng | |
| dc.rights | Attribution 4.0 International | |
| dc.rights.accessRights | open access | |
| dc.rights.uri | https://creativecommons.org/licenses/by/4.0/ | |
| dc.subject.keyword | News | |
| dc.subject.keyword | newspapers | |
| dc.subject.keyword | born digital | |
| dc.subject.keyword | libraries | |
| dc.subject.keyword | collections | |
| dc.subject.keyword | HTML | |
| dc.subject.keyword | web harvesting | |
| dc.subject.keyword | digital preservation | |
| dc.title | Archiving and Accessing HTML-Based Newspapers Using XML and CDATA Strings | en |
| dc.type | Article | |
| ifla.Unit | Section:Newspapers Section | |
| ifla.oPubId | https://library.ifla.org/id/eprint/2096/ |
Files
Original bundle
1 - 1 of 1