Semantic enrichment on Large Corpora: a case study for Patrologia Graeca
Loading...
Date
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
In this paper a case study for Patrologia Graeca (PG) is presented to reveal the difficulties that arise when a semantic enrichment though interconnections is pursued with specific aim to discuss an alternative cooperative architecture between libraries. The presented case method interconnects the PG with a range of satellite scanned or editable texts to provide a better user navigation experience, easy citation for specific PG page and document interconnections. Although the semantic enrichment can be achieved by using semi-manual methods, this is not recommended due to the fact that these tasks are time consuming and costly. Therefore a new architecture for a better cooperation between libraries and cultural institutions is proposed in order to provide a framework for the application of Artificial Intelligence(AI) and Pattern recognition (PR) techniques for mass enrichment on Large corpora. The proposed framework creates the possibility for the participating libraries to cooperate and provide a high quality commonly accepted version of a specific corpus which is used by independent research teams to apply AI and PR methods and create automatically the structure and the interconnections between documents.
Description
Keywords
Citation
AbbyyFineReader 14. 2019. “OCR System.” https://www.abbyy.com/en-ca/finereader/.
Ahmed, Rashad, Wasfi G. Al-Khatib, and Sabri Mahmoud. 2017. “A Survey on Handwritten Documents Word Spotting.” International Journal of Multimedia Information Retrieval.
Anemi. 2006. “Digital Libary of Modern Greek Studies.” https://anemi.lib.uoc.gr/metadata/8/5/0/metadata-01-0001289.tkl.
Archive.org. 1996. “Internet Archive Homepage.” https://archive.org.
Belongie, S., Jitendra Malik, and J. Puzicha. 2002. “Shape Matching and Object Recognition Using Shape Contexts-14pages.” IEEE Transactions on Pattern Analysis and Machine Intelligence.
DocumentaCatholicaOmnia. “Latin Index of Patrologia Graeca.” 2011. http://www.documentacatholicaomnia.eu/.
Erdem, Aykut, and Sibel Tari. 2010. “A Similarity-Based Approach for Shape Classification Using Aslan Skeletons.” Pattern Recognition Letters.
Giotis, Angelos P., Demetrios P. Gerogiannis, and Christophoros Nikou. 2014. “Word Spotting in Handwritten Text Using Contour-Based Models.” In Proceedings of International Conference on Frontiers in Handwriting Recognition, ICFHR,.
Giotis, Angelos P., Giorgos Sfikas, Basilis Gatos, and Christophoros Nikou. 2017. “A Survey of Document Image Word Spotting Techniques.” Pattern Recognition.
Giotis, Angelos P., Giorgos Sfikas, Christophoros Nikou, and Basilis Gatos. 2015. “Shape-Based Word Spotting in Handwritten Document Images.” In Proceedings of the International Conference on Document Analysis and Recognition, ICDAR,.
Google. 2004. “Books Library Project.” https://books.google.com.
Liu, Weibo et al. 2017. “A Survey of Deep Neural Network Architectures and Their Applications.” Neurocomputing.
Papadopoulos, Stylianos. 1982. Patrology, Vol. 1: Introduction, Second and Third Century (2nd. Ed.). Athens.
Park, Jung-ran, and Andrew Brenza. 2015. “Evaluation of Semi-Automatic Metadata Generation Tools: A Survey of the Current State of the Art.” Information Technology and Libraries.
Patristica.net. 2019. “PG Volumes List.” http://patristica.net/graeca/.
Perseus. 1987. “Digital Library Homepage.” http://www.perseus.tufts.edu/hopper/.
Sfikas, Giorgos, Angelos P. Giotis, Georgios Louloudis, and Basilis Gatos. 2015. “Using Attributes for Word Spotting and Recognition in Polytonic Greek Documents.” In Proceedings of the International Conference on Document Analysis and Recognition, ICDAR,.
Tesseract 4.0. 2019. “OCR System.” https://github.com/tesseract-ocr/tesseract.
TLG. 2000. “Thesaurus Linguae Graeca Homepage.” http://www.tlg.uci.edu/index.prev.php.
Varthis, Evagelos, Marios Poulos, Ilias Giarenis, and Sozon Papavlasopoulos. 2019. “Patrologia Graeca, Semantic Enrchment and Navigation.” http://patrologia.tk/kleida/.