Using Graph Visualization to enhance representation and evaluation of work clusters

Loading...
Thumbnail Image

Date

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

The German National Library (DNB) uses the platform Culturegraph (www.culturegraph.org) to aggregate metadata of library holdings of German and Austrian library networks. In this data pool of more than 171 million items we perform work clustering. By offering an aggregated view of all publications representing a work, e.g. different editions and translations, the collection appears much more structured and searching is easier. In this paper we would like to show the use of graph visualization for display, analysis and evaluation of work clusters. Graph visualization enables users to obtain a more transparent view of the connections underlying the structure of a work cluster. Work clustering is achieved by creating and matching keys which combine different metadata elements of a bibliographic record. Applying a breadth-first search, publications with identical matchkeys are grouped together. We use different keys to represent a publication, so each publication can obtain more than one key. Thus, one matchkey establishes a connection between publications. If one of the publications shares a different matchkey with even more publications the network representing a work grows. Moving beyond visualization we can also gain statistical indicators from the graph to evaluate a work cluster. Degree, average path length or centrality measures can offer information about the internal structure of a work cluster. Particularly with large clusters information about the degree of connections between the members and the existence of more closely related subclusters is important for evaluating clusters. Using graph visualization thus not only assists in grasping the internal structure of a work cluster more clearly but also helps managing and evaluating large datasets, ultimately leading to better clustering results to support data representation and findability.

Description

Keywords

Citation

Bastian, Mathieu, Sebastien Heymann, and Mathieu Jacomy. ‘Gephi: An Open Source Software for Exploring and Manipulating Networks.’ International AAAI Conference on Weblogs and Social Media, 2009. https://gephi.org/publications/gephi-bastianfeb09. pdf. Brandes, Ulrik, and Thomas Erlebach, eds. Network Analysis. Vol. 3418. Lecture Notes in Computer Science. Berlin, Heidelberg: Springer Berlin Heidelberg, 2005. https://doi.org/10.1007/b106453. IFLA. ‘Functional Requirements for Bibliographic Records. Final Report’, 1998. https://www.ifla.org/files/assets/cataloguing/frbr/frbr_2008.pdf. Geipel, Markus Michael, Christoph Böhme, and Jan Hannemann. ‘Metamorph: A Transformation Language for Semi-Structured Data’. D-Lib Magazine 21, no. 5/6 (May 2015). https://doi.org/10.1045/may2015-boehme. Hickey, Thomas B., and Jenny Toves. ‘FRBR Work-Set Algorithm. Version 2.0’, 2009. https://www.oclc.org/content/dam/research/activities/frbralgorithm/2009-08.pdf. Newman, M. E. J. Networks: An Introduction. Oxford ; New York: Oxford University Press, 2010. Pfeffer, Magnus. ‘Using Clustering Across Union Catalogues to Enrich Entries with Indexing Information’. In Data Analysis, Machine Learning and Knowledge Discovery, Hrsg. Myra Spiliopoulou, Lars Schmidt-Thieme, Ruth Janning, 437–45. Cham: Springer International Publishing, 2014. 13 Pfeifer, Barbara, and Renate Polak-Bennemann. ‘Zusammenführen was zusammengehört – Intellektuelle und automatische Erfassung von Werken nach RDA’. o-bib. Das offene Bibliotheksjournal 3, no. 4 (2016): 144–55. https://doi.org/10.5282/obib/ 2016h4s144-155. Raj P.M., Krishna, Ankith Mohan, and K.G. Srinivasa. Practical Social Network Analysis with Python. Computer Communications and Networks. Cham: Springer International Publishing, 2018. https://doi.org/10.1007/978-3-319-96746-2. Riva, Pat, Patrick Le Boeuf, and Maja Žumer. ‘IFLA Library Reference Model. A Conceptual Model for Bibliographic Information’, 2017. https://www.ifla.org/files/assets/cataloguing/frbr-lrm/ifla-lrm-august-2017.pdf. Wiesenmüller, Heidrun, and Magnus Pfeffer. ‘Abgleichen, Anreichern, Verknüpfen’. BuB 35, no. 09 (2013): 625–29.