Improving Performance in AI-based Automatic Classification through Feature Augmentation: A Case Study of KDC

dc.audienceAudience::Bibliography Section
dc.audienceAudience::Information Technology Section
dc.audienceAudience::Artificial Intelligence Special Interest Group
dc.contributor.authorChul, Jung
dc.contributor.authorSoo-Sang, Lee
dc.contributor.authorJee-Hyun, Rho
dc.date.accessioned2025-10-20T08:44:24Z
dc.date.available2025-10-20T08:44:24Z
dc.date.issued2025-10
dc.description.abstractThe objective of this study is to empirically examine the performance variations of an AI-based Korean Decimal Classification(KDC) automatic classification model through the augmentation of classification features, aiming to identify strategies that improve the consistency and accuracy in automated subject cataloguing of classification numbers. Experiments were conducted using 5,882 bibliographic records, where metadata from the library domain were supplemented with publishing metadata by integrating independent attributes from both sources. Core features(title, author) and KDC extracted from the National Library of Korea’s database were enriched with external features(keywords, book summary, tables of contents) collected from the Korea Publication Industry Promotion Agency’s BNK database. Feature composition was organized into three sets: Feature Set A(title, author), Feature Set B(title, author, keywords), and Feature Set C(title, author, keywords, book summary, tables of contents). Multi-class classification models based on KLUE-BERT were developed for each set, and their performance variations were systematically analyzed. The findings demonstrate that feature enrichment resulted in progressive improvements across all KDC main classes. The Arts(6XX) class exhibited the most substantial improvement, with a 124.24% increase in the F1-score from Feature Set C to Feature Set A. Significant gains were also observed in several other classes, including Science and Technology(57.14%), Social Sciences(40.00%), History(34.04%), and Literature(25.37%). Further analysis across the 61 divisions revealed that 28 divisions demonstrated continuous improvement, 20 showed limited improvement, 7 exhibited performance degradation, and 6 showed no significant change. These findings underscore the critical importance of feature augmentation in enhancing the performance of KDC automatic classification model, while indicating that its effectiveness may vary depending on the interaction between classification divisions and feature attributes. To improve classification performance, it is necessary to adopt not only feature enrichment but also more advanced strategies, including hierarchical classification structures, data refinement techniques, and sophisticated data augmentation methods. (presented on 15 August 2025 at "Pushing Boundaries to Next Generation Cataloguing: Experiments at the Edge of AI and Metadata" session)
dc.identifier.urihttps://www.ifla.org/events/artificial-intelligence-bibliographic-control-and-legal-matters-navigating-new-horizons/
dc.identifier.urihttps://2025.ifla.org/bibliography-section-with-the-information-technology-section-and-the-ifla-artificial-intelligence-special-interest-group/
dc.identifier.urihttps://wlic2025.astanait.edu.kz/
dc.identifier.urihttps://repository.ifla.org/handle/20.500.14598/6863
dc.language.isoeng
dc.publisherInternational Federation of Library Associations and Institutions (IFLA)
dc.relation.ispartofseries89th IFLA World Library and Information Congress (WLIC), 2025 Astana
dc.relation.ispartofseriesWLIC 2025, Astana, Satellite Meeting: Artificial Intelligence, Bibliographic Control and Legal Matters: Navigating New Horizons
dc.rights.holderInternational of Library Associations and Institutions (IFLA)
dc.rights.licenseCC BY 4.0
dc.rights.urihttps://creativecommons.org/licenses/by/4.0
dc.subjectArtificial intelligence
dc.subjectAutomation
dc.subjectClassification and indexing
dc.subjectMetadata
dc.titleImproving Performance in AI-based Automatic Classification through Feature Augmentation: A Case Study of KDC
dc.typeEvents Material
ifla.UnitSection::Bibliography Section
ifla.UnitSection::Information Technology Section
ifla.UnitSpecial Interest Group::Artificial Intelligence Special Interest Group
ifla.oPubId0

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Performance Changes in Automatic KCD Classifications_Chul_WLIC2025.pdf
Size:
2.39 MB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
2.28 KB
Format:
Item-specific license agreed upon to submission
Description: