Improving Performance in AI-based Automatic Classification through Feature Augmentation: A Case Study of KDC
dc.audience | Audience::Bibliography Section | |
dc.audience | Audience::Information Technology Section | |
dc.audience | Audience::Artificial Intelligence Special Interest Group | |
dc.contributor.author | Chul, Jung | |
dc.contributor.author | Soo-Sang, Lee | |
dc.contributor.author | Jee-Hyun, Rho | |
dc.date.accessioned | 2025-10-20T08:44:24Z | |
dc.date.available | 2025-10-20T08:44:24Z | |
dc.date.issued | 2025-10 | |
dc.description.abstract | The objective of this study is to empirically examine the performance variations of an AI-based Korean Decimal Classification(KDC) automatic classification model through the augmentation of classification features, aiming to identify strategies that improve the consistency and accuracy in automated subject cataloguing of classification numbers. Experiments were conducted using 5,882 bibliographic records, where metadata from the library domain were supplemented with publishing metadata by integrating independent attributes from both sources. Core features(title, author) and KDC extracted from the National Library of Korea’s database were enriched with external features(keywords, book summary, tables of contents) collected from the Korea Publication Industry Promotion Agency’s BNK database. Feature composition was organized into three sets: Feature Set A(title, author), Feature Set B(title, author, keywords), and Feature Set C(title, author, keywords, book summary, tables of contents). Multi-class classification models based on KLUE-BERT were developed for each set, and their performance variations were systematically analyzed. The findings demonstrate that feature enrichment resulted in progressive improvements across all KDC main classes. The Arts(6XX) class exhibited the most substantial improvement, with a 124.24% increase in the F1-score from Feature Set C to Feature Set A. Significant gains were also observed in several other classes, including Science and Technology(57.14%), Social Sciences(40.00%), History(34.04%), and Literature(25.37%). Further analysis across the 61 divisions revealed that 28 divisions demonstrated continuous improvement, 20 showed limited improvement, 7 exhibited performance degradation, and 6 showed no significant change. These findings underscore the critical importance of feature augmentation in enhancing the performance of KDC automatic classification model, while indicating that its effectiveness may vary depending on the interaction between classification divisions and feature attributes. To improve classification performance, it is necessary to adopt not only feature enrichment but also more advanced strategies, including hierarchical classification structures, data refinement techniques, and sophisticated data augmentation methods. (presented on 15 August 2025 at "Pushing Boundaries to Next Generation Cataloguing: Experiments at the Edge of AI and Metadata" session) | |
dc.identifier.uri | https://www.ifla.org/events/artificial-intelligence-bibliographic-control-and-legal-matters-navigating-new-horizons/ | |
dc.identifier.uri | https://2025.ifla.org/bibliography-section-with-the-information-technology-section-and-the-ifla-artificial-intelligence-special-interest-group/ | |
dc.identifier.uri | https://wlic2025.astanait.edu.kz/ | |
dc.identifier.uri | https://repository.ifla.org/handle/20.500.14598/6863 | |
dc.language.iso | eng | |
dc.publisher | International Federation of Library Associations and Institutions (IFLA) | |
dc.relation.ispartofseries | 89th IFLA World Library and Information Congress (WLIC), 2025 Astana | |
dc.relation.ispartofseries | WLIC 2025, Astana, Satellite Meeting: Artificial Intelligence, Bibliographic Control and Legal Matters: Navigating New Horizons | |
dc.rights.holder | International of Library Associations and Institutions (IFLA) | |
dc.rights.license | CC BY 4.0 | |
dc.rights.uri | https://creativecommons.org/licenses/by/4.0 | |
dc.subject | Artificial intelligence | |
dc.subject | Automation | |
dc.subject | Classification and indexing | |
dc.subject | Metadata | |
dc.title | Improving Performance in AI-based Automatic Classification through Feature Augmentation: A Case Study of KDC | |
dc.type | Events Material | |
ifla.Unit | Section::Bibliography Section | |
ifla.Unit | Section::Information Technology Section | |
ifla.Unit | Special Interest Group::Artificial Intelligence Special Interest Group | |
ifla.oPubId | 0 |
Files
Original bundle
1 - 1 of 1
Loading...
- Name:
- Performance Changes in Automatic KCD Classifications_Chul_WLIC2025.pdf
- Size:
- 2.39 MB
- Format:
- Adobe Portable Document Format
License bundle
1 - 1 of 1
Loading...
- Name:
- license.txt
- Size:
- 2.28 KB
- Format:
- Item-specific license agreed upon to submission
- Description: