Service-Oriented Architecture for automatic markup of documents. An use case for legal documents

Cifuentes-Silva, Francisco Adolfo

Service-Oriented Architecture for automatic markup of documents. An use case for legal documents

dc.audience	Audience::Law Libraries Section
dc.audience	Audience::Library and Research Services for Parliaments Section
dc.audience	Audience::Information Technology Section
dc.audience	Audience::Advisory Committee on Freedom of Access to Information and Freedom of Expression
dc.conference.date	16-22 August 2014
dc.conference.place	Lyon, France
dc.conference.sessionType	Law Libraries with Parliamentary Libraries, Information Technology and Committee on Freedom of Access to Information and Freedom of Expression (FAIFE)
dc.conference.title	IFLA WLIC 2014
dc.conference.venue	Lyon Convention Centre
dc.congressWLIC	IFLA WLIC 2014 - Lyon, France
dc.contributor.author	Cifuentes-Silva, Francisco Adolfo
dc.date.accessioned	2025-09-24T08:22:18Z
dc.date.available	2025-09-24T08:22:18Z
dc.date.issued	2014
dc.description.abstract	The problem of information extraction and automatic markup of plain text to XML, has been resolved partially in a specific domain of legal documents. Techniques such as named entity recognition, hierarchy detection of text sections and others has led to partially identify and retrieve different kind of information inside non structured documents. In this paper we introduce different interconnected components, the NLP techniques used on each component and the workflow needed for processing a plain text document and to generate a new full marked XML version of the document. The generated XML complies with the schema legal standard Akoma-Ntoso and is highly enriched with named entities, semantic URIS, structural sections, lists and elements sequences, between others. As an use case we analyze the experience of the Library of Congress of Chile in the context of the 'History of Law project' and Parliamentary Labor, where these architecture had a key role in order to accomplish the final product and results of processing and marking up different types or models of documents used in the legislative process.	en
dc.identifier.citation	[1] Cifuentes-Silva F., Sifaqui C. and Labra-Gayo J. Towards an architecture and adoption process for linked data technologies in open government contexts: a case study for the Library of Congress of Chile. I-Semantics 2011 [2] Hyland B, Atemezing G., Villazón-Terrazas B. Bests practices for Publishing Linked Data. Enero 2014. [3] Palmirani M. XML Legislativo: Principios e instrumentos técnicos. Oct. 2012
dc.identifier.relatedurl	http://conference.ifla.org/ifla80/
dc.identifier.uri	https://repository.ifla.org/handle/20.500.14598/5419
dc.language.iso	spa
dc.rights	Attribution 3.0 Unported
dc.rights.accessRights	open access
dc.rights.uri	https://creativecommons.org/licenses/by/3.0/
dc.subject.keyword	Linked Open Data
dc.subject.keyword	Semantic Web
dc.subject.keyword	Akoma-Ntoso
dc.subject.keyword	Machine Learning
dc.subject.keyword	e-parliament
dc.title	Service-Oriented Architecture for automatic markup of documents. An use case for legal documents	en
dc.type	Article
ifla.Unit	Section:Law Libraries Section
ifla.Unit	Section::Library and Research Services for Parliaments Section
ifla.Unit	Section::Information Technology Section
ifla.Unit	Section::Advisory Committee on Freedom of Access to Information and Freedom of Expression
ifla.oPubId	https://library.ifla.org/id/eprint/1048/

Files

Original bundle

Now showing 1 - 1 of 1

Name:: 121-cifuentes-es.pdf
Size:: 210.77 KB
Format:: Adobe Portable Document Format

Download

Collections

World Library and Information Congress (WLIC) Papers and Presentations