Building on Ontology of Historical Newspapers based on Methontology and Natural Language Processing Techniques
Dr Fatihah Ramli

Data science niche: Foundations of Data Science

INTRODUCTION
In the past years, there has been increasing concern on ontology for its ability to explain data semantics in the usual manner independent of the data source characteristics, providing a schema that allows interchanging data between heterogeneous information systems and users. The ontology development in some areas is not expected due to the large amount of information, particularly in history, leading its semantic impossible. Several works have been designed to enhance the technological aspects of ontology, such as the discovery and representation of concepts for historical newspaper.This project aims to develop an ontology of historical newspapers based on the combination of METHONTOLOGY and a pipeline of natural language processing (NLP) techniques using GATE, a language engineering platform. METHONTOLOGY has been proposed by (Fernández, Gómez, & Juristo, 1997), and it is a complete ontology development process, as defined by the IEEE1074-1995 standard. Moreover, it is a suitable methodology for building an ontology from scratch.The expected outcome of this project is a historical newspapers ontology, which can be later integrated in a digital archive to allow historical researchers to extract information and derive new knowledge from the historical documents.

OBJECTIVES
The aim of this study is to build an ontology of historical newspapers based on the combination of METHONTOLOGY and a pipeline of natural language processing techniques.

To support this aim, the following objectives have been outlined:

1.To process the historical newspaper terms using a pipeline of natural language processing techniques to generate and cluster.
2.To transform the historical newspaper into a new historical newspaper ontology by integrating existing ontologies.
3.To validate the quality of the ontology of historical newspapers.

METHODOLOGY

This project was applied METHONTOLOGY (Fernández-López, Gómez-Pérez, & Juristo, 1997), a complete ontology development process, as defined by the IEEE1074-1995 standard. It is a suitable methodology for building an ontology from scratch. The ontology building process has seven phases.

Phase 1: Specification

Phase 2: Knowledge acquisition

Phase 3: Conceptualisation

Phase 4: Integration

Phase 5: Implementation

Phase 6: Evaluation

Phase 7: Documentation

RESULTS


images/Project/Project_Fatihah1.png           

Part of historical news taxonomic domain built using TopBraid Composer


images/Project/Project_Fatihah3.png

Evaluation method


REFERENCES

Adorni, G., Maratea, M., Pandolfo, L., & Pulina, L. (2015). An Ontology for Historical Research Documents Web Reasoning and Rule Systems (pp. 11-18): Springer.

Brusa, G., Caliusco, M. L., & Chiotti, O. (2006). A process for building a domain ontology: an experience in developing a government budgetary ontology. Paper presented at the Proceedings of the second Australasian workshop on Advances in ontologies-Volume 72.

Burton-Jones, A., Storey, V. C., Sugumaran, V., & Ahluwalia, P. (2005). A semiotic metrics suite for assessing the quality of ontologies. Data & Knowledge Engineering, 55(1), 84-102.

Corda, I. (2007). Ontology-based representation and reasoning about the history of science. The University of Leeds.

This work is a research project under funding of F08/SpSTG/1363/16/5