Abstract—The semantic gap between human language and machine language (logical view of text used by machine) is the most important challenge of content management. While focusing on the specific message that a text is intended to convey, knowledge is exchanged through natural language assuming a large body of shared background knowledge. Thus, a considerable amount of the knowledge in a text is actually very implicit and remains”under the surface”[1]. This study intends to reveal the “under the surface” knowledge, named “Latent Semantic Domain”, of a text in order for the machine to recognize and process it. Given a text and an ontology that models the domain knowledge, the specific message of the text is those concepts that explicitly appear in the text. The remaining part of the ontology constitutes the background knowledge. However, not all of the concepts of the ontology equally contribute to the Latent Semantic Domain of the text. So, it is very import to specify how the concepts of the ontology are involved in the Latent Semantic Domain of the text. In order to do so, the semantic relatedness between the concepts mentioned in a text, as a whole unit, and the other concepts of the domain should be measured. This measure determines how a domain concept is related to the specific message of the text. In order to implement this idea, each concept of the ontology is represented by a vector that semantically describes the concept in the semantic space of the domain. Considering just the concepts of a text in a vector, it describes its concept in the semantic domain of the text. This representation of concepts provides the formal basis to compare and determine their role in the Latent Semantic Domain of a text.
Index Terms—Ontology, latent semantic analysis, text mining, natural language processing.
The authors are with the Computer Engineering Department of Engineering Faculty, University of Isfahan, Isfahan, Iran (e-mail: khounsiavash@eng.ui.ac.ir, ahmadb@eng.ui.ac.ir).
[PDF]
Cite: Ehsan KhounSaivash and Ahmad Baraani-Dastjerdi, "An Ontology-based Approach to Text Latent Semantic Domain Recognition,"
International Journal of Computer Theory and Engineering vol. 5, no. 1, pp. 178-182, 2013.