Abstract—Retrieved documents from queries are clustered to help users find information needed more significant in information retrieval. There are some frequent queries try finding information on an issue from the aspect of another issue. But current methods of clustering do not pay attention to the concept of the aspect included in these queries after retrieval process. In this paper we introduce aspect-oriented document clustering to group documents more significant and based on a special point of view. In our approach, text documents are represented based on a special aspect and the similarity between them is computed on the basis of its features. We use Wikipedia as background knowledge to emphasize and enrich the concept of the aspect. Then we evaluate the proposed approach with selected documents from two popular datasets, 20 Newsgroups and Reuters 21578. Results demonstrate that aspect-oriented clustering enhances clustering performance of those documents which can be equivalent to retrieved documents from aspect based queries significantly.
Index Terms—Aspect based representation, aspect-oriented clustering, clustering, Wikipedia.
M. Hosseinia was with the university of Tehran, Tehran, Iran (e-mail: mhosseinia@ut.ac.ir).
K. Badie is with the Research Institute for ICT (ITRC), Tehran, Iran (e-mail: k_badie@itrc.ac.ir).
A. Moeini is with the university of Tehran, Tehran, Iran (e-mail: moeini@ut.ac.ir).
[PDF]
Cite: M. Hosseinia, K. Badie, and A. Moeini, "Aspect-Oriented Document Clustering for Facilitating Retrieval Process,"
International Journal of Computer Theory and Engineering vol. 4, no. 5, pp. 707-711, 2012.