A New Multi-Phase Algorithm for Stemming in Farsi Language Based on Morphology

Home > Articles > Archive > 2011 > Volume 3, Number 5 (Oct. 2011) >

IJCTE 2011 Vol.3(5): 623-627 ISSN: 1793-8201
DOI: 10.7763/IJCTE.2011.V3.381

Somayyeh Estahbanati, Reza Javidan, and Mehdi Nikkhah

Abstract—The main goal of stemming is to standardize words by reducing a word to its origin. In this paper a new algorithm for stemming in Farsi (Persian) language is presented. This stemmer is based on removing the suffixes and prefixes, and a database is used for saving the exceptions to decrease error rate. In the proposed method the speed of stemmer and also the percentage of errors are improved. The evaluation results on the prototype document collections show significant improvement in precision and recall in comparison with other well-known methods.

Index Terms—Farsi, persian, language, stemming.

Somayye Estahbanati is with Department of Computer Engineering. Islamic Azad University, Science and Research Branch, Ahvaz, Iran (Email: s.estahbanati@gmail.com).
Reza Javidan and Mehdi Nikkhah are with Department of Computer Engineering. Islamic Azad University, Beyza Branch, Beyza, Iran (Email: reza.javidan@gmail.com; Nikkhah@biau.ac.ir).

[PDF]

Cite: Somayyeh Estahbanati, Reza Javidan, and Mehdi Nikkhah, "A New Multi-Phase Algorithm for Stemming in Farsi Language Based on Morphology," International Journal of Computer Theory and Engineering vol. 3, no. 5, pp. 623-627, 2011.

Article Metrics in Dimensions

Previous Paper

A Common Database and a Transaction Pool to Reduce Data Redundancy and to Maximize Database Throughput in a Comprehensive Information System of a Particular Domain of Interests

Next Paper

Analysis and Design of Non Linear Snow 2.0 for Improved Security

A New Multi-Phase Algorithm for Stemming in Farsi Language Based on Morphology

Article Metrics in Dimensions

Menu