Abstract
Wordnet development is an active research area among NLP researchers. Since the manual construction of the English wordnet was very costly both in terms of time and human expertise, automatic approaches have become very popular for wordnet development in languages other than English. Automatic methods usually benefit from an existing wordnet of a high resource language and use it as the backbone of their work. In this article, we present an unsupervised approach for automatic wordnet construction using a combination of Expectation–Maximization and personalized PageRank algorithms. Our method uses some typical and available language resources, so it is applicable to many languages including under-resourced ones. The proposed method needs just a bilingual dictionary and a monolingual corpus for developing a wordnet. In order to evaluate the proposed method, we apply it to the Persian language which is identified as an under-resourced language in NLP tasks. Evaluation results properly indicate the power of the proposed method to construct a high quality and large-scale wordnet for poor-resource languages. According to experiments, we achieve a precision of higher than 93% with a recall of 50%.