Review Article
Workineh Tesema
Abstract
This paper presents the sense clustering of multi-sense words in Afan Oromo. The main idea of this work is to cluster contexts which is providing a useful way to discover semantically related senses. The similar contexts of a given senses of target word are clustered using three hierarchical and two partitional clustering. All contexts of related senses are included in the clustering and thus performed over all the contexts in the corpus. The underlying hypothesis is that clustering captures the reflected unity among the contexts and each cluster reveal possible relationships existing among the contexts. As the experiment shows, from the total five clusters, the EM and K-Means clusters which yield significantly higher accuracy than hierarchical (single clustering, complete clustering and average clustering) result. For Afan Oromo, EM and K-means enhance the accuracy of sense clustering than hierarchical clustering algorithms. Each cluster representing a unique sense. Some words have two senses to the five senses. As the result shows an average accuracy of test set was 85.5% which is encouraging with the unsupervised machine learning work. By using this approach, finding the right number of clusters is equivalent to finding the number of senses. The achieved result was encouraging, despite it is less resource requirement.