Distributed Representations of Mongolian Words and Its Efficient Estimation

WUYUNTANA, SIRI GULENG WANG

Abstract


The word vectors has good semantic properties that can be used to improve and simplify many natural language processing applications. In this paper, we use the two model architectures Continuous Bag-of-Words (CBOW) and skip-gram to compute the Mongolian word vectors. On this basis, we design a comprehensive test set based on the Mongolia language features to measure the similarity of Mongolian syntactic and semantic. And then on this test set estimate the quality of the Mongolian word vectors. Experiments show that the Skip-gram architecture works better than the CBOW on the Mongolian syntactic semantic tasks, and the word vectors computed by this model are have good quality. Take Mongolian verb vectors as an example, also find that there are multiple similarities between the computed Mongolian word vectors.

Keywords


Word vectors, Continuous Bag-of-Words model, Skip-gram model, semantic syntactic evaluation.


DOI
10.12783/dtcse/iceit2017/19851

Refbacks

  • There are currently no refbacks.