You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
SGNS-WNE : The word-like n-gram embedding version of skip-gram model with negative sampling
SGNS-WNE is an open source implementation of our framework to learn distributed representation of words by embedding word-like character n-grams, described in the following papers:
The majority of C++ code which is used for computing representations for n-grams with SGNS is taken from word2vec - Google Codes[1] and w2v-sembei[2].
References
Mikolov, T., Corrado, G., Chen, K., & Dean, J. (2013). Efficient Estimation of Word Representations in Vector Space. In Proceedings of ICLR2013. [pdf, code]
Oshikiri, T. (2017). Segmentation-Free Word Embedding for Unsegmented Languages. In Proceedings of EMNLP2017. [pdf]
Kudo, T., Yamamoto, K., & Matsumoto, Y. (2004). Applying Conditional Random Fields to Japanese Morphological Analysis. In Proceedings of EMNLP2004. [pdf]
MeCab: Yet Another Part-of-Speech and Morphological Analyzer. [code]
About
C++ implementation of the paper "Word-like n-gram embedding". EMNLP 2018 Workshop on Noisy User-generated Text.