项目作者: meghdadFar

项目描述 :
Python implementation of Substitution-driven Measures of Association
高级语言: Python
项目地址: git://github.com/meghdadFar/SDMA.git
创建时间: 2018-03-28T22:04:17Z
项目社区:https://github.com/meghdadFar/SDMA

开源协议:

下载


Substitution-driven Measures of Association (SDMAs) for extracting collocations

SDMAs can be used as an alternative to measures such as PMI and Chi-squared in order to identify collocations
in a corpus of text. However, unlike PMI and other purely statistical measures that are blind about the meaning of words,
SDMAs measure the statistical association by taking into account the degree of
semantic non-substitutability of sequences of words. Non-Substitutability is
a Linguistic test that measures the fixedness of a phrase.
SDMAs can be used to identify collocations and it has been shown that it can considerably
outperform association measures such as Pointwise Mutual Information.
You can read more about the theory behind this measure in this Jupyter notebook.

Applications

Similar to PMI, SDMAs can be used to identify collocations or multiword expressions.

Usage