项目作者: rhythmcao

项目描述 :
Source code and data for the journal ``Dual learning for semi-supervised natural language understanding" in TASLP 2020.
高级语言: Python
项目地址: git://github.com/rhythmcao/slu-dual-learning.git
创建时间: 2021-03-29T10:35:30Z
项目社区:https://github.com/rhythmcao/slu-dual-learning

开源协议:

下载


Dual Learning for Semi-Supervised Natural Language Understanding


This is the project containing source code and data for the journal Dual learning for semi-supervised natural language understanding in IEEE/ACM Transactions on Audio, Speech, and Language Processing (TASLP) 2020. If you find it useful, please cite our work.

  1. @article{Zhu_2020,
  2. title={Dual Learning for Semi-Supervised Natural Language Understanding},
  3. ISSN={2329-9304},
  4. url={http://dx.doi.org/10.1109/TASLP.2020.3001684},
  5. DOI={10.1109/taslp.2020.3001684},
  6. journal={IEEE/ACM Transactions on Audio, Speech, and Language Processing},
  7. publisher={Institute of Electrical and Electronics Engineers (IEEE)},
  8. author={Zhu, Su and Cao, Ruisheng and Yu, Kai},
  9. year={2020},
  10. pages={11}
  11. }

Preparations

  1. Create the conda environment slu and download dependencies such as char/word vectors and pretrained language model bert-base-uncased:

    1. ./environment.sh
  2. Construct the vocabulary, slot-value database and intent-slot co-occurrence matrix:

    1. python utils/preparations.py --dataset atis snips

All outputs are saved in directory data.


Supervised experiments

All the experimental outputs will be saved in the directory exp by default, see utils/hyperparam.py.

SLU task

Running script: (labeled is the ratio of labeled examples in the entire training set)

  1. ./run/run_slu.sh [atis|snips] labeled [birnn|birnn+crf|focus]

Or with bert:

  1. ./run/run_slu_bert.sh [atis|snips] labeled [birnn|birnn+crf|focus]

NLG task

Running script:

  1. ./run/run_nlg.sh [atis|snips] labeled [sclstm|sclstm+copy]

Language Model task

Running script:

  1. ./run/run_lm.sh [atis|snips] [surface|sentence]

surface means training a LM with slot values replaced by its slot name; while sentence argument represents the LM trained at the natural language level.


Semi-supervised experiments

Attention: all model paths such as read_slu_model_path in the running scripts below can be replaced with other supervised models.

Dual pseudo labeling

Running script:

  1. ./run/run_dual_pseudo_labeling.sh [atis|snips] labeled [focus|bert]

Dual learning

Running script:

  1. ./run/run_dual_learning.sh [atis|snips] labeled [focus|bert]

Dual pseudo labeling + Dual learning

Running script:

  1. ./run/run_dual_plus_pseudo.sh [atis|snips] labeled [focus|bert]