深度学习NLP方向论文列表--NNLM

Neural Language Models

Neural langauge models

  • [neural LM] Bengio et al., “A Neural Probabilistic Language Model.” pdf Journal of Machine Learning Research 2003
  • [bi-loglinear LM]
  • [discriminative LM] Brian Roark, Murat Saraclar, and Michael Collins. “Discriminative n-gram language modeling.” pdf Computer Speech and Language, 21(2):373-392. 2007

Long short term memory (LSTMs)

  • [parsing] Oriol Vinyals, Lukasz Kaiser, Terry Koo, Slav Petrov, Ilya Sutskever, Geoffrey Hinton, “Grammar as Foreign Language” pdf arXiv 2014
  • [program] Wojciech Zaremba, Ilya Sutskever, “Learning to Execute” pdf arXiv 2014
  • [translation] Ilya Sutskever, Oriol Vinyals, Quoc Le, “Sequence to Sequence Learning with Neural Networks” pdf NIPS 2014
  • [attention-based LSTM, summarization] Alexander M. Rush, Sumit Chopra and Jason Weston, “A Neural Attention Model for Abstractive Sentence Summarization” pdf EMNLP 2015
  • [bi-LSTM, character] Wang Ling, Tiago Luis, Luis Marujo, Ramon Fernandez Astudillo, Silvio Amir, Chris Dyer, Alan W Black, Isabel Trancoso, “Finding Function in Form: Compositional Character Models for Open Vocabulary Word Representation” pdf EMNLP 2015
  • [reading gate, dialogue cell] Tsung-Hsien Wen, Milica Gasic, Nikola Mrksic, Pei-Hao Su, David Vandyke, Steve Young, “Semantically Conditioned LSTM-based Natural Language Generation for Spoken Dialogue Systems” pdf EMNLP 2015 Best Paper
  • [state embedding, character] Miguel Ballesteros, Chris Dyer and Noah A. Smith, “Improved Transition-Based Parsing by Modeling Characters instead of Words with LSTMs” pdf EMNLP 2015
  • [no stacked, highway networks, character, CNN with LSTM] Yoon Kim, Yacine Jernite, David Sontag, Alexander M. Rush “Character-Aware Neural Language Models” pdf arXiv pre-print

CNNs: convolution neural networks for language

  • [convoluting from character-level to doc-level] Xiang Zhang, Yann LeCun. “Text Understanding from Scratch” pdf
  • [character LM for doc-level] Peng, F., Schuurmans, D., Keselj, V. and Wang, S. “Language independent authorship attribution using character level language models.” pdf EACL 2004.
  • [convnet for sentences, dynamic, k-max pooling, stacked] Nal Kalchbrenner, Edward Grefenstette and Phil Blunsom. “A Convolutional Neural Network for Modelling Sentences” pdf ACL 2014.
  • [unsupervised pretraining for CNN] Wenpeng Yin and Hinrich Schutze. “Convolutional Neural Network for Paraphrase Identification.” pdf NAACL 2015
  • [convolute better with word order, parallel-CNN, different region] Rie Johnson and Tong Zhang. “Effective Use of Word Order for Text Categorization with Convolutional Neural Networks” pdf
  • [character, ConvNet, data augumentation] Xiang Zhang, Junbo Zhao, Yann LeCun, “Character-level Convolutional Networks” pdf NIPS 2015
  • [no stacked, highway networks, character, CNN with LSTM] Yoon Kim, Yacine Jernite, David Sontag, Alexander M. Rush “Character-Aware Neural Language Models” pdf arXiv pre-print

QA with commonsense reasoning

  • [nlp for AI] Jason Weston, Antoine Bordes, Sumit Chopra, Tomas Mikolov. “Towards AI-Complete Question Answering:A Set of Prerequisite Toy Tasks” pdf 2015
  • [memory networks] Jason Weston, Sumit Chopra, Antoine Bordes “Memory Networks” pdf ICLR 2015
  • [winograd schema] Hector J. Levesque. “The Winograd Schema Challenge” pdf AAAI Spring Symposium: Logical Formalizations of Commonsense Reasoning 2011
  • [textual entailment] Ion Androutsopoulos, Prodromos Malakasiotis “A Survey of Paraphrasing and Textual Entailment Methods” pdf Journal of Artificial Intelligence Research 38 (2010) 135-187

Compositional

  • Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg Corrado, and Jeff Dean, “Distributed Representations of Words and Phrases and their Compositionality,” pdf NIPS 2013
  • [socher’s]
  • [cutting RNN trees] Christian Scheible, Hinrich Schutze. “Cutting Recursive Autoencoder Trees” pdf CoRR abs/1301.2811 (2013)