Lecture 2: NLP Tools
- Chapters 1, 2, 3, 5, and 7 from the NLTK book. Read chapter 4 if you are not familiar with Python.
- The Penn Treebank Tagset.
- Part-of-Speech Tagging Guidelines for the Penn Treebank Project, Beatrice Santorini, 1990.
- Penn Treebank Constituent Tags.
- Bracketing Guidelines for Treebank II Style. Penn Treebank Project, Ann Bies, Mark Ferguson, Karen Katz, Robert MacIntyre, 1995.
- Tregex and Tsurgeon: tools for querying and manipulating tree data structures, Roger Levy and Galen Andrew, LREC 2006.
- Slides and Javadoc for Semgrex, Chloe Kiddon, ~2006.
- Stanford typed dependencies manual, Marie-Catherine de Marneffe and Christopher D. Manning, revised 2012.
- Section 3, especially Table 4, in The CoNLL-2008 Shared Task on Joint Parsing of Syntactic and Semantic Dependencies, Mihai Surdeanu, Richard Johansson, Adam Meyers, Lluis Marquez, and Joakim Nivre, CoNLL 2008.
- The CoNLL-2000 Shared Task on Text Chunking.
- Deterministic Coreference Resolution Based on Entity-Centric, Precision-Ranked Rules,
Heeyoung Lee, Angel Chang, Yves Peirsman, Nathanael Chambers, Mihai Surdeanu, Dan Jurafsky, Computational Linguistics 39(4), 2013.
- Coreference for Learning to Extract Relations: Yes, Virginia, Coreference Matters,
Ryan Gabbard, Marjorie Freedman, Ralph Weischedel, ACL 2011.
- Rich Caruana and Alexandru Niculescu-Mizil.
An Empirical Comparison of Supervised Learning Algorithms,
ICML, 2006.
Lecture 3: ML Tools
Lecture 4: Sentiment Analysis
Lecture 5: Information Extraction
- Andrew McCallum, Dayne Freitag and Fernando Pereira. 2000. Maximum Entropy Markov Models for Information Extraction and Segmentation, Proceedings of ICML 2000.
- On Hidden Markov Models, read one (or both) of these: Hidden Markov Models Tutorial by Andrew Moore; or Chapter 6 in Jurafsky and Martin's book.
- Mike Mintz, Steven Bills, Rion Snow, Dan Jurafsky. 2009. Distant supervision for relation extraction without labeled data, Proceedings of ACL 2009.
- Bonan Min, Xiang Li, Ralph Grishman, Ang Sun. 2012. New York University 2012 System for KBP Slot Filling, Proceedings of TAC-KBP 2012.
- Raphael Hoffmann, Congle Zhang, Xiao Ling, Luke Zettlemoyer, Daniel S. Weld. 2011. Knowledge-Based Weak Supervision for Information Extraction of Overlapping Relations, Proceedings of ACL 2011.
- Wei Xu, Raphael Hoffmann, Le Zhao, Ralph Grishman. Filling Knowledge Base Gaps for Distant Supervision of Relation Extraction, Proceedings of ACL 2013.
- Douglas E. Appelt, Jerry R. Hobbs, John Bear, David Israel and Mabry Tyson. 1993. FASTUS: A Finite-state Processor for Information Extraction from Real-world Text, Proceedings of IJCAI 1993.
- Mihai Surdeanu, Sanda Harabagiu, John Williams, and Paul Aarseth. Using Predicate Arguments Structures for Information Extraction. Proceedings of ACL 2003.
- Truc-Vien T. Nguyen, Alessandro Moschitti and Giuseppe Riccardi. 2009. Convolution Kernels on Constituent, Dependency and Sequential Structures for Relation Extraction, Proceedings of EMNLP 2009.
- Ruihong Huang and Ellen Riloff. 2011. Peeling Back the Layers: Detecting Event Role Fillers in Secondary Contexts, Proceedinfs of ACL 2011.
- Ruihong Huang and Ellen Riloff. 2012. Bootstrapped Training of Event Extraction Classifiers, Proceedings of EACL 2012.
- Nathanael Chambers and Dan Jurafsky. 2011. Template-Based Information Extraction without the Templates, Proceedings of ACL 2011.
- Zornitsa Kozareva and Eduard Hovy. 2010. A Semi-Supervised Method to Learn and Construct Taxonomies using the Web, Proceedings of EMNLP 2010.
- Tara McIntosh. 2010. Unsupervised Discovery of Negative Categories in Lexicon Bootstrapping, Proceedings of EMNLP 2010.
Lecture 6: Question Answering
- Marius Pasca and Sanda Harabagiu. 2001. High Performance Question/Answering, Proceedings of SIGIR.
- Xin Li and Dan Roth. 2004. Learning Question Classifiers: The Role of Semantic Information, Natural Language Engineering.
- Susan Dumais, Michele Banko, Eric Brill, Jimmy Lin, and Andrew Ng. 2002. Web Question Answering: Is More Always Better?, Proceedings of SIGIR.
- E. Saquete, P. Martinez-Barco, R. Munoz, and J.L. Vicedo. 2004. Splitting Complex Temporal Questions for Question Answering systems, Proceedings of ACL.
| |