Date | Topic | Reading | Assignment | FAQs | Video |
Finite-state/tagging models | |||||
---|---|---|---|---|---|
1/4 | Introduction to NLP; overview of class structure; homework and term project options; tutorial options |
||||
1/6 | Dynamic programming for finite-state models: Viterbi search | sections 6.3 & 6.3.1 of Roark&Sproat |
HW1 | FAQ1 | |
1/11 | More dynamic programming: Forward-Backward search; other finite-state tagging tasks |
sections 6.3.3 & 6.4 of Roark&Sproat |
|||
1/13 | Language modeling | CG98 | |||
1/18 | No class -- MLK Jr Day | ||||
1/20 | Dependency parsing | HW2 | |||
1/25 | Ambiguity on input and output; n-best lists and lattices; Pipeline systems & intro to our text processing "pipeline" |
HR07 Norm01 |
|||
1/27 | Alignment in Machine Translation | ||||
2/1 | Graph-based methods in NLP | OER05; Mil05 | |||
Context-free/parsing models | |||||
2/3 | Context-free grammars (CFGs) | ||||
2/8 | Context-free parsing: CYK | HW3 | FAQ3 | ||
2/10 | More context-free parsing | ||||
2/15 | No class -- Presidents' Day | ||||
2/17 | Bi-text parsing | Chi05; CKtut06 | |||
2/22 | Advanced context-free parsing: grammar transformations | HW4 | |||
Beyond context-free; semantics and meaning | |||||
2/24 | Context-sensitive grammars: Unification, TAG, CCG | ||||
3/1 | Machine learning in NLP (for sequence processing) | Col02; DFL07; SP03; LMP01; SMtut06 |
|||
3/3 | HW4 student-presentations Semantics: WSD, NP coreference, semantic role labeling | HW5 | FAQ5 | ||
3/8 | Information Retrieval (IR) and Question Answering (QA) | CH08; RH02 | |||
3/10 | Co-reference resolution, Information Extraction (IE), and Automatic Summarization |
GKM05 | |||
3/15 | Class summary; surveying the state-of-the-art: existing systems and system competitions, likely future research directions |
||||
3/17 | Final Exam | Final Projects (due) |
CG98 | Stanley Chen and Joshua Goodman. An empirical study of smoothing techniques for language modeling. Technical report TR-10-98, Harvard University, August 1998. | |
Chi05 | David Chiang. A hierarchical phrase-based model for statistical machine translation. Proceedings of the Annual Meeting of the ACL, pp. 263-270, 2005. | |
CKtut06 | David Chiang and Kevin Knight. An introduction to synchronous grammars: part of a tutorial given at ACL 2006. | |
CH08 | K. Bretonnel Cohen and Lawrence Hunter. Getting started in text mining. PLoS Computational Biology, 4(1), 2008. | |
Col02 | Michael Collins. Discriminative training methods for Hidden Markov Models: theory and experiments with perceptron algorithms. Proceedings of the 2002 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1-8, 2002. | |
DFL07 | Dina Demner-Fushman and Jimmy Lin. Answering Clinical Questions with Knowledge-Based and Statistical Techniques. Computational Linguistics, 33(1):63-103, 2007. | |
GKM05 | Trond Grenager, Dan Klein and Chris Manning. Unsupervised Learning of Field Segmentation Models for Information Extraction. Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL), pp. 371-378, 2005. | |
HR07 | Kristy Hollingshead and Brian Roark. Pipeline Iteration. Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics (ACL), pp. 952-959, 2007. | |
LMP01 | John Lafferty, Andrew McCallum, and Fernando Pereira. Conditional Random Fields: probabilistic models for segmenting and labeling sequence data. Proceedings of the International Conference on Machine Learning (ICML), 2001. | |
Mil05 | Rada Mihalcea. Unsupervised Large-Vocabulary Word Sense Disambiguation with Graph-based Algorithms for Sequence Data Labeling. Proceedings of HLT-EMNLP, 2005. | |
OER05 | Jahna Otterbacher, Gunes Erkan and Dragomir R. Radev. Using Random Walks for Question-focused Sentence Retrieval. Proceedings of HLT-EMNLP, 2005. | |
RH02 | Deepak Ravichandran and Eduard Hovy. Learning Surface Text Patterns for a Question Answering System. Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics (ACL), pp. 41-47, 2002. | |
SP03 | Fei Sha and Fernando Pereira. Shallow parsing with conditional random fields. Proceedings of the HLT-NAACL Annual Meeting, pp. 134-141, 2003. | |
Norm01 | Richard Sproat, Alan Black, Stanley Chen, Shankar Kumar, Mari Ostendorf, and Christopher Richards. Normalization of non-standard words. Computer Speech and Language, 15(3):287-333, 2001. | |
SMtut06 | Charles Sutton and Andrew McCallum. An introduction to Conditional Random Fields for relational learning. Book chapter in Introduction to Statistical Relational Learning. Edited by Lise Getoor and Ben Taskar. MIT Press, 2006. |