Machine Learning in Natural Language Processing

Fernando Pereira

Computer and Information Science
University of Pennsylvania

Prerequisites: elementary discrete probability (events, probability, conditional probability, Bayes rule, entropy, mutual information) and general mathematical fluency at the undergraduate level (linear algebra and a tad of calculus).

Summary: Machine learning has been taking over in natural-language processing. Any interesting language processor has to choose among competing alternatives: document classes, search query answers, word senses, parse trees, translations. Many factors conspire to decide the choice. People are notoriously incompetent at coming up with explicit rules that formalize those decisions accurately, especially when they involve relatively rare items or configurations. On the other hand, machines (and, presumably, human brains) are able to learn effective rules that combine those many factors to achieve accurate decisions. I will survey the main current applications and techniques of machine learning in natural language processing, with a strong bias towards the research problems that I care about and have worked on. Topics may include (depending on class interest):

Introduction: Techniques:
Course Notes:
Course Slides
Back to course listing.