![]() |
|
|
COURSE
INFORMATION
|
||||
| Instructor |
Dan Jurafsky, jurafsky@stanford.edu Office: Margaret Jacks Hall (bld 460) 117 Office Hours: M 12:00-12:30, Tu 4:30-5:30 |
|||
| TAs |
|
|||
| Time | Tues/Thur, 9:30-10:45am | |||
| Staff Email | cs124-win0809-staff@lists.stanford.edu for any questions about the homework (or anything else) | |||
| Location | 200-030 | |||
| Textbooks |
|
|||
| Description |
Automated processing of less structured information: human language text and speech, web pages, social networks, genome sequences, with goal of automatically extracting meaning and structure. Methods include: string algorithms, automata and transducers, hidden Markov models, graph algorithms, XML processing. Applications such as information retrieval, text classification, social network models, machine translation, genomic sequence alignment, word meaning extraction, and speech recognition. Prerequisite: CS 106B, CS 103 or 103B, and CS 107 (or familiarity with Linux shell scripts) |
|||
| Required Work |
|
|||
|
SCHEDULE
|
||||
|
Wk
|
Date
|
HW
|
Lec
|
Topic and Readings |
| 1 |
Jan 6 |
Lec 1 (ppt) Lec 1 (pdf) |
Strings, Formal languages, and Automata
|
|
| 1 |
Jan 8 |
Lec 2 (ppt) Lec 2 (pdf) |
Edit Distance (in Text and Genes) and start of Tokenization
|
|
| 2 |
Jan 13 |
HW 1: Harvesting email addresses and phone numbers |
Lec 3 (ppt) Lec 3 (pdf) |
Language Modeling (and Probability Theory Background)
|
| 2 |
Jan 15 |
Lec 4 (ppt) Lec 4 (pdf) |
Naive Bayes and Text Classification
|
|
| 3 |
Jan 20 |
HW 2: Language Identification |
Lec 5 (ppt) Lec 5 (pdf) |
Text Classification for Sentiment Analysis
|
| 3 |
Jan 22 |
Lec 6 (ppt) Lec 6 (pdf) |
Hidden Markov Models
|
|
| 4 |
Jan 27 |
HW 3: Sentiment analysis of movie reviews |
Lec 7 (ppt) Lec 7 (pdf) |
Named Entity Tagging
|
| 4 |
Jan 29 |
Lec 8 (ppt) Lec 8 (pdf) |
Information Retrieval (I)
|
|
| 5 |
Feb 3 |
Lec 9 (ppt) Lec 9 (pdf) |
Information Retrieval (II)
|
|
| 5 |
Feb 5 |
HW 4: Person name extraction |
Lec 10 (ppt) Lec 10 (pdf) |
Information Retrieval (III)
|
| 6 |
Feb 10 |
Lec 11 (ppt) Lec 11 (pdf) |
XML: accessing structured information (I)
|
|
| 6 |
Feb 12 |
Lec 12 (ppt) Lec 12 (pdf) |
XML: accessing structured information (II)
|
|
| 7 |
Feb 17 |
HW 5: Exercises and Search Engine analysis |
Lec 13 (ppt) Lec 13 (pdf) |
Computational Lexical Semantics
|
| 7 |
Feb 19 |
Lec 14 (ppt) Lec 14 (pdf) |
Relation and Information Extraction
|
|
| 8 |
Feb 24 |
Lec 15 (ppt) Lec 15 (pdf) |
Machine Translation
|
|
| 8 |
Feb 26 |
HW 6: Relation Extraction |
Lec 16 (ppt) Lec 16 (pdf) |
Machine Translation
|
| 9 |
Mar 3 |
Lec 17 (ppt) Lec 17 (pdf) |
Web graphs, Links, and PageRank
|
|
| 9 |
Mar 5 |
HW 7: Machine Translation |
Lec 18 (ppt) Lec 18 (pdf) |
Understanding Social and Technological Networks: Small Worlds, Fat Tails, and Whatnot
|
| 10 |
Mar 10 |
No Class Today |
||
| 10 |
Mar 12 |
Lec 19 (ppt) Lec 19 (pdf) |
Speech Recognition
|
|