CS276B / SYMBSYS 239J / LING 239J
Text Information Retrieval, Mining, and Exploitation
Christopher Manning, Prabhakar Raghavan, and Hinrich
Lecture: 3 units, TuTh 4:15-5:30 Gates B08
[NB: Different room this quarter!]
TA: Teg Grenager
Staff e-mail: firstname.lastname@example.org
- 3/14/2003 The final exam will be held on Friday, March 21, from 12:15-3:15pm in Gates B08. An alternate final will be held on Tuesday, March 18, from 8:30-11:30 in Gates 459. I have posted practice questions on the syllabus.
- 3/10/2003 Project part two is due in class on Thursday, March 13.
- 3/10/2003 An answer key to the midterm has been posted, to aid in studying for the final exam.
- 2/11/2003 Project part 1B is due tonight at 11:59 pm! The grading policy is posted on the grading page.
- 1/31/2003 The midterm will be held on Thursday, February 6 during class. Information on the midterm is available on the grading page.
- 1/26/2003 Project part 1A is due on Monday at 11:59 pm! We have posted the grading policy on the grading page.
- 1/20/2003 The project page contains important new information, including a new schema.
- 1/14/2003 The TA, Teg, has posted office hours below.
- 1/14/2003 We have posted the project and project tools tutorial handouts at the bottom of this page. Please read them carefully before you begin your projects.
- 1/14/2003 Note that the course
staff email list as originally listed on this page was
incorrect. To reach the course staff please send mail to
- 1/07/2003 The syllabus is online.
Document clustering, classification, routing, and recommendation
systems. Machine learning methods. Information extraction methods:
terminologies and ontology acquisition, named entity recognition,
coreference resolution, web wrappers and web agents. Natural language
processing techniques: summarization, cross-lingual retrieval, event
tracking, question answering and text mining.
Biomedical text: special constraints, knowledge discovery,
improved performance from integrating textual information.
Prerequisites: either CS276A or reasonable background in some
background in text and statistical
machine learning techniques, such as from CS224N, CS229, or
Stat315. (You're not required to have done CS276A to do this
course, and the focus is rather different. On the other
hand, we will only very briefly review material covered
there, and so unless you already know appropriate topics
from CS276A, you will need to do additional outside
project will require extensive programming in Java,
so previous object oriented programming experience will be very helpful.
There is no required or recommended text. We will distribute readings
for each topic. Books which contain considerable material
of relevance to the course that you may wish to look at
- Soumen Chakrabarti. 2003. Mining the Web: Discovering Knowledge from
Hypertext Data. Amsterdam: Morgan Kaufmann.
- Christopher Manning and Hinrich Schütze. 1999. Foundations of
Statistical Natural Language Processing. Cambridge,
MA: MIT Press.
- Tom Mitchell. 1997. Machine Learning. McGraw Hill.
- Ian Witten and Eibe Frank. 2000. Data Mining: Practical Machine
Learning Tools and Techniques with Java
Implementations. San Francisco, CA: Morgan
40% (divided as follows)
Staff Contact Information:
We request that you send questions to the
staff mailing list at
when appropriate. There is also a course newsgroup at
su.class.cs276b where students can
help one another.
Office: Gates Bldg., Rm 418
Office Hours: F 10:00-12:00
Office Hours: By Appt.
Office Hours: By Appt.
Office: Gates Bldg., Rm 454
Office Hours: Mon 2:00-3:00, Thurs 10:00-11:00
Last modified: Fri Mar 14 15:18:36 PST 2003