CS173

Course Description
Introduction to computational biology through an informatic exploration of the human genome. Topics include: genome sequencing; functional landscape of the human genome (genes, gene regulation, repeats, RNA genes, epigenetics); genome evolution (comparative genomics, ultraconservation, co-option). Additional topics may include population genetics, personalized genomics, and ancient DNA. Course includes lectures on molecular biology, the UCSC Genome Browser, and text processing languages. Guest lectures on current genomic research topics.

Prerequisites
CS107 or equivalent background in programming (familiar with the command line; proficient in one programming language).

Introductory lectures on molecular biology, text processing in UNIX, and the UCSC Genome Browser will be given early in the quarter (see schedule below).

Class Schedule
Mon Wed 11:00am-12:15pm in Beckman B302. Once inside Beckman, take elevator to third floor, make a left at the lobby, and B-302 is the second door on the right.

Bibliography
The course is mostly based on current or very recent literature. As such, it does not follow any textbook. Please use the papers mentioned at each lecture as pointers into the relevant literature (for more material, you can look at the papers' references, or at more recent publications that cite those papers). The easiest way to find a paper would be to search for its title and/or authors on Google Scholar or vanilla Google. You are also encouraged to consult online resources such as Wikipedia.

As a Stanford student you also have free access to many biomedical journals. In order to be granted access to them while you are off-campus you simply need to add ".laneproxy.stanford.edu" to the main URL and enter your Stanford credentials upon request (for example http://www.somejournal.com/other/stuff would become http://www.somejournal.com.laneproxy.stanford.edu/other/stuff). There is also a bookmarklet that can do this for you on a push of a button.

The following book can be used as a general reference to the biological topics discussed in class: Genomes, 2nd edition. You can also read the NCBI Primer to Genomics. The course may also use material from Genomes, Browsers and Databases: Data-Mining Tools for Integrated Genomic Databases.

Instructors
Gill Bejerano
Office: Beckman Center B321
Office hours: Email for appointment
Phone: (650) 723-7666

Teaching Assistants
Jim Notwell
Office: B383 (Walking out of the elevators, B383 is on the right side of the lounge.)
Office hours: Tuesday 3:00 - 5:00 PM

Harendra Guturu
Office: B383 (Walking out of the elevators, B383 is on the right side of the lounge.)
Office hours: Thursday 10:00 AM - 12:00 PM

Communication
This year all course communication will be handled via Piazza. Find our class page at Piazza and enroll to receive announcements and access to other private course resources.

Course Requirements
There are three course requirements:
  1. Homeworks. Throughout the class there will be two homework assignments, due at the beginning of class on their due dates. Three late days are awarded for the quarter. Once these late days are used up, homework turned in late will be penalized 20% per late day. The number of late days used is rounded up to the nearest day, so assignments turned in one hour late use one full late day. Late days cannot be applied to the project milestone or final project presentation.

    A link to frequently asked questions about each homework will be created on the schedule and updated as questions come in, so refresh and check the FAQ to see if your question has been addressed already.

    Because we reuse some problem set questions from previous years' homeworks, looking at previous years' solution sets is not permitted and is an honor code violation.

    Students may discuss homework problems in groups. However, each student must write down the solutions independently, and without referring to written notes from the joint session. In other words, each student must understand the solution well enough in order to reconstruct it by him/herself. In addition, each student should write on the problem set the set of people with whom s/he collaborated.

  2. Project. Students will form groups of several people, and each group will be assigned an individual project. Instead of a final exam, at the end of the class there will be a poster session where the groups will present their work.

  3. Attendance. For this class, attendance is mandatory. You may miss up to 2 lectures without affecting your grade, with consideration given if you are not feeling well.

Grades will be determined by roughly the following breakdown: 20% HW1, 25% HW2, 5% Attendance, 50% Final Project.
Course Tools
The base course directory is located at /afs/ir.stanford.edu/class/cs273a, and is reachable from the cardinal and elaine machines. Source tree executables are available within the bin directory, and are machine-dependent. If you add "/afs/ir.stanford.edu/class/cs273a/bin/@sys" to your PATH variable, the correct version of the executable will be executed.

Previous CS273A Materials
This is the first year that CS173 has been taught, but it is similar in spirit to CS273a.

Auditing
Homeworks and lecture materials require a SUNet ID and password for registered students. If you are auditing the class and want access to these materials, please send the staff a message on Piazza with your SUNetId.
Schedule
As the quarter progresses, the following schedule will be updated accordingly. Please check back often for the latest material.

 DateTitleHW
11/7Introduction/Overview 
21/9Introductory Biology Tutorial 
31/14Protein Coding Genes 
41/16UCSC Genome Browser TutorialHW1
HW1 Solutions
51/23Introduction to Text Processing Tutorial 
61/28Non-protein Coding Genes 
71/30Transcriptional Activation I 
82/4Transcriptional Regulation IIHW2
HW2 Solutions
Halfway feedback
92/6Transcriptional Regulation III 
102/11Genome Evolution I: Repeats 
112/13Genome Evolution II 
122/20Chains & Nets, Conservation & FunctionProject out
HW2 due
132/25Sequencing, Human Variation, and Disease 
142/27Personal Genomics, GSEA/GREAT 
153/4Transcription factor binding sites - Functions and ComplexesProject milestone due
163/6Population Genetics & Evo-Devo 
173/11Ancestral genome-phenotype mapping 
183/13Project PresentationsProject presentations (2.5 hours, lunch served)