CS 101

Servers and Backend

Announcements

  • WiCS Dinner Tonight at Gates 403

Plan for Today

  • Recall: requests go to a server, which returns a response
  • Today: how do servers figure out what information to return?
    • Google
    • Facebook
    • Displaying Ads

Information Storage: Databases

  • One strength of computers is their ability to store and process information
  • Information is stored in databases
  • Like a giant Excel sheet, but with lots of rows
  • Usually can't "see" all the data - choose certain columns at a time, or filter out rows with certain features
    • Example: I want to send an email to all users in North America who last logged in between 5 and 7 days ago and who have an outstanding friend request
  • Basic Idea: companies store a lot of information, then responses involve searching the saved information based on the request
  • Come back Thursday!

Power of Data

  • Talk with a neighbor: what sorts of data do you think your favorite websites store?

Power of Data

  • A good rule of thumb: every click, view, and even mouse hover is recorded and stored
  • Tracking this kind of data can be very powerful
    • Are people getting frustrated because a video takes too long to load?
    • Which articles are the most popular?
    • Which genres of movies/TV are most popular? (Netflix is phenomenal at this)

Google: Getting Information

  • Indexes the internet
  • "Spiders" "crawl" the internet (Google calls them "Googlebots")
  • Start on a page, index that page, follow all outbound links
  • Store all the information in a database
    • Contains info about the words pages contain

Google: Evaluating Relevance

figure
Source: Wikipedia
  • Request includes search terms
  • Need to derive meaning from order of terms (Natural Language Processing)
  • Search all the indexed websites
  • Look for terms and their synonyms; terms in the title are better
  • PageRank: a measure of how "important" a website is
    • Sort of like how academic papers work: being cited by lots of papers is better, and being cited by other important papers is better

Google Search: Recap

Facebook: Storing Friends

figure
Source: Wikipedia
  • Social Network: people are connected to each other through friendships
  • Called a graph in CS
    • nodes = people
    • edges = friendships
  • Other uses of graphs: the Internet, road networks, disease outbreaks, company hierarchies

Facebook as a Network

figure
Source: Facebook
  • Friendship Paradox: your friends have more friends on average than you do
  • Triadic Closure: how People You May Know works
  • Degrees of Separation and (Kevin) Bacon number

Storing Other Information

  • Stores likes, comments, posts, live videos, messages, etc.
  • Big idea: give IDs to users and each type of interaction
  • Tables in a database for each of these, linked by IDs
  • News Feed algorithm: get the content from each of your friends, attempt to rank using relevance, popularity, and recentness

Internet Advertisements

  • Some target certain types of individuals: Facebook
  • Some target certain search terms: Google
  • Some track your internet search history through ad-tracking using third-party cookies that are shared across websites
  • In general, ad campaigns have a budget and a bid per view or bid per click; ordering on a page and which ads are visible are based on bids

Recap

  • Companies decide how to store information
  • How companies prioritize information and interpret requests has a huge impact on society

Exam

  • Great job on the exam!
  • I will curve the class similarly to past courses (as viewable on Carta)