|
Reading list
Required reading list
Ion Stoica, Robert Morris, David Liben-Nowell, David R. Karger,
M. Frans Kaashoek, Frank Dabek, Hari Balakrishnan. Chord: A Scalable
Peer-to-peer Lookup Protocol for Internet Applications. IEEE/ACM
Transactions on Networking (TON), 2003.
PDF
Jeffrey Dean and Sanjay Ghemawat. MapReduce: Simplified Data
Processing on Large Clusters. 6th Symposium on Operating Systems
Design and Implementation (OSDI), 2004.
PDF
Christopher Olston, Benjamin Reedy, Utkarsh Srivastavava, Ravi
Kumar, Andrew Tomkins. Pig Latin: A Not-So-Foreign Language for Data
Processing. SIGMOD, 2008.
PDF
Leslie Lamport. Time, clocks, and the ordering of events in a
distributed system. Communications of the ACM, Volume 21 , Issue 7
(July 1978), pages 558-565.
PDF
Optional reading list
Distributed database design
Vertical fragmentation
S. Navathe, S. Ceri, G. Wiederhold, J. Dou. Vertical Partitioning
Algorithms for Database Design. ACM Transactions on Database Systems
(TODS), volume 9, issue 4, 1984.
PDF
B. F. Cooper, R. Ramakrishnan, U. Srivastava, A. Silberstein,
P. Bohannon, H. Jacobsen, N. Puz, D. Weaver, R. Yerneni, PNUTS:
Yahoo!’s Hosted Data Serving Platform. PVLDB 2008
PDF
Query processing and optimization in distributed databases
Privacy-preserving join
R. Agrawal, A. Evfimievski, R. Srikant. Information Sharing
Across Private Databases. ACM SIGMOD International Conference on
Management of Data, San Diego, California, 2003.
PDF
Data Replication
P2P
Distributed information retrieval
Crawling
P. Boldi, B. Codenotti, M. Santini, S. Vigna. UbiCrawler: A
Scalable Fully Distributed Web Crawler. Software Practice &
Experience 34(8): 711-726.
PDF
A. Arasu, J. Cho, H.Garcia-Molina, A. Paepcke, S. Raghavan. Searching the Web. ACM Transactions on Internet Technology, Vol. 1, No. 1, August 2001, Pages 2–43.
PDF
Caching
R. Baeza-Yates, A. Gionis, F. Junqueira, V. Murdock,
V. Plachouras, F. Silvestri. The Impact of Caching on Search
Engines. ACM SIGIR International Conference on Information Retrieval,
Amsterdam, The Netherlands, 2007.
PDF
Open Source Systems
S4
Leonardo Neumeyer, Bruce Robbins, Anish Nair, Anand Kesari.
S4: Distributed Stream Computing Platform.
2010 IEEE International Conference on Data Mining Workshops.
PDF
Hyracks
Vinayak Borkar, Michael Carey, Raman Grover, Nicola Onose, Rares Vernica.
Hyracks: A Flexible and Extensible Foundation for Data-Intensive Computing.
ICDE, 2011.
PDF
BigTable
FAY CHANG, JEFFREY DEAN, SANJAY GHEMAWAT, WILSON C. HSIEH, DEBORAH A. WALLACH, MIKE BURROWS, TUSHAR CHANDRA, ANDREW FIKES, and ROBERT E. GRUBER.
Bigtable: A Distributed Storage System for Structured Data.
OSDI, 2006.
PDF
Pregel
Grzegorz Malewicz, Matthew H. Austern, Aart J. C. Bik, James C. Dehnert, Ilan Horn, Naty Leiser, and Grzegorz Czajkowski.
Pregel: A System for Large-Scale Graph Processing.
SIGMOD, 2010.
PDF
|