|
|
Current ProjectsIn my research projects we don't just build prototype systems and write papers; we build production-quality software systems that are used by other people for their daily work. In most cases we release the software in open-source form. This approach is unusual in academia, but it allows us to do much better research: designing for production use forces us to think about important issues that could be ignored otherwise, and measurements of usage allow us to evaluate our ideas more thoroughly. Furthermore, this approach allows students to develop a higher level of system-building maturity than would be possible otherwise. RAMCloud (2009- )The RAMCloud project is creating a new class of storage, based entirely in DRAM, that is 2-3 orders of magnitude faster than existing storage systems. If successful, it will enable new applications that manipulate large-scale datasets much more intensively than has ever been possible before. In addition, we think RAMCloud, or something like it, will become the primary storage system for cloud computing environments such as Amazon's AWS and Microsoft's Azure. The role of DRAM in storage systems has been increasing rapidly in recent years, driven by the needs of large-scale Web applications. These applications manipulate very large datasets with an intensity that cannot be satisfied by disks alone. As a result, applications are keeping more and more of their data in DRAM. For example, large-scale caching systems such as memcached are being widely used (in 2009 Facebook used a total of 150 TB of DRAM in memcached and other caches for a database containing 200 TB of disk storage), and the major Web search engines now keep their search indexes entirely in DRAM. Although DRAM's role is increasing, it still tends to be used in limited or specialized ways. In most cases DRAM is just a cache for some other storage system such as a database; in other cases (such as search indexes) DRAM is managed in an application-specific fashion. It is difficult for developers to use DRAM effectively in their applications; for example, the application must manage consistency between caches and the backing storage. In addition, cache misses and backing store overheads make it difficult to capture DRAM's full performance potential. Our goal for RAMCloud is to create a general-purpose storage system that makes it easy for developers to harness the full performance potential of large-scale DRAM storage. It keeps all data in DRAM all the time, so there are no cache misses. RAMCloud storage is durable and available, so developers need not manage a separate backing store. RAMCloud is designed to scale to thousands of servers and hundreds of terabytes of data while providing uniform low-latency access to all machines within a large datacenter. As of Fall 2011 we have initial implementations of many of the components of RAMCloud and the system runs well enough to use it for simple tests. On our 60-node test cluster we are able to perform remote reads of 100-byte objects in about 5 microseconds, and an individual server can process more than 800,000 small read requests per second. The basic crash recovery mechanism is running, and RAMCloud can recover 35 GB of memory from a failed server in about 1.6 seconds. The RAMCloud project is still young, so there are many interesting research issues still to explore, such as the following:
Related links:
Previous ProjectsThe following sections describe some projects on which I have worked in the past. I am no longer involved with these projects, though several of them are still active as open-source projects or commercial products. The years listed for each project represent the period when I was involved. Fiz (2008-2011)The Fiz project explored new frameworks for highly interactive Web applications. The goal of the project was to develop higher-level reusable components in order to encourage reusability and simplify application development by hiding inside the components many of the complexities that bedevil Web developers (such as security issues or using Ajax to enhance interactivity). Related links:
ElectricCommander (2005-2007, Electric Cloud)ElectricCommander is the second major product for Electric Cloud. It addresses the problem of managing software development processes such as nightly builds and automated test suites. Most organizations have home-grown software for these tasks, which is hard to manage and scales poorly as the organization grows. ElectricCommander provides a general-purpose Web-based platform for managing these processes. Developers use a Web interface to describe each job as a collection of shell commands that run serially or in parallel on one or more machines. The ElectricCommander server manages the execution of these commands according to schedules defined by the developers. It also provides a variety of reporting tools and manages a collection of server machines to allow concurrent execution of multiple jobs. ElectricAccelerator (2002-2005, Electric Cloud)
ElectricAccelerator is Electric Cloud's first product; it
accelerates software builds based on the Tcl/Tk (1988-2000, U.C. Berkeley, Sun, Scriptics)Tcl is an embeddable scripting language: it is a simple interpreted language implemented as a library package that can be incorporated into a variety of applications. Furthermore, Tcl is extensible, so additional functions (such as those provided by the enclosing application) can easily be added to those already provided by the Tcl interpreter. Tk is a GUI toolkit that is implemented as a Tcl extension. I initially built Tcl and Tk as hobby projects and didn't think that anyone besides me would care about them. However, they were widely adopted because they solved two problems. First, the combination of Tcl and Tk made it much easier to create graphical user interfaces than previous frameworks such as Motif (and, Tcl and Tk were eventually ported from Unix to the Macintosh and Windows, making Tcl/Tk applications highly portable). Second, Tcl made it easy to include powerful command languages in a variety of applications ranging from design tools to embedded systems. Related links:
Log-Structured File Systems (1988-1994, U.C. Berkeley)A log-structured file system (LFS) is one where all information is written to disk sequentially in a log-like structure, thereby speeding up both file writing and crash recovery. The log is the only structure on disk; it contains index information so files can be read back from the log efficiently. LFS was initially motivated by the RAID project, because random writes are expensive in RAID disk arrays. However, the LFS approach has also found use in other settings, such as flash memory where wear-leveling is an important problem. Mendel Rosenblum created the first LFS implementation as part of the Sprite project; Ken Shirriff added RAID support to LFS in the Sawmill project, and John Hartman extended the LFS ideas into the world of cluster file systems with Zebra. Related links:
Sprite (1984-1994, U.C. Berkeley)
Sprite was a Unix-like network operating system built at U.C. Berkeley
and used for day-to-day computing by about 100 students and staff for
more than five years. The overall goal of the project was to create a
single system image, meaning that a collection of workstations
would appear to users the same as a single time-shared system, except
with much better performance. Among its more notable features were
support for diskless workstations, large client-level file caches
with guaranteed consistency, and a transparent process migration
mechanism. We built a parallel version of Related links:
VLSI Design Tools (1980-1986, U.C. Berkeley)When I arrived at Berkeley in 1980 the Mead-Conway VLSI revolution was in full bloom, enabling university researchers to create large-scale integrated circuits such as the Berkeley RISC chips. However, there were few tools for chip designers to use. In this project my students and I created a series of design tools, starting with a simple layout editor called Caesar. We quickly replaced Caesar with a more powerful layout editor called Magic. Magic made two contributions: first, it implemented a number of novel interactive features for developers, such as incremental design-rule checking, which notified developers of design rule violations immediately while they edited, rather than waiting for a slow offline batch run. Second, Magic incorporated algorithms for operating on hierarchical layout structures, which were much more efficient than previous approaches that "flattened" the layout into a non-hierarchical form. Magic took advantage of a new data structure called corner stitching, which permitted efficient implementation of a variety of geometric operations. We also created a switch-level timing analysis program called Crystal. We made free source-level releases of these tools and many others in what became known as the Berkeley VLSI Tools Distributions; these represented one of the earliest open-source software releases. Magic continued to evolve after the end of the Berkeley project; as of 2005 it was still widely used. Cm* and Medusa (1975-1980, Carnegie Mellon University)As a graduate student at Carnegie Mellon University I worked on the Cm* project, which created the first large-scale NUMA multiprocessor. Cm* consisted of 50 LSI-ll processors connected by a network of special-purpose processors called Kmaps that provided shared virtual memory. I worked initially on the design and implementation of the Kmap hardware, then shifted to the software side and led the Medusa project, which was one of two operating systems created for Cm*. The goal of the Medusa project was to understand how to structure an operating system to run on the NUMA hardware; among other things, Medusa was the first system to implement coscheduling (now called "gang scheduling"). |