About Me
I am currently the Co-Founder
of Apture, an online media startup
trying to radically improve communication on the web.
I was previously a Ph.D. student in the Computer
Science Department at Stanford
University, where I am on a Leave of Absence. My PhD research focuses
mainly on developing tools and
techniques for finding bugs in large, complex software systems. My
advisor is Dawson Engler.
I graduated with a joint B.S. with Honors and M.S. in Computer Science from Stanford University in 2006.
I maintain an active blog on International Development, Economics, Foreign Policy, and
Finance.
Papers
EXPLODE: a Lightweight, General System for Finding Serious Storage System Errors
(
ps)
(
pdf)
Junfeng Yang, Can Sar, and Dawson Engler
in OSDI 2006
Storage systems such as filesystems, databases, and RAID systems have a
simple, basic contract: you give them data, they do not lose or corrupt it.
Often they store the only copy, making its irrevocable loss almost arbitrarily
bad. Unfortunately, their code is exceptionally hard to get right, since it must
correctly recover from any crash at any program point, no matter how
their state was smeared across volatile and persistent memory.
This paper describes EXPLODE, a system that makes it
easy to systematically check real storage systems for errors.
It takes user-written, potentially system-specific checkers and
uses them to drive a storage system into tricky corner cases,
including crash recovery errors. EXPLODE uses a novel adaptation
of ideas from model checking, a comprehensive, heavyweight
formal verification technique, that makes its checking
more systematic (and hopefully more effective) than a pure testing
approach while being just as lightweight. EXPLODE is effective.
It found serious bugs in a broad range of real storage systems
(without requiring source code): three version
control systems, BerkeleyDB, an NFS implementation,
ten filesystems, a RAID system, and the popular VMwareGSX
virtual machine. We found bugs in every system we checked,
36 bugs in total, typically with little effort.
Automatically Generating Malicious Disks using Symbolic Execution
(
ps)
(
pdf)
(
bib)
Junfeng Yang, Can Sar, Paul Twohey,
Cristian Cadar, and Dawson Engler
in IEEE Security and Privacy 2006
Many current systems allow data produced by potentially malicious
sources to be mounted as a file system. File system code must check
this data for dangerous values or invariant violations before using
it. Because file system code typically runs inside the operating
system kernel, even a single unchecked value can crash the machine or
lead to an exploit. Unfortunately, validating file system images is
complex: they form DAGs with
complex dependency relationships across massive amounts of data bound
together with intricate, undocumented assumptions. This paper shows
how to automatically find bugs in such code using symbolic execution.
Rather than running the code on manually-constructed concrete input,
we instead run it on symbolic input that is initially allowed to be
``anything.'' As the code runs, it observes (tests) this input and
thus constrains its possible values. We generate test cases by
solving these constraints for concrete values. The approach works
well in practice: we checked the disk mounting code of three
widely-used Linux file systems: ext2, ext3, and JFS and found bugs in
all of them where malicious data could either cause a kernel panic or
form the basis of a buffer overflow attack.
EXPLODE: A Lightweight, General Approach to Finding
Serious Errors in Storage Systems
(
ps)
(
pdf)
(
bib)
Junfeng Yang, Paul Twohey, Ben Pfaff, Can Sar,
Dawson Engler
in BUGS 2005
File systems, RAID systems, and applications that require data
consistency, among others, assuredata integrity by carefully forcing
valuable data to stable storage.Unfortunately, verifying that a system
can recover from a crash to avalid state at any program counter is
very difficult. Previoustechniques for finding data integrity bugs
have been heavyweight,requiring extensive effort for each OS and file
system to be checked.We demonstrate a lightweight, flexible,
easy-to-apply technique by developing a tool called EXPLODE and show
how we used it to find 25 serious bugs in eight Linux filesystems,
Linux software RAID 5, Linux NFS, and three version controlsystems.