Cloud9: Scalable Cluster-based Parallel Symbolic Execution

Cloud9 is the first parallel symbolic execution engine that scales linearly on large clusters of commodity hardware. Cloud9 offers a testing engine whose power is directly proportional to the number of cluster nodes, enabling the analysis of significantly larger systems than before. Unlike existing symbolic execution engines, Cloud9 handles not only single-threaded programs but also multithreaded ones, and it provides a symbolic environment that is the first to support all major elements of the POSIX interface, such as processes, threads, synchronization, networking, IPC, and file I/O.

Find more details on the project's web page.

S2E: Multi-path In-vivo Analysis of Complex Software Systems

Selective symbolic execution, embodied in the S2E system, automatically selects the minimal set of paths that need to be explored for a given analysis, thus reducing resource needs by orders of magnitude compared to classic symbolic execution. S2E weaves the execution back and forth between the symbolic and the concrete domain by automatically and transparently converting symbolic to concrete data and vice versa. These conversions are governed by execution consistency models, a unified way of reasoning about families of paths through programs. S2E directly runs x86 machine code and requires no abstract models of the code and its environment; it can run and analyze entire software stacks, such as Microsoft Windows with all its libraries and applications. We used S2E to build, among others, a tool that tests proprietary Windows drivers and a comprehensive performance profiler, which constitutes the first use of symbolic execution in performance analysis.

Find more details on the project's web page.

ESD: Automated Debugging Through Execution Synthesis

Debugging is time-consuming: bug reports rarely provide sufficient information to reproduce the reported failure, so developers turn into detectives searching for an explanation of how the bug occurred. We introduced execution synthesis, a technique that combines reverse static analysis with forward symbolic execution to automate this process and find program inputs and thread schedules that cause a program to exhibit the symptoms shown in a bug report. We leverage the insight that, to fix a bug, developers need not reproduce the exact same execution that failed in the field, but any execution that leads to the same failure state is sufficient. Execution synthesis requires no runtime tracing or program modifications, thus incurring no runtime overhead.

Find more details on the project's web page.

Reverse Execution Synthesis: Automated Debugging for Arbitrarily Long Executions

One of the most energy-draining and frustrating parts of software development is playing detective with elusive bugs. We argue that automated post-mortem debugging of failures is feasible for real, in-production systems with no runtime recording. To this end, we are developing reverse execution synthesis (RES), a technique that takes a
coredump obtained after a failure and automatically computes the suffix of an execution that leads to that coredump. RES provides a way to then play back this suffix in a debugger deterministically, over and over again. We argue that the RES approach could be used to (1) automatically classify bug reports based on their root cause, (2)
automatically identify coredumps for which hardware errors (e.g., bad memory), not software bugs are to blame, and (3) ultimately help developers reproduce the root cause of the failure in order to debug it.

ADDA: Automating the Debugging of Datacenter Applications

Debugging data-intensive distributed applications running in datacenters is complex and time-consuming because developers do not have practical ways of deterministically replaying failed executions. Building such tools is hard because the non-determinism that may be tolerable on a single node is exacerbated in large clusters of interacting nodes, and datacenter applications produce terabytes of intermediate data exchanged by nodes, thus making full input recording infeasible. We introduce ADDA, replay-debugging system for datacenters that has lower recording and storage overhead than existing systems. ADDA provides "control plane" determinism, leveraging the observation that typical datacenter applications consist of a separate "control plane" and "data plane", and most bugs reside in the former. ADDA does not record "data plane" inputs, instead it synthesizes them during replay, starting from the application’s external inputs.

Portend: Automated Classification of Data Races

Even though most data races are harmless, the harmful ones are at the heart of some of the worst concurrency bugs. Alas, spotting just the harmful data races in programs is like finding a needle in a haystack. A vast majority of the true  data races reported by state-of-the art race detectors turn out to be harmless. Portend is a tool that not only detects races but also automatically classifies them based on their potential consequences: Could they lead to crashes or hangs? Alter system state? Could their effects be visible outside the program? Are they harmless?

RaceMob: Crowdsourced Data Race Detection

RaceMob is a novel data race detector that has both low overhead and good accuracy. RaceMob first detects potential races statically (hence it has few false negatives), and then dynamically validates whether these are true races (hence has few false positives). It achieves low runtime overhead and a high degree of realism by combining real-user crowdsourcing with a new on-demand dynamic data race validation technique. RaceMob is the first data race detector that can be used always-on and in-production, while also providing better accuracy than today’s race detectors.

-Overify: Speeding Up Verification Through Compiler Optimizations

Developers today relay on automated tools to test and verify complex software systems. Such off-the-shelf tools are often too slow and hard to use, because they work on programs that were not built with verification in mind. The -Overify project researches how compilers can make programs more verification-friendly. Our prototypes show that verification speed can be improved by up to two orders of magnitude with minimal changes to the build process.

Klein: A Platform for Modular Program Analysis

Today's software systems are large and complex. No single technique suffices to analyze and test them. The Klein project aims to build a platform that breaks down this complexity into manageable parts. It allows for multiple automated verification techniques to collaborate on each part of the problem. Analysis results can be exchanged between individual program modules and between different analysis techniques. Currently, we are using Klein to apply a combination of bounded model checking and static analysis techniques to guarantee the memory safety of systems software.