Photo of Brendan Dolan-Gavitt

New York University
370 Jay St, #1055
(646) 997-3489
(617) 913-9060

Brendan Dolan-Gavitt's Research Home Page

Jump to: Biography | Datasets | Publications | Software | Blog | Twitter | Mastodon


I am currently an Associate Professor in the Computer Science and Engineering Department at the NYU Tandon School of Engineering. My research interests include systems and software security, machine learning, and embedded and cyber-physical systems. Currently, my research focuses on developing techniques to ease or automate the understanding of large, real-world software systems in order to develop novel defenses against attacks, typically by subjecting them to static and dynamic analyses that reveal hidden and undocumented assumptions about their design and behavior.

My research lab is the MESS Lab (Machine Learning, Embedded, and Software/Systems Security). You can find the most up-to-date information on what my research group is doing on our lab home page. I also serve as faculty advisor for the OSIRIS Lab, which does independent research in cybersecurity, plays in CTFs as NYUSec, and organizes NYU's yearly CSAW Cybersecurity Games and Conference.

I received my PhD from Georgia Tech in August 2014, and my B.A. in Mathematics and Computer Science from Wesleyan University in 2006. I also spent two years working as an information security analyst and researcher for the MITRE Corporation.

If you need to get in touch privately, you can find my PGP key on KeyBase.


LAVA Synthetic Bug Corpora

Work on automating vulnerability discovery has long been hampered by a shortage of ground-truth corpora with which to evaluate tools and techniques. This lack of ground truth prevents authors and users of tools alike from being able to measure such fundamental quantities as miss and false alarm rates. To solve this problem, we created LAVA, a system to automatically add serious vulnerabilites to software to create realistic ground truth datasets at scale.

The LAVA corpora have helped spur innovation by providing a benchmark for evaluating work in vulnerability discovery, including:

LAVA is the result of a collaboration between NYU, Northeastern University, and MIT Lincoln Laboratory.

The Malrec Dataset

Current malware sandbox systems face two limitations: first, for performance reasons, the amount of data they can collect is limited (typically to system call traces and memory snapshots). Second, they lack the ability to perform retrospective analysis – that is, to later extract features of the malware's execution that were not considered relevant when the sample was originally executed. We have created a new malware sandbox system, Malrec, which uses PANDA's whole-system deterministic record and replay to capture high-fidelity, whole-system traces of malware executions with low time and space overheads.

We used this system to create a curated dataset of 66,301 full-system traces, collected over a two year period. It has been used for several deep analyses of malware:





PANDA: Platform for Architecture-Neutral Dynamic Analysis
PANDA is the Platform for Architecture-Neutral Dynamic Analysis. It is a platform based on QEMU and LLVM for performing dynamic software analysis, abstracting architecture-level details away with a clean plugin interface. Particularly notable features include support for deterministic record and replay. It is currently being developed in collaboration with MIT Lincoln Laboratory, Georgia Tech, and Northeastern University.
Virtuoso is a system for automatically generating tools that can be used to introspect into virtual machines or extract information from memory images. It consists of a dynamic tracing system that records the execution of an in-guest program, and an analysis and translation component that converts the traces into a compact, out-of-guest program that computes the same result. More details can be found in our 2011 IEEE Security and Privacy paper.
Virtual Address Descriptor Tools
The VAD tools are a set of scripts for working with Virtual Address Descriptor structures in dumps of Windows physical memory to provide detailed information about a process's memory allocations to a forensic investigator. (Note: the functionality of these tools has now been implemented in Volatility, and their use is no longer recommended.)
PDBparse is a GPL-licensed library for parsing Microsoft PDB files. Support for these is already available within Windows through the Debug Interface Access API, however, this interface is not usable on other operating systems. PDB files provide a way to access debugging information about programs compiled with Microsoft Visual Studio, and can enable interesting applications such as extracting the Windows kernel data structures or finding non-exported kernel global vairables, all without access to the source.
Along with AAron Walters and several others, I help develop and maintain Volatility, an open-source (GPL-licensed) memory forensics framework. Volatility can do a lot of really cool things with memory images, from listing processes and threads, to viewing open network connections, to reconstructing executable files out of memory. I have also written some small extensions that allow it to interpret the memory of live virtual machines under Xen, using the XenAccess library.