New York University
370 Jay St, #1055
(646) 997-3489
(617) 913-9060
brendandg@nyu.edu

Brendan Dolan-Gavitt's Research Home Page

Biography

I am currently an Associate Professor in the Computer Science and Engineering Department at the NYU Tandon School of Engineering. My research interests include systems and software security, machine learning, and embedded and cyber-physical systems. Currently, my research focuses on developing techniques to ease or automate the understanding of large, real-world software systems in order to develop novel defenses against attacks, typically by subjecting them to static and dynamic analyses that reveal hidden and undocumented assumptions about their design and behavior.

My research lab is the MESS Lab (Machine Learning, Embedded, and Software/Systems Security). You can find the most up-to-date information on what my research group is doing on our lab home page. I also serve as faculty advisor for the OSIRIS Lab, which does independent research in cybersecurity, plays in CTFs as NYUSec, and organizes NYU's yearly CSAW Cybersecurity Games and Conference.

I received my PhD from Georgia Tech in August 2014, and my B.A. in Mathematics and Computer Science from Wesleyan University in 2006. I also spent two years working as an information security analyst and researcher for the MITRE Corporation.

If you need to get in touch privately, you can find my PGP key on KeyBase.

Datasets

LAVA Synthetic Bug Corpora

Work on automating vulnerability discovery has long been hampered by a shortage of ground-truth corpora with which to evaluate tools and techniques. This lack of ground truth prevents authors and users of tools alike from being able to measure such fundamental quantities as miss and false alarm rates. To solve this problem, we created LAVA, a system to automatically add serious vulnerabilites to software to create realistic ground truth datasets at scale.

The LAVA corpora have helped spur innovation by providing a benchmark for evaluating work in vulnerability discovery, including:

T-Fuzz: fuzzing by program transformation. Peng et al. (Oakland 2018)
VUzzer: Application-aware Evolutionary Fuzzing, Rawat et al. (NDSS 2017).
Steelix: program-state based binary fuzzing, Li et al. (FSE 2017)
Search Based Fuzzing, Laszlo Szekeres. (PhD Thesis)
Angora: Efficient Fuzzing by Principled Search. Chen and Chen. (Oakland 2018)

LAVA is the result of a collaboration between NYU, Northeastern University, and MIT Lincoln Laboratory.

The Malrec Dataset

Current malware sandbox systems face two limitations: first, for performance reasons, the amount of data they can collect is limited (typically to system call traces and memory snapshots). Second, they lack the ability to perform retrospective analysis – that is, to later extract features of the malware's execution that were not considered relevant when the sample was originally executed. We have created a new malware sandbox system, Malrec, which uses PANDA's whole-system deterministic record and replay to capture high-fidelity, whole-system traces of malware executions with low time and space overheads.

We used this system to create a curated dataset of 66,301 full-system traces, collected over a two year period. It has been used for several deep analyses of malware:

Malware Lineage in the Wild. Ul Haq et al. (arXiv preprint).
Malrec: Compact Full-Trace Malware Recording for Retrospective Deep Analysis. Severi et al. (DIMVA 2018)

Publications

Refereed

Joshua Bundt, Andrew Fasano, Brendan Dolan-Gavitt, William Robertson, and Tim Leek. Evaluating Synthetic Bugs. 16th ACM ASIA Conference on Computer and Communications Security (ACM ASIACCS 2021), June 2021. [preprint]
Andrew Fasano, Tiemoko Ballo, Marius Muench, Tim Leek, Alexander Oleinik, Brendan Dolan-Gavitt, Manuel Egele, Aurélien Francillon, Long Lu, Nick Gregory, Davide Balzarotti, and William Robertson. SoK: Enabling Security Analyses of Embedded Systems via Rehosting. 16th ACM ASIA Conference on Computer and Communications Security (ACM ASIACCS 2021), June 2021. [preprint]
Luke Craig, Andrew Fasano, Tiemoko Ballo, Tim Leek, Brendan Dolan-Gavitt, and William Robertson. PyPANDA: Taming the PANDAmonium of Whole System Dynamic Analysis. Proceedings of the 4th NDSS Workshop on Binary Analysis Research (BAR), February 2021. [preprint]
Zekun Shen and Brendan Dolan-Gavitt. HeapExpo: Pinpointing Promoted Pointers to Prevent Use-After-Free Vulnerabilities. Proceedings of the Annual Computer Security Applications Conference (ACSAC), December 2020. [pdf]
Qingchuan Zhao, Chaoshun Zuo, Brendan Dolan-Gavitt, Giancarlo Pellegrino, and Zhiqiang Lin. Automatic Uncovering of Hidden Behaviors from Input Validation in Mobile Apps. Proceedings of the IEEE Symposium on Security and Privacy (Oakland), May 2020. [pdf] Press coverage: ZDNet
Awanish Pandey, Yu Hu, Brendan Dolan-Gavitt, and Subhajit Roy. Realistic Bug Synthesis for Testing Bug-Finding Tools. ESEC/FSE 2018. [pdf]
Kang Liu, Brendan Dolan-Gavitt, and Siddharth Garg. Fine-Pruning: Defending Against Backdooring Attacks on Deep Neural Networks. RAID 2018. [pdf]
Giorgio Severi, Tim Leek, and Brendan Dolan-Gavitt. Malrec: Compact Full-Trace Malware Recording for Retrospective Deep Analysis. DIMVA 2018. [pdf]
Tianyu Gu, Siddharth Garg, and Brendan Dolan-Gavitt. BadNets: Identifying Vulnerabilities in the Machine Learning Model Supply Chain. Machine Learning and Computer Security Workshop (Colocated with NIPS), December 2017 [pdf] (Best Attack Paper) Press coverage: WIRED, The Register, The Independent, Quartz.
Yiwen Li, Brendan Dolan-Gavitt, Sam Weber, and Justin Cappos. Lock-in-Pop: Securing Privileged Operating System Kernels by Keeping on the Beaten Path. USENIX Annual Technical Conference (ATC), July 2017 [pdf]
Patrick Hulin, Andy Davis, Rahul Sridhar, Andrew Fasano, Cody Gallagher, Aaron Sedlacek, Tim Leek, and Brendan Dolan-Gavitt. AutoCTF: Creating Diverse Pwnables via Automated Bug Injection. USENIX Workshop on Offensive Technologies (WOOT), August 2017 [pdf]
Brendan Dolan-Gavitt, Patrick Hulin, Engin Kirda, Tim Leek, Andrea Mambretti, Wil Robertson, Frederick Ulrich, Ryan Whelan. LAVA: Large-scale Automated Vulnerability Addition. Proceedings of the IEEE Symposium on Security and Privacy (Oakland), May 2016. [pdf] [slides] Press coverage: Engineering.com, Military Embedded Systems, Network World, PC World.
Brendan Dolan-Gavitt, Josh Hodosh, Patrick Hulin, Tim Leek, Ryan Whelan. Repeatable Reverse Engineering with PANDA. 5th Program Protection and Reverse Engineering Workshop, Los Angeles, California, December 2015. [pdf]
Brendan Dolan-Gavitt, Tim Leek, Josh Hodosh, and Wenke Lee. Tappan Zee (North) Bridge: Mining Memory Accesses for Introspection. Proceedings of the ACM Conference on Computer and Communications Security (CCS), November 2013. [pdf] [Slides: Keynote | PDF] [Presentation Audio] [BibTex]
Brendan Dolan-Gavitt, Tim Leek, Michael Zhivich, Jonathon Giffin, and Wenke Lee. Virtuoso: Narrowing the Semantic Gap in Virtual Machine Introspection. Proceedings of the IEEE Symposium on Security and Privacy (Oakland), May 2011. [pdf] [Slides: Keynote | PDF | PDF with Notes] [BibTex]
Brendan Dolan-Gavitt, Abhinav Srivastava, Patrick Traynor, and Jonathon Giffin. Robust Signatures for Kernel Data Structures. Proceedings of the ACM Conference on Computer and Communications Security (CCS), November 2009. [pdf] [slides] [BibTex]
Brendan Dolan-Gavitt. The VAD tree: A process-eye view of physical memory. Digital Investigation, Volume 4, Supplement 1, September 2007, Pages 62-64. [pdf] [slides] [BibTex]
Brendan Dolan-Gavitt. Forensic analysis of the Windows registry in memory. Digital Investigation, Volume 5, Supplement 1, September 2008, Pages S26-S32. [pdf] [slides] [BibTex]

Unrefereed

Akshaj Kumar Veldanda, Kang Liu, Benjamin Tan, Prashanth Krishnamurthy, Farshad Khorrami, Ramesh Karri, Brendan Dolan-Gavitt, and Siddharth Garg. NNoculation: Broad Spectrum and Targeted Treatment of Backdoored DNNs. [pdf]
Zhenghao Hu, Yu Hu, and Brendan Dolan-Gavitt. Chaff Bugs: Deterring Attackers by Making Software Buggier. [pdf] Press coverage: VICE Motherboard, CSO Online, MIT Technology Review, The Register, TechXplore
Brendan Dolan-Gavitt, Josh Hodosh, Patrick Hulin, Tim Leek, and Ryan Whelan. Repeatable Reverse Engineering for the Greater Good with PANDA. Technical Report: CUCS-023-14, October, 2014. [pdf]
Brendan Dolan-Gavitt, Bryan Payne, and Wenke Lee. Leveraging Forensic Tools for Virtual Machine Introspection. Technical Report: GT-CS-11-05, May, 2011. [pdf]
Brendan Dolan-Gavitt and Yacin Nadji. See No Evil: Evasions in Honeymonkey Systems. May 2010. [pdf]
Brendan Dolan-Gavitt and Patrick Traynor. Using Kernel Type Graphs to Detect Dummy Structures. December 2008. [pdf]

Software

PANDA: Platform for Architecture-Neutral Dynamic Analysis: PANDA is the Platform for Architecture-Neutral Dynamic Analysis. It is a platform based on QEMU and LLVM for performing dynamic software analysis, abstracting architecture-level details away with a clean plugin interface. Particularly notable features include support for deterministic record and replay. It is currently being developed in collaboration with MIT Lincoln Laboratory, Georgia Tech, and Northeastern University.
Virtuoso: Virtuoso is a system for automatically generating tools that can be used to introspect into virtual machines or extract information from memory images. It consists of a dynamic tracing system that records the execution of an in-guest program, and an analysis and translation component that converts the traces into a compact, out-of-guest program that computes the same result. More details can be found in our 2011 IEEE Security and Privacy paper.
Virtual Address Descriptor Tools: The VAD tools are a set of scripts for working with Virtual Address Descriptor structures in dumps of Windows physical memory to provide detailed information about a process's memory allocations to a forensic investigator. (Note: the functionality of these tools has now been implemented in Volatility, and their use is no longer recommended.)
PDBparse: PDBparse is a GPL-licensed library for parsing Microsoft PDB files. Support for these is already available within Windows through the Debug Interface Access API, however, this interface is not usable on other operating systems. PDB files provide a way to access debugging information about programs compiled with Microsoft Visual Studio, and can enable interesting applications such as extracting the Windows kernel data structures or finding non-exported kernel global vairables, all without access to the source.
Volatility: Along with AAron Walters and several others, I help develop and maintain Volatility, an open-source (GPL-licensed) memory forensics framework. Volatility can do a lot of really cool things with memory images, from listing processes and threads, to viewing open network connections, to reconstructing executable files out of memory. I have also written some small extensions that allow it to interpret the memory of live virtual machines under Xen, using the XenAccess library.