EVEREST

Data Systems Group · MIT CSAIL

Our mission is to establish the engineering principles for AI-driven data systems. We move beyond ad-hoc experimentation to build a foundation of new abstractions, reusable software components, and rigorous development tools.

Research Directions

Code Generation

Developing AI systems that automatically generate code from specifications, design documents, and natural language descriptions, advancing program synthesis and software development automation.

Data Analytics

Building AI-powered systems for data analysis, transformation, and insight extraction, applying optimization techniques to make data processing more efficient and accessible.

Large Scale Retrieval

Developing efficient systems for retrieving relevant information from massive datasets, optimizing both accuracy and performance for real-world applications.

Agent Operations

Investigating operational frameworks for deploying and managing AI agents in production environments, addressing challenges in monitoring, debugging, and maintaining autonomous systems.

Agent Planning

Developing methods for AI agents to autonomously plan and execute complex multi-step tasks, combining reasoning, decision-making, and adaptive execution strategies.

Projects

D4: Design Doc Driven Development

D4 achieves a 54.8% average pass rate on the commit-0 benchmark, outperforming the next-best system by 14 percentage points. The system is an LLM-based compiler that generates code from design documents, treating design specifications as input and automatically producing implementation code.

Palimpzest

Palimpzest achieves a 90.3x speedup and 9.1x cost reduction on document processing benchmarks. The system applies cost-based query optimization techniques to AI-powered data processing, automatically selecting optimal combinations of models, prompts, and execution strategies for data transformation tasks.

A2rchi

A2rchi is an open-source RAG framework for AI support systems. The system enables retrieval-augmented generation for technical assistance and has been deployed at MIT (SubMIT system and multiple courses) and CERN for particle physics data processing support.

BRAD

BRAD (Blueprint for Relational Adaptive Databases) virtualizes cloud data infrastructures, automatically optimizing cloud database deployments across different storage engines and compute resources to balance performance and cost.

KramaBench

KramaBench is a comprehensive benchmark for evaluating AI agents on knowledge-intensive tasks. The benchmark provides a systematic framework for assessing agent performance across diverse scenarios requiring deep domain knowledge and reasoning capabilities.

Team

Faculty

Samuel Madden

Tim Kraska

Michael Cafarella

Omar Khattab

Affiliate Faculty

Christoph Paus

Postdocs

Tianyu Li

Postdoctoral Researcher

Jason Mohoney

Postdoctoral Researcher

Gerardo Vitagliano

Postdoctoral Researcher

Research Engineers

James Moore

Research Engineer

Students

Sushrut Borkar

PhD Student

Peter Baile Chen

PhD Student

Zhuohan (Joshua) Gu

PhD Student

Darryl Ho

PhD Student

Ferdinand Kossmann

PhD Student

Eugenie Lai

PhD Student

Markos Markakis

PhD Student

Amadou Latyr Ngom

PhD Student

Matthew Russo

PhD Student

Ziniu Wu

PhD Student

Geoffrey Yu

PhD Student

Alex Zhang

PhD Student

Anna Zeng

PhD Student

Sylvia Zhang

PhD Student

Xinjing Zhou

PhD Student

Events

November142025

Everest Lab Annual Meeting 2025

MIT Building 45, 8th Floor · 51 Vassar Street, Cambridge, MA

Join us for our annual meeting where lab members will present their latest research findings, discuss ongoing projects, and share insights on the future directions of AI-driven data systems. The event will feature technical talks, poster sessions, and networking opportunities.

8:30 AM - 6:00 PM ESTIn-Person EventClick for full agenda →

Publications

For a complete list of publications from Everest Lab members, please visit the DSG Publications page.

Contact

Lab Contact

everest-info@csail.mit.edu

Location

MIT Computer Science & Artificial Intelligence Laboratory

32 Vassar Street

Cambridge, MA 02139

Prospective Students

We welcome inquiries from prospective PhD students interested in AI-powered data systems. Please reach out to faculty members directly regarding research opportunities.

EVEREST

Research Directions

Code Generation

Data Analytics

Large Scale Retrieval

Agent Operations

Agent Planning

Projects

D4: Design Doc Driven Development

Palimpzest

A2rchi

BRAD

KramaBench

Team

Faculty

Affiliate Faculty

Postdocs

Research Engineers

Students

Events

Everest Lab Annual Meeting 2025

Publications

Sponsors

Contact

Lab Contact

Location

Prospective Students