UC San Diego Database Lab

The Database Lab at UC San Diego is one of the leading academic research groups in the field of data management, spanning the major themes of theory, systems, languages, interfaces, and applications, as well as intersections with other data-oriented fields. Areas of particular strength include database theory, data integration, semistructured data, heterogeneous data, hardware-conscious data processing, query processing and optimization, data exploration, data analytics, and machine learning systems. Application areas of particular interest have included healthcare, Internet of Things, and social media.

We are a part of the Computer Science and Engineering department. DB Lab faculty are also affiliated with other research groups, including the Theory Group, Artificial Intelligence Group, and Center for Networked Systems, as well as with the Halicioglu Data Science Institute.


Jump to Faculty, Affiliated Faculty, PhD Students and Postdocs, SDSC Affiliated Members, Alumni.


Alin Deutch [ DBLP|Google Scholar ]

Alin's interests include prdata publishing and integration, specification and verification of database-powered business processes and semistructured and XML data. His research is currently sponsored by the NSF.

Arun Kumar [ Google Scholar ]

Arun's research interests are in data management and systems for machine learning-based data analytics. His research focuses on designing abstractions, algorithms, and systems that make it easier and faster to analyze large and complex datasets using machine learning.

Yannis Papakonstantinou [ DBLP|Google Scholar ]

Yannis' research extends the capabilities of data platforms and query processors. He has published over 100 research articles with more than 14,000 citations.

Victor Vianu [ DBLP|Google Scholar ]

Victor's research interests are in database systems and theory. He has been focusing on verification of database-driven systems, an area at the boundary of databases and computer-aided verification. His current research focuses on automatic verification of interactive data-driven Web services and business processes. He is also interested in the theory of query languages and computational logic.

Affiliated Faculty

Kamalika Chaudhuri [Google Scholar]

Kamalika's research interests lie in the area of machine learning. Specifically, much of her work is on privacy-preserving machine learning and unsupervised learning, but she's also broadly interested in a number of topics in learning theory, such as confidence-rated prediction, online learning, and active learning.

Babak Salimi [Google Scholar]

Babak's research interests are in data management, causal inference, responsible data science and data ethics. His research seeks to unify techniques from theoretical data management, causal inference and machine learning to develop the necessary conceptual foundations for decision-making and policy evaluation from complex relational data, algorithmic fairness, explainability and accountability.

PhD Students and Postdocs

Rana Alotaibi

Rana's research interests lie in hybrid stores databases, large scale data integration and scalable self tuning database systems.

Side Li

Side's research interests lie in systems for machine learning. He also loves unsweetened boba milk tea.

Tara Mirmira

Tara's research interests lie in systems for data analytics and machine learning.

Supun Nakandala

Supun's research interest lies broadly in the intersection of Machine Learning and Systems, an emerging area which is also known as Machine Learning Systems. His current work involves implementing efficient, scalable, and reliable algorithms, systems, and abstractions for Deep Learning powered Machine Learning workloads.

Ainur Smagulova

Ainur's research interests lie at the intersection of the theory and applications of databases and big data management systems. Currently, she is focused on data processing for massively parallel systems, more specifically, parallel computation of queries over large graphs and their complexity.

Vraj Shah

Vraj's research interest lies in accelerating the advanced data analytics life cycle by simplifying data sourcing for Machine Learning (ML) and making it easier and cheaper to deploy ML powered data analytics applications, improving both efficiency and usability of ML for data science.

Marysia Tran

Marysia's research interests lie in logic and database theory.

Yuhao Zhang

Yuhao's research interests lie within the field of machine learning systems, including systems powered by applied ML that enable novel applications, and systems designed for ML to make data science easier and faster.

Xiuwen Zheng

Xiuwen's research interests lie in heterogeneous data management, polystore system, query optimization and AI in DB.

SDSC Affiliated Members

Ilkay Altintas

Ilkay Altintas is the Director for the Center of Excellence in Workflows for Data Science at the San Diego Supercomputer Center (SDSC), UCSD.

Chaitanya K. Baru

Director, Center for Large-scale Data Systems research (CLDS), clds.sdsc.edu Director, Advanced Cyberinfrastructure Development Group (ACID).

Amaranath Gupta

Amarnath Gupta is a Research Scientist at the San Diego Supercomputer Center (SDSC) of the University of California San Diego.


Nikos Koulouris
PhD; 2020; Advised by Yannis Papakonstantinou
Vicky Papavasileiou
PhD; 2019; Advised by Alin Deutsch
Konstantinos (Costas) Zarifis
PhD; 2019; Advised by Yannis Papakonstantinou
Yuliang Li
PhD; 2018; Advised by Alin Deutsch and Victor Vianu
Chunbin Lin
PhD; 2018; Advised by Yannis Papakonstantinou
Jianguo Wang
PhD; 2018; Advised by Yannis Papakonstantinou and Steve Swanson
Yannis Katsis
PostDoc; 2017; Advised by Yannis Papakonstantinou
Wojtek Kazana
PostDoc; 2014; Advised by Victor Vianu and Alin Deutsch
Elio Damaggio
PhD; 2012; Advised by Victor Vianu and Alin Deutsch
Alan Nash
PhD; 2011; Advised by Alin Deutsch and Victor Vianu
Andrey Balmin
PhD; 2010; Advised by Alin Deutsch
Lying Sui
PhD; 2006; Advised by Victor Vianu
Frank Neven
PostDoc; 1999; Advised by Victor Vianu
Karl Denninghoff
PhD; 1995; Advised by Victor Vianu
Sťephane Grumbach
PostDoc; 1990-91; Advised by Victor Vianu
Gottfried Vossen
PostDoc; 1987; Advised by Victor Vianu
Avinash Vyas
Bertram Ludascher
Dayou Zhou
Emiran Curtmola
Guilian Wang
Heasoo Hwang
Kevin Zhao
Kian Ong
Lian Chen
Liying Sui
Michalis Petropoulos
Min Qian
Nathan Bales
Nicola Onose
Nikolaos Trogkanis
Pavel Velikhov
Pratik Mukhopadhyay
Sergio Lifschitz
Vagelis Hristidis
Yu Xu
Yupeng Fu


December 2020
Kabir receives an Honorable Mention for the CRA Outstanding Undergraduate Researcher Award; link to announcement.

August 2020
Supun, Arun, and Yannis receive an ACM SIGMOD Research Highlight Award; link to SIGMOD Rec. special edition.

July 2020
Arun receives a VMware Early Career Faculty Grant award.

June 2020
Arun receives an NSF CAREER Award; link to award abstract.

March 2020
Arun receives a Google Faculty Research Award; link to announcement.

See all News...

Research Projects

Current Projects


An end-to-end data system for sourcing data and features, as well as specifying, optimizing, and managing the ML model selection process.


Deep learning-powered database perception to enable data systems to see and hear unstructured data for unified type-agnostic analytics.


A declarative, rapid development framework for data-driven Ajax reports and applications. Rich visualizations and collaborative workflows require only a few lines of SQL-based code and visualization/interaction markup.

Past Projects

  • Delphi: DELPHI is a platform that enables integrated access and analysis of all data relevant to health. This platform promotes a more rapid development of empowering, data-driven health apps and tools by a broad community of health-related software developers.
  • SQL++ and Middleware: SQL++ is a highly expressive semi-structured query language that encompasses both the SQL and the JSON data model. SQL++ is SQL backwards-compatible. The Configurable version of SQL++ includes configuration options that formally itemize the semantics variations that language designers may choose from. We use SQL++ in FORWARD's middleware query processor.
  • Other projects: Links coming soon.

Recent Publications

Cerebro: A Layered Data Platform for Scalable Deep Learning
Arun Kumar, Supun Nakandala, Yuhao Zhang, Side Li, Advitya Gemawat, and Kabir Nagrecha
2021 ; CIDR
Link to paper

Cerebro: A Data System for Optimized Deep Learning Model Selection
Supun Nakandala, Yuhao Zhang, and Arun Kumar
2020 ; VLDB
Link to paper

Panorama: A Data System for Unbounded Vocabulary Querying over Video
Yuhao Zhang and Arun Kumar
2020 ; VLDB
Link to paper

Understanding and Benchmarking the Impact of GDPR on Database Systems
Supreeth Shastri, Vinay Banakar, Melissa Wasserman, Arun Kumar, and Vijay Chidambaram
2020 ; VLDB
Link to paper

Query Optimization for Faster Deep CNN Explanations
Supun Nakandala, Arun Kumar, and Yannis Papakonstantinou
2020 ; ACM SIGMOD Record
Link to paper

Incremental and Approximate Computations for Accelerating Deep CNN Inference
Supun Nakandala, Kabir Nagrecha, Arun Kumar, and Yannis Papakonstantinou
2020 ; ACM TODS
Link to paper

Vista: Optimized System for Declarative Feature Transfer from Deep CNNs at Scale
Supun Nakandala and Arun Kumar
Link to paper

SpeakQL: Towards Speech-driven Multimodal Querying of Structured Data
Vraj Shah, Side Li, Arun Kumar, and Lawrence Saul
Link to paper

Aggregation Support for Modern Graph Analytics in TigerGraph
Alin Deutsch, Yu Xu, Mingxi Wu, and Victor E. Lee
Link to paper

Towards Scalable Hybrid Stores: Constraint-Based Rewriting to the Rescue
Rana Alotaibi, Damian Bursztyn, Alin Deutsch, Ioana Manolescu, and Stamatis Zampetakis
Link to paper

See all publications...

Courses Offered by Database Lab Faculty

Graduate Courses

Undergraduate Courses