UC San Diego Database Lab

The Database Lab at UC San Diego is one of the leading academic research groups in the field of data management, spanning the major themes of theory, systems, languages, interfaces, and applications, as well as intersections with other data-oriented fields. Areas of particular strength include database theory, data integration, semistructured data, heterogeneous data, query processing and optimization, data exploration, data analytics, machine learning systems, causal inference, and responsible data science. Application areas of particular interest have included healthcare, Internet of Things, and social media.

Our members span the departments of Computer Science and Engineering and Halıcıoğlu Data Science Institute. DB Lab faculty are also affiliated with other research groups, including the CSE Theory Group, CSE AI Group, HDSI AI and ML Group, HDSI Data Infrastructure and Systems Group, and Center for Networked Systems.


People

Jump to Faculty, Affiliated and Adjunct Faculty, PhD Students and Postdocs, SDSC Affiliated Members, Alumni.

Faculty

Alin Deutch [ DBLP|Google Scholar ]

Alin's research interests include data publishing and integration, specification and verification of DB-powered business processes and semistructured and XML data.

Arun Kumar [ Google Scholar ]

Arun's research interests are in data management and systems for ML/AI-based data analytics. His work focuses on designing abstractions, algorithms, and systems to make it easier and faster to analyze large and complex datasets using ML/AI.

Babak Salimi [ DBLP|Google Scholar ]

Babak's research interests are in data management, causal inference, responsible data science, and data ethics. His work unifies techniques from DB theory, causal inference, and ML to lay the foundations for decision-making and policy evaluation from complex relational data, algorithmic fairness, explainability, and accountability.

Victor Vianu [ DBLP|Google Scholar ]

Victor's research interests are in DB systems and theory. His current work focuses on verification of DB-driven systems, at the intersection of DB and computer-aided verification, and on automatic verification of interactive data-driven Web services and business processes. He is also interested in the theory of query languages and computational logic.

Affiliated and Adjunct Faculty

Kamalika Chaudhuri [Google Scholar]

Kamalika's research interests lie in the area of ML. Much of her work is on privacy-preserving ML and unsupervised learning, but she's also broadly interested in learning theory, including confidence-rated prediction, online learning, and active learning.

Yannis Papakonstantinou [Google Scholar]

Yannis' research extends the capabilities of data platforms and query processors. He has published over 100 research articles with more than 14,000 citations.

PhD Students and Postdocs

Baharan Khatami

Baharan's research interests lie in fair and explainable machine learning, debiasing the data, data cleaning for ML, and causal inference in relational data using different techniques like graph representation learning.

Kyle Luoma

Kyle's research interests lie in the use of data science and data engineering methods within the government domain, specifically making data-based solution engineering within the Department of Defense more accessible to data analysts and researchers via novel technologies. His current work is on speech-based querying of relational databases.

Kabir Nagrecha

Kabir's research interests are in the area of ML systems, with a focus on applying data management techniques to improve the scalability and performance of deep learning training. His most recent work aims to improve the efficiency and trainability of deep learning workloads that are too large to fit into processor memory.

Xiuwen Zheng

Xiuwen's research interests lie in heterogeneous data management, polystore system, query optimization and AI in DB.

Jiongli Zhu

Jiongli's research interests lie in data cleaning and debiasing for machine learning applications.

SDSC Affiliated Members

Ilkay Altintas

Ilkay Altintas is the Director for the Center of Excellence in Workflows for Data Science at the San Diego Supercomputer Center (SDSC), UCSD.

Amaranath Gupta

Amarnath Gupta is a Research Scientist at the San Diego Supercomputer Center (SDSC) of the University of California San Diego.

Alumni

Yuhao Zhang
PhD; 2023; Advised by Arun Kumar
Romila Pradhan
PostDoc; 2022; Advised by Babak Salimi
Rana Alotaibi
PhD; 2022; Advised by Alin Deutsch
Supun Nakandala
PhD; 2022; Advised by Arun Kumar
Vraj Shah
PhD; 2022; Advised by Arun Kumar
Ainur Smagulova
PhD; 2021; Advised by Alin Deutsch
Nikos Koulouris
PhD; 2020; Advised by Yannis Papakonstantinou
Vicky Papavasileiou
PhD; 2019; Advised by Alin Deutsch
Konstantinos (Costas) Zarifis
PhD; 2019; Advised by Yannis Papakonstantinou
Yuliang Li
PhD; 2018; Advised by Alin Deutsch and Victor Vianu
Chunbin Lin
PhD; 2018; Advised by Yannis Papakonstantinou
Jianguo Wang
PhD; 2018; Advised by Yannis Papakonstantinou and Steve Swanson
Yannis Katsis
PostDoc; 2017; Advised by Yannis Papakonstantinou
Wojtek Kazana
PostDoc; 2014; Advised by Victor Vianu and Alin Deutsch
Elio Damaggio
PhD; 2012; Advised by Victor Vianu and Alin Deutsch
Alan Nash
PhD; 2011; Advised by Alin Deutsch and Victor Vianu
Andrey Balmin
PhD; 2010; Advised by Alin Deutsch
Lying Sui
PhD; 2006; Advised by Victor Vianu
Frank Neven
PostDoc; 1999; Advised by Victor Vianu
Karl Denninghoff
PhD; 1995; Advised by Victor Vianu
Sťephane Grumbach
PostDoc; 1990-91; Advised by Victor Vianu
Gottfried Vossen
PostDoc; 1987; Advised by Victor Vianu
Avinash Vyas
Bertram Ludascher
Dayou Zhou
Emiran Curtmola
Guilian Wang
Heasoo Hwang
Kevin Zhao
Kian Ong
Lian Chen
Liying Sui
Michalis Petropoulos
Min Qian
Nathan Bales
Nicola Onose
Nikolaos Trogkanis
Pavel Velikhov
Pratik Mukhopadhyay
Sergio Lifschitz
Vagelis Hristidis
Yu Xu
Yupeng Fu

News

December 2023
Yuhao graduates with his PhD. Congrats and best wishes, Yuhao!

August 2023
Yannis Papakonstantinou joins Google full time and switches to adjunct faculty at UC San Diego. Best wishes, Yannis!

April 2023
Supun receives the coveted ACM SIGMOD Jim Gray Doctoral Dissertation Award! He is the first recipient in the UCSD DB Lab's history, and this is the first time this award goes to work in the fast-growing arena of ML Systems. Congrats, Supun!

June 2022
Rana, Supun, and Vraj walk at the PhD commencement. Congrats and best wishes to them for their careers!

February 2022
Kabir receives a highly competitive Meta PhD Fellowship, the first UCSD student to receive one in the program's 10-year history; link to announcement

See all News...


Recent Publications

OTClean: Data Cleaning for Conditional Independence Violations using Optimal Transport
Alireza Pirhadi, Mohammad Hossein Moslemi, Alexander Cloninger, Mostafa Milani, and Babak Salimi
2024 ; SIGMOD
Link to paper


How do Categorical Duplicates Affect ML? A New Benchmark and Empirical Analyses
Vraj Shah, Thomas Parashos, and Arun Kumar
2024 ; VLDB
Link to paper


Saturn: An Optimized Data System for Multi-Large-Model Deep Learning Workloads
Kabir Nagrecha and Arun Kumar
2024 ; VLDB
Link to paper


Consistent Range Approximation for Fair Predictive Modeling
Jiongli Zhu, Sainyam Galhotra, Nazanin Sabri, and Babak Salimi
2023 ; VLDB
Link to paper


Causal Data Integration.
Brit Youngmann, Michael Cafarella, Babak Salimi, and Anna Zeng
2023 ; VLDB
Link to paper


Lotan: Bridging the Gap between GNNs and Scalable Graph Analytics Engines
Yuhao Zhang and Arun Kumar
2023 ; VLDB
Link to paper


NEXUS: On Explaining Confounding Bias
Brit Youngmann, Michael Cafarella, Yuval Moskovitch, and Babak Salimi
2023 ; SIGMOD
Link to paper


On Explaining Confounding Bias
Brit Youngmann, Michael Cafarella, Babak Salimi, and Yuval Moskovitch
2023 ; ICDE
Link to paper


Causal What-If and How-To Analysis Using Hyper
Fangzhu Shen, Kayvon Heravi, Oscar Gomez, Sainyam Galhotra, Amir Gilad, Sudeepa Roy, and Babak Salimi
2023 ; ICDE
Link to paper


Database-Aware ASR Error Correction for Speech-to-SQL Parsing
Yutong Shao, Arun Kumar, and Ndapandula Nakashole
2023 ; IEEE ICASSP
Link to paper


See all publications...


Courses Offered by Database Lab Faculty

Graduate Courses

  • CSE 232A: Graduate Database Systems
  • CSE 232B: Database Systems: Advanced Topics and Implementation
  • CSE 233: Database Theory
  • CSE 234: Data Systems for Machine Learning

Undergraduate Courses

  • CSE 132A: Database System Principles
  • CSE 132B: Database System Applications
  • CSE 132C: Database System Implementation
  • DSC 100: Introduction to Data Management
  • DSC 102: Systems for Scalable Analytics

Seminars