The Database Lab at UC San Diego is one of the leading academic research groups in the field of
data management, spanning the major themes of theory, systems, languages, interfaces, and applications,
as well as intersections with other data-oriented fields.
Areas of particular strength include database theory, data integration, semistructured data, heterogeneous
data, hardware-conscious data processing, query processing and optimization, data exploration, data analytics,
and machine learning systems.
Application areas of particular interest have included healthcare, Internet of Things, and social media.
We are a part of the Computer Science and Engineering department.
DB Lab faculty are also affiliated with other research groups, including the
Theory Group,
Artificial Intelligence Group, and
Center for Networked Systems, as well as with the
Halicioglu Data Science Institute.
Alin's interests include prdata publishing and integration, specification and verification of database-powered business processes and semistructured and XML data. His research is currently sponsored by the NSF.
Arun's research interests are in data management and systems for machine learning-based data analytics. His research focuses on designing abstractions, algorithms, and systems that make it easier and faster to analyze large and complex datasets using machine learning.
Yannis' research extends the capabilities of data platforms and query processors. He has published over 100 research articles with more than 14,000 citations.
Victor's research interests are in database systems and theory. He has been focusing on verification of database-driven systems, an area at the boundary of databases and computer-aided verification. His current research focuses on automatic verification of interactive data-driven Web services and business processes. He is also interested in the theory of query languages and computational logic.
Kamalika's research interests lie in the area of machine learning. Specifically, much of her work is on privacy-preserving machine learning and unsupervised learning, but she's also broadly interested in a number of topics in learning theory, such as confidence-rated prediction, online learning, and active learning.
Babak's research interests are in data management, causal inference, responsible data science and data ethics. His research seeks to unify techniques from theoretical data management, causal inference and machine learning to develop the necessary conceptual foundations for decision-making and policy evaluation from complex relational data, algorithmic fairness, explainability and accountability.
Rana's research interests lie in hybrid stores databases, large scale data integration and scalable self tuning database systems.
Side's research interests lie in systems for machine learning. He also loves unsweetened boba milk tea.
Supun's research interest lies broadly in the intersection of Machine Learning and Systems, an emerging area which is also known as Machine Learning Systems. His current work involves implementing efficient, scalable, and reliable algorithms, systems, and abstractions for Deep Learning powered Machine Learning workloads.
Ainur's research interests lie at the intersection of the theory and applications of databases and big data management systems. Currently, she is focused on data processing for massively parallel systems, more specifically, parallel computation of queries over large graphs and their complexity.
Vraj's research interest lies in accelerating the advanced data analytics life cycle by simplifying data sourcing for Machine Learning (ML) and making it easier and cheaper to deploy ML powered data analytics applications, improving both efficiency and usability of ML for data science.
Yuhao's research interests lie within the field of machine learning systems, including systems powered by applied ML that enable novel applications, and systems designed for ML to make data science easier and faster.
Xiuwen's research interests lie in heterogeneous data management, polystore system, query optimization and AI in DB.
Ilkay Altintas is the Director for the Center of Excellence in Workflows for Data Science at the San Diego Supercomputer Center (SDSC), UCSD.
Director, Center for Large-scale Data Systems research (CLDS), clds.sdsc.edu Director, Advanced Cyberinfrastructure Development Group (ACID).
Amarnath Gupta is a Research Scientist at the San Diego Supercomputer Center (SDSC) of the University of California San Diego.
December 2020
Kabir receives an Honorable Mention for the CRA Outstanding Undergraduate Researcher Award; link to announcement.
August 2020
Supun, Arun, and Yannis receive an ACM SIGMOD Research Highlight Award; link to SIGMOD Rec. special edition.
July 2020
Arun receives a VMware Early Career Faculty Grant award.
June 2020
Arun receives an NSF CAREER Award; link to award abstract.
March 2020
Arun receives a Google Faculty Research Award; link to announcement.
An end-to-end data system for sourcing data and features, as well as specifying, optimizing, and managing the ML model selection process.
Deep learning-powered database perception to enable data systems to see and hear unstructured data for unified type-agnostic analytics.
A declarative, rapid development framework for data-driven Ajax reports and applications. Rich visualizations and collaborative workflows require only a few lines of SQL-based code and visualization/interaction markup.
Cerebro: A Layered Data Platform for Scalable Deep Learning
Arun Kumar, Supun Nakandala, Yuhao Zhang, Side Li, Advitya Gemawat, and Kabir Nagrecha
2021 ; CIDR
Link to paper
Cerebro: A Data System for Optimized Deep Learning Model Selection
Supun Nakandala, Yuhao Zhang, and Arun Kumar
2020 ; VLDB
Link to paper
Panorama: A Data System for Unbounded Vocabulary Querying over Video
Yuhao Zhang and Arun Kumar
2020 ; VLDB
Link to paper
Understanding and Benchmarking the Impact of GDPR on Database Systems
Supreeth Shastri, Vinay Banakar, Melissa Wasserman, Arun Kumar, and Vijay Chidambaram
2020 ; VLDB
Link to paper
Query Optimization for Faster Deep CNN Explanations
Supun Nakandala, Arun Kumar, and Yannis Papakonstantinou
2020 ; ACM SIGMOD Record
Link to paper
Incremental and Approximate Computations for Accelerating Deep CNN Inference
Supun Nakandala, Kabir Nagrecha, Arun Kumar, and Yannis Papakonstantinou
2020 ; ACM TODS
Link to paper
Vista: Optimized System for Declarative Feature Transfer from Deep CNNs at Scale
Supun Nakandala and Arun Kumar
2020 ; ACM SIGMOD
Link to paper
SpeakQL: Towards Speech-driven Multimodal Querying of Structured Data
Vraj Shah, Side Li, Arun Kumar, and Lawrence Saul
2020 ; ACM SIGMOD
Link to paper
Aggregation Support for Modern Graph Analytics in TigerGraph
Alin Deutsch, Yu Xu, Mingxi Wu, and Victor E. Lee
2020 ; ACM SIGMOD
Link to paper
Towards Scalable Hybrid Stores: Constraint-Based Rewriting to the Rescue
Rana Alotaibi, Damian Bursztyn, Alin Deutsch, Ioana Manolescu, and Stamatis Zampetakis
2019 ; ACM SIGMOD
Link to paper