Database Lab Research Seminar


If you are interested in database research, you are welcome to join us in our weekly database seminar. Every week a local or visiting researcher gives a talk on their research topic.

Location/Time: In Fall 2020, the seminar is held on Fridays 3:00-4:00pm PT on Zoom and hosted by Dr. Arun Kumar. The Zoom link will be emailed to the DBTalks mailing list.

Course Number: If you are a student and plan to attend, please enroll in CSE 239A for 1 credit.

Mailing List: Announcements about the seminar are sent to the DBTalks mailing list. To get subscribed, please submit this Google Form.

DB Seminars

  • Programmatically Building & Managing Training Data with Snorkel

    Dr. Alex Ratner (UW-Seattle and Snorkel AI) •
    One of the key bottlenecks in building machine learning systems is creating and managing the massive training datasets that today’s models require. In this talk,... Read more
  • Responsible Data Management

    Dr. Julia Stoyanovich (NYU) •
    The need for responsible data management intensifies with the growing impact of data on society. One central locus of the societal impact of data are... Read more
  • Systems for Human Data Interaction

    Dr. Eugene Wu (Columbia University) •
    The rapid democratization of data has placed its access and analysis in the hands of the entire population. While the advances in rapid and large-scale... Read more
  • Elements of Learning Systems

    Dr. Tianqi Chen (CMU and OctoML) •
    Data, models, and computing are the three pillars that enable machine learning to solve real-world problems at scale. Making progress on these three domains requires... Read more
  • Interpretable Data Analysis with Explanations and Causality

    Dr. Sudeepa Roy (Duke University) •
    In current times, data is considered synonymous with knowledge, profit, power, and entertainment, requiring development of new techniques to extract useful information and insights from... Read more
  • The Socio-Technical Phenomena of Data Integration and Knowledge Graph

    Dr. Juan Sequeda (Data.World) •
    Data Integration has been an active area of computer science research for over two decades. A modern manifestation is as Knowledge Graphs which integrates not... Read more
  • An Exabyte scale global data infrastructure for CMS@LHC

    Prof. Frank Wuerthwein (UCSD Physics and HDSI) •
    The science program at the Large Hadron Collider at CERN is preparing to produce, distribute, and access an Exabyte of new data per year starting... Read more
  • Grouped Learning: Group-By Machine Learning Model Selection Workloads

    Side Li (UCSD CSE) •
    ML practitioners routinely build separate models for data subsets based on some specified attribute(s), e.g., one model per state. We call this practice “ML over... Read more
  • Vista: An End-to-end Declarative Transfer Learning System for Multimodal Analytics with Deep Neural Networks

    Advitya Gemawat (UCSD HDSI) •
    Scalable systems for ML are largely siloed into dataflow systems for structured data and DL systems for unstructured data. This gap has left workloads that... Read more
  • Cerebro: A Layered Data Platform for Scalable Deep Learning

    Yuhao Zhang and Supun Nakandala (UCSD CSE) •
    Deep learning (DL) is gaining popularity across myriad domains due to the new ubiquity of unstructured data, tools such as TensorFlow, and easier access to... Read more
  • Panorama: A Data System for Unbounded Vocabulary Querying over Video

    Yuhao Zhang •
    Deep convolutional neural networks (CNNs) achieve state-of-the-art accuracy for many computer vision tasks. But using them for video monitoring applications incurs high computational cost and... Read more
  • SpeakQL: Towards Speech-driven Multimodal Querying of Structured Data

    Vraj Shah •
    Speech-driven querying is becoming popular in new device environments such as smartphones, tablets, and even conversational assistants. However, such querying is largely restricted to natural... Read more
  • Towards Scalable Hybrid Stores: Constraint-Based Rewriting to the Rescue

    Rana Alotaibi •
    Big data applications increasingly involve diverse datasets, conforming to different data models. Such datasets are routinely hosted in heterogeneous stores, each capable of handling one... Read more
  • Vista: Declarative Feature Transfer from Deep CNNs at Scale

    Supun Nakandala •
    Scalable systems for machine learning (ML) are largely siloed into dataflow systems for structured data and deep learning systems for unstructured data. This gap has... Read more
  • Knowledge Graph Use Cases at Intuit

    Jay Yu (Intuit) •
    Intuit, the leading financial software/service company behind TurboTax, Mint and Quickbooks, is embarking on a multi-year transformational journey into an AI-driven Expert Platform to help... Read more
  • From Data to Models and Back: Experiences from Google's Production ML Pipelines

    Alkis Polyzotis (Google Research) •
    Building a good ML model requires good input data. Conversely, debugging a model inevitably involves data debugging and understanding. In this talk, I will present... Read more
  • Cloudy with high chance of DBMS: A 10-year prediction for Enterprise-Grade ML

    Carlo Curino and Kostas Karanasos (Microsoft Jim Gray Systems Lab) •
    Machine learning (ML) has proven itself in high-value web applications such as search ranking and is emerging as a powerful tool in a much broader... Read more
  • Analysis of data-driven workflows

    Prof. Victor Vianu •
    Software systems centered around databases have become pervasive in a wide variety of applications, including health-care management, e-commerce, business processes, scientific workflows, and e-government. Such... Read more
  • Automatic Verification of Database-powered Workflows

    Allessandro Gianola (Visitng PhD student from Free University of Bolzano) •
    During the last two decades, a huge body of research has been dedicated to the challenging problem of reconciling data and process management within contemporary... Read more