In current times, data is considered synonymous with knowledge, profit, power, and entertainment, requiring development of new techniques to extract useful information and insights from data. In this talk, I will describe some concepts and techniques in interpretable data analysis from the viewpoint of a database researcher. First, I will talk about our work on explaining query answers, in terms of “intervention” or how changes in the input data changes the output of a query, and “context” or how input data not contributing to the answers of interest can help explain them. Then I will talk about true causal inference from observational data without randomized controlled experiments and how database techniques can help with causal inference for large complex data.
Speaker bio:
Sudeepa Roy is an Assistant Professor in Computer Science at Duke University. She works in the area of databases, with a focus on foundational aspects of big data analysis, which includes causality and explanations for big data, data provenance, probabilistic databases, and applications of database techniques in other domains. Prior to Duke, she did a postdoc at the University of Washington, and obtained her Ph.D. from the University of Pennsylvania. She has served on the program committees of a number of premier conferences and workshops including SIGMOD, VLDB, PODS, and ICDT. She is a recipient of an NSF CAREER Award and a Google PhD Fellowship in Structured Data.