Shichang Zhang (Harvard University) is a postdoctoral fellow at Harvard University. He holds a Ph.D. in Computer Science from the University of California, Los Angeles. His research focuses on AI interpretability, with particular expertise in explainable AI, data attribution, and mechanistic interpretability for large language models (LLMs). His work aims to understand complex feature interactions and model internal mechanisms to develop more explainable and efficient AI systems. His research has applications spanning science, healthcare, and public policy. His recent contributions to the field include developing a unified attribution framework and advancing cross-domain methodologies. His work has appeared in leading venues including NeurIPS, ICML, and ICLR, and has been highlighted in Nature News Feature for its educational impact. He also brings valuable experience organizing workshops at NeurIPS. Website: https://shichangzh.github.io/

Himabindu Lakkaraju (Harvard University) is an Assistant Professor at Harvard University with appointments in the Business School and the Department of Computer Science. Her research interests lie within the broad area of the algorithmic foundations and societal implications of trustworthy AI. Specifically, she develops machine learning and optimization techniques as well as evaluation frameworks to improve the safety, interpretability, fairness, privacy, and reasoning capabilities of predictive and generative models, including large language models (LLMs). Her recent research deals with exposing the vulnerabilities of various explanation methods and making them more robust. She has also been working with various domain experts to understand the real-world consequences of misleading explanations. She has given tutorials at NeurIPS in 2020 and invited talks at various workshops in ICML, NeurIPS, CVPR, and other top venues. Website: https://himalakkaraju.github.io/

Julius Adebayo (Guide Labs) is the cofounder of Guide Labs, building interpretable AI systems that humans and domain experts can easily audit, steer, and understand. He got his PhD in Computer Science from MIT, where he worked on developing and understanding approaches that seek to make machine learning based systems reliable when deployed. More broadly, he is interested in rigorous approaches to help develop models that are robust to spurious associations, distribution shifts, and align with 'human' values. His recent work has looked at understanding and assessing popular neural network feature relevance techniques in order to evaluate how faithful these methods are to the model being interpreted. Website: https://juliusadebayo.com/

Presenters