Interpretable AI: Past, Present and Future

NeurIPS 2024 Workshop


About

Interpretability in machine learning revolves around constructing models that are inherently transparent and insightful for human end users. As the scale of machine learning models increases and the range of applications expands across diverse fields, the need for interpretable models is more crucial than ever. The significance of interpretability becomes particularly evident in scenarios where decisions carry substantial real-world consequences, influencing human lives in areas such as healthcare, criminal justice, and lending, where understanding the machine learning process is essential. Interpretability can aid in auditing, verification, debugging, bias detection, ensure safety, and align models more effectively with human intentions. Post-hoc explanations may be unfaithful and thereby unreliable in some applications, which is why it is essential to design inherently interpretable models that provide truthful and complete explanations by default. Motivated by this, researchers have studied interpretability, resulting in a spectrum of distinct approaches.

On one end of the spectrum, classical interpretability methods designed for small-scale and tabular datasets often use rule-based models (e.g., decision trees, risk scores) and linear models (e.g., sparse linear models, generalized linear models) that are deemed inherently transparent. On the other end, modern interpretability methods for large-scale foundation models involve incorporating interpretable components into deep neural networks while not being fully interpretable, spawning novel research areas such as mechanistic interpretability.

In the workshop we aim to connect researchers working on different sub-fields of interpretability, such as rule-based interpretability, attribution-based interpretability, mechanistic interpretability, applied interpretable ML for various domains (e.g. healthcare, earth, material sciences, physics), and AI regulation. We will pose several key questions to foster discussion and insights:

  • What interpretability approaches are best suited for large-scale models and foundation models?
  • How to incorporate domain knowledge and expertise when designing interpretable models?
  • How can we assess the quality and reliability of interpretable models?
  • How to choose between different interpretable models?
  • When is it appropriate to use interpretable models or post-hoc explainability methods
  • What are the inherent limitations of interpretability, and how can we address them?
  • What are the diverse applications of interpretability across different domains?
  • What will the future landscape of interpretability entail?
  • Is there a legal need for interpretable models, and when should they be enforced?

Dates

Note: All deadlines are 11:59PM UTC-12:00 Anywhere on Earth (AoE).

Paper Submission

  • Submission open on OpenReview: August 9, 2024
  • Submission Deadline: August 30, 2024
  • Notification of Acceptance: October 9, 2024
  • Camera-ready Deadline: November 15, 2024

Workshop Event

Date: December 15, 2024

Schedule

To be announced

Speakers

Organisers

Suraj Srinivas

Suraj Srinivas

Postdoctoral research fellow at Harvard University, his research focuses on developing the foundations for interpretable machine learning

Michal Moshkovitz

Michal Moshkovitz

Machine Learning Research Scientist at Bosch Research, she has been focused on developing the foundations of explainable machine learning

Chhavi Yadav

Chhavi Yadav

PhD student at UCSD, her interests lie in XAI, Secure Verification, Auditing and societal impacts of deep generative models

Chhavi Yadav

Lesia Semenova

Postdoctoral researcher at Microsoft Research, her research focuses mainly on interpretable machine learning and AI in healthcare.

Nave Frost

Nave Frost

Research Scientist at eBay Research, his research interests focus on supplying explanations for data science applications

Valentyn Boreiko

Valentyn Boreiko

PhD student at the University of Tübingen, his research focuses on development of interpretability technique for vision classifiers

Zico Kolter

Vinayak Abrol

Assistant Professor at IIIT Delhi, his research focuses on the design and analysis of numerical algorithms for information-inspired applications

Bitya Neuhof

Bitya Neuhof

PhD student at the Hebrew University of Jerusalem, exploring the stability and reliability of explainable AI methods

Dotan Di Castro

Dotan Di Castro

Research scientist and lab manager at Bosch Research, his research focuses on Reinforcement Learning and Computer Vision

Kamalika Chaudhuri

Kamalika Chaudhuri

Associate Professor at UCSD and a Research Scientist at Meta AI, her research interests lie in the foundations of trustworthy machine learning

Hima Lakkaraju

Hima Lakkaraju

Assistant Professor at Harvard University who focuses on the algorithmic and applied aspects of explainability, fairness, robustness, and privacy of machine learning models

Contact information

  • Email: interpretable.ai.neurips.workshop [AT] gmail.com

Sponsors

Organizers’ Institutions