About
Interpretability in machine learning revolves around constructing models that are inherently transparent and insightful for human end users. As the scale of machine learning models increases and the range of applications expands across diverse fields, the need for interpretable models is more crucial than ever. The significance of interpretability becomes particularly evident in scenarios where decisions carry substantial real-world consequences, influencing human lives in areas such as healthcare, criminal justice, and lending, where understanding the machine learning process is essential. Interpretability can aid in auditing, verification, debugging, bias detection, ensure safety, and align models more effectively with human intentions. Post-hoc explanations may be unfaithful and thereby unreliable in some applications, which is why it is essential to design inherently interpretable models that provide truthful and complete explanations by default. Motivated by this, researchers have studied interpretability, resulting in a spectrum of distinct approaches.
On one end of the spectrum, classical interpretability methods designed for small-scale and tabular datasets often use rule-based models (e.g., decision trees, risk scores) and linear models (e.g., sparse linear models, generalized linear models) that are deemed inherently transparent. On the other end, modern interpretability methods for large-scale foundation models involve incorporating interpretable components into deep neural networks while not being fully interpretable, spawning novel research areas such as mechanistic interpretability.
In the workshop we aim to connect researchers working on different sub-fields of interpretability, such as rule-based interpretability, attribution-based interpretability, mechanistic interpretability, applied interpretable ML for various domains (e.g. healthcare, earth, material sciences, physics), and AI regulation. We will pose several key questions to foster discussion and insights:
- What interpretability approaches are best suited for large-scale models and foundation models?
- How to incorporate domain knowledge and expertise when designing interpretable models?
- How can we assess the quality and reliability of interpretable models?
- How to choose between different interpretable models?
- When is it appropriate to use interpretable models or post-hoc explainability methods
- What are the inherent limitations of interpretability, and how can we address them?
- What are the diverse applications of interpretability across different domains?
- What will the future landscape of interpretability entail?
- Is there a legal need for interpretable models, and when should they be enforced?
Dates
Note: All deadlines are 11:59PM UTC-12:00 Anywhere on Earth (AoE).
Paper Submission
- Submission open on OpenReview: August 9, 2024
- Submission Deadline: August 30, 2024
- Notification of Acceptance: October 9, 2024
- Camera-ready Deadline: November 15, 2024
Workshop Event
Date: December 15, 2024
Schedule
To be announced
Speakers
Cynthia Rudin
Distinguished Professor
Duke University
Rich Caruana
Senior Principal Researcher
Microsoft Research
Tong Wang
Assistant Professor of Marketing
Yale University
Neel Nanda
Lead of
Google DeepMind Mechanistic Interpretability
Organisers
Suraj Srinivas
Postdoctoral research fellow at Harvard University, his research focuses on developing the foundations for interpretable machine learning
Michal Moshkovitz
Machine Learning Research Scientist at Bosch Research, she has been focused on developing the foundations of explainable machine learning
Chhavi Yadav
PhD student at UCSD, her interests lie in XAI, Secure Verification, Auditing and societal impacts of deep generative models
Lesia Semenova
Postdoctoral researcher at Microsoft Research, her research focuses mainly on interpretable machine learning and AI in healthcare.
Nave Frost
Research Scientist at eBay Research, his research interests focus on supplying explanations for data science applications
Valentyn Boreiko
PhD student at the University of Tübingen, his research focuses on development of interpretability technique for vision classifiers
Vinayak Abrol
Assistant Professor at IIIT Delhi, his research focuses on the design and analysis of numerical algorithms for information-inspired applications
Bitya Neuhof
PhD student at the Hebrew University of Jerusalem, exploring the stability and reliability of explainable AI methods
Dotan Di Castro
Research scientist and lab manager at Bosch Research, his research focuses on Reinforcement Learning and Computer Vision
Kamalika Chaudhuri
Associate Professor at UCSD and a Research Scientist at Meta AI, her research interests lie in the foundations of trustworthy machine learning
Hima Lakkaraju
Assistant Professor at Harvard University who focuses on the algorithmic and applied aspects of explainability, fairness, robustness, and privacy of machine learning models
Contact information
- Email: interpretable.ai.neurips.workshop [AT] gmail.com