Championing Open-source DEvelopment in Machine Learning
Open-source software development is a cornerstone of modern machine learning research. However, there are often overlooked issues around the sustainability of long-term projects, the reliability of software, and proper academic acknowledgment of long-term maintenance and contributions. This workshop aims to identify and discuss strategies for successful and sustainable open-source development in machine learning while also proposing possible solutions to the challenges mentioned above. In addition, this workshop will serve as a platform to provide academic and community recognition of efforts from open-source contributors in the field. We will bring together machine learning researchers, engineers, industrial practitioners, and software development experts. The workshop will feature invited talks, panel discussions with experts, and extended abstract submissions from open-source contributors in machine learning.
Important Dates
Submission Deadline | May 19th, 2025 11:59 PM AoE |
Acceptance Notification Date | June 9th, 2025 11:59 PM AoE |
Workshop Date | July 18th or 19th, 2025 |
Call for Papers
We invite high-quality extended abstract submissions on open-source software development in machine learning and adjacent fields. Some examples (non-exhaustive list):
- Submissions that describe a new open-source machine learning software library
- Submissions which explain a new addition, significant bug fixes, or changes to an established library
- Submissions that explore a scientific result across different versions of an (established) library
- Submissions on the technical setup (e.g. CI, testing) and best practices for reproducibility
- Proposals for better workflows or incentives for open-source development and maintenance in ML
Submissions
Accepted submissions will be presented during joint poster sessions and will be made publicly available as non-archival reports, allowing future submissions to archival conferences or journals.
Submissions must be anonymous, in ICML format and not longer than four pages, excluding references, acknowledgments, and supplementary material. Long appendices are permitted but strongly discouraged, and reviewers are not required to read them. The review process is double-blind.
Authors may be asked to review other workshop submissions.
Schedule
Time | Session |
---|---|
09:00 - 09:15 | Opening remarks |
09:15 - 09:45 | Invited talk: |
09:45 - 10:15 | Invited talk: |
10:15 - 11:00 | Coffee and breakout sessions |
11:00 - 11:30 | Invited talk: |
11:30 - 12:00 | Contributed talks/demos |
12:00 - 13:15 | Lunch and discussion |
13:15 - 13:45 | Invited talk: |
13:45 - 14:15 | Lightning talks/demos |
14:15 - 15:00 | Coffee and breakout sessions |
15:00 - 15:30 | Invited talk: |
15:30 - 16:30 | Poster session |
16:30 - 17:00 | Invited talk: |
17:00 - 17:55 | Panel discussion |
17:55 - 18:00 | Closing remarks |
Invited Speakers

Sara Hooker is a renowned ML researcher and leader in AI fairness and interpretability, and currently the VP of research at Cohere. She previously was a research scientist at Google Brain, focusing on training models that are not only accurate, but also interpretable, fair, and robust. Sara is the founder and a current advisor of Delta Analytics, a nonprofit organization dedicated to bringing data science expertise to underserved communities. She uses her expertise and outreach to advocate for trustworthy, accessible and equitable ML practices and to promote open research and collaboration.

Tri Dao is a chief scientist at Together.AI and an incoming assistant professor at Princeton University. He recently completed his Ph.D. at Stanford University working with Christopher Re and Stefano Ermon. Tri is a leading expert in machine learning and systems, with a focus on efficient training and long-range context. He has made significant contributions to the development of open-source tools and frameworks, including Mamba and FlashAttention.

Stella Biderman is a researcher at Booz Allen Hamilton & executive director at EleutherAI who specializes in natural language processing, ML interpretability, and AI ethics. She has contributed to the release of several open-source generative models such as GPT-NeoX, BLOOM, VQGAN-CLIP, and OpenFold. Her current research focuses on mechanistic interpretability research and the learning dynamics of large language models. Stella is currently a lead contributor to the Pythia project for transformer interpretability.

Matt Johnson is a senior research scientist at Google, where he works on the development of open-source tools and frameworks for machine learning. He has made numerous contributions to numerical libraries used in machine learning. Matt was a founder and lead contributor of Autograd, a precursor to the widely popular JAX library, and now is a key contributor to JAX.

Evan Shelhamer is an incoming assistant professor of computer science at UBC, a faculty member at the Vector Institute, and a senior research scientist at Google DeepMind. He has over ten years' experience in research and development for computer vision and machine learning and is an advocate for DIY science and open-source. Most notably, Evan served as the lead developer of the Caffe deep learning framework from version 0.1 to 1.0.

Chris Rackauckas is the Director of Modeling and Simulation at Julia Computing, the lead developer of the SciML Open Source Software Organization, Co-PI of the Julia Lab at MIT and Director of Scientific Research at Pumas-AI. He is the lead developer of several major open-source packages within the Julia ecosystem, perhaps most notably DifferentialEquations.jl. Chris's research focuses on scientific machine learning, which aims to integrate domain-specific scientific models with data-driven approaches from machine learning to accelerate simulations.
Organizers

Columbia University

Zurich University of Applied Sciences

Recursion

University of Tübingen
News
Jan 23, 2025 |
![]() |
---|