Championing Open-source Development in Machine Learning
Open-source software development is a cornerstone of modern machine learning research. However, there are often overlooked issues around the sustainability of long-term projects, the reliability of software, and proper academic acknowledgment of long-term maintenance and contributions. The CODEML workshop at ICML aims to identify and discuss strategies for successful and sustainable open-source development in machine learning while also proposing possible solutions to the challenges mentioned above. In addition, this workshop will serve as a platform to provide academic and community recognition of efforts from open-source contributors in the field. We will bring together machine learning researchers, engineers, industrial practitioners, and software development experts. The workshop will feature invited talks, panel discussions with experts, and extended abstract submissions from open-source contributors in machine learning.
Important Dates
Submission Deadline | |
Acceptance Notification Date | June 9, 2025 |
Workshop Date | July 18, 2025 |
Location and Venue
Vancouver Convention Center
1055 Canada Pl, Vancouver, BC V6C 0C3, Canada
Call for Papers
We welcome submissions on open-source software within the context of machine learning research. We encourage all types of contributions, including research papers, position papers, technical reports, and retrospectives.
Suggested topics include:
- Submissions that describe a new open-source machine learning software library
- Submissions which explain a new addition, significant bug fixes, or changes to an established library
- Submissions that explore a scientific result across different versions of an (established) library
- Submissions on the technical setup (e.g. CI, testing) and best practices for reproducibility
- Proposals for better workflows or incentives for open-source development and maintenance in ML
- Retrospectives on the development and maintenance of mature ML OSS packages
We especially encourage submissions on development practices, mature libraries, and other topics that have received little recognition from traditional academic venues.
As our proceedings are non-archival, we will accept work that is under submission or recently accepted by other publication venues (e.g. NeurIPS, JMLR OSS Track, etc.)
Submission Guidelines
We invite submissions of 4-page workshop papers (excluding references and appendix) that address any of the workshop themes. Submissions should use the ICML style file. We encourage submissions to include relevant links to projects wherever applicable (we recommend using Anonymous4OpenScience to hide identifying details). We discourage lengthy appendices as reviewers are not required to read them.
Submissions will undergo a double-blind review process for relevance and adherence to ICML’s academic integrity standards. We recognize that it may be impossible for some submissions to be truly anonymous (e.g., a retrospective on a widely used library), so we ask authors to use their best judgment regarding potentially identifying details.
We aim to be inclusive while ensuring high-quality discussions that align with the workshop’s objectives. To that end, papers will be reviewed under the TMLR criteria:
- Correctness: Are the claims made in the submission supported by accurate, convincing and clear evidence?
- Audience: Would some individuals in the workshop’s audience be interested in the findings of this submission?
Accepted submissions will be presented during joint poster sessions and made publicly available as non-archival reports, allowing future submissions to archival conferences or journals.
Camera-Ready Submission
For the camera-ready version of accepted papers, please use this style file which serves as a drop-in replacement of the ICML style file. Camera-ready papers may have five pages of main content (one additional from the submission page limit).
Schedule
Time | Session |
---|---|
09:00 - 09:15 | Opening remarks |
09:15 - 09:45 | Invited talk: |
09:45 - 10:15 | Coffee break |
10:15 - 10:45 | Invited talk: |
10:45 - 11:15 | Invited talk: |
11:15 - 12:00 | Contributed talks / demos |
12:00 - 13:00 | Lunch and discussion |
13:00 - 13:30 | Invited talk: |
13:30 - 14:00 | Invited talk: |
14:00 - 14:15 | Contributed talks / demos |
14:15 - 15:00 | Poster session |
15:00 - 15:30 | Coffee break |
15:30 - 16:00 | Invited talk: |
16:00 - 16:55 | Panel discussion |
16:55 - 17:00 | Closing remarks |
Invited Speakers

Sara Hooker is a renowned ML researcher and leader in AI fairness and interpretability, and currently the VP of research at Cohere. She previously was a research scientist at Google Brain, focusing on training models that are not only accurate, but also interpretable, fair, and robust. Sara is the founder and a current advisor of Delta Analytics, a nonprofit organization dedicated to bringing data science expertise to underserved communities. She uses her expertise and outreach to advocate for trustworthy, accessible and equitable ML practices and to promote open research and collaboration.

Tri Dao is a chief scientist at Together.AI and an incoming assistant professor at Princeton University. He recently completed his Ph.D. at Stanford University working with Christopher Re and Stefano Ermon. Tri is a leading expert in machine learning and systems, with a focus on efficient training and long-range context. He has made significant contributions to the development of open-source tools and frameworks, including Mamba and FlashAttention.

Stella Biderman is a researcher at Booz Allen Hamilton & executive director at EleutherAI who specializes in natural language processing, ML interpretability, and AI ethics. She has contributed to the release of several open-source generative models such as GPT-NeoX, BLOOM, VQGAN-CLIP, and OpenFold. Her current research focuses on mechanistic interpretability research and the learning dynamics of large language models. Stella is currently a lead contributor to the Pythia project for transformer interpretability.

Matt Johnson is a senior research scientist at Google, where he works on the development of open-source tools and frameworks for machine learning. He has made numerous contributions to numerical libraries used in machine learning. Matt was a founder and lead contributor of Autograd, a precursor to the widely popular JAX library, and now is a key contributor to JAX.

Evan Shelhamer is an incoming assistant professor of computer science at UBC, a faculty member at the Vector Institute, and a senior research scientist at Google DeepMind. He has over ten years' experience in research and development for computer vision and machine learning and is an advocate for DIY science and open-source. Most notably, Evan served as the lead developer of the Caffe deep learning framework from version 0.1 to 1.0.

Chris Rackauckas is the Director of Modeling and Simulation at Julia Computing, the lead developer of the SciML Open Source Software Organization, Co-PI of the Julia Lab at MIT and Director of Scientific Research at Pumas-AI. He is the lead developer of several major open-source packages within the Julia ecosystem, perhaps most notably DifferentialEquations.jl. Chris's research focuses on scientific machine learning, which aims to integrate domain-specific scientific models with data-driven approaches from machine learning to accelerate simulations.
Organizers

Columbia University

Zurich University of Applied Sciences

Recursion
