C4AI Newsletter - September 2024

🎉 This year at ACL 2024 Bangkok, Thailand, our work Aya Model: An Instruction Finetuned Open-Access Multilingual Language Model was recognized with a Best Paper Award! A huge congratulations to all the authors and Aya collaborators worldwide for their hard work and recognition.

🚀 Congrats to everyone who participated in Expedition Aya, our global open-build challenge. Over the past 6 weeks, 180 researchers came together in 28 teams to push the boundaries of multilingual AI. Watch the recording of the Closing Ceremony here, including lightning talks from all teams, and judging by Sebastian Ruder, Marzieh Fadaee, and Alice Schoenauer Sebag.

📣 Announcing C4AI Research Office Hours - a new avenue to support all-stage ML researchers on their journey. Join us for support with a blocker in your research, ideation brainstorming, help implementing C4AI’s research models and datasets, or accountability as you work on an independent project. These biweekly sessions hosted by C4AI Open Science MLE, Alejandro Salamanca, begin Sept 11 and are available to members of our open-science community. If you haven’t yet, join us!

🎤 Friday, September 6th, join Aidan Peppin, C4AI’s Policy & Responsible AI Lead, as he sits down with Dr. Sasha Luccioni, David Kanter, Deval Pandya, and Sara Hooker for our AI & Technical Governance series on “Efficient AI Models.” Register here!

In our latest work, Multilingual Arbitrage: Optimizing Data Pools to Accelerate Multilingual Progress, synthetic data plays a crucial role in training models. Traditionally, a single teacher model imparts knowledge to a student model, assuming it excels across all tasks. However, this assumption often falls short in multilingual contexts. We introduce "multilingual arbitrage," a method that strategically samples from multiple teacher models to enhance multilingual capabilities. Work led by Ayomide Odumakinde, Daniel D’souza, Pat Verga, Beyza Ermis, Sara
Hooker.

We introduce BAM! Just Like That: Simple and Efficient Parameter Upcycling for Mixture of Experts. Training MoEs from scratch is hard and expensive, instead many have explored “upcycling” dense expert models into MoEs. In this work, we argue that upcycling attention experts is crucial for MoEs, and we introduce BAM as a new MoE upcycling method. Work by Qizhen (Irene) Zhang, Nikolas Gritsch, Dwaraknath Gnaneshwar, Simon Guo, David Cairuz, Bharat Venkitesh, Jakob Foerster, Phil Blunsom, Sebastian Ruder, Ahmet Üstün, Acyr Locatelli

With Nexus, we introduce an adaptive routing where the model learns to project expert embeddings from domain representations, which allows for high degree of specialization and flexibly adding new experts after the initial upcycling through newly trained dense models. Work by Nikolas Gritsch, Qizhen Zhang, Acyr Locatelli, Sara Hooker, Ahmet Üstün

We introduce, “How Does Quantization Affect Multilingual LLMs?” where we conduct a thorough analysis of quantized multilingual LLMs, focusing on their performance across languages and at varying scales. Work led by Kelly Marchisio, with Saurabh Dash, Hongyu Chen, Dennis Aumiller, Ahmet Üstün, Sara Hooker, and Sebastian Ruder.

In this work, we address the rising energy and environmental cost of the artificial-intelligence boom that is fuelling concern. Green policy mechanisms that already exist offer a path towards a solution. Work by Sasha Luccioni, Boris Gamazaychikov, Sara Hooker, Régis Pierrard, Emma Strubell, Yacine Jernite & Carole-Jean Wu

This month we are spotlighting Sunitha Selvan, one of our co-leads for our Open Science Community’s AI Safety & Alignment group. Sunitha is a Research Engineer at Patronus AI. You can catch up with her on twitter at @sunitha_selvan.

Guest Speaker Events

In addition to our guest speaker events, our community-led sub-field groups have several sessions scheduled this month, be sure to check out all of our open science programs.

Join our open science community to see a full list of all upcoming events.

Cohere For AI