Introducing llamafile - Mozilla Hacks

A special thanks to Justine Tunney of the Mozilla Internet Ecosystem (MIECO), who co-authored this blog post.

Today we’re announcing the first release of llamafile and inviting the open source community to participate in this new project.

llamafile lets you turn large language model (LLM) weights into executables.

Say you have a set of LLM weights in the form of a 4GB file (in the commonly-used GGUF format). With llamafile you can transform that 4GB file into a binary that runs on six OSes without needing to be installed.

This makes it dramatically easier to distribute and run LLMs. It also means that as models and their weights formats continue to evolve over time, llamafile gives you a way to ensure that a given set of weights will remain usable and perform consistently and reproducibly, forever.

We achieved all this by combining two projects that we love: llama.cpp (a leading open source LLM chatbot framework) with Cosmopolitan Libc (an open source project that enables C programs to be compiled and run on a large number of platforms and architectures). It also required solving several interesting and juicy problems along the way, such as adding GPU and dlopen() support to Cosmopolitan; you can read more about it in the project’s README.

This first release of llamafile is a product of Mozilla’s innovation group and developed by Justine Tunney, the creator of Cosmopolitan. Justine has recently been collaborating with Mozilla via MIECO, and through that program Mozilla funded her work on the 3.0 release (Hacker News discussion) of Cosmopolitan. With llamafile, Justine is excited to be contributing more directly to Mozilla projects, and we’re happy to have her involved.

llamafile is licensed Apache 2.0, and we encourage contributions. Our changes to llama.cpp itself are licensed MIT (the same license used by llama.cpp itself) so as to facilitate any potential future upstreaming. We’re all big fans of llama.cpp around here; llamafile wouldn’t have been possible without it and Cosmopolitan.

We hope llamafile is useful to you and look forward to your feedback.

Stephen leads open source AI projects (including llamafile) in Mozilla Builders. He previously managed social bookmarking pioneer del.icio.us; co-founded Storium, Blockboard, and FairSpin; and worked on Yahoo Search and BEA WebLogic.

Subscribe now

To access premium content

Introducing llamafile – Mozilla Hacks

Related

Alcatel 3 (2025) silently launched

OpenAI to Buy Jony Ive’s Stealth Startup for $6.5 Billion

‘If Letters Had Pants’ Enhances Word Finders With Powerful Jeans

Paul Mescal Talks Masculinity in Movies, Brokeback Mountain in Cannes

The Startup Junkies Podcast | 23: Empoderando a latinas ante nuevos retos profesionales con Itzel Velazquez

How many designers does it take to fix the UX job market? | by Axel Lessio | May, 2025

A Reader’s Question on Nested Lists