xAI by Elon Musk released is Grok big language model as “open source” over the weekend. The billionaire clearly hopes to confront his company with its rival OpenAI, which despite its name is not particularly open. But does publishing the code of something like Grok really contribute to the AI development community? Yes and no.
Grok is a chatbot trained by xAI to play the same loosely defined role as something like ChatGPT or Claude: you ask it, it answers. This LLM, however, was given a bold tone and additional access to Twitter data as a way to differentiate it from the rest.
As always, these systems are nearly impossible to evaluate, but the general consensus seems to be that they are competitive with state-of-the-art midsize models like the GPT-3.5. (Whether you decide this is impressive given the short development timeline or disappointing given the budget and bombast surrounding xAI is entirely up to you.)
In any case, Grok is a modern, functional LLM of significant size and capacity, and the more access the developer community has to the innards of such things, the better. The problem is defining “open” in a way that does more than allow a company (or a billionaire) to claim the moral high ground.
This is not the first time the terms “open” and “open source” have been questioned or abused in the world of AI. And we’re not just talking about a technical objection, like choosing a usage license that’s not as open as another (Grok is Apache 2.0, if you’re wondering).
To start, AI models differ from other software in terms of making them “open source.”
If you’re building, say, a word processor, it’s relatively easy to make it open source: you publish all your code publicly and let the community propose improvements or create their own version. Part of what makes open source valuable is that every aspect of the application is original or attributed to its original creator; This transparency and compliance with correct attribution is not just a byproduct, but is fundamental to the very concept of openness.
With AI, this is arguably not possible at all, because the way machine learning models are created involves a largely unknowable process by which a huge amount of training data is distilled into a complex statistical representation. the structure of which no human being really directs, or even understands. . This process cannot be inspected, audited, and improved like traditional code can, so while it still has immense value in a sense, it can never really be open-sourced. (The standards community has not even defined what will be opened in this context, but they are actively discussing it.)
That hasn’t stopped AI developers and companies from designing and claiming their models as “open,” a term that has lost much of its meaning in this context. Some call their model “open” if there is a public interface or API. Some call it “open” if they publish a document that describes the development process.
Arguably, the closest an AI model can get to “open source” is when its developers release their dumbbells, that is, the exact attributes of the countless nodes of its neural networks, which perform vector mathematical operations in precise order to complete the pattern initiated by a user’s input. But even “open weights” models like LLaMa-2 exclude other important data, such as the data set and training process, that would be necessary to recreate them from scratch. (Some projects go furtherof course.)
All of this without even mentioning the fact that it takes millions of dollars in computing and engineering resources to create or replicate these models, effectively restricting who can create and replicate them to companies with considerable resources.
So where does Grok’s release of xAI sit on this spectrum?
As an open weights model, it is ready for anyone to download, use, modify, refine or distill. That’s good! It appears to be among the largest models that anyone can freely access this way, in terms of parameters (314 billion), giving curious engineers plenty to work with if they want to test how it works after several modifications.
However, the model’s size has serious drawbacks: you’ll need hundreds of gigabytes of high-speed RAM to use it in this raw form. If you don’t already own, say, a dozen Nvidia H100s on a six-figure AI inference platform, don’t bother clicking that download link.
And while the Grok is arguably competitive with other modern models, it’s also much, much larger than them, meaning it requires more resources to achieve the same thing. There is always a hierarchy of size, efficiency and other metrics, and it is still valuable, but it is more raw material than final product. It’s also unclear if this is the latest and greatest version of Grok, like the clearly tweaked version some have access to through X.
Overall, it’s good to release this data, but it’s not a game-changer as some expected.
It’s also hard not to wonder why Musk is doing this. Is his nascent AI company really dedicated to open source development? Or is it just mud in the eye of OpenAI, which Musk is currently with? Chasing multimillion-dollar meat?
If they are serious about open source development, this will be the first of many releases, and hopefully they will take into account community feedback, release other crucial information, characterize the training data process, and explain their approach in more detail. . If not, and this is only done so Musk can point it out in online arguments, it’s still valuable, but it’s not something anyone in the AI world will trust or pay much attention to after the next few months. while playing. the model.