Just a week after launching the latest version of its Gemini Models, Google today announced the launch of Gemma, a new family of lightweight, open-weight models. Beginning with Gem 2B and Gem 7BThese new models were “Gemini-inspired” and are available for commercial and research use.
Google didn't provide us with a detailed document on how these models perform against similar models from Meta and Mistral, for example, and only noted that they are “state of the art.” However, the company noted that these are dense models that only have set-top boxes, which is the same architecture it used for its Gemini models (and its previous PaLM models) and we'll look at benchmarks later today Hugging Face Leaderboard.
To get started with Gemma, developers can get access to out-of-the-box Colab and Kaggle notebooks, as well as integrations with Nvidia's Hugging Face, MaxText, and NeMo. Once pre-trained and fine-tuned, these models can work everywhere.
While Google highlights that these are open models, it should be noted that they are not open source. In fact, at a press conference ahead of today's announcement, Google's Janine Banks highlighted the company's commitment to open source, but also noted that Google is very intentional in the way it refers to the Gemma models.
“[Open models] it’s become pretty ubiquitous now in the industry,” Banks said. “And it often refers to open weight models, where there is broad access for developers and researchers to customize and fine-tune the models but, at the same time, the terms of use (things like redistribution, as well as ownership of those “variants that are developed – they vary depending on the specific terms of use of the model. And so we see some difference between what we would traditionally call open source and decided that it made more sense to refer to our Gemma models as open models.”
That means developers can use the model to infer and adjust at will, and the Google team maintains that although these model sizes are adequate for many use cases.
“The quality of generation has increased significantly in the last year,” said Tris Warkentin, director of product management at Google DeepMind. “Things that were previously the province of extremely large models are now possible with smaller, more modern models. This unlocks completely new ways to develop AI applications that we're really excited about, including the ability to run inference and tweak on your local developer desktop or laptop with your RTX GPU or on a single host on GCP with Cloud TPU, too . .”
This also applies to open models from Google's competitors in this space, so we'll have to see how the Gemma models perform in real-world scenarios.
In addition to the new models, Google is also launching a new Responsible Generative AI toolkit to provide “essential guidance and tools for building safer AI applications with Gemma,” as well as a debugging tool.