Although the most well-liked language models can be accessible through APIs, open models — to the extent that the word can be taken seriously — are gaining popularity. The first model from Mistral, a French AI firm that raised a sizable seed round in June, has just been unveiled. The company claims it outperforms models of a similar size and is entirely free to use without any limitations.
A 13.4-gigabyte torrent (with a few hundred seeders already) and other download options are currently available for the Mistral 7B model. Additionally, the business launched a GitHub repository and a Discord channel for cooperation and problem-solving.
The Apache 2.0 license, which is highly permissive and has no limitations on usage or replication other than attribution, was used to release the model, which is crucial. As long as they have a system that can run it locally or are prepared to pay for the necessary cloud resources, the model can be employed by a hobbyist, a multibillion-dollar enterprise, or the Pentagon.
Other “small” big language models like Llama 2 have been improved upon by Mistral 7B, which provides comparable functionality (as measured by some industry benchmarks) at a far lower compute cost. Although foundation models like GPT-4 are significantly more capable and challenging to use, they are only made accessible through APIs or remote access.
In a blog post that coincides with the model’s release, the Mistral team stated that their goal was to “become the leading supporter of the open generative AI community and bring open models to state-of-the-art performance.” “Mistral 7B’s performance shows what little models can do when they have enough conviction. This is the outcome of three months of arduous effort, during which we put together the Mistral AI team, established a top-performing MLops stack, and created a highly complex data processing pipeline from scratch.
The founders had a head start because they had worked on related models at Meta and Google DeepMind. For some people (perhaps the majority), that list may sound like more than three months’ labor. It doesn’t simplify matters, but at least they were knowledgeable about their actions.
This differs significantly from being “open source” or a similar concept, as discussed during our discussion at Disrupt last week, even if it’s available for download and use.
Although the model was created privately with private funds and the license is relatively permissive, the datasets and weights were also created privately.
This looks to be the business model of Mistral: The premium product is what you’ll need if you want to dive more deeply than the free model allows. “Weights and code sources will be available, and [our commercial offering] will be distributed as white-box solutions. The blog post states, “We are actively working on hosted solutions and dedicated deployment for businesses.
Mistral has been contacted for clarification on some of the openness and their future release plans. If they respond, I’ll update this piece.