Mistral, the Microsoft-backed French AI startup valued at $6 billion, has Approved its first generative AI model for coding, called Codestral.
Like other code-generating models, Codestral is designed to assist developers write and interact with code. It has been trained in over 80 programming languages, including Python, Java, C++ and JavaScript, Mistral explains in a blog post. Codestral can complete coding functions, write tests and “fill in” partial code, in addition to answer questions on a code base in English.
Mistral describes the model as “open,” but that’s controversial. The startup's license prohibits the usage of Codestral and its results for business purposes. There is an exception for “development,” but even that comes with limitations: the license explicitly prohibits “any internal use by employees in the middle of the corporate's business.”
The reason could possibly be that Codestral was partially trained with copyrighted content. Mistral neither confirmed nor denied this in its blog post, however it wouldn’t be surprising; there are Proof that the startup's previous training data sets contained copyrighted data.
Codestral is probably not well worth the effort anyway. Its 22 billion parameter model requires a robust PC to run. (Parameters essentially define an AI model's capabilities at an issue, akin to parsing and generating text.) And while it beats the competition in keeping with some benchmarks (which, as we all know, are unreliable), it's not a breakthrough.
Although Codestral is impractical for many developers and provides only incremental performance improvements, it should actually fuel the talk about whether it is beneficial to depend on code-generating models when programming.
Developers actually use generative AI tools for no less than some programming tasks. In a Stack Overflow Opinion poll As of June 2023, 44% of developers said they currently use AI tools of their development process, while 26% plan to accomplish that soon. However, these tools have obvious shortcomings.
An evaluation of greater than 150 million lines of code pushed from GitClear to project repos over the past few years found that generative AI development tools result in: more faulty code pushed into code bases. Elsewhere, security researchers have warned that such tools exacerbate existing bugs and security issues in software projects; greater than half of the answers given by OpenAI’s ChatGPT to programming questions are mistaken, in keeping with a study by Purdue.
That won't stop firms like Mistral and others from getting cash (and gaining attention) from their models. This morning, Mistral launched a hosted version of Codestral on its conversational AI platform Le Chat, in addition to its paid API. Mistral says it has also been working on integrating Codestral into app frameworks and development environments like LlamaIndex, LangChain, Continue.dev, and Tabnine.