The AI startup is demonstrating its intent to support a wide selection of enterprise use cases – including people who don't require expensive, resource-intensive Large Language Models (LLMs). connections has released Command R7B, the smallest and fastest in its R series.
Command R7B supports rapid prototyping and iteration and uses Retrieval-Augmented Generation (RAG) to enhance its accuracy. The model has a context length of 128 KB and supports 23 languages. Cohere says it outperforms others in its class of open-weights models — Google's Gemma, Meta's Llama, Mistral's Ministral — on tasks like math and coding.
“The model is designed for developers and enterprises that must optimize the speed, cost-performance and computing resources of their use cases,” said Aidan Gomez, co-founder and CEO of Cohere writes in a blog post publicizes the brand new model.
Outperform the competition in math, coding and RAG
Cohere has strategically focused on enterprises and their unique use cases. The company introduced Command-R in March and the powerful Command R+ in April and has made upgrades all yr round to support speed and efficiency. Command R7B was touted because the “final” model in its R series and announced it will release model weights for the AI research community.
Cohere noted that a critical focus in the event of Command R7B was improving performance within the areas of mathematics, reasoning, code and translation. The company appears to achieve success in these areas, with the brand new smaller model taking the lead HuggingFace Open LLM Leaderboard against similarly sized open weight models including Gemma 2 9B, Ministral 8B and Llama 3.1 8B.
Additionally, the smallest R-Series model outperforms competing models in areas resembling AI agents, tool usage and RAG, helping to enhance accuracy by basing model outputs on external data. According to Cohere, Command R7B stands out for its outstanding skills in interview tasks, including workplace technical support and enterprise risk management (ERM). technical facts; media workstation and customer support support; HR FAQs; and summary. Cohere also notes that the model is “exceptionally good” at retrieving and manipulating numerical information in financial environments.
Overall, the Command R7B ranked first on average in key benchmarks, including the Command Sequence Evaluation (IFeval); big bench hard (BBH); Google Safe Questions and Answers (GPQA) for Graduates; Multi-level soft pondering (MuSR); And massive multitasking language comprehension (MMLU).
Removing unnecessary calling features
Command R7B can use tools resembling search engines like google, APIs and vector databases to increase its functionality. Cohere reports that the model's tool usage compares favorably to competitors on the Berkeley Function-Calling Leaderboard, which evaluates a model's accuracy in function calling (connecting to external data and systems).
Gomez points out that this proves its effectiveness in “real-world, diverse and dynamic environments” and eliminates the necessity for unnecessary calling features. This could make it a superb alternative for constructing “fast and powerful” AI agents. For example, Cohere points out that when Command R7B functions as a web-based search agent, it may break down complex questions into sub-goals while performing well in advanced reasoning and knowledge retrieval.
Due to its small size, Command R7B will be used on low-end and consumer CPUs, GPUs and MacBooks, enabling on-device inference. The model is now available on the Cohere platform and HuggingFace. The price is $0.0375 per 1 million input tokens and $0.15 per 1 million output tokens.
“It is a great alternative for firms in search of a cheap model based on their internal documents and data,” writes Gomez.