Transformers are the cornerstone of the fashionable age of generative AI, but they should not the one technique to create a model.
AI21 today released recent versions of its Jamba model, which mixes transformers with a Structured State Space (SSM) modeling approach. The recent versions, Jamba 1.5 mini and Jamba 1.5 large, construct on the primary innovations the corporate introduced with the discharge of Jamba 1.0 in March. Jamba uses an SSM approach called Mamba. Jamba's goal is to bring together the very best features of transformers and SSM. The name Jamba is definitely an acronym that stands for Joint Attention and Mamba (Jamba) Architecture. The combined SSM Transformer architecture guarantees higher performance and accuracy than either approach can offer by itself.
“We got great feedback from the community because this was principally the primary and continues to be one in every of the one Mamba-based production models we got,” Or Dagan, VP of Product at AI21, told VentureBeat. “It's a novel architecture that I believe has sparked some debate concerning the way forward for architecture in LLMs and whether transformers are here to remain or whether we’d like something else.”
With the Jamba 1.5 series, AI21 adds more features to the model, including function calls, JSON mode, structured document objects, and quote mode. The company hopes the brand new additions make the 2 models ideal for developing agent-based AI systems. Both models also feature a big context window of 256K and are Mixture of Experts (MoE) models. Jamba 1.5 mini offers 52 billion total parameters and 12 billion lively parameters. Jamba 1.5 large has 398 billion total parameters and 94 billion lively parameters.
Both Jamba 1.5 models can be found under an open license. AI21 also offers business support and services for the models. The company also has partnerships with AWS, Google Cloud, Microsoft Azure, Snowflake, Databricks and Nvidia.
What's recent in Jamba 1.5 and the way will it speed up agent-based AI?
Jamba 1.5 Mini and Large introduce a variety of recent features designed to satisfy the growing needs of AI developers:
- JSON mode for structured data processing
- Quotes for improved accountability
- Document API for improved context management
- Functions for calling functions
According to Dagan, these additions are especially essential for developers working on agent-based AI systems. Developers often use JSON (JavaScript Object Notation) to access and create application workflows.
Dagan explained that the addition of JSON support will allow developers to more easily create structured input/output relationships between different parts of a workflow. He noted that JSON support is critical for more complex AI systems that transcend using the language model alone. The citation feature, however, works together with the brand new Document API.
“We can teach the model that once you generate something and your input incorporates documents, please map the relevant parts to the documents,” Dagan said.
How Citation Mode differs from RAG and provides an integrated approach to agent-based AI
Users mustn’t confuse citation mode with Retrieval Augmented Generation (RAG), although each approaches base answers on data to enhance accuracy.
Dagan explained that citation mode in Jamba 1.5 is designed to work together with the model's Document API, providing a more integrated approach in comparison with traditional RAG workflows. In a typical RAG setup, developers connect the language model to a vector database to access relevant documents for a given query or task. The model would then need to learn to effectively incorporate the retrieved information into its generation.
In contrast, the citation mode in Jamba 1.5 is more tightly integrated into the model itself. This implies that the model just isn’t only trained to retrieve and include relevant documents, but additionally to explicitly cite the sources of the data it uses in its output. This provides more transparency and traceability in comparison with a conventional LLM workflow, where the model's reasoning could also be more opaque.
AI21 also supports RAG. Dagan noted that his company offers its own end-to-end RAG solution as a managed service that features document retrieval, indexing and other required components.
Looking ahead, Dagan said AI21 will proceed to work on evolving its models to satisfy customer needs, with one other focus being enabling agent-based AI.
“We also know that we’d like to work with and push the boundaries of agent-based AI systems and the way in which planning and execution is handled on this area,” Dagan said.