close
close

AI21’s Jamba-Instruct snowflake lines help businesses decode long documents


Join our daily and weekly newsletters to receive the latest updates and exclusive content on industry-leading AI coverage. Learn more


Cloud giant Snowflake today announced that it will add Jamba-Instruct LLM, a subsidiary of Israeli enterprise-focused AI startup AI21 Labs, to its Cortex AI service.

This model, available today, will enable Snowflake enterprise customers to build generative AI-powered applications (such as chatbots and summarization tools) that can handle long documents without compromising quality and accuracy.

Given the massive reliance that enterprises have on large files and documents, Jamba-Instruct could be a great asset for teams. It’s worth noting, however, that AI21 isn’t Snowflake’s only Large Language Model (LLM) partner. The company, led by Sridhar Ramaswamy, has a laser focus on the Gen AI category. It has already initiated several efforts to create an entire ecosystem for developing high-performance data-driven AI applications.

Just a few days ago, the company announced a partnership with Meta to bring an all-new Llama 3.1 LLM family to Cortex. It previously debuted a proprietary enterprise model called “Arctic.” The approach was quite similar to that of rival Databricks, which acquired MosaicML last year and has been aggressively building out its own DBRX model and adding new LLMs and tools for customers to build on.

What does Jamba-Instruct offer Snowflake users?

In March, AI21 made headlines with Jamba, an open generative AI model that combined the proven transformer architecture with a new, memory-efficient Structured State Space (SSM) model. The hybrid model gave users access to a massive 256 KB context window (the amount of data an LLM can process) and activated just 12 B of the 52 B parameters—ensuring not only a powerful solution but also an efficient one.

According to AI21, Jamba delivered 3x the throughput in long contexts compared to the Mixtral 8x7B (another model in its size class), which was a tempting offer for enterprises. This led to the debut of Jamba-Instruct, a version of the model tuned for instruction with additional training, chat capabilities, and security to make it suitable for enterprise use.

The commercial model launched on the AI21 platform in May and is now rolling out to Cortex AI, Snowflake’s fully managed, no-code service for building advanced, next-generation AI applications based on data stored on the platform.

“With its large context window capacity, Jamba-instruct has a strong processing capability. It can handle up to 256K tokens, which is equivalent to about 800 pages of text. This makes Jamba-instruct an extremely efficient model for a variety of use cases involving extensive document processing, such as corporate financial histories, earnings call transcripts, or long clinical trial interviews,” Baris Gultekin, head of AI at Snowflake, told VentureBeat.

For example, financial analysts at investment banks or hedge funds could use a Q&A or summary tool powered by LLM to extract quick and accurate insights from 10-K documents that often run to more than 100 pages. Similarly, doctors could review lengthy patient reports in a short time to extract relevant information, and retailers could build chatbots capable of consistently maintaining long, reference-based conversations with customers.

Gultekin noted that a long context window in the model can simplify the entire RAG pipeline creation process, allowing for the capture of a single, large piece of information and even supporting “multiple hints” to direct the model to follow a specific tone during generation.

Main cost benefits

In addition to the ability to handle long documents, Snowflake customers can also expect significant cost benefits with Jamba-Instruct.

The model’s essentially hybrid nature, combined with Mixture of Experts (MoE) layers activating selected parameters, makes its 256 KB context window more economically accessible than other instruction-tuned transformer models of the same size. Additionally, Cortex AI’s serverless inference with a consumption-based pricing model ensures that enterprises only have to pay for the resources they use, rather than maintaining dedicated infrastructure on their own.

“Organizations can balance performance, cost, and latency by leveraging the scalability of Snowflake and the efficiency of Jamba-Instruct. The Cortex AI framework enables easy scaling of compute resources for optimal performance and cost benefits. Meanwhile, the Jamba-Instruct architecture minimizes latency,” Pankaj Dugar, SVP & GM, North America, AI21 Labs, told VentureBeat.

Currently a fully managed service, alongside Jamba-Instruct, it includes a range of LLM courses including Arctic by Snowflake, as well as courses from Google, Meta, Mistral AI and Reka AI.

“Our goal is to give our customers the flexibility to choose between open source or commercial models, allowing them to select the best model that meets their specific application, cost, and performance requirements—without having to set up complex integrations or move data from where it is already managed in the AI ​​data cloud,” Gultekin explained.

The list is expected to grow, with more major models coming to the platform in the coming months, including those from AI21. However, the AI ​​boss noted that the company is constantly monitoring customer feedback as it evaluates and integrates LLM to ensure that it only includes models that meet specific requirements and use cases.

“We have very rigorous guidelines and processes when it comes to bringing LLM to Cortex AI… We want to make sure that the model we offer covers a wide range of use cases, from automated BI to conversational assistants, text processing, and summarization. And the model should have unique capabilities—for example, Jamba-instruct has the largest context window of any model we offer to date,” he added.

Snowflake also acquired TruEra a few months ago to save companies from being overwhelmed by the growing choice of models. Gultekin said they can use the startup’s TruLens offering to run LLM experiments and assess what works best for them.

Today, over 5,000 enterprises are using Snowflake’s AI capabilities (Cortex and other related features), with the most popular use cases being automated business analytics, conversational assistants, and text summarization.