Kakao updates Kanana-2, releases 4 open-source fashions

[ad_1]

A graph shows benchmark test results for Kanana-2's updated Instruct model in highlight. Courtesy of Kakao

A graph reveals benchmark take a look at outcomes for Kanana-2’s up to date Instruct mannequin in spotlight. Courtesy of Kakao

Kakao has up to date its proprietary massive language mannequin (LLM), Kanana-2, releasing 4 extra variants of the mannequin as open supply.

After demonstrating sturdy efficiency and effectivity optimized for agentic synthetic intelligence (AI) with Kanana-2, which was open-sourced in December on Hugging Face, the corporate rolled out a significant replace a month later, including 4 considerably improved fashions to its open-source lineup.

The newly launched fashions — the Base, Instruct, reasoning-focused Considering and research-optimized Mid-training fashions — emphasize excessive effectivity and cost-effectiveness, whereas considerably strengthening tool-calling capabilities important for agentic AI.

“The up to date Kanana-2 is the results of our deep give attention to the way to construct sensible agentic AI with out counting on costly infrastructure,” stated Kim Byung-hak, efficiency lead for Kakao’s Kanana challenge. “By open-sourcing fashions that ship excessive effectivity even on general-purpose infrastructure, we hope to supply a brand new various for AI adoption and assist advance Korea’s AI R&D ecosystem.”

The fashions are optimized to run easily on general-purpose graphics processing models (GPUs) at Nvidia’s A100 degree, making the AI accessible to small companies and tutorial researchers with out heavy price burdens.

Kanana-2, which incorporates 32 billion parameters in whole, makes use of a mixture-of-experts (MoE) structure, activating solely 3 billion parameters throughout inference to dramatically enhance compute effectivity. Parameters are the inner variables that an AI mannequin learns from knowledge throughout coaching to make predictions.

Past architectural and knowledge enhancements, Kakao additionally refined the coaching pipeline for the up to date model. It launched a brand new mid-training stage between pretraining and post-training, and adopted a replay mechanism to forestall catastrophic forgetting when fashions study new data. This permits the mannequin to retain present language and reasoning abilities whereas buying new ones.

Not like typical conversational AI fashions, the up to date Kanana-2 sequence focuses on agentic AI able to executing real-world duties. The fashions have been fine-tuned with in depth multi-turn tool-calling datasets, enabling them to interpret complicated consumer directions and autonomously choose and execute applicable instruments.

In benchmark testing, the fashions outperformed a peer mannequin, Qwen-30B-A3B-Instruct-2507, in instruction-following accuracy, multi-turn tool-calling efficiency and Korean-language functionality.

[ad_2]

What's Hot

Kakao updates Kanana-2, releases 4 open-source fashions

Related Posts