A Finnish startup today launched a multilingual AI model that’s a “significant milestone” on the path to LLMs for every EU language, the company says.
Helsinki-based Silo AI calls the new large language model Viking 7B. It covers Danish, Finnish, Icelandic, Norwegian, and Swedish, as well as English and programming languages. Evaluations indicate best-in-class performance in all the Nordic languages — without compromising the English outputs.
Peter Sarlin, Silo AI’s CEO, told TNW that his company is now “on the right track” towards its ultimate goal.
“This release marks an important step in our ongoing efforts to develop performant language models for all official EU languages,” he said.
“With the Viking model family, we reaffirm our commitment to Europe’s digital sovereignty.”
Silo AI’s LLM family
Silo specialises in low-resource languages, which lack the linguistic data that’s typically needed to train AI models.
Without LLMs in these languages, entire communities will miss out on countless services, from machine translation to personalised healthcare.
To fill the data gap, Silo applies a variety of techniques. One is optimising model architectures for pre-training. Another incorporates translated pairs of high- and low-resource languages.
Several of the techniques use a cross-lingual signal, which enhances the connections between languages.
“It allows the model to generalise and apply learned patterns across different languages — even those with limited training data,” Sarlin said.
New parameters
The 7 billion-parameter Viking is the first release from a model family announced last month. Silo also plans to launch 13B and 33B versions. Checkpoints for both these LLMs were released today.
As the parameters expand, the models will improve their understanding of prompts and their capacity for nuanced outputs. But they will also need greater computational resources, which lead to higher costs and energy consumption.
To conserve these resources, Silo trained Viking on LUMI — Europe’s most powerful supercomputer and the world’s third greenest on the Top500 list.
With resources under control and performance proven, Silo now plans to integrate every EU language.
“We consider multilingual LLMs to constitute a part of Europe’s digital infrastructure,” Sarlin said.
One of the themes of this year’s TNW Conference is Ren-AI-ssance: The AI-Powered Rebirth. If you want to go deeper into all things artificial intelligence, or simply experience the event (and say hi to our editorial team), we’ve got something special for our loyal readers. Use the code TNWXMEDIA at checkout to get 30% off your business pass, investor pass or startup packages (Bootstrap & Scaleup).