Språkvelger

Norsk

NorwAI

A guide to NorwAI’s work on Large Language Models in Norwegian

A guide to NorwAI’s work on Large Language Models in Norwegian

NorLLM

NorLLM

SFI NorwAI – the Norwegian Research Center for AI Innovation – is working intensively to develop generative language models that can benefit Norwegian society. Generative language models gained widespread attention in 2023 after the international breakthrough of the large language models from OpenAI. In this section, we summarize the current status of NorwAI’s work with NorLLM (Norwegian Large Language Models).

Our key messages are as follows:

Norway needs control over its own generative language models which are built on Norwegian data and values.
We have a well-functioning system for collecting and managing published content for use in large language models.
Lack of computational resources hinders both training and operation of large language models in Norway.
There is a need for structures and mechanisms to ensure that training data, fine-tuning data, and align methods align with Norwegian values and support open models.
NorwAI with its partners have the necessary expertise and experience and aims to develop Norwegian language models for the benefit of Norwegian society.

NorLLM models

Access the NorLLM models

NorLLM logo The models are available to test, for representatives from organizations based in the Nordic countries and students at Nordic universities. Apply for access to the NorLLM models on Hugging Face: https://huggingface.co/NorwAI

Contact

Contact

person-portlet

Technical inquiries

person-portlet

Inquiries about language models in general

Terje Brasethvik Associate Professor

person-portlet

Other inquiries

News

A presentation of the NorLLM project, which focuses on large language models based on Norwegian data and values, at a conference in Trondheim. The screen displays information about the model launch, along with the logos of NorwAI and NTNU

National launch of the next gen NorLLM models

On May 15th NorwAI will present and launch the next generation of its NorLLM models. In addition, a group of partners, cooperating companies and organizations will present projects and plans for their use of the models.

Møte om kunstig intelligens med NorwAI-logoen synlig.

Language models are taking off

The NorwAI language model activities received national attention in 2023. The interest continues into 2024 with VIP political visits.

Three people in front of server racks.

Four models built - four new ones in the pipeline

NorwAI has built four distinct Norwegian generative language models. During winter of 2023/2024, an additional four models are being developed which will be made available in spring 2024. Collectively, these eight models represent steps toward NorwAI’s ambition to build a comprehensive generative base model for general use, with approximately 40 billion parameters by the end of 2024.

Discussion at a whiteboard in an office setting.

Lessons learned about Language Models

Interesting aspects are coming to light working with language models connected to transparency, copyrights, sustainability, values and norms and language variants.

Four people in a meeting room

Requirements for Large Language Models

If you are to build an environment for training and operation of Norwegian, commercially available language models, you must have access to resources:

Portrait of Karl Aksel Festø

The demand for Norwegian Models

NorwAI has been approached by several public organizations and private enterprises seeking an alternative to international models. These entities have primarily raised two concerns regarding existing commercial models: (i) handling sensitive and copyrighted data (ii) the lack of quality in Norwegian language generation.

A person presenting in front of a blurred screen.

The Language Council of Norway about domain-competent generative language technology

Åse Wetås, Director of the Language Council of Norway, discusses the importance of developing language models that can handle specialized terminology and language for professional use across various societal sectors.

Portrait of Anders Løland

How can (Norw)AI protect personal data?

Protecting personal information is challenging with complex AI models that are hungry for data. NorwAI’s pledge to provide an individualized AI experience that provably respects privacy concerns is therefore more important than ever. By Anders Løland, Research director, Norwegian Computing Center (NR)

An illustration of a person chatting with an AI chatbot on their computer.

Harmful behavior in language models

The responses from a language model reflect the data that goes into its training set. If the training data is incomplete, the model will combine words based on statistical probabilities and construct sentences that may be both plausible and grammatically correct but have little to do with reality.

Portrait of Alak Sira Myhre

The project “MIMIR” on copyrighted content

At the end of 2023, an initiative emerged that brought the three most active environments in Norway with expertise in language models to collaborate more closely. The “Mimir” project united the National Library of Norway, the University of Oslo, and NorwAI in a joint effort.

A person presenting TrustLLM objectives at a conference.

TrustLLM - An EU-Project as an answer to Generative AI hallucinations

The last two years have seen the rise of Generative AI. Many models provide useful functions but tend to make up facts and respond overly confidently. How to mitigate that risk? In November 2023, a consortium with partners from Norway, Germany, Sweden, Iceland, Denmark, and the Netherlands kicked off the Horizon Europe funded project to develop open, trustworthy, and sustainable Large Language Models (LLMs).