LAP - Language and Personalization
Language and Personalization
The purpose for this work package is to develop personalization techniques and Scandinavian language processing capabilities to provide personalized content generation and:
- Develop truly explainable, fair and transparent personalization techniques
- Enable proactivity in customer relations
- Provide an individualized experience that provably respects privacy concerns
- Develop individualized content
- Develop large-scale Scandinavian language models
- Enable human-like content creation and conversations
Personalization and contextualization have been successfully employed in diverse applications over the past decade, and currently see an extended usage, for instance in proactive interaction with customers and individualization of news stories. LAP will contribute to developing such systems while ensuring that the system usage will be ethical and respecting users’ requirements for privacy, fairness and accountability.
Building Scandinavian language models requires the compilation of large-scale reusable language resources, including general-purpose corpora from public sources (e.g., news and social media) as well as industry- and domain-specific text collections. We will address the scarcity of the latter by pre-training on the former and developing transfer learning methods. These large-scale language models will then be utilized in real-life scenarios by formulating a number of specific summarization, explanation, and conversational tasks based on our partners’ use-cases. LAP will develop appropriate evaluation methodology with user-oriented evaluation measures and objectives. It will thus contribute to providing measurable quantification of the amount of domain-specific training material needed in order to provide a language service that is of sufficiently high quality.
LAP projects
NorLLM Store Norske Språkmodeller
NorLLM Norwegian Large Language Models
The models are available to test, for representatives from organizations based in the Nordic countries and students at Nordic universities. Apply for access to the NorLLM models on Hugging Face: https://huggingface.co/NorwAI
Associated projects
TrustLLM - Democratize Trustworthy and Efficient Large Language Model Technology for Europe.
TrustLLM brings together leading European research institutions to lead European development in NLP and AI, and to lay the foundation for a broader European collaboration effort on LLMs and large-scale AI. The project envisions to build a series of LLMs to represent specific language families, in particular the Germanic language family.
The Aifal project is a research initiative that focuses on integrating generative AI models into the primary healthcare service to improve general practitioners' everyday work. The goal is to develop AI services that free up time for patient meetings, improve decision-making processes, and speed up diagnosis. The project tests prototypes for summarizing patient records, knowledge support, transcribing consultations, and medical coding. Collaboration partners include the Antibiotic Center at UiO and NorwAI at NTNU. For more information, visit the Aifal project.
The Mímir project, led by the National Library of Norway, aims to assess the impact of copyrighted material on the performance of large generative language models for Norwegian languages. It involves collaboration with the University of Oslo, NTNU/NorwAI, and Sigma2, focusing on training models with both copyrighted and non-copyrighted content. The findings suggest that models trained with a mix of these materials generally perform better, emphasizing the importance of high-quality, curated content.
Press release from the National Library: Forskningsprosjekt viser: Rettighetsbelagt innhold gir norske språkmodeller høy kvalitet | Nasjonalbiblioteket (nb.no)
WP Leader
Researchers
-
Nolwenn Bernard
PhD candidate, UiS -
Terje Brasethvik
Associate Professor, NTNU -
Jon Atle Gulla
Professor, NTNU -
Jon Espen Ingvaldsen
Associate Professor, NTNU -
Peng Liu
Researcher, NTNU -
Egil Rønningstad
PhD candidate, UiO -
Bjørnar Vassøy
PhD candidate, NTNU -
Vandana Yadav
PhD candidate, NTNU -
Lemei Zhang
Postdoc, NTNU -
Weronika Łajewska
PhD candidate, UiS
A call for Nordic collaboration
Leading lingvistic researchers and data engineers from the greater Nordic region was gathered to share competence and challenges. To keep up with the technology pace, even more cross border cooperation is needed.
2024-11-29
Where does the road ahead lead for NorwAI and language models? A small roadmap for NorLLM
The demand and curiosity for Norwegian generative language models has been notable. The six models NorwAI published this summer have been downloaded more than 10 000 times. Plans for what comes next is taking shape.
2024-11-26
Nordic Language Technology Get-together to Include Minority Languages
As society is becoming increasingly digitized, professionals in the Nordics want to ensure that the new solutions that language technology and artificial intelligence can offer, are available for all languages in the Nordics.
The organizers for the Get-together in Trondheim on November 5th and 6th hope to foster cooperation across national borders and languages. They will also launch a new language technology platform for small languages and have a poster session.
The conference is a cooperation between ASTIN (Arbetsgruppen för språkteknologi i Norden) with members UiT The Artic University, of Norway, Språkrådet vid Institutet för språk och folkminnen in Sweden, Dansk Sprognævn in Denmark and Språkrådet, NorwAI in cooperation with NorwAI, Norwehian Research Center for AI Innovation.
2024-10-29
Medbric to assist doctors in the primary healthcare service
Jon Espen Ingvaldsen of NorwAI and Jorunn Thaulow of the University of Oslo has joined hands to bring AI state-of-the-art-technology into medical practice. The test period showed very good results after more than 100 GPs participated.
2024-09-30
Lær med NorwAIs eksperter
NorwAI is preparing a special course on language models for important decision-makers and developers who want to use the technology in their own applications. In the course "Innovation with generative language models", NorwAI's experts will share knowledge and their skills with those who want to lead the way in Norwegian AI utilization. (Full article in Norwegian)
2024-08-29
Introducing IAI MovieBot
IAI MovieBot is an on-going project by the IAI group at the University of Stavanger, and is a conversational recommender system for movies.
Throughout a conversation, IAI MovieBot asks you questions related to your preferences, such as the genre and the release year of the movie you are looking for. Based on your answers, IAI MovieBot tries to recommend you a movie that matches your preferences and reply to your question on it.
2024-08-05
Best Paper Honorable Mention Award at ICTIR '24
We are thrilled to announce that the paper, "Towards a Formal Characterization of User Simulation Objectives in Conversational Information Access" by Nolwenn Bernard and Krisztian Balog, has received the Best Paper Honorable Mention Award at the 14th International Conference on the Theory of Information Retrieval (ICTIR '24)!
2024-08-16
How can (Norw)AI protect personal data?
Protecting personal information is challenging with complex AI models that are hungry for data. NorwAI’s pledge to provide an individualized AI experience that provably respects privacy concerns is therefore more important than ever.
The project “MIMIR” on copyrighted content
At the end of 2023, an initiative emerged that brought the three most active environments in Norway with expertise in language models to collaborate more closely. The “Mimir” project united the National Library of Norway, the University of Oslo, and NorwAI in a joint effort.
Large language models at University of Oslo
Many have declared 2023 as the year of Large Language Models (LLM), and it’s hard to disagree. In the Language Technology Group (LTG) at the University of Oslo (UiO), developing language models for Norwegian has been an important priority for several years. While also a NorwAI partner, LTG has not been involved in the LLM efforts of the center. Nonetheless, language modeling has defined our activities in several other collaborations.
-Vi treng generative norske språkmodeller
- Treng me generative språkmodellar på norsk? Svaret er eit rungande ja! Modellane må kunna formidla norske verdiar og haldningar, og dei må formidla god norsk - både bokmål og nynorsk.
Det sa Språkrådets direktør Åse Wetås da hun innledde på Trondheim Tech Port og NorwAI’s innovasjonsfrukost om språkmodeller og innovasjon 14. februar -24.
2024-02-27
New language model for public use this winter
- NorwAI will meet the immense interest to work with language models with an open model, smaller thant that of the research model NorGPT-23 but will still be fully operational, says Jon Atle Gulla, professor and director at NorwAI.
2023-10-06
Upgrading infrastucture is critical to meet AI demands
Upgrading the national infrastructures are critical steppingstones to be prepared for the quantum leap technology now is facing.
2023-08-11
Amerikanske språkmodeller påvirker ChatGPT. Det er problematisk.
-Flere problemer melder seg ved kunstig intelligens. Nå trenger vi å ta kontroll over infrastrukturen, sier Sven Størmer Thaulow, EVP og Chief Data and Technology Officer i Schibsted ASA.
2023-07-05
Schibsted reports on their AI results
2023-05-15
Large language models as public goods
Large language models have huge potential for value creation - but there is a strong need to address issues of control and risk mitigation.
We are now moving towards a huge change in intellectual value creation, powered by the weird and surprisingly sophisticated mimicry of intelligence powered by large language models (LLMs).
These models have unleashed a wave of creativity. They, and their model cousins that can process, transform and generate sound, images and any digitizable data, have enabled previously impossible products and services along with a torrent of hype.
Due to the enormous amounts of data, compute and brain power required, these important platforms are now mostly developed and controlled by a few very large private technology companies in the US. This is problematic, because along with all the interesting new functionality, large language models also suffer from serious and complicated challenges such as bias, hallucinations and toxicity. Private companies will invariably balance mitigating these issues with the need for profit. They are likely to do the bare minimum required to avoid regulatory retribution and public relations backlash.
2023-03-28
ChatGPT and its inner workings
Media insiders from seven countries got a lecture from NorwAI researcher Benjamin Kille as the interest to know more about the new language models dominated the discussions at a media lab day in Hamburg, Germany.
2023-01-31
The Kahoot test of the AI summary
Participants at the NxtMedia Conference 2022 were able to test journalistically written articles against summaries written by a language robot.
-The Kahoot game came in handy to choose the winners in the three examples, says adjunct associate professor Jon Espen Ingvaldsen who did the test.
2022-12-13
NorwAI to introduce large Norwegian GPT model
NorwAI GPT Language Modeling Project is currently building its version of a large Norwegian model. The model will go into training this spring and will be ready for demonstrations for interested partners, says NorwAI head professor Jon Atle Gulla.
2023-01-31
A new team of research assistants has started at NorwAI
A new team of research assistants will continue our work with Kaia-The Social Robot
NorwAI will continue the research on Social Robotics. This semester three new research assistants have joined us and will develop new features and conduct extensive benchmarks to test Kaia against the state of the art.
2022-09-02
Language experts to report on speech and text to the Storting
The Norwegian Board of Technology (Teknologirådet) councels lawmakers and government. By starting with speech and later continuing on large language models, expert groups will disseminate the complex language technology step by step. NorwAI's director Jon Atle Gulla is part of the expert group.
2022-05-31
Norway may take a world-leading AI role
-One particular area where the Norwegian AI stands out is the genuine interest in fairness, transparency and explainability, which align with societal values in Norway. Therefore, I can see Norwegian AI research taking a world-leading role in these areas, says professor Krisztian Balog at the University of Stavanger and Staff Research Scientist at Google.
Krisztian Balog heads NorwAIs work package for language technologies. He cooperates with NorwAI's research director, professor Kjetil Nørvaag. The two professors joined their skills as general chairs of the successful ECIR conference in Stavanger during the week before Easter, giving an international audience insight on new research results in the broadly conceived area of Information Retrieval.
2022-04-28
A silent challenge
The Language Council of Norway has contacted NorwAI about current research on sign languages. There is ongoing research in Europe on AI-driven sign language processing, and NorwAI is considering looking into the use of machine learning for interpreting the Norwegian sign language. The visual and silent language is an official minority language in NorwAI. Research will face some very special challenges if a project materializes.
2022-03-30
Tailoring news content: How Scandinavian mediahouses have tested recommender systems
Scandinavian newspapers were early adapters to online services 25 years ago. Gradually some of them explored how recommender systems would enable individually tailored news streams. In an article in AI Magazine recently NorwAI associates, headed by Center director Jon Atle Gulla (picture) explore how Scandinavian media organizations are coping with these new technological opportunities.
2021-12-20
New Language Models in NorwAI
The NorwAI center is determined to provide new Norwegian language models that are significantly larger and better than what is available to-day and can easily be employed in advanced Norwegian NLP applications for industrial use, says center director professor Jon Atle Gulla.
2021-04-20