Thesis defense Lemei Zhang
Successful PhD thesis defence for Lemei Zhang
Lemei Zhang began her PhD studies at NTNU in 2016, moving from China to Norway.
After giving a trial lecture, she defended her thesis before her opponents on December 7th, 2021 digitally.
Lemei’s thesis titled «Exploring Multifaced User Modelling in Textual Data Streams”, addresses a difficult problem well known to the media industry.
The opponents were professor Markus Zanker and associate professor Li Chen, bot internationally renowned authorities on recommender systems. Both are active in the Recsys community, and Markus Zanker is also co-author of one of the most widely used books on recommender systems, "Recommender Systems: An Introduction" published by Cambridge University Press, 2011. The seconf opponent, associate professor Dr. Li Chen works at the Departement of Computer Science at Hong Kong Baptist University.
Lemei took on a common challenge for industries facing many fly-by customers or non-subscribers. In many situations it is either impossible or undesirable to build and store user profiles for users of the recommender system. In the media world, for example, there are many users who do not have an account or subscription, and we can hardly recognize them over time when they use many different devices to read the articles. In other cases, it may be privacy audits that make it difficult to store user profiles.
Lemei’s solution to this is to use machine learning (more specifically deep learning) to recommend articles based on what the user has read in this specific session. No information about the user is stored over time - she starts with clean slates every time the user comes to the newspaper portal. We call this "session-based recommendations".
Session-based recommendations are difficult to perform because one has little information about the user. They are often referred to as "cold starters".
Lemei compensates by trying to find a pattern in what kind of articles are read in the same session (but does not care who the user in the session is), and she tries to use more of the user's context to come up with good recommendations.
In addition, she uses knowledge graphs. These are graphs that define entities and terms that are discussed in the articles. This makes it possible for her to see the connection in content between articles and words, where this does not emerge directly from the articles themselves.
She has tested her models on social media (e.g. Twitter), but also on pure corpora from the news world.
In addition, Lemei is our chief architect to define and make available the dataset from Adresseavisen known as the ‘Adressa dataset’ to research environments all over the world.
As of today, the article introducing the ‘Adressa dataset’ has been cited 78 times in research articles worldwide, quite an achievement, making it one of the world’s most popular datasets of it’s kind.
Topic for the trial lecture: “Give a general introduction to recommender systems and reflect on their impact on users and society”
Doctoral thesis "Exploring Multifaced User Modelling in Textual Data Streams”, by Lemei Zhang
