Abstract
Intent-aware music recommender systems are relatively new and advanced approaches in personalized music recommendation. By the integration of the user’s current intent - their main aim for the consumption of the content - these recommender systems can provide more meaningful suggestions that improve user satisfaction. At the same time, Large Language Models (LLMs) appear in many applications nowadays, so it is no surprise that they can have a significant potential and can unlock new opportunities for recommender systems too. Thus, combining these two fields - namely LLMs and intent-aware music recommendation – offers opportunities to improve personalized music recommendations. This thesis investigates the potential of LLMs for intent-aware music recommendation by studying how listening intents can be included in LLM prompts to improve the relevance and intent-alignment of recommendations. 2 main research questions guide this work: (i) How does the inclusion of listening intent and user preferences in LLM prompts affect the relevance and intent-alignment of music recommendations? and (ii) What is the impact of different track–intent assignment strategies on recommendation quality of intent-aware music recommenders using LLMs? To address these questions, an LLM-based intent-aware music recommender framework is developed that combines the 2020 subset of the LFM-2b dataset, which includes user–track interactions, with the Spotify Million Playlist Dataset enriched with listening intent annotations for each playlist, based on which 5 distinct track-intent matching approaches are implemented to define listening intents on the track level too. 2 LLMs (Google’s Gemini 1.5 Flash and Mistral 7B Instruct v0.3) are evaluated against a Factorization Machine baseline capable of integrating contextual features such as the listening intent of the user. Recommendations are offline evaluated using fuzzy string matching between recommended songs and ground-truth user history tracks in 3 ways: (i) content relevance, where the recommended track has to match one of the tracks the user previously listened to, regardless of the listening intent, (ii) intent-aware relevance, where the listening intent also needs to match, and (iii) intent calibration, which measures distributional alignment between listening intents in the user’s history data and in the recommended songs. Standard accuracy-based and beyond-accuracy metrics such as precision, recall, F1 score, NDCG, MRR, coverage, artist diversity, and hit rate at 10 are also utilized to assess the recommended songs. The results show that including listening intents into LLM prompts improves recommendation quality and intent alignment relative to intent-agnostic prompting strategies, while the Factorization Machine model still provides a competitive baseline. However, several limitations emerge due to the restricted offline evaluation setting, the reliance on fuzzy string matching, and due to the vulnerability of track-intent matching approaches to the robustness of playlist-intent mappings. Additionally, the LLM-based intent-aware music recommender framework faces some scalability challenges, as the need to query an LLM for every user–intent pair introduces computational bottlenecks that can limit feasibility for large-scale deployment.Overall, this thesis provides insights into the integration of LLMs to the field of intentaware music recommendation, and highlights both their potential and their current limitations for large-scale, real-world deployment.
Citation
Petra Jósár
Large Language Models for Intent-aware Music Recommendation
Advisor(s):
Markus
Schedl,
Johannes Kepler University Linz, Master's Thesis, 2025.
BibTeX
@misc{PetraJósár2025master-thesis,
title = {Large Language Models for Intent-aware Music Recommendation},
author = {Petra Jósár},
school = {Johannes Kepler University Linz},
year = {2025}
}