Utilizing parameter-efficient fine-tuning methods to improve controllable text generation

Abstract

Language models have become a crucial component in the field of natural language processing (NLP). Despite demonstrating remarkable performance in various tasks, the text generated by these models often lacks accurate control. This lack of control can lead to social and ethical issues, and can be harmful in applications that require guideline adherence, stylistic personalization or content moderation. This has led to a growing interest in controllable text generation, specifically in multi-aspect controllable text generation (MCTG). Various MCTG methods have been proposed to improve the controllability of language models while maintaining the level of fluency of the generated text. This thesis investigates augmenting an existing method, Disentangled Controllable Generation (DCG), with parameter-efficient fine-tuning (PEFT) methods to improve the controllability of GPT-2. The main experiments consist of comparing four DCG + PEFT variants, namely bottleneck adapters, IA3, LoRA and SSF, with eight baseline methods. The experiments are conducted on two datasets that measure performance in terms of different attribute spaces - YELP contains three attributes (sentiment, pronoun and tense) and Mixture contains two attributes (sentiment and topic). The approaches are also evaluated under two settings called protocols - Original (all attributes seen during training) and Few-Shot (subset of attributes seen during training to measure performance on novel combinations). This is an important consideration since training on a subset of all attribute combinations and generalizing to novel unseen combinations circumvents the need for collecting training data for every attribute combination. The results show that all four PEFT-augmented variations of DCG improve the standalone DCG’s performance in MCTG, with the bottleneck adapter approach attaining the best overall results among all baseline methods. This method improves standalone DCG’s average accuracy from 75.47% to 79.21% (3.74 percentage point improvement), and reduces the average perplexity from 70.44 to 46.87 (23.57 point improvement). The proposed bottleneck adapter approach attains the best overall score among all baseline methods as measured by accuracy-perplexity score (aps), a composite metric that combines accuracy and perplexity. Further analysis reveals that reducing the parameters of the bottleneck adapter-augmented DCG method by 75% still results in improving DCG’s average accuracy and average perplexity by 3.50 and 16.82 absolute points, respectively. A closer examination of this method shows that as the dropout rate of the bottleneck adapter is decreased, slight improvements in average accuracy come at the cost of substantial increases in average perplexity. This highlights the importance of regularizing the method to combat overfitting on sample-specific features.


Citation

Hector Auvinen
Utilizing parameter-efficient fine-tuning methods to improve controllable text generation
Advisor(s): Markus Schedl, Shahed Masoudian,
Johannes Kepler University Linz, Master's Thesis, 2025.

BibTeX

@misc{HectorAuvinen2025master-thesis,
    title = {Utilizing parameter-efficient fine-tuning methods to improve controllable text generation},
    author = {Hector Auvinen},
    school = {Johannes Kepler University Linz},
    year = {2025}
}