Hier finden Sie Themenbereiche für Abschlussarbeiten die am Lehrstuhl Softwaresysteme betreut werden. Wir freuen uns über Ihre eigenen Themenvorschläge, suchen aber auch mit Ihnen eine geeignete Abschlussarbeit in einem der unten stehenden Themenfelder.

Wenn Sie sich für eine Abschlussarbeit am Lehrstuhl Softwaresysteme interessieren, senden Sie eine Email mit Betreff „Anfrage Abschlussarbeit“ und den folgenden Informationen an Alina Mailach >Mail<:

  • Studiengang,
  • Termin an dem Sie mit der Arbeit beginnen möchten,
  • Themenvorschlag und/oder Themenbereich(e) am Lehrstuhl Softwaresysteme für die Sie sich interessieren und
  • Eine Liste mit Modulen des Wahl(pflicht)bereichs, die Sie belegt haben und die Sie für die gewählten Themenbereiche als relevant erachten.

Für das Wintersemester 2024/25 gibt es keine freien Betreuungskapazitäten mehr.

Themenfelder

This topic area explores the development and application of methods that infer the effect configuration options have on software performance and energy consumption. The objective is to maximize insights while minimizing analysis costs, because, typically, a large number of software executions under varying configurations is necessary to obtain a good model. Key strategies include:

  • Active Learning: Selecting the most informative data points for model training, reducing the number of experiments required.
  • Hierarchical Models: Structuring models to first analyze high-level features, then drill down to more specific characteristics, improving predictive accuracy while reducing computational overhead.
  • Statistical Models: Leveraging Bayesian inference and other statistical approaches to quantify uncertainties and identify key performance drivers.
  • Machine Learning Algorithms: Utilizing regression models, decision trees, and neural networks to predict system behavior under varying configurations.

Cost efficiency is especially crucial when we not only try to understand each option’s influence on performance, but also the influence of environmental factors, such as workload and hardware platform. To this end, novel methods may be evaluated, which will involve conducting measurements on the chair’s compute hardware.

Example Research Questions:

  • How can active learning be leveraged to reduce the number of performance experiments required while maintaining prediction accuracy?
  • Knowing which options‘ influence change across differing workloads, can we reduce the sampling effort for new environments?
  • Can we adaptively reduce the number of performance measurement repetitions with an uncertainty-aware model?
  • How can we incorporate environmental factors such as workload characteristics and hardware benchmark scores to generalize to new environments?

Energy consumption of source code is an important aspect that more and more developers care about nowadays. Being able to understand and reduce energy consumption of a software product in early development stages would be beneficial, and therefore, CI/CD pipeline integration is necessary. However, integrating energy tests into an automated pipeline adds some complexity. Those pipelines run in parallel, on different servers, with varying workload, which adds varying an amount of uncertainty.

A thesis in this topic could focus on determining sources and quantifying the amount of measurement noise in such systems. Further, developers need code level information to, not only optimize a software systems configuration, but to improve the energy consumption of the code itself. Therefore, suitable is needed.

Example Research Questions:

  • What is the influence of the different layers of virtualization on the significance of energy measurements in CI pipelines?
  • Can we learn code-level performance/energy models using data from CI pipelines?

As covered in the lecture Search-Based Software Engineering, Genetic Algorithms have the potential to feasibly solve np-hard problems in software engineering. For example, GAs can generate test suites, optimize the performance of a configurable software system, and automate debugging. Challenges include tailoring algorithms towards a specific problem and choosing a suitable problem representation. Prior to working on a thesis in this topic area, one should survey the related work, and identify if the problem has already been studied extensively.

LLM-based tools, such as Copilot, ChatGPT, or CodeWhisperer, often promise to help developers build software more efficiently and effectively. However, the specific impact these tools have on the development processes of individuals and teams remain unclear. In this topic area, a thesis could investigate questions focusing on productivity, decision making, team communication, developers‘ behaviors and emotions, or adoption barriers faced when introducing LLM-based tools.

Example Research Questions:

  • How does the introduction of an LLM-based assistant change different productivity metrics of individuals and teams?
  • How do programming beginners interact with LLM-based chatbots?

In the realm of software development, code documentation and commenting are crucial, yet often underestimated. To this end, expensive domain-specific Large Language Models (LLMs), have been proposed which automate typical software engineering tasks such as documentation writing. Nevertheless, these advanced models usually require considerable computational resources, often necessitating high-end GPU-powered cloud servers. The financial and resource-intensive nature of this approach poses considerable challenges, especially for small-to-medium enterprises or individual developers who may not have the financial muscle to continuously maintain such resource-intensive operations.

Example Research Questions:

  • How do smaller, local LLMs perform against proprietary LLMs on [THE SOFTWARE ENGINEERING TASK]?
  • How does [A NOVEL TECHNIQUE TO IMPROVE LOCAL LLMs] improve local LLMs and compare against proprietary LLMs?
  • For task X, how do the environmental emissions/costs differ from large-scale and small-scale, local LLMs?

In contrast to traditional software, ML software is highly data-dependent and requires continuous monitoring and updating to maintain performance. The unpredictable and experimental nature of data and model behavior adds complexity to development and maintenance. Additionally, new roles with diverse skill sets are driving these products, adding complexity to processes and organizational structures. A thesis in this topic could focus on technical, social, or ethical aspects of engineering ML-enabled systems, thus developing tools to solve a specific task, or reviewing (gray) literature, conducting interview or survey studies to understand specific challenges or best practices. Currently, we are specifically interested in software systems containing or interacting with an LLM. Nevertheless, if you have a thesis in mind related to “traditional„ ML, please approach us.

Example Research Questions:

  • What tools exist to support developers when prompt engineering and how do they align with developers‘ needs?

Modern software development involves a plethora of technologies, including build tools, code frameworks, databases, CI/CD pipelines and containerization technologies. All of these technologies encode not only hundreds of configuration options in their own syntax, semantics and structure, but also introduce non-obvious configuration dependencies across the used technology stack. Unfortunately, there is no complete overview of the intertwined configuration dependencies. The lack of an overview and the rapid evolution of software systems inevitably lead to misconfigurations, which often infiltrate a software project unnoticed. The resulting misconfigurations, such as inconsistent configurations, are complex, far-reaching and much more difficult to detect and resolve than typical software errors. In this area, a thesis could investigate topics that focus on dependency detection, misconfiguration resolution, dependency validation, configuration space evolution and maintenance, and the dimension of software configuration in ML-enabled systems.

Example Research Questions:

  • How do configuration dependencies manifest in ML-enabled software systems?
  • Can LLMs extract configuration dependencies from Stack Overflow posts?

Wir sind offen für eine Betreuung von Themen in Firmen, sofern es thematisch passt. Eine Geheimhaltungsklausel (NDA) wird hierbei explizit ausgeschlossen! Wir haben ebenfalls Kontakt zu renommierten Firmen und können dabei helfen Abschlussarbeiten zu vermitteln.

Abgeschlossene Arbeiten

Managing Prompt Engineering Experiments: Tools and User’s Needs (Bachelor)

Socio-Technical Challenges in Software Engineering (Bachelor)

Evolution of the Configuration Space of Open-Source Software Systems (Bachelor)

Ermittlung von Konfigurationsoptionen im Source Code mit Fokus auf Machine Learning Bibliotheken in Python (Bachelor)

Tracking Configuration Options on Source Code Focusing on Java Frameworks (Bachelor)

Klassifikation von Konfigurationsabhängigkeiten zwischen verschiedenen Technologien (Bachelor)

Software Engineering 4 AI in der Praxis (Bachelor)

Abdeckungsanalyse von konfigurierbaren Performance-Benchmarks und Softwaresystemen (Bachelor/Master)

Energy Consumption in Practice (Bachelor)

Studie über Konfigurationsfehler (Bachelor)

Automatisiertes Beheben von Konfigurationsfehlern (Bachelor/Master)

Gruppenbasierte Sensitivitätsanalyse für die Vorhersage nicht-funktionaler Eigenschaften konfigurierbar Softwaresysteme (Bachelor)

Group Sampling for Learning Configuration-specific Software Performance (Bachelor)