The 12th International Workshop on Genetic Improvement @ICSE 2023

Skyline at the Shrine of Remembrance, Melbourne, Australia

Navigation: Registration, Keynote, Schedule, Accepted Papers, CFP, Workshops Chairs

Event

The 12th instalment of the GI workshop took place in Melbourne on 20 May 2023, collocated with the International Conference on Software Engineering, ICSE 2023.

GI 2023 was a hybrid workshop that ran in-person and virtually.

best paper award: Generative Art via Grammatical Evolution — Erik M. Fredericks, Abigail C. Diller, and Jared M. Moore
best position paper award: Towards Objective-Tailored Genetic Improvement Through Large Language Models — Sungmin Kang and Shin Yoo
best presentation award: Towards Objective-Tailored Genetic Improvement Through Large Language Models — Sungmin Kang and Shin Yoo

Registration

Registration details are available on the ICSE website: https://conf.researchr.org/attending/icse-2023/Registration

Early Bird deadline: 13th March 2023

At least one author of an accepted paper needs to register for the event.

Important Dates

Submission Deadline: 13 January 2023 (Fri)
Notification: 24 February 2023 (Fri)
Camera-ready: 17 March 2023 (Fri)
Workshop: 20 May 2023 (Sat)

Keep up to date with the latest event news via our Twitter: https://twitter.com/gi_of_software.

Keynote

We are happy to announce that Myra B. Cohen (Iowa State University, USA) and Sebastian Baltes will both give keynote speechs at GI@ICSE 2023.

It’s all in the Semantics: When are Genetically Improved Programs Still Correct?
Genetic improvement (GI) is a powerful technique to automatically optimize programs, often for non-functional properties. As such, we expect to retain the original program semantics, hence GI is guided by both a functional test suite and at least one other objective such as program efficiency, memory usage, energy efficiency, etc. An assumption made is that it is possible to improve a program’s non-functional objective while retaining the program’s correctness, however, this assumption may not hold for all types of non-functional properties. In this talk I show why GI is naturally a multi-objective optimization problem and argue that it may be necessary to relax part of the program correctness to satisfy our non-functional goals. I discuss a few recent examples where we have had to balance functional correctness and non-functional objectives and demonstrate how this may lead to programs that are of higher quality in the end. This raises an important question about when it is possible to completely satisfy multiple (potentially competing) program objectives during GI, and when it is semantically impossible. This leads to the ultimate question of what it means for a program to be correct when using GI.

Prof. Cohen is a full professor at Iowa State University (USA), where she holds the Lanh and Oanh Nguyen Chair in Software Engineering in the Department of Computer Science. She is head of Iowa State’s LaVA-Ops, Laboratory for Variability-Aware Assurance and Testing of Organic Programs. As well as genetic improvement, her research covers software testing of highly-configurable software, SBSE, applications of combinatorial designs (CIT), and the synergy between systems and synthetic biology and software engineering. She has served on many software engineering conferences, including this year as the Technical Briefings-track chair of ICSE 2023.

All about the money: Cost modeling and optimization of cloud applications
Cost is an essential non-functional property of cloud applications and is often a primary reason for companies to move to the cloud. One significant advantage of cloud platforms is the possibility to scale compute, storage, and networking resources up and down based on demand. However, as an application scales, so does the cost. Cost transparency of cloud applications is a common problem, and cloud providers have responded by providing means for detecting cost anomalies. However, detecting anomalies after billing is a workaround rather than a solution addressing the core problem. After introducing central cloud computing concepts and typical pricing approaches in the cloud, this talk outlines our vision of a vendor-agnostic cost model enabling reasoning about cost-optimal infrastructure and platform configurations based on expected workloads. The overall goal is to shift cost transparency left, i.e., to the developers and platform engineers who frequently provision cloud environments using web portals or Infrastructure-as-Code (IaC) files. The talk concludes by summarizing the current trend towards Infrastructure-from-Code (IfC), where programming languages and cloud infrastructure descriptions converge into one paradigm, intending to automate infrastructure provisioning as much as possible. This area has huge potential for genetic improvement to optimize the IfC code and the provisioning mechanisms while balancing non-functional properties such as performance and cost.

Dr. Sebastian Baltes is a Principal Expert for Empirical Software Engineering at SAP SE in Germany and an Adjunct Lecturer at the University of Adelaide in Australia. He received his Ph.D. in Computer Science from the University of Trier, Germany, in 2019. His work focuses on software analytics, i.e., processing, analyzing, and visualizing software engineering data to monitor, govern, and improve software development processes and tools. He is further interested in interdisciplinary research and methodological aspects of empirical software engineering. For him, thoroughly analyzing and understanding the state-of-practice is an essential first step towards improving how software is being developed. Dr. Baltes’ research has been published in leading software engineering venues, including ICSE, FSE, TSE, and EMSE. He was awarded a Google Faculty Research Award in 2020 and two ACM SIGSOFT Distinguished Paper Awards (at ICSE 2021 and 2023). For more information, please visit https://empirical-software.engineering.

Accepted Papers

Authors of accepted papers are invited to submit an extended version of their papers to ASE’s Special Issue on Genetic Improvement.

All about the money: Cost modeling and optimization of cloud applications
by Sebastian Baltes
DOI PDF SLIDES SLIDES VIDEO VIDEO VIDEO Abstract

Cost is an essential non-functional property of cloud applications and is often a primary reason for companies to move to the cloud. One significant advantage of cloud platforms is the possibility to scale compute, storage, and networking resources up and down based on demand. However, as an application scales, so does the cost. Cost transparency of cloud applications is a common problem, and cloud providers have responded by providing means for detecting cost anomalies. However, detecting anomalies after billing is a workaround rather than a solution addressing the core problem. After introducing central cloud computing concepts and typical pricing approaches in the cloud, this talk outlines our vision of a vendor-agnostic cost model enabling reasoning about cost-optimal infrastructure and platform configurations based on expected workloads. The overall goal is to shift cost transparency left, i.e., to the developers and platform engineers who frequently provision cloud environments using web portals or Infrastructure-as-Code (IaC) files. The talk concludes by summarising the current trend towards Infrastructure-from-Code (IfC), where programming languages and cloud infrastructure descriptions converge into one paradigm, intending to automate infrastructure provisioning as much as possible. This area has huge potential for genetic improvement to optimize the IfC code and the provisioning mechanisms while balancing nonfunctional properties such as performance and cost.

DebugNS: Novelty Search for Finding Bugs in Simulators
by David Griffin, Susan Stepney, and Ian Vidamour
DOI PDF SLIDES VIDEO VIDEO Abstract

Novelty search is used to find a range of novel behaviours in a system. Software bugs are behaviours that are a) unexpected and b) incorrect. As the intersection between ``novel'' and ``unexpected'' is non-empty, here we overview how novelty search can be employed to find bugs in simulation software. We give an example of this approach applied to the RingSim simulator.

Exploring the Use of Natural Language Processing Techniques for Enhancing Genetic Improvement
by Oliver Krauss
DOI PDF SLIDES VIDEO VIDEO VIDEO Abstract

We explore the potential of using large-scale Natural Language Processing (NLP) models, such as GPT-3, for enhancing genetic improvement in software development These models have previously been used to automatically find bugs, or improve software. We propose using these models as a novel mutator, as well as for explaining the patches generated by genetic improvement algorithms. Our initial findings indicate promising results, but further research is needed to determine the scalability and applicability of this approach across different programming languages.

Generative Art via Grammatical Evolution
by Erik M. Fredericks, Abigail C. Diller, and Jared M. Moore
DOI PDF SLIDES VIDEO VIDEO URL Abstract

Generative art produces artistic output via algorithmic design. Common examples include flow fields, particle motion, and mathematical formula visualization. Typically an art piece is generated with the artist/programmer acting as a domain expert to create the final output. A large amount of effort is often spent manipulating and/or refining parameters or algorithms and observing the resulting changes in produced images. Small changes to parameters of the various techniques can substantially alter the final product. We present GenerativeGI, a proof of concept evolutionary framework for creating generative art based on an input suite of artistic techniques and desired aesthetic preferences for outputs. GenerativeGI encodes artistic techniques in a grammar, thereby enabling multiple techniques to be combined and optimized via a many-objective evolutionary algorithm. Specific combinations of evolutionary objectives can help refine outputs reflecting the aesthetic preferences of the designer. Experimental results indicate that GenerativeGI can successfully produce more visually complex outputs than those found by random search.

Genetic Improvement of OLC and H3 with Magpie
by William B. Langdon and Bradley J. Alexander
DOI PDF PDF SLIDES VIDEO VIDEO URL Abstract

Magpie (Machine Automated General Performance Improvement via Evolution of software) has been recently developed by Aymeric Blot from PyGGI 2.0. Like PyGGI, it claims to be able to optimise computer source code written in arbitrary programming languages. So far it has been demonstrated on benchmarks written in Python and C. Recently we have used hill climbing to customise two industrial open source programs: Google's Open Location Code OLC and Uber's Hexagonal Hierarchical Spatial Index H3 [W. B. Langdon et al., ``Genetic improvement of LLVM intermediate representation'', in EuroGP 2023]. Magpie found much faster improvements (reducing instruction counts by up to 15 percent v. 2 percent) which generalise. Various glitches in Magpie are also reported.

It's all in the Semantics: When are Genetically Improved Programs Still Correct?
by Myra B. Cohen
DOI PDF VIDEO Abstract

Genetic improvement (GI) is a powerful technique to automatically optimise programs, often for nonfunctional properties. As such, we expect to retain the original program semantics, hence GI is guided by both a functional test suite and at least one other objective such as program efficiency, memory usage, energy efficiency, etc. An assumption made is that it is possible to improve a program's non-functional objective while retaining the program's correctness, however, this assumption may not hold for all types of non-functional properties. In this talk I show why GI is naturally a multi-objective optimization problem and argue that it may be necessary to relax part of the program correctness to satisfy our non-functional goals. I discuss a few recent examples where we have had to balance functional correctness and non-functional objectives and demonstrate how this may lead to programs that are of higher quality in the end. This raises an important question about when it is possible to completely satisfy multiple (potentially competing) program objectives during GI, and when it is semantically impossible. This leads to the ultimate question of what it means for a program to be correct when using GI.

Towards Objective-Tailored Genetic Improvement Through Large Language Models
by Sungmin Kang and Shin Yoo
DOI PDF PDF SLIDES VIDEO VIDEO URL Abstract

While Genetic Improvement (GI) is a useful paradigm to improve functional and nonfunctional aspects of software, existing techniques tended to use the same set of mutation operators for differing objectives, due to the difficulty of writing custom mutation operators. we suggest that Large Language Models (LLMs) can be used to generate objective-tailored mutants, expanding the possibilities of software optimisations that GI can perform. We further argue that LLMs and the GI process can benefit from the strengths of one another, and present a simple example demonstrating that LLMs can both improve the effectiveness of the GI optimization process, while also benefiting from the evaluation steps of GI. As a result, we believe that the combination of LLMs and GI has the capability to significantly aid developers in optimizing their software.

Updating Gin's profiler for current Java
by Myles Watkinson and Alexander Brownlee
DOI PDF SLIDES VIDEO VIDEO URL Abstract

Genetic improvement is a young and growing field. With much research still to be done, a number of tools to support the research community have emerged, with Gin being one such tool targeted at GI for Java. One core component of Gin is the profiler, which is used to identify hot methods in target applications: methods where the CPU spends most time and so may offer the most fertile sections of code for improvements to run time. Gin's profiler is HPROF, which was included with JDKs up to version 8. HPROF is no longer supported and so needs replaced if Gin is to support later versions of Java. Furthermore, little investigation has been made within the GI community comparing different profiling approaches. With this paper and its associated accepted pull request, we replace Gin's CPU profiler with Java Flight Recorder (JFR) to allow Gin to be applied to current Java code, allowing researchers working in GI with more recent JVMs to easily integrate profiling in their pipeline. We also contribute an experimental comparison of the HPROF and JFR profilers for the JVM.

Schedule

This schedule appears in Melbourne’s time zone (UTC+10, Australian Eastern Time); compare to your timezone here.
Presentations for full papers are 20 minutes long, followed by 10 minutes for questions.
Presentations for short papers consist of a 10 minute talk, followed by 5 minutes for questions.

The workshop will take place in Meeting Room 109.

Saturday, May 20, 09:00–10:30 (90 mins)
- 09:00: Welcome and Introductions (15 mins)
- 09:15: Keynote speech — It’s all in the Semantics: When are Genetically Improved Programs Still Correct? — Myra B. Cohen (5+55+15 mins Q&A)
- 10:30: Break & Morning tea (30 mins)
Saturday, May 20, 11:00–12:30 (90 mins)
- 11:00: Generative Art via Grammatical Evolution — Erik M. Fredericks, Abigail C. Diller, and Jared M. Moore (20+10 mins)
- 11:30: Genetic Improvement of OLC and H3 with Magpie — William B. Langdon and Bradley J. Alexander (20+10 mins)
- 12:00: DebugNS: Novelty Search for Finding Bugs in Simulators — David Griffin, Susan Stepney, and Ian Vidamour (10+5 mins)
- 12:15: Discussion (15 minutes)
- 12:30: Lunch & Social event (75 minutes)
Saturday, May 20, 13:45–15:15 (90 mins)
- 13:45: Keynote speech — All about the money: Cost modeling and optimization of cloud applications — Sebastian Baltes (5+40+15 mins Q&A)
- 14:45: Towards Objective-Tailored Genetic Improvement Through Large Language Models — Sungmin Kang and Shin Yoo (10+5 mins)
- 15:00: Exploring the Use of Natural Language Processing Techniques for Enhancing Genetic Improvement — Oliver Krauss (10+5 mins)
- 15:15: Break & Afternoon tea (30 minutes)
Saturday, May 20, 15:45–17:00 (75 mins)
- 15:45: Updating Gin’s profiler for current Java — Myles Watkinson and Alexander Brownlee (20+10 mins)
- 16:15: Discussion, Awards, and Closing (45 mins)

Call For Submissions [pdf]

We invite submissions that discuss recent developments in all areas of research on, and applications of, Genetic Improvement. The International Workshop on Genetic Improvement is the premier workshop in the field and provides an opportunity for researchers interested in automated program repair and software optimisation to disseminate their work, exchange ideas, and discover new research directions. Topics of interest include both the theory and practice of Genetic Improvement. Applications of GI include, but are not limited to:

Improve efficiency
Decrease memory consumption
Decrease energy consumption
Transplant new functionality
Specialise software
Translate between programming languages
Generate multiple versions of software
Improve low level or binary code
Repair bugs
GI techniques in industrial settings

We invite submissions of two paper types:

Research papers (eight page limit, ~~excluding~~ including references)
Position papers (two page limit, including references)

The best paper and best presentation will be awarded during the workshop.

Detailed formatting instructions for authors are listed here.
We encourage authors to submit early and in-progress work. The workshop emphasises interaction and discussion.

Papers must be submitted through the paper submission website: https://gi-at-icse2023-workshop.hotcrp.com/
These papers will be reviewed in a double-blind manner.
All accepted papers must be presented at the workshop.

Note that per the ICSE 2023 guidelines:

The official publication date is the date the proceedings are made available in the ACM or IEEE Digital Libraries. This date may be up to two weeks prior to the first day of ICSE 2023. The official publication date affects the deadline for any patent filings related to published work.
Purchases of additional pages in the proceedings is not allowed.

Funding Opportunity For Students

We will support up to 5 students by offering to partially reimburse (up to 200 GBP each) registration and travel costs for students whose work is accepted to the GI workshop. Priority will be given based on the student’s need and submission quality.

Students applying for a scholarship should submit a first-author regular paper to the workshop (up to 8 pages long) and plan to present their work in Melbourne in person. Moreover, their supervisor should send a one-paragraph note of recommendation to [email protected] by March 13th listing:

the student’s area of work.
the supervisor’s support of the student’s application.

Workshop Chairs

Vesna Nowack. Since she gained her PhD in Software Engineering in 2016 from the Universitat Politecnica de Catalunya in Barcelona, she has conducted research in supercomputing (Spain) and taught robotics in Germany. Her recent research has been on APR in the UK, including 12 months with Bloomberg (London) published as “On the Introduction of Automatic Program Repair in Bloomberg” by IEEE Software. She is now a Senior Research Assistant at Lancaster University where she continues her work on using GI to automatically fix bugs.

Markus Wagner is currently an Associate Professor at the University of Adelaide, and he will move to Monash University, Melbourne, Australia, in January 2023. He has organised two GI workshops and co-presented GI tutorials at ASE 2020, GECCO 2020 and 2021. This year he was general chair for ACM GECCO-2022 having previously served as Workshop Chair and Competition Chair. Within the IEEE CIS, he has chaired several education-related committees, where he also served as founding chair of two task forces.

Gabin An is a doctoral candidate at KAIST in South Korea, advised by Prof. Shin Yoo. Gabin’s work focuses on faster and more accurate bug assignments and, in close collaboration with industry, improving test suites. She is also behind PyGGI, a Python library for fast prototyping of GI techniques for multiple programming languages. Gabin served as the web chair for GI@ICSE 2020 and the GI workshop PC 2021-2022.

Aymeric Blot is a Senior Lecturer in the Université du Littoral Côte d’Opale, Calais, France. Before that he was a Research Associate conducting research in genetic improvement at the CREST and SOLAR groups in University College London. He received in 2018 a doctorate from the University of Lille following work on automated algorithm design for multi-objective combinatorial optimisation. His research focuses on strengthening GI techniques using knowledge from automated machine learning, algorithm configuration, and evolutionary computation. He maintains and evolves the community website on genetic improvement.

Justyna Petke is a Principal Research Fellow and Proleptic Associate Professor, conducting research in genetic improvement. She is at the Centre for Research on Evolution, Search and Testing at University College London. Her work on genetic improvement was awarded a Silver (GECCO 2014) and a Gold ’Humie’ (GECCO 2016) and an ACM SIGSOFT Distinguished Paper Award at ISSTA 2015. She was the PC co-Chair for the International Symposium on Search-Based Software Engineering in 2017. She also organised 8 Genetic Improvement Workshops. She currently serves on the editorial board of the Empirical Software Engineering (EMSE), and Automated Software Engineering (ASE) journals.