Learning Natural Coding Conventions
Miltiadis Allamanis, University of Edinburgh
Software systems are made out of source code that defines in a formal and unambiguous way the instructions that a computer needs to execute. Source code is a core artifact of the software engineering process. Since software systems need to be maintained and extended, source code needs to be frequently revisited by software engineers who need to read, understand and maintain it. Source code acts as a means of communication between software developers and therefore it needs to be easily understandable and modifiable.
To achieve this, software teams enforce a set of coding conventions, i.e. a set of self-imposed restrictions on how source code is written, imposed for efficient developer communication through source code. One important coding convention is related to naming software artifacts. The names need to clearly reveal the role and the function of each code artifact. Other conventions include the idiomatic use of source code constructs. These idioms convey easily understandable semantics and therefore aid humans when reasoning about code functionality.
This thesis presents an automated way for inferring and enforcing coding conventions to help software engineers write conventional, maintainable code. We use machine learning - a set of statistical and mathematical modelling methods whose parameters are learned from data and can be used to make “smart” predictions about previously unseen observations. This thesis presents machine learning models that learn to suggest conventional names for software engineering artifacts. This task requires novel machine learning models that “understand” the role and the function of the source code artifacts and how they compose to provide a distinct functionality.
This dissertation also presents a machine learning-based method that automatically finds widely used source code idioms from a large set of source code. Code idioms are “mental chunks” of code that serve a single, easily identifiable semantic purpose. The mined idioms serve as a form of documentation of how code libraries and programming language constructs are used. Finally, we mine semantic idioms, mental chunks of code that are not syntactic but represent common types of operations. We show how these idioms can be used within software engineering tools and to support the evolution of programming languages.
Algorithms for Game-Theoretic Environments
Alkmini Sgouritsa, University of Liverpool
Game Theory constitutes an appropriate way for approaching the Internet and modelling situations where participants interact with each other, such as networking and online auctions. Mechanism Design attempts implementing desired social choices in a strategic setting.
This thesis studies how the efficiency of a system degrades due to the selfish behaviour of its agents, in two well-studied settings, auctions and cost-sharing games. The goal is to design mechanisms where the strategic behaviour of the agents results in outcomes as close as possible to the maximum social welfare.
For auctions that are close to the spirit of eBay, this thesis studies a wide class of mechanisms, answering several state of the art open questions. It shows that the mechanism where the winner is the highest bidder and pays their own bid is the best among this class.
In cost-sharing games, where the agents share the cost of the resources they use, this thesis proposes new research directions where the mechanism designer has more information at hand, such as some structural knowledge of the instance. It studies thoroughly this area by either designing better-performed mechanisms or by showing impossibility results.