Understanding the internal functioning of evolutionary algorithms is an essential requirement for improving their performance and reliability. Increased computational resources available in current mainstream computers make it possible for new previously infeasible research directions to be explored. Therefore, a comprehensive theoretical analysis of their mechanisms and dynamics using modern tools becomes possible. Recent algorithmic achievements like offspring selection in combination with linear scaling have enabled genetic programming (GP) to achieve high quality results in system identification in less than 50 generations using populations of only several hundred individuals. Therefore, the active gene pool of evolutionary search remains manageable and may be subjected to new theoretical investigations closely related to genetic programming schema theories, building block hypotheses and bloat theories.
Genetic algorithms emulate emergent systems in which complex patterns are formed from an initially simple and random pool of elementary structures. In GP, complexity emerges under the influence of stabilizing selection which preserves the useful genetic variation created by recombination and mutation. The mapping between the structures used for solution representation and the ones used for the evaluation of fitness has a major influence on algorithm behavior. Population-wide effects concerning building blocks, genetic diversity and bloat can be conceptually seen as results of the complex interaction between phenotypic operators (selection) and genotypic operators (mutation and recombination). This coupling known as the “variation-selection loop” is the main engine for GP emergent behavior and constitutes the main topic of this research. This thesis aims to provide a unified theoretical framework which can explain GP evolution. To this end, it explores the way in which the genotype-phenotype map, in relation with the evolutionary operators (selection, recombination, mutation) determines algorithmic behaviour. As the title suggests, the main contribution consists of a novel “tracing” framework that makes it possible to determine the exact patterns of building block and gene propagation through the generations and the way smaller elements are gradually assembled into more complex structures by the evolutionary algorithm.