Doctoral thesis (Dissertations and theses)
Diversity Preserving Genetic Algorithms - Application to the Inverted Folding Problem and Analogous Formulated Benchmarks
Nielsen, Sune Steinbjorn
2016
 

Files


Full Text
Thesis_Sune_S_Nielsen_24_02_2016_updated_17_08_2016_final.pdf
Author preprint (6.99 MB)
Download

All documents in ORBilu are protected by a user license.

Send to



Details



Keywords :
Genetic Algorithms; Inverted Folding Problem; Diversity Preservation
Abstract :
[en] Protein structure prediction is an essential step in understanding the molecular mechanisms of living cells with widespread applications in biotechnology and health. Among the open problems in the field, the Inverse Folding Problem (IFP) that consists in finding sequences that fold into a defined structure is, in itself, an important research problem at the heart of most rational protein design approaches. In brief, solutions to the IFP are protein sequences that will fold into a given protein structure, contrary to conventional structure prediction where the solution consists of the structure into which a given sequence folds. This inverse approach is viewed as a simplification due to the fact that the near infinite number of structure conformations of a protein can be disregarded, and only sequence to structure compatibility needs to be determined. Additional emphasis has been put on the generation of many sequences dissimilar from the known reference sequence instead of finding only one solution. To solve the IFP computationally, a novel formulation of the problem was proposed in which possible problem solutions are evaluated in terms of their predicted secondary structure match. In addition, two specialised Genetic Algorithms (GAs) were developed specifically for solving the IFP problem and compared with existing algorithms in terms of performance. Experimental results outlined the superior performance of the developed algorithms, both in terms of model score and diversity of the generated sets of problem solutions, i.e. new protein sequences. A number of landscape analysis experiments were conducted on the IFP model, enabling the development of an original benchmark suite of analogous problems. These benchmarks were shown to share many characteristics with their IFP model counterparts, but are executable in a fraction of the time. To validate the IFP model and the algorithm output, a subset of the generated solutions were selected for further inspection through full tertiary structure prediction and comparison to the original protein structure. Congruence was then assessed by super-positioning and secondary structure annotation statistics. The results demonstrated that an optimisation process relying on a fast secondary structure approximation, such as the IFP model, permits to obtain meaningful sequences.
Research center :
ULHPC - University of Luxembourg: High Performance Computing
Disciplines :
Computer science
Author, co-author :
Nielsen, Sune Steinbjorn ;  University of Luxembourg > Faculty of Science, Technology and Communication (FSTC) > Computer Science and Communications Research Unit (CSC)
Language :
English
Title :
Diversity Preserving Genetic Algorithms - Application to the Inverted Folding Problem and Analogous Formulated Benchmarks
Defense date :
24 February 2016
Number of pages :
248
Institution :
Unilu - University of Luxembourg, Luxembourg
Degree :
Docteur de l’Université du Luxembourg en Informatique
Promotor :
Jury member :
Talbi, El-Ghazali
Danoy, Grégoire  
Jurkowski, Wiktor
Focus Area :
Systems Biomedicine
Name of the research project :
Evoperf
Funders :
AFR
Available on ORBilu :
since 24 August 2016

Statistics


Number of views
128 (21 by Unilu)
Number of downloads
187 (5 by Unilu)

Bibliography


Similar publications



Contact ORBilu