Results 1-15 of 15.
Bookmark and Share    
Full Text
Peer Reviewed
See detailMicroscopic energy consumption modelling of electric buses: model development, calibration, and validation
Fiori, Chiara; Montanino, Marcello; Nielsen, Sune et al

in Transportation Research. Part D, Transport and Environment (2021)

Detailed reference viewed: 46 (0 UL)
Full Text
Peer Reviewed
See detailMachine Learning to Support the Presentation of Complex Pathway Graphs.
Nielsen, Sune Steinbjorn UL; Ostaszewski, Marek UL; McGee, Fintan et al

in IEEE/ACM transactions on computational biology and bioinformatics (2019)

Visualization of biological mechanisms by means of pathway graphs is necessary to better understand the often complex underlying system. Manual layout of such pathways or maps of knowledge is a difficult ... [more ▼]

Visualization of biological mechanisms by means of pathway graphs is necessary to better understand the often complex underlying system. Manual layout of such pathways or maps of knowledge is a difficult and time consuming process. Node duplication is a technique that makes layouts with improved readability possible by reducing edge crossings and shortening edge lengths in drawn diagrams. In this article we propose an approach using Machine Learning (ML) to facilitate parts of this task by training a Support Vector Machine (SVM) with actions taken during manual biocuration. Our training input is a series of incremental snapshots of a diagram describing mechanisms of a disease, progressively curated by a human expert employing node duplication in the process. As a test of the trained SVM models, they are applied to a single large instance and 25 medium-sized instances of hand-curated biological pathways. Finally, in a user validation study, we compare the model predictions to the outcome of a node duplication questionnaire answered by users of biological pathways with varying experience. We successfully predicted nodes for duplication and emulated human choices, demonstrating that our approach can effectively learn human-like node duplication preferences to support curation of pathway diagrams in various contexts. [less ▲]

Detailed reference viewed: 113 (4 UL)
Full Text
See detailDiversity Preserving Genetic Algorithms - Application to the Inverted Folding Problem and Analogous Formulated Benchmarks
Nielsen, Sune Steinbjorn UL

Doctoral thesis (2016)

Protein structure prediction is an essential step in understanding the molecular mechanisms of living cells with widespread applications in biotechnology and health. Among the open problems in the field ... [more ▼]

Protein structure prediction is an essential step in understanding the molecular mechanisms of living cells with widespread applications in biotechnology and health. Among the open problems in the field, the Inverse Folding Problem (IFP) that consists in finding sequences that fold into a defined structure is, in itself, an important research problem at the heart of most rational protein design approaches. In brief, solutions to the IFP are protein sequences that will fold into a given protein structure, contrary to conventional structure prediction where the solution consists of the structure into which a given sequence folds. This inverse approach is viewed as a simplification due to the fact that the near infinite number of structure conformations of a protein can be disregarded, and only sequence to structure compatibility needs to be determined. Additional emphasis has been put on the generation of many sequences dissimilar from the known reference sequence instead of finding only one solution. To solve the IFP computationally, a novel formulation of the problem was proposed in which possible problem solutions are evaluated in terms of their predicted secondary structure match. In addition, two specialised Genetic Algorithms (GAs) were developed specifically for solving the IFP problem and compared with existing algorithms in terms of performance. Experimental results outlined the superior performance of the developed algorithms, both in terms of model score and diversity of the generated sets of problem solutions, i.e. new protein sequences. A number of landscape analysis experiments were conducted on the IFP model, enabling the development of an original benchmark suite of analogous problems. These benchmarks were shown to share many characteristics with their IFP model counterparts, but are executable in a fraction of the time. To validate the IFP model and the algorithm output, a subset of the generated solutions were selected for further inspection through full tertiary structure prediction and comparison to the original protein structure. Congruence was then assessed by super-positioning and secondary structure annotation statistics. The results demonstrated that an optimisation process relying on a fast secondary structure approximation, such as the IFP model, permits to obtain meaningful sequences. [less ▲]

Detailed reference viewed: 125 (21 UL)
Full Text
Peer Reviewed
See detailTackling the IFP Problem with the Preference-Based Genetic Algorithm
Nielsen, Sune Steinbjorn UL; Ferreira Torres, Christof UL; Danoy, Grégoire UL et al

in Proceedings of the Genetic and Evolutionary Computation Conference 2016 (2016)

Detailed reference viewed: 218 (33 UL)
Full Text
Peer Reviewed
See detailPreference-Based Genetic Algorithm for Solving the Bio-Inspired NK Landscape Benchmark
Ferreira Torres, Christof UL; Nielsen, Sune Steinbjorn UL; Danoy, Grégoire UL et al

in 7th European Symposium on Computational Intelligence and Mathematics (ESCIM) (2015, October)

Detailed reference viewed: 158 (30 UL)
Full Text
Peer Reviewed
See detailAn NK Landscape Based Model Mimicking the Protein Inverse Folding Problem
Nielsen, Sune Steinbjorn UL; Danoy, Grégoire UL; Talbi, El-Ghazali et al

in 27th European Conference on Operational Research (EURO) (2015)

Detailed reference viewed: 121 (19 UL)
Full Text
Peer Reviewed
See detailDISEASES: text mining and data integration of disease-gene associations.
Pletscher-Frankild, Sune; Palleja, Albert; Tsafou, Kalliopi et al

in Methods (2015), 74

Text mining is a flexible technology that can be applied to numerous different tasks in biology and medicine. We present a system for extracting disease-gene associations from biomedical abstracts. The ... [more ▼]

Text mining is a flexible technology that can be applied to numerous different tasks in biology and medicine. We present a system for extracting disease-gene associations from biomedical abstracts. The system consists of a highly efficient dictionary-based tagger for named entity recognition of human genes and diseases, which we combine with a scoring scheme that takes into account co-occurrences both within and between sentences. We show that this approach is able to extract half of all manually curated associations with a false positive rate of only 0.16%. Nonetheless, text mining should not stand alone, but be combined with other types of evidence. For this reason, we have developed the DISEASES resource, which integrates the results from text mining with manually curated disease-gene associations, cancer mutation data, and genome-wide association studies from existing databases. The DISEASES resource is accessible through a web interface at http://diseases.jensenlab.org/, where the text-mining software and all associations are also freely available for download. [less ▲]

Detailed reference viewed: 137 (6 UL)
Full Text
Peer Reviewed
See detailNK Landscape Instances Mimicking the Protein Inverse Folding Problem Towards Future Benchmarks
Nielsen, Sune Steinbjorn UL; Danoy, Grégoire UL; Bouvry, Pascal UL et al

in GECCO Companion '15 Proceedings of the Companion Publication of the 2015 Annual Conference on Genetic and Evolutionary Computation (2015)

This paper introduces two new nominal NK Landscape model instances designed to mimic the properties of one challenging optimisation problem from biology: the Inverse Folding Problem (IFP), here focusing ... [more ▼]

This paper introduces two new nominal NK Landscape model instances designed to mimic the properties of one challenging optimisation problem from biology: the Inverse Folding Problem (IFP), here focusing on a simpler secondary structure version. Through landscape analysis tests, numerous problem properties are identified and used to parameterise and validate model instances in terms of epistatic links, adaptive- and random walk characteristics. Then the performance of different Genetic Algorithms (GAs) is compared on both the new NK Models and the original IFP, in terms of population diversity, solution quality and convergence characteristics. It is demonstrated that very similar properties are captured in all presented tests with a significantly faster evaluation time compared to the real IFP. The future purpose of such a model is to provide a generic benchmark for algorithms targeting protein sequence optimisation, specifically in protein design. It may also provide the foundation for more in-depth studies of the size, shape and characteristics of the solution space of good solutions to the IFP. [less ▲]

Detailed reference viewed: 168 (16 UL)
Full Text
Peer Reviewed
See detailA Novel Multi-objectivisation Approach for Optimising the Protein Inverse Folding Problem
Nielsen, Sune Steinbjorn UL; Danoy, Grégoire UL; Jurkowski, Wiktor et al

in Applications of Evolutionary Computation: 18th European Conference, EvoApplications 2015, Copenhagen, Denmark, April 8-10, 2015, Proceedings (2015)

In biology, the subject of protein structure prediction is of continued interest, not only to chart the molecular map of the living cell, but also to design proteins of new functions. The Inverse Folding ... [more ▼]

In biology, the subject of protein structure prediction is of continued interest, not only to chart the molecular map of the living cell, but also to design proteins of new functions. The Inverse Folding Problem (IFP) is in itself an important research problem, but also at the heart of most rational protein design approaches. In brief, the IFP consists in finding sequences that will fold into a given structure, rather than determining the structure for a given sequence - as in conventional structure prediction. In this work we present a Multi Objective Genetic Algorithm (MOGA) using the diversity-as-objective (DAO) variant of multi-objectivisation, to optimise secondary structure similarity and sequence diversity at the same time, hence pushing the search farther into wide-spread areas of the sequence solution-space. To control the high diversity generated by the DAO approach, we add a novel Quantile Constraint (QC) mechanism to discard an adjustable worst quantile of the population. This DAO-QC approach can efficiently emphasise exploitation rather than exploration to a selectable degree achieving a trade-off producing both better and more diverse sequences than the standard Genetic Algorithm (GA). To validate the final results, a subset of the best sequences was selected for tertiary structure prediction. The super-positioning with the original protein structure demonstrated that meaningful sequences are generated underlining the potential of this work. [less ▲]

Detailed reference viewed: 189 (9 UL)
Full Text
Peer Reviewed
See detailCooperative Selection: Improving Tournament Selection via Altruism
Jimenez Laredo, Juan Luis UL; Nielsen, Sune Steinbjorn UL; Danoy, Grégoire UL et al

in The 14th European Conference on Evolutionary Computation in Combinatorial Optimisation (2014)

This paper analyzes the dynamics of a new selection scheme based on altruistic cooperation between individuals. The scheme, which we refer to as cooperative selection, extends from tournament selection ... [more ▼]

This paper analyzes the dynamics of a new selection scheme based on altruistic cooperation between individuals. The scheme, which we refer to as cooperative selection, extends from tournament selection and imposes a stringent restriction on the mating chances of an individual during its lifespan: winning a tournament entails a depreciation of its fitness value. We show that altruism minimizes the loss of genetic diversity while increasing the selection frequency of the fittest individuals. An additional contribution of this paper is the formulation of a new combinatorial problem for maximizing the similarity of proteins based on their secondary structure. We conduct experiments on this problem in order to validate cooperative selection. The new selection scheme outperforms tournament selection for any setting of the parameters and is the best trade-off, maximizing genetic diversity and minimizing computational efforts. [less ▲]

Detailed reference viewed: 243 (35 UL)
Full Text
Peer Reviewed
See detailCOMPARTMENTS: unification and visualization of protein subcellular localization evidence.
Binder, Janos X. UL; Pletscher-Frankild, Sune; Tsafou, Kalliopi et al

in Database: the Journal of Biological Databases and Curation (2014), 2014

Information on protein subcellular localization is important to understand the cellular functions of proteins. Currently, such information is manually curated from the literature, obtained from high ... [more ▼]

Information on protein subcellular localization is important to understand the cellular functions of proteins. Currently, such information is manually curated from the literature, obtained from high-throughput microscopy-based screens and predicted from primary sequence. To get a comprehensive view of the localization of a protein, it is thus necessary to consult multiple databases and prediction tools. To address this, we present the COMPARTMENTS resource, which integrates all sources listed above as well as the results of automatic text mining. The resource is automatically kept up to date with source databases, and all localization evidence is mapped onto common protein identifiers and Gene Ontology terms. We further assign confidence scores to the localization evidence to facilitate comparison of different types and sources of evidence. To further improve the comparability, we assign confidence scores based on the type and source of the localization evidence. Finally, we visualize the unified localization evidence for a protein on a schematic cell to provide a simple overview. Database URL: http://compartments.jensenlab.org. [less ▲]

Detailed reference viewed: 195 (2 UL)
Full Text
Peer Reviewed
See detailVehicular mobility model optimization using cooperative coevolutionary genetic algorithms
Nielsen, Sune Steinbjorn UL; Danoy, Grégoire UL; Bouvry, Pascal UL

in Proceedings of the Genetic and Evolutionary Computation Conference (GECCO '13) (2013)

Detailed reference viewed: 167 (13 UL)
Full Text
Peer Reviewed
See detailNovel efficient asynchronous cooperative co-evolutionary multi-objective algorithms
Nielsen, Sune Steinbjorn UL; Dorronsoro, Bernabé UL; Danoy, Grégoire UL et al

in Congress on Evolutionary Computation (CEC) (2012)

Detailed reference viewed: 211 (16 UL)