data mining; Random Forests; Kraft recovery boiler
Résumé :
[en] A data mining methodology, the random forests, is applied to predict high pressure steam
production from the recovery boiler of a Kraft pulping process. Starting from a large database of raw process data, the goal is to identify the input variables that explain the most significant output variations and to predict the high pressure steam flow.
Disciplines :
Ingénierie, informatique & technologie: Multidisciplinaire, généralités & autres
Auteur, co-auteur :
SAINLEZ, Matthieu ; University of Luxembourg > Luxembourg Centre for Systems Biomedicine (LCSB)
Heyen, Georges
Lafourcade, Sébastien
Langue du document :
Anglais
Titre :
Supervised learning for a Kraft recovery boiler: a data mining approach with Random Forests.
Date de publication/diffusion :
11 janvier 2011
Nom de la manifestation :
ECOS 2010 - 23rd International Conference on Efficiency, Cost, Optimization, Simulation and Environmental Impact of Energy Systems
Organisateur de la manifestation :
EPFL
Lieu de la manifestation :
Lausanne, Suisse
Date de la manifestation :
June 14, 2010 – June 17, 2010
Manifestation à portée :
International
Titre de l'ouvrage principal :
ECOS 2010 Volume IV (Power plants and Industrial processes)
Kellie J. Archer and Ryan V. Kimes. Empirical characterization of random forest variable importance measures. Computational Statistics and Data Analysis, 52: 2249-2260, 2008.
Christopher M. Bishop. Pattern Recognition and Machine Learning. Springer, 2006.
W. Blasiak, L. Tao, J. Vaclavinek, and P. Lidegran. Modeling of kraft recovery boilers. Energy Conversion Management, 38: 995-1005, 1997.
L. Breiman, J. Friedman, R. Olshen, and C. Stone. Classification and Regression Trees. Wadsworth, 1984.
Leo Breiman. Bagging predictors. Machine Learning, 24: 123-140, 1996.
Leo Breiman. Random forests. Machine Learning, 45: 5-32, 2001.
Adrian Cakembergh-Mas, Jean Paris, and Martin Trépanier. Strategic simulation of the energy management in a kraft mill. Energy Conversion Management, 51: 988-997, 2010.
B. Efron. Estimating the error rate of a prediction rule: some improvements on crossvalidation. Journal of the American Statistical Association, 78: 316-331, 1983.
Pierre Geurts, Damien Ernst, and Louis Wehenkel. Extremely randomized trees. Computational Statistics and Data Analysis, 36: 3-42, 2006.
P.M. Granitto, F. Gasperi, F. Biasioli, E. Trainotti, and C. Furlanello. Modern data mining tools in descriptive sensory analysis: A case study with a random forest approach. Food Quality and Preference, 18: 681-689, 2007.
Jiawei Han and Micheline Kamer. Data Mining : Concepts and Techniques. MorganKaufmann Publishers, 1984.
Trevor Hastie, Robert Tibshirani, and Jerome Friedman. The Elements of Statistical Learning : Data Mining, Inference, and Prediction. Springer Series in Statistics, 2009. Second Edition.
Andrej Macek. Research on combustion of black-liquor drops. Progress in Energy and Combustion Science, 25: 275-304, 1999.
Sigurdur Olafsson, Xiaonan Li, and Shuning Wu. Operations research and data mining. European Journal of Operational Research, 187: 1429-1448, 2008.
Matthieu Sainlez and Georges Heyen. Performance monitoring of an industrial boiler: classification of relevant variables with random forests. In S. Pierucci and G. Buzzi Ferraris (Editors), editors, 20th European Symposium on Computer Aided Process Engineering ESCAPE20. Elsevier, 2010. Accepted paper.
David S. Siroky. Navigating random forests and related advances in algorithmic modeling. Statistics Surveys, 3: 147-163, 2009.
Gary A. Smook. Handbook for Pulp and Paper Technologists. Angus Wilde Publications, 2002. Third Edition.
Ting Wang. Package random forests for matlab r13. http://lib.stat.cmu.edu/matlab/.