data mining; Random Forests; Kraft recovery boiler
Abstract :
[en] A data mining methodology, the random forests, is applied to predict high pressure steam
production from the recovery boiler of a Kraft pulping process. Starting from a large database of raw process data, the goal is to identify the input variables that explain the most significant output variations and to predict the high pressure steam flow.
Disciplines :
Engineering, computing & technology: Multidisciplinary, general & others
Author, co-author :
Sainlez, Matthieu ; University of Luxembourg > Luxembourg Centre for Systems Biomedicine (LCSB)
Heyen, Georges
Lafourcade, Sébastien
Language :
English
Title :
Supervised learning for a Kraft recovery boiler: a data mining approach with Random Forests.
Publication date :
11 January 2011
Event name :
ECOS 2010 - 23rd International Conference on Efficiency, Cost, Optimization, Simulation and Environmental Impact of Energy Systems
Event organizer :
EPFL
Event place :
Lausanne, Switzerland
Event date :
June 14, 2010 – June 17, 2010
Audience :
International
Main work title :
ECOS 2010 Volume IV (Power plants and Industrial processes)
Kellie J. Archer and Ryan V. Kimes. Empirical characterization of random forest variable importance measures. Computational Statistics and Data Analysis, 52: 2249-2260, 2008.
Christopher M. Bishop. Pattern Recognition and Machine Learning. Springer, 2006.
W. Blasiak, L. Tao, J. Vaclavinek, and P. Lidegran. Modeling of kraft recovery boilers. Energy Conversion Management, 38: 995-1005, 1997.
L. Breiman, J. Friedman, R. Olshen, and C. Stone. Classification and Regression Trees. Wadsworth, 1984.
Leo Breiman. Bagging predictors. Machine Learning, 24: 123-140, 1996.
Leo Breiman. Random forests. Machine Learning, 45: 5-32, 2001.
Adrian Cakembergh-Mas, Jean Paris, and Martin Trépanier. Strategic simulation of the energy management in a kraft mill. Energy Conversion Management, 51: 988-997, 2010.
B. Efron. Estimating the error rate of a prediction rule: some improvements on crossvalidation. Journal of the American Statistical Association, 78: 316-331, 1983.
Pierre Geurts, Damien Ernst, and Louis Wehenkel. Extremely randomized trees. Computational Statistics and Data Analysis, 36: 3-42, 2006.
P.M. Granitto, F. Gasperi, F. Biasioli, E. Trainotti, and C. Furlanello. Modern data mining tools in descriptive sensory analysis: A case study with a random forest approach. Food Quality and Preference, 18: 681-689, 2007.
Jiawei Han and Micheline Kamer. Data Mining : Concepts and Techniques. MorganKaufmann Publishers, 1984.
Trevor Hastie, Robert Tibshirani, and Jerome Friedman. The Elements of Statistical Learning : Data Mining, Inference, and Prediction. Springer Series in Statistics, 2009. Second Edition.
Andrej Macek. Research on combustion of black-liquor drops. Progress in Energy and Combustion Science, 25: 275-304, 1999.
Sigurdur Olafsson, Xiaonan Li, and Shuning Wu. Operations research and data mining. European Journal of Operational Research, 187: 1429-1448, 2008.
Matthieu Sainlez and Georges Heyen. Performance monitoring of an industrial boiler: classification of relevant variables with random forests. In S. Pierucci and G. Buzzi Ferraris (Editors), editors, 20th European Symposium on Computer Aided Process Engineering ESCAPE20. Elsevier, 2010. Accepted paper.
David S. Siroky. Navigating random forests and related advances in algorithmic modeling. Statistics Surveys, 3: 147-163, 2009.
Gary A. Smook. Handbook for Pulp and Paper Technologists. Angus Wilde Publications, 2002. Third Edition.
Ting Wang. Package random forests for matlab r13. http://lib.stat.cmu.edu/matlab/.