Paper published in a book (Scientific congresses, symposiums and conference proceedings)
Distributed C++-Python embedding for fast predictions and fast prototyping
Varisteas, Georgios; Avanesov, Tigran; State, Radu
2018In Proceedings of the Second Workshop on Distributed Infrastructures for Deep Learning
Peer reviewed
 

Files


Full Text
didl18-final2.pdf
Author preprint (835.36 kB)
Download

All documents in ORBilu are protected by a user license.

Send to



Details



Abstract :
[en] Python has evolved to become the most popular language for data science. It sports state-of-the-art libraries for analytics and machine learning, like Sci-Kit Learn. However, Python lacks the computational performance that a industrial system requires for high frequency real time predictions. Building upon a year long research project heavily based on SciKit Learn (sklearn), we faced performance issues in deploying to production. Replacing sklearn with a better performing framework would require re-evaluating and tuning hyperparameters from scratch. Instead we developed a python embedding in a C++ based server application that increased performance by up to 20x, achieving linear scalability up to a point of convergence. Our implementation was done for mainstream cost effective hardware, which means we observed similar performance gains on small as well as large systems, from a laptop to an Amazon EC2 instance to a high-end server.
Disciplines :
Computer science
Author, co-author :
Varisteas, Georgios ;  University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT)
Avanesov, Tigran ;  University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT)
State, Radu  ;  University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT)
External co-authors :
no
Language :
English
Title :
Distributed C++-Python embedding for fast predictions and fast prototyping
Publication date :
2018
Event name :
Second Workshop on Distributed Infrastructures for Deep Learning (DIDL) 2018
Event date :
10-12-2018
Main work title :
Proceedings of the Second Workshop on Distributed Infrastructures for Deep Learning
ISBN/EAN :
978-1-4503-6119-4
Peer reviewed :
Peer reviewed
FnR Project :
FNR11822390 - Optimal Scalability And Performance In Programmatic Advertising Platforms, 2017 (01/09/2017-31/08/2019) - Georgios Varisteas
Available on ORBilu :
since 21 December 2018

Statistics


Number of views
127 (8 by Unilu)
Number of downloads
248 (3 by Unilu)

Scopus citations®
 
2
Scopus citations®
without self-citations
2

Bibliography


Similar publications



Contact ORBilu