data analytics; graph databases; large-scale graphs; time-evolving graphs
Résumé :
[en] Modern analytics solutions succeed to understand and predict phenomenons in a large diversity of software systems, from social networks to Internet-of-Things platforms. This success challenges analytics algorithms to deal with more and more complex data, which can be structured as graphs and evolve over time. However, the underlying data storage systems that support large-scale data analytics, such as time-series or graph databases, fail to accommodate both dimensions, which limits the integration of more advanced analysis taking into account the history of complex graphs, for example. This paper therefore introduces a formal and practical definition of temporal graphs. Temporal graphs pro- vide a compact representation of time-evolving graphs that can be used to analyze complex data in motion. In particular, we demonstrate with our open-source implementation, named GREYCAT, that the performance of temporal graphs allows analytics solutions to deal with rapidly evolving large-scale graphs.
Disciplines :
Sciences informatiques
Auteur, co-auteur :
HARTMANN, Thomas ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT)
FOUQUET, François ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT)
JIMENEZ, Matthieu ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT)
Rouvoy, Romain; University of Lille / Inria / IUF
LE TRAON, Yves ; University of Luxembourg > Faculty of Science, Technology and Communication (FSTC) > Computer Science and Communications Research Unit (CSC)
Co-auteurs externes :
yes
Langue du document :
Anglais
Titre :
Analyzing Complex Data in Motion at Scale with Temporal Graphs
Date de publication/diffusion :
juillet 2017
Nom de la manifestation :
29th International Conference on Software Engineering and Knowledge Engineering
Lieu de la manifestation :
Pittsburgh, Etats-Unis
Date de la manifestation :
05-07-2017 to-07-07-2017
Titre de l'ouvrage principal :
Proceedings of the 29th International Conference on Software Engineering and Knowledge Engineering
G. Malewicz, M. H. Austern, A. J. Bik, J. C. Dehnert, I. Horn, N. Leiser, and G. Czajkowski, "Pregel: A system for large-scale graph processing, " in Proc. of the 2010 ACM SIGMOD International Conference on Management of Data, ser. SIGMOD'10, 2010.
Y. Low, D. Bickson, J. Gonzalez, C. Guestrin, A. Kyrola, and J. M. Hellerstein, "Distributed graphlab: A framework for macHine learning and data mining in the cloud, " Proc. VLDB Endow., vol. 5, no. 8, Apr. 2012.
J. E. Gonzalez, Y. Low, H. Gu, D. Bickson, and C. Guestrin, "Powergraph: Distributed graph-parallel computation on natural graphs, " in Proc. of the 10th USENIX Conference on Operating Systems Design and Implementation, ser. OSDI'12, 2012.
J. Leskovec, J. Kleinberg, and C. Faloutsos, "Graphs over time: Densification laws, shrinking diameters and possible explanations, " in Proc. of the 11th ACM SIGKDD International Conference on Knowledge Discovery in Data Mining, ser. KDD'05, 2005.
T. Hartmann, F. Fouquet, G. Nain, B. Morin, J. Klein, and Y. L. Traon, "Model-based time-distorted contexts for efficient temporal reasoning, " in Proc of the 26th International Conference on Software Engineering and Knowledge Engineering, 2014.
A. P. Iyer, L. E. Li, T. Das, and I. Stoica, "Time-evolving graph processing at scale, " in Proc. of the 4th International Workshop on Graph Data Management Experiences and Systems, ser. GRADES'16, 2016.
W. Han, Y. Miao, K. Li, M. Wu, F. Yang, L. Zhou, V. Prabhakaran, W. Chen, and E. Chen, "Chronos: A graph engine for temporal graph analysis, " in Proc. of the 9th European Conference on Computer Systems, ser. EuroSys'14, 2014.
Y. Miao, W. Han, K. Li, M. Wu, F. Yang, L. Zhou, V. Prabhakaran, E. Chen, and W. Chen, "Immortalgraph: A system for storage and analysis of temporal graphs, " Trans. Storage, Jul. 2015.
J. Lin, E. Keogh, S. Lonardi, and B. Chiu, "A symbolic representation of time series, with implications for streaming algorithms, " in Proc. of the 8th ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery, ser. DMKD'03, 2003.
J. J. Miller, "Graph database applications and concepts with neo4j, " in Proc. of the Southern Association for Information Systems Conference, vol. 2324, 2013.
B. Shao, H. Wang, and Y. Li, "Trinity: A distributed graph engine on a memory cloud, " in Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data, ser. SIGMOD '13. New York, NY, USA: ACM, 2013, pp. 505-516.
Y.-M. Wang, "Consistent global checkpoints that contain a given set of local checkpoints, " IEEE Transactions on Computers, vol. 46, no. 4, pp. 456-468, Apr 1997.
R. Elmasri, Y.-J. Kim, and G. T. Wuu, "Efficient implementation techniques for the time index, " in Proc. 7th International Conference on Data Engineering. IEEE, 1991.
L. J. Guibas and R. Sedgewick, "A dichromatic framework for balanced trees, " in Foundations of Computer Science, 1978., 19th Annual Symposium on. IEEE, 1978.
T. Hartmann, F. Fouquet, J. Klein, Y. L. Traon, A. Pelov, L. Toutain, and T. Ropitault, "Generating realistic Smart Grid communication topologies based on real-data, " in 2014 IEEE International Conference on Smart Grid Communications (SmartGridComm).
"influxdb: Time-Series Data Storage, " https://influxdata.com/time-series-platform/influxdb.
E. Keogh, S. Lonardi, and B. Y.-c. Chiu, "Finding surprising patterns in a time series database in linear time and space, " in Proc. of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ser. KDD '02. New York, NY, USA: ACM, 2002, pp. 550-556.
C. Faloutsos, M. Ranganathan, and Y. Manolopoulos, "Fast subsequence matching in time-series databases, " in Proc. of the 1994 ACM SIGMOD International Conference on Management of Data, ser. SIGMOD '94. New York, NY, USA: ACM, 1994.
"Use Checkpoints for Efficient Snapshots, " http://rocksdb.org/blog/2015/11/10/use-checkpoints-for-efficient-snapshots.html.
J. Clifford and D. S. Warren, "Formal semantics for time in databases, " ACM Trans. Database Syst., vol. 8, no. 2, Jun. 1983.
E. Rose and A. Segev, "Tooa: A temporal object-oriented algebra, " in Proc. of the 7th European Conference on Object-Oriented Programming, ser. ECOOP'93, 1993.
K. Kulkarni and J.-E. Michels, "Temporal features in sql:2011, " SIGMOD Rec., vol. 41, no. 3, Oct. 2012.
B. Motik, "Representing and querying validity time in rdf and owl: A logic-based approach, " Web Semant., vol. 12-13, Apr. 2012.
"OpenTSDB: The Scalable Time Series DB, " http://opentsdb.net.
U. Khurana and A. Deshpande, "Storing and analyzing historical graph data at scale, " CoRR, vol. abs/1509.08960, 2015.
A. G. Labouseur, J. Birnbaum, P. W. Olsen Jr., S. R. Spillane, J. Vijayan, J.-H. Hwang, and W.-S. Han, "The g∗ graph database: Efficiently managing large distributed dynamic graphs, " Distrib. Parallel Databases, vol. 33, no. 4, Dec. 2015.
R. Cheng, J. Hong, A. Kyrola, Y. Miao, X. Weng, M. Wu, F. Yang, L. Zhou, F. Zhao, and E. Chen, "Kineograph: Taking the pulse of a fast-changing and connected world, " in Proc. of the 7th ACM European Conference on Computer Systems, ser. EuroSys'12, 2012.