Apache-Spark-team, park 2.4.3: Structured Streaming Programming Guide https://spark.apache.org. (Accessed 27 August 2019)
Akidau, T., Balikov, A., Bekiroğlu, K., Chernyak, S., Haberman, J., Lax, R., McVeety, S., Mills, D., Nordstrom, P., Whittle, S., Millwheel: Fault-Tolerant Stream Processing at Internet Scale. Proc. VLDB Endow. 6:11 (2013), 1033–1044.
Akidau, T., Bradshaw, R., Chambers, C., Chernyak, S., Fernández-Moctezuma, R.J., Lax, R., McVeety, S., Mills, D., Perry, F., Schmidt, E., Whittle, S., The Dataflow Model: A practical approach to balancing correctness, latency, and cost in massive-scale, unbounded, out-of-order data processing. Proc. VLDB Endow. 8 (2015), 1792–1803.
Aldinucci, M., Campa, S., Danelutto, M., Kilpatrick, P., Torquati, M., Targeting Distributed Systems in Fastflow. Euro-Par 2012: Parallel Processing Workshops, 2012, 47–56.
Aldinucci, M., Torquati, M., Meneghin, M., Fastflow: Efficient Parallel Streaming Applications on Multi-core. CoRR arXiv:0909.1187 [abs].
Alexandrov, A., Bergmann, R., Ewen, S., Freytag, J.-C., Hueske, F., Heise, A., Kao, O., Leich, M., Leser, U., Markl, V., Naumann, F., Peters, M., Rheinländer, A., Sax, M.J., Schelter, S., Höger, M., Tzoumas, K., Warneke, D., The Stratosphere Platform for Big Data Analytics. VLDB J. 23:6 (2014), 939–964.
Apache-Flink-team. Apache Flink—Stateful Computations over Data Streams. https://flink.apache.org/. (Accessed 21 August 2020)
Arasu, A., Babcock, B., Babu, S., Datar, M., Ito, K., Nishizawa, I., Rosenstein, J., Widom Stream, J., The Stanford Stream Data Manager (Demonstration Description). SIGMOD, 2003, 665.
Barbieri, D.F., Braga, D., Ceri, S., Valle, E.D., Grossniklaus, M., Querying RDF streams with C-SPARQL. SIGMOD Rec. 39:1 (2010), 20–26, 10.1145/1860702.1860705.
Bingmann, T., Axtmann, M., Jöbstl, E., Lamm, S., Nguyen, H.C., Noe, A., Schlag, S., Stumpp, M., Sturm, T., Sanders, P., Thrill: High-Performance Algorithmic Distributed Batch Data Processing with C++. BigData, 2006, 172–183.
Bland, W., Bouteiller, A., Herault, T., Bosilca, G., Dongarra, J., Post-Failure Recovery of MPI Communication Capability: Design and Rationale. Int. J. High Perform. Comput. Appl. 27:3 (2013), 244–254, 10.1177/1094342013488238.
Bouteiller, A., Herault, T., Bosilca, G., Dongarra, J.J., Correlated Set Coordination in Fault Tolerant Message Logging Protocols. Jeannot, E., Namyst, R., Roman, J., (eds.) Euro-Par 2011 Parallel Processing, 2011, Springer Berlin Heidelberg, Berlin, Heidelberg, 51–64.
Buddhika, T., Pallickara, S., Neptune: Real-time Stream Processing for Internet of Things and Sensing Environments. IPDPS, 2016, 1143–1152.
Cangialosi, F.J., Ahmad, Y., Balazinska, M., Cetintemel, U., Cherniack, M., Hwang, J.-H., Lindner, W., Maskey, A.S., Rasin, A., Ryvkina, E., Tatbul, N., Xing, Y., Zdonik, S., The Design of the Borealis Stream Processing Engine. CIDR, 2005.
Carbone, P., Katsifodimos, A., Ewen, S., Markl, V., Haridi, S., Tzoumas, K., Apache Flink™: Stream and Batch Processing in a Single Engine. IEEE Data Eng. Bull. 38:4 (2015), 28–38.
Chandramouli, B., Goldstein, J., Barnett, M., DeLine, R., Platt, J.C., Terwilliger, J.F., Wernsing Trill, J., Trill: A High-Performance Incremental Query Processor for Diverse Analytics. Proc. VLDB Endow. 8:4 (2014), 401–412.
Chandy, K.M., Lamport, L., Distributed Snapshots: Determining Global States of Distributed Systems. ACM Trans. Comput. Syst. 3:1 (1985), 63–75, 10.1145/214451.214456.
Chintapalli, S., Dagit, D., Evans, B., Farivar, R., Graves, T., Holderbaugh, M., Liu, Z., Nusbaum, K., Patil, K., Peng, B.J., Poulosky, P., Benchmarking Streaming Computation Engines: Storm, Flink and Spark Streaming. IPDPSW, 2016, 1789–1792.
Coti, C., Herault, T., Lemarinier, P., Pilard, L., Rezmerita, A., Rodriguez, E., Cappello, F., Blocking vs. Non-blocking Coordinated Checkpointing for Large-scale Fault Tolerant MPI. SC'06, 2006, Association for Computing Machinery, New York, NY, USA, 127-es, 10.1145/1188455.1188587.
Databricks, Running the Yahoo benchmark on Databricks. https://databricks.github.io/benchmarks/structured-streaming-yahoo-benchmark/index.html. (Accessed 22 August 2019)
Davies, T., Karlsson, C., Liu, H., Ding, C., Chen, Z., High Performance Linpack Benchmark: A Fault Tolerant Implementation without Checkpointing. Proceedings of the International Conference on Supercomputing, ICS ’11, 2011, Association for Computing Machinery, New York, NY, USA, 162–171, 10.1145/1995896.1995923.
del Rio Astorga, D., Dolz, M.F., Fernández, J., García, J.D., A Generic Parallel Pattern Interface for Stream and Data Processing. Concurrency and Computation: Practice and Experience, 29(24), 2017, e4175.
Du, P., Bouteiller, A., Bosilca, G., Herault, T., Dongarra, J., Algorithm-based Fault Tolerance for Dense Matrix Factorizations. Proceedings of the 17th ACM SIGPLAN, PPoPP ’12, 2012, Association for Computing Machinery, New York, NY, USA, 225–234, 10.1145/2145816.2145845.
Ferreira, K., Stearley, J., Laros, J.H., Oldfield, R., Pedretti, K., Brightwell, R., Riesen, R., Bridges, P.G., Arnold, D., Evaluating the Viability of Process Replication Reliability for Exascale Systems. SC '11: Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis, 2011, 1–12, 10.1145/2063384.2063443.
Gedik, B., Andrade, H., Wu, K.-L., Yu, P.S., Doo SPADE, M., The System's Declarative Stream Processing Engine. SIGMOD, 2008, 1123–1134.
Guermouche, A., Ropars, T., Snir, M., Cappello, F., HydEE: Failure Containment without Event Logging for Large Scale Send-Deterministic MPI Applications. 2012 IEEE 26th International Parallel and Distributed Processing Symposium, 2012, 1216–1227, 10.1109/IPDPS.2012.111.
Hursey, J., Squyres, J.M., Mattox, T., Lumsdaine, A., The Design and Implementation of Checkpoint/Restart Process Fault Tolerance for Open MPI. 2007 IEEE International Parallel and Distributed Processing Symposium, 2007, 1–8.
Imai, S., Patterson, S., Varela, C.A., Maximum Sustainable Throughput Prediction for Data Stream Processing Over Public Clouds. ACM, CCGrid '17, 2017, 504–513.
Karimov, J., Rabl, T., Katsifodimos, A., Samarev, R., Heiskanen, H., Markl, V., Benchmarking Distributed Stream Data Processing Systems. ICDE, 2018, 1507–1518.
Koliousis, A., Weidlich, M., Fernandez, R.C., Costa, P., Wolf, A.L., Pietzuch, P., SABER: Window-based Hybrid Stream Processing for Heterogeneous Architectures. ACM SIGMOD 2016, 2016.
Kreps, J., Questioning the Lambda Architecture. https://www.oreilly.com/radar/questioning-the-lambda-architecture/. (Accessed 17 October 2019)
Kulkarni, S., Bhagat, N., Fu, M., Kedigehalli, V., Kellogg, C., Mittal, S., Patel, J.M., Ramasamy, K., Taneja, S., Twitter Heron: Stream Processing at Scale. SIGMOD 2015, 2015, 239–250.
Lin, W., Fan, H., Qian, Z., Xu, J., Yang, S., Zhou, J., Zhou, L., StreamScope: Continuous Reliable Distributed Processing of Big Data Streams. NSDI, 2016, 439–453.
McSherry, F., Murray, D.G., Isaacs, R., Isard, M., Differential Dataflow. CIDR, 2013.
Miao, H., Park, H., Jeon, M., Pekhimenko, G., McKinley, K.S., Lin, F.X., StreamBox: Modern Stream Processing on a Multicore Machine. 2017, USENIX, 617–629.
Misale, C., Drocco, M., Tremblay, G., Martinelli, A.R., Aldinucci, M., PiCo: High-performance data analytics pipelines in modern C++. Future Gener. Comput. Syst. 87 (2018), 392–403.
Murray, D.G., McSherry, F., Isaacs, R., Isard, M., Barham, P., Abadi, M., Naiad: A Timely Dataflow System. SOSP, 2013, 439–455.
Narkhede, N., Shapira, G., Palino Kafka, T., The Definitive Guide Real-Time Data and Stream Processing at Scale. 2017, O'Reilly Media, Inc.
Noghabi, S.A., Paramasivam, K., Pan, Y., Ramesh, N., Bringhurst, J., Gupta, I., Campbell, R.H., Samza: Stateful Scalable Stream Processing at LinkedIn. Proc. VLDB Endow. 10:12 (2017), 1634–1645.
Phuoc, D.L., Dao-Tran, M., Parreira, J.X., Hauswirth, M., A Native and Adaptive Approach for Unified Processing of Linked Streams and Linked Data. Aroyo, L., Welty, C., Alani, H., Taylor, J., Bernstein, A., Kagal, L., Noy, N.F., Blomqvist, E., (eds.) The Semantic Web - ISWC 2011, Bonn, Germany, October 23-27, 2011, Proceedings, Part I Lect. Notes Comput. Sci., vol. 7031, October 2011, Springer, Bonn, Germany, 370–388, 10.1007/978-3-642-25073-6_24.
Slurm-team. SLURM Workload Manager. https://slurm.schedmd.com/documentation.html. (Accessed 14 March 2020)
Thies, W., Karczmarek, M., Amarasinghe, S.P., StreamIt: A Language for Streaming Applications. CC, 2002, 179–196.
Toshniwal, A., Taneja, S., Shukla, A., Ramasamy, K., Patel, J.M., Kulkarni, S., Jackson, J., Gade, K., Fu, M., Donham, J., Bhagat, N., Mittal, S., Ryaboy, D., Storm@twitter. SIGMOD, 2014, 147–156.
Tosi, M.D.L., Venugopal, V.E., Theobald, M., Convergence Time Analysis of Asynchronous Distributed Artificial Neural Networks. 9th ACM IKDD CODS and 27th COMAD, 2022, 314–315, 10.1145/3493700.3493758.
Tucker, P.A., Maier, D., Sheard, T., Applying Punctuation Schemes to Queries over Continuous Data Streams. IEEE Data Eng. Bull. 26 (2003), 33–40.
Tucker, P.A., Maier, D., Sheard, T., Fegaras, L., Exploiting Punctuation Semantics in Continuous Data Streams. IEEE Trans. Knowl. Data Eng. 15:3 (2003), 555–568.
Venkataraman, S., Panda, A., Ousterhout, K., Armbrust, M., Ghodsi, A., Franklin, M.J., Recht, B., Stoica, I., Drizzle: Fast and Adaptable Stream Processing at Scale. SOSP 2017, 2017, 374–389.
Venugopal, V.E., Theobald, M., Benchmarking Synchronous and Asynchronous Stream Processing Systems. 7th ACM IKDD CoDS and 25th COMAD, Hyderabad India, January 5-7, 2020, 2020, 322–323, 10.1145/3371158.3371206.
Venugopal, V.E., Theobald, M., Effective Stream Data Processing using Asynchronous Iterative Routing Protocol. IEEE International Conference on Big Data, Big Data 2020, Atlanta, GA, USA, December 10–13, 2020, 2020, IEEE, 5846–5848, 10.1109/BigData50022.2020.9377752.
Venugopal, V.E., Theobald, M., Chaychi, S., Tawakuli, A., AIR: A Light-Weight yet High-performance Dataflow Engine based on Asynchronous Iterative Routing. SBAC-PAD 2020, September 9-11, 2020, 2020, IEEE, Porto, Portugal, 51–58, 10.1109/SBAC-PAD49847.2020.00018.
Venugopal, V.E., Theobald, M., Chaychi, S., Tawakuli, A., AIR: A Light-Weight yet High-Performance Dataflow Engine based on Asynchronous Iterative Routing. arXiv:2001.00164, 2020.
White, T., Hadoop: The Definitive Guide. 1st edition, 2009, O'Reilly Media, Inc.
Yavuz, B., Blog post: Arbitrary Stateful Processing in Apache Spark's Structured Streaming. https://databricks.com/blog/2017/10/17/arbitrary-stateful-processing-in-apache-sparks-structured-streaming.html, 2017. (Accessed 22 August 2019)
Zaharia, M., Chowdhury, M., Das, T., Dave, A., Ma, J., McCauley, M., Franklin, M.J., Shenker, S., Stoica, I., Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-memory Cluster Computing. NSDI, 2012, 2–15.
Zeuch, S., Breß, S., Rabl, T., Monte, B.D., Karimov, J., Lutz, C., Renz, M., Traub, J., Markl, V., Analyzing Efficient Stream Processing on Modern Hardware. Proc. VLDB Endow. 12:5 (2019), 516–530.