A Performance study on
Operator-based stream
processing systems
Miyuru Dayarathna, Souhei Takeno, Toyotaro Suzumura
Department of Computer Science Tokyo Institute of Technology
Japan
Stream Computing Systems
Insights from data in motion
◦ It is impossible to store data on disk
◦ The volume of the data is very large
Process data on-the-fly in-memory
OP 1
OP 2
OP 3
OP 4 Route keyless
input events
Join the serve
and click events BotFilter Compute the correct click throughput rate
Essence of our Performance
Study
System S (IBM) and S4 (Yahoo)
Four benchmarks (60 application Scenarios)
Five metrics
Results - Throughput
0 2 4 6 8 10 12
0 2 4 6 8 10 12 14
Throughput (Events\s)Thousands
Number of Nodes
Throughput observed for four applications on S4
CDR Optimized VWAP Twitter Micro-benchmark
CDR
0 20 40 60 80 100 120 140
0 2 4 6 8 10 12 14
Throughput (Tuples\s)Thousands
Number of Nodes
Throughput observed for five applications on System S
CDR VWAP
Micro-benchmark CDR Optimized Twitter
(c) (d)
Essence of our Performance
Study
System S (IBM) and S4 (Yahoo)
Four benchmarks (60 application Scenarios)
Five metrics
Conclusions on Stream Processing system architectures