• 検索結果がありません。

Publication 論文 鈴村研究室 大規模データ処理・ストリームコンピューティング VCTL Perf IISWC2011

N/A
N/A
Protected

Academic year: 2018

シェア "Publication 論文 鈴村研究室 大規模データ処理・ストリームコンピューティング VCTL Perf IISWC2011"

Copied!
1
0
0

読み込み中.... (全文を見る)

全文

(1)

A Performance Study on Operator-based Stream

Processing Systems

Miyuru Dayarathna

, Souhei Takeno

, Toyotaro Suzumura

∗†

Department of Computer Science, Tokyo Institute of Technology, Japan

IBM Research - Tokyo, Japan

[email protected], [email protected], [email protected]

Abstract—This short paper compares and contrasts perfor- mance characteristics of System S and S4, two stream processing systems which use operator-based programming model. Our aim is to investigate and characterize which architecture is better for handling which type of stream processing workloads and observe the reasons for such characteristics.

Stream processing [1] has emerged as an exciting new filed to support online information processing activities. Currently there is a growing attention towards operator-based stream programming models. We conducted this study on System S [2] and S4 [3] which are currently two of the most prominent systems developed following this model.

The contribution of this work is the identification and char- acterization of performance and trade-offs of the two different stream processing systems with respect to their underlying architectural design principles. Such result will be useful for system designers and developers to identify what type of stream programming models would suit for their needs.

We use three different stream programs (Application- Specific Benchmarks) which are are used for different pur- poses such as Call Detail Record Processing (CDR), Volume Weighted Average Price calculation (VWAP) and Twitter trend detection. Then we use a micro-benchmark to get basic characterization of the two systems’ performance.

0 2 4 6 8 10 12

0 2 4 6 8 10 12 14

Throughput (Events\s) Thousands

Number of Nodes Throughput of five applications on S4

CDR Optimized VWAP Twitter Micro-benchmark

CDR 0

20 40 60 80 100 120 140

0 2 4 6 8 10 12 14

Throughput (Tuples\s) Thousands

Number of Nodes Throughput of five applications on System S

CDR VWAP Micro-benchmark CDR Optimized Twitter

(a) (b)

Fig. 1: Throughput comparison of sample application on System S and S4.

The results we obtained by running ten different applications (totaling 50 applications considering 1,2,4,8,12 node scenar- ios) on System S and S4 gave us sufficient insight to what kind of processing happen in the both the systems. It became clear from the throughput comparisons shown on Figure 1 that these two stream programming models should be used carefully to maximize the throughput. E.g. A SPADE program written with single source operator might not scale well in different hardware configurations. Also a S4 application that generates

huge numbers of PEs for incoming events cannot scale well. Yet introduction of multiple source operators resulted in a 3.2 times speedup for CDR application on System S and a 1.34 times speed up for reducing 0.1 million Aggregator PEs of S4 to 100 PEs which indicated possible avenues for performance improvements.

While Java based stream processing system architectures are gaining considerable attention due to their portability, system designers and stream programmers have to think carefully before choosing the correct solution. E.g. A light weight input event rate job could be easily processed using S4 with few amount of PEs (E.g. Twitter application run on S4). However a large scale application with high commercial importance such as the VWAP and CDR might produce millions of PEs since S4 dynamically generates PEs for each new data events its receives.

By analyzing the throughput and profiling results we ob- served that properly designed stream applications result in high throughput. Another conclusion we arrived at is that choice of a stream processing system need to be made carefully con- sidering factors such as performance, platform independence, size of the jobs. Furthermore we understood the importance of key role played by operating system kernel in stream processing system’s performance. While we observed heavy use of network bandwidth by S4, by using optimized protocols and techniques such as Java New I/O, InfiniBand Remote Direct Memory Access the conditions could be improved.

Creating a stream processing system architecture that scales in terms of the number of PEs is a further work that was inspired by this work. In future we hope to extend this work to a micro level performance study on S4, specially to identify which components, code segments are most resource intensive. Moreover we hope to implement Linear Road Benchmark on System S and S4 and observe and characterize the systems’ performance.

REFERENCES

[1] D. Turaga, H. Andrade, B. Gedik, C. Venkatramani, O. Verscheure, J. D. Harris, J. Cox, W. Szewczyk, and P. Jones, “Design principles for developing stream processing applications,” Software: Practice and Experience, Aug 2010.

[2] B. Gedik, H. Andrade, K.-L. Wu, P. S. Yu, and M. Doo, “Spade: the system s declarative stream processing engine,” in SIGMOD ’08. New York, NY, USA: ACM, 2008, pp. 1123–1134.

[3] L. Neumeyer, B. Robbins, A. Nair, and A. Kesari, “S4: Distributed stream computing platform,” in KDCloud 2010, December 2010.

Fig. 1: Throughput comparison of sample application on System S and S4.

参照

関連したドキュメント

We have studied the effects of different treatment regimens on both the tumour growth and the immune response within the simple ODE model that describes tumour-immune dynamics

The approach based on the strangeness index includes un- determined solution components but requires a number of constant rank conditions, whereas the approach based on

In particu- lar, we see that these two qualitatively different kinds of formulas for basic hypergeometric functions are closely related: indeed, they are different limits of

Furthermore, the following analogue of Theorem 1.13 shows that though the constants in Theorem 1.19 are sharp, Simpson’s rule is asymptotically better than the trapezoidal

Conley index, elliptic equation, critical point theory, fixed point index, superlinear problem.. Both authors are partially supportedby the Australian

This work consists of 2 parts which are rather different in their settings. How- ever these two settings are both natural, at least for the beginning of such discus- sion. The

Motivated by complex periodic boundary conditions which arise in certain problems such as those of modelling the stator of a turbogenerator (see next section for detail), we give

We will show that under different assumptions on the distribution of the state and the observation noise, the conditional chain (given the observations Y s which are not