仮想マシンエミュレータを用いた特定故障パターン発生時におけるアプリケーションの誤差の評価
7
0
0
全文
(2) Vol.2016-HPC-155 No.10 2016/8/8. IPSJ SIG Technical Report. .. . SDC ,. SDC ,. SDC. . ,. .. SDC. , 1: DRAM. [6]. [7].. 2.2.1 ,. Functional Memory Fault Model [4]. .. , SDC. .. ,. .. , SDC. .. SDC. 1. , SDC .. [2, 3, 11, 12]. , .. .. ,. 0. ,. 0 .. .. ,. 2.2.2 Retention Fault. .. Retention Fault. , DRAM NAS Parallel Benchmark SDC. , CG. ,. 0. .. .. , . DRAM. QEMU. ,. .. . , SDC. ,. ,. SDC. .. . Retention Fault. .. , Retention. Fault. .. 2.2.3 Row-Hammer. 2. DRAM. Row-Hammer. ,. 2.1 DRAM DRAM. ,. . 1 .. DRAM. ,. ,. [14],. .. . 2014 Hammer. Kim. [10]. Row-. . Row-Hammer. , .. . 1. ,. 2.2 DRAM. ,. SDC. ,. 0. 1. .. .. . ,. c 2016 Information Processing Society of Japan ⃝. Row-Hammer 2.
(3) Vol.2016-HPC-155 No.10 2016/8/8. IPSJ SIG Technical Report. Guest VM. App. • QEMU. QEMU. • . QEMU. RAM. scenario. Host. • • . mmap(). CPU NIC,…. tmpfs. QEMU. (tmpfs). 3:. QEMU. , .. 2:. 2 . , OS .. ,. . ,. 3.. . ,. ,. ,. ,. .. .. , ,. ,. .. .. ,. , .. ,. .. .. ,. , ,. .. .. , .. ,. ,. , .. .. 2.2. . ,. .. .. ,. Retention Fault. .. ,. , .. ,. 4.. . 0 ,. QEMU. 2.3.1. System Emulation .. x86 64. .. Full. , .. QEMU .. Row-Hammer. c 2016 Information Processing Society of Japan ⃝. .. 3.
(4) Vol.2016-HPC-155 No.10 2016/8/8. IPSJ SIG Technical Report. Algorithm CG(A) b = {1,1,...,1}; DO iter = 1, 75; //Power Method loop Solve Ax = b with CG method; ζiter = 1/(b*x); b = normalize(x); ENDDO return ζiter ; End-Algorithm 4: NPB. CG. 1: %. 2. 4. 8. 44.2. 25.4. 13.8. 2: 512MB OS. Scientific Linux 7.1 VT-d. .. B. CG QEMU. .. -mem-path CG. .. .. CG. . (1). -mem-path $map file. (2). , -mem-path. .. |Ax − b| < 10. ,. .. −10. A. ,b. ,x , .. .. 300 .. (3) , file ram alloc(). 5.2. , mmap(). 2. 3 . .. (4) Retention Faults. . (5). 512 ,. .. 1 0. 5.. Fault. 5.1. . Retention. .. Single Bit Faults. NAS Parallel Benchmarks NPB [1] 3.3.1. MPI. CG. , SDC. 512. 0. B . NPB. CG. . [15]. 2 ,. 30. ,. 1. .. ,. 5 .. 1. ,. .. .. .. SDC. CG. 3. CG. 1. .. .. (1). CG .. .. 1. [8] 4 .. NPB. .. , 75. ,. 25. (2). . OS. SSH. .. , SSH ,. CG. . .. CG. . ssh. .. ,. −10. 10. , .. 300. c 2016 Information Processing Society of Japan ⃝. . • 60. “ssh $node name hostname”. 4.
(5) Vol.2016-HPC-155 No.10 2016/8/8. 100. 100. 80. 80. 80. 60. CG.B.2. 40. CG.B.4. 20. CG.B.8. 60. 0 -50%. 0%. CG.B.2. 40. CG.B.4. 20. CG.B.8. -50%. 0%. CG. 50% 100% error. 150%. (b). 5: Retention Faults. -50%. 200%. (c). 80. CG.B.2 CG.B.4 CG.B.8. 60. CG.B.2. 40. CG.B.4. 20. 0. ra#o of error(%). 100. 80. -50%. 50% 100% 150% 200% error. (a). CG. CG.B.8. 60. -50%. CG.B.2. 40. CG.B.4. 20. CG.B.8. 0. 0. 0%. 50% 100% 150% 200% error. CG. 100. 20. CG.B.8 0%. 80. 40. CG.B.4. 20. 100. 60. CG.B.2. 40. NPB CG. ra#o of error(%). ra#o of error(%). 60. 0. 0. 50% 100% 150% 200% error. (a). ra#o of error(%). 100 ra#o of error(%). ra#o of error(%). IPSJ SIG Technical Report. 0%. CG. -50%. 50% 100% 150% 200% error. (b). 6: Retention Faults. 0%. 50% 100% 150% 200% error. CG. (c). CG. NPB CG. 3:. ,. sec CG. 2. 4. 8. 70. 67. 50. CG. 30. 27. 21. CG. 8. 8. 8. NaN. Inf. , .. , SDC. ,. 160%. .. CG. CG. CG. . 3. .. ,. CG. CG. ,. .. Retention Faults. . •. 6. .. . , 1. .. 2. 4. CG. 5%. ,. . ,. ,. . (3). , NPB. CG. CG. . .. 2,4,8. ,. , SDC .. 100 , NPB. .. 2. Retention Faults. Single Bit Faults ,. CG 1. ,. 7. .. CG . Single Bit Faults .. Reten-. tion Faults. 5.3. , OS. Retention Faults 5. c 2016 Information Processing Society of Japan ⃝. .. ,. 5.
(6) Vol.2016-HPC-155 No.10 2016/8/8. 100. 100. 80. 80. 60. ra#o of error(%). ra#o of error(%). IPSJ SIG Technical Report. CG.B.2. 40. CG.B.4. 20. CG.B.8. 60. CG.B.4. 20. 0 -50%. CG.B.2. 40. CG.B.8. 0 0%. 50% 100% 150% 200% error. -50%. 0%. 50% 100% 150% 200% error. (a) Retention Faults. (b) Single Bit Faults. 7:. CG. .. ,. 7.. , .. ,. 6.. .. Retention. Faults . Bronevetsky. [3]. NPB 3. ,. , SDC. . SDC. ABFT. CG. ,. SDC. ,. . ,. , . Charng-da Lu. .. [11]. ,. MPI. SDC. .. .. ,. SDC .. . ECC. , .. ,. [12]. ,. (S)23220003. ,. 10. . , . Adamu-Fika. [2]. EBD:. , [1]. , .. [2]. . [3]. c 2016 Information Processing Society of Japan ⃝. ,. .. . . Yixin Luo. ,. Nas parallel benchmarks. https://www.nas.nasa.gov/ publications/npb.html, March 1994. Fatimah Adamu-Fika and Arshad Jhumka. An Investigation of the Impact of Double Bit-Flip Error Variants on Program Execution, pages 799–813. Springer International Publishing, Cham, 2015. Greg Bronevetsky and Bronis de Supinski. Soft error vulnerability of iterative linear algebra methods. In Pro-. 6.
(7) IPSJ SIG Technical Report. [4]. [5]. [6]. [7]. [8]. [9]. [10]. [11]. [12]. [13]. [14]. [15]. Vol.2016-HPC-155 No.10 2016/8/8. ceedings of the 22Nd Annual International Conference on Supercomputing, ICS ’08, pages 155–164, New York, NY, USA, 2008. ACM. Michael Bushnell. Essentials of electronic testing for digital, memory and mixed-signal VLSI circuits. Kluwer Academic, New York, 2002. Franck Cappello, Al Geist, William Gropp, Sanjay Kale, Bill Kramer, and Marc Snir. Toward exascale resilience: 2014 update. Supercomputing frontiers and innovations, 1(1), 2014. Kurt Ferreira, Jon Stearley, James H. Laros, III, Ron Oldfield, Kevin Pedretti, Ron Brightwell, Rolf Riesen, Patrick G. Bridges, and Dorian Arnold. Evaluating the viability of process replication reliability for exascale systems. In Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis, SC ’11, pages 44:1–44:12, New York, NY, USA, 2011. ACM. C. George and S. Vadhiyar. Fault tolerance on large scale systems using adaptive process replication. IEEE Transactions on Computers, 64(8):2213–2225, Aug 2015. Magnus Rudolph Hestenes and Eduard Stiefel. Methods of conjugate gradients for solving linear systems, volume 49. NBS, 1952. Kuang-Hua Huang and J.A. Abraham. Algorithm-based fault tolerance for matrix operations. Computers, IEEE Transactions on, C-33(6):518–528, June 1984. Yoongu Kim, Ross Daly, Jeremie Kim, Chris Fallin, Ji Hye Lee, Donghyuk Lee, Chris Wilkerson, Konrad Lai, and Onur Mutlu. Flipping bits in memory without accessing them: An experimental study of dram disturbance errors. SIGARCH Comput. Archit. News, 42(3):361–372, June 2014. Charng-da Lu and Daniel A Reed. Assessing fault sensitivity in mpi applications. In Proceedings of the 2004 ACM/IEEE conference on Supercomputing, page 37. IEEE Computer Society, 2004. Yixin Luo, Sriram Govindan, Bikash Sharma, Mark Santaniello, Justin Meza, Aman Kansal, Jie Liu, Badriddine Khessib, Kushagra Vaid, and Onur Mutlu. Characterizing application memory error vulnerability to optimize datacenter cost via heterogeneous-reliability memory. In 2014 44th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, pages 467– 478. IEEE, 2014. Bianca Schroeder and Garth A Gibson. Understanding failures in petascale computers. In Journal of Physics: Conference Series, volume 78, page 012022. IOP Publishing, 2007. Vilas Sridharan, Nathan DeBardeleben, Sean Blanchard, Kurt B. Ferreira, Jon Stearley, John Shalf, and Sudhanva Gurumurthi. Memory errors in modern systems: The good, the bad, and the ugly. SIGARCH Comput. Archit. News, 43(1):297–310, March 2015. . . , , Japan, 2 edition, 2002.2 2002.. c 2016 Information Processing Society of Japan ⃝. 7.
(8)
関連したドキュメント
Based on the proposed hierarchical decomposition method, the hierarchical structural model of large-scale power systems will be constructed in this section in a bottom-up manner
中空 ★発生時期:夏〜秋 ★発生場所:広葉樹林、マツ混生林の地上に発生する ★毒成分:不明 ★症状:胃腸障害...
1.3で示した想定シナリオにおいて,格納容器ベントの実施は事象発生から 38 時間後 であるため,上記フェーズⅠ~フェーズⅣは以下の時間帯となる。 フェーズⅠ 事象発生後
・性能評価試験における生活排水の流入パターンでのピーク流入は 250L が 59L/min (お風呂の
本事象は,東京電力株式会社福島第一原子力発電所原子炉施
それに対して現行民法では︑要素の錯誤が発生した場合には錯誤による無効を承認している︒ここでいう要素の錯
アジアにおける人権保障機構の構想(‑)
評価する具体的な事故シーケンスは,事故後長期において炉心が露出す