• 検索結果がありません。

量子化誤差を考慮したニューラルネットワークの学習手法

N/A
N/A
Protected

Academic year: 2021

シェア "量子化誤差を考慮したニューラルネットワークの学習手法"

Copied!
4
0
0

読み込み中.... (全文を見る)

全文

(1)

ྔࢠԽޡࠩΛߟྀͨ͠χϡʔϥϧωοτϫʔΫͷֶशख๏

Quantization Error-aware Neural Network Training

ኍ੉ Ұढ़

1∗

҆౻ ᔨଠ

1

২٢ ߊେ

1

஑ล ক೭

1

ઙҪ ఩໵

1

ຊଜ ਅਓ

1

ߴલా ৳໵

1

Kazutoshi Hirose

1

, Kota Ando

1

, Kodai Ueyoshi

1

, Masayuki Ikebe

1

,

Tetsuya Asai

1

, Masato Motomura

1

, and Shinya Takamaeda-Yamazaki

1

1

๺ւಓେֶେֶӃ৘ใՊֶݚڀՊ

1

Graduate School of Information Science and Technology (IST), Hokkaido University

Abstract: Deep neural network is a widely-used technology for various machine learning appli-cations. A training technology for both low-precision and high-accuracy is desired for low power neural network hardware. We propose a quantization-error-aware training method for higher ac-curacy of quantized neural networks. Our approach appends an additional regularization term, based on quantization errors of weights, to the loss function. The evaluation results on MNIST and CIFAR-10 show that the proposed approach achieves higher accuracy than the standard approach.

1

͸͡Ίʹ

Deep neural network (DNN)͸ػցֶशͷٕज़ͱ͠

ͯ෯޿͘༻͍ΒΕ͓ͯΓɺը૾ೝࣝɺԻ੠ೝࣝ[9]ɺ຋ ༁[6]౳ͱ͍ͬͨ৔໘Ͱ࢖༻͞Ε͍ͯΔɻDNN͸ɺωο τϫʔΫͷେن໛Խ΍ෳࡶԽʹΑΓैདྷͷػցֶशΑ Γ΋ߴ౓ͳλεΫ͕Մೳͱͳ͕ͬͨɺലେͳܭࢉࢿݯ Λඞཁͱ͢ΔɻαʔόʔͷॲཧͰ͸ɺਪ࿦ͱֶश͕ߦ ΘΕɺओʹGPU͕࢖༻͞ΕΔɻߴੑೳͰ͋Δ൓໘ɺଟ ͘ͷిྗΛফඅ͢ΔɻҰํͰɺܞଳ୺຤΍૊ΈࠐΈػ ثͱ͍ͬͨকདྷͷIoTσόΠεͰ͸ɺݶΒΕͨ؀ڥͰ ಈ࡞͠ͳ͚Ε͹ͳΒͳ͍ɻͦͷͨΊɺ௿ిྗɾলϝϞ ϦͰಈ࡞͢Δϋʔυ΢ΣΞʹಛԽͨ͠DNNٕज़͕ٻ ΊΒΕ͍ͯΔɻ ϋʔυ΢ΣΞࢦ޲ͷDNNٕज़ͷ1ͭͱͯ͠ɺ਺஋ ͷྔࢠԽ͕ڍ͛ΒΕΔɻྔࢠԽ͸ුಈখ਺Ͱදݱ͞Ε ͍ͯΔ਺஋Λݻఆখ਺[11]΍ର਺[12]ɺόΠφϦ[2]ͱ ͍ͬͨදݱͰද͢͜ͱͰ͋ΔɻྔࢠԽ͞Εͨ஋͸ϝϞ ϦྔΛ࡟ݮ͠ɺԋࢉΛ୯७ʹ͢Δ͜ͱ͕ՄೳͱͳΔɻྫ ͑͹ɺόΠφϦԽ͞ΕͨॏΈ܎਺ͱΞΫνϕʔγϣϯ ͷܭࢉ͸৐ࢉ͕ෆཁʹͳΓɺ୅ΘΓʹXNORԋࢉʹஔ ͖׵͑ΒΕΔɻXNORճ࿏͸৐ࢉճ࿏ʹൺ΂ɺඇৗʹ ؆୯ͳճ࿏Ͱ͋ΔͨΊɺফඅిྗ͸େ෯ʹ࡟ݮ͞ΕΔ [1]ɻ ͔͠͠ɺྔࢠԽΛߦ͏ͱɺೝࣝਫ਼౓ΛԼ͛ͯ͠·͏ ͱ͍ͬͨ໰୊͕ى͜Δɻ͜ͷݪҼ͸ɺුಈখ਺Ͱදݱ ࿈བྷઌɿ๺ւಓେֶେֶӃ৘ใՊֶݚڀՊ ɹɹɹɹɹɹ˟ 060-0814 ๺ւಓࡳຈࢢ๺۠๺ 14 ৚੢ 9 ஸ໨ ɹɹɹɹɹɹ৘ใ౩ (̢౩)2F ूੵΞʔΩςΫνϟݚڀࣨ ɹɹɹɹɹɹ E-mail: [email protected] -1.2 -0.8 -0.4 0 0.4 0.8 1.2 -1.2 -1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1 1.2 with Eq(3) with Eq(4) w wq ਤ1: ର਺ྔࢠԽͷద༻(LogQuant(w, 3, 1)ͷྫ) ͞Ε͍ͯΔ஋ΛྔࢠԽ͢Δ͜ͱͰൃੜ͢ΔྔࢠԽޡࠩ ʹΑΔɻҰൠతʹॏΈ܎਺ͷදݱਫ਼౓ͷૈ͞ͱೝࣝਫ਼ ౓ʹ͸τϨʔυΦϑͷؔ܎͕͋ΓɺྔࢠԽ͢Δ΄Ͳೝ ࣝਫ਼౓͕Լ͕Δ܏޲ʹ͋ΔɻͦͷͨΊɺೝࣝਫ਼౓Λอͬ ͨ··ɺ͍͔ʹॏΈ܎਺ͷදݱਫ਼౓Λམͱͯ͠ྔࢠԽ Ͱ͖Δ͔͕՝୊ͱͳ͍ͬͯΔɻ ຊݚڀͰ͸ɺྔࢠԽχϡʔϥϧωοτϫʔΫʹ͓͍ ͯɺΑΓೝࣝਫ਼౓ΛߴΊΔͨΊʹྔࢠԽޡࠩΛߟྀ͠ ֶͨशख๏ΛఏҊ͢ΔɻॏΈ܎਺ͷྔࢠԽޡࠩ͸ɺೝ ࣝਫ਼౓ͷ௿ԼΛ๷͙ͨΊɺখ͘͢͞Δ͜ͱ͕๬·ΕΔɻ ຊख๏͸ɺྔࢠԽޡࠩʹجͮ͘ਖ਼ଇԽ߲Λ໨తؔ਺ʹऔ ΓೖΕֶͯशΛਐΊΔ͜ͱͰྔࢠԽޡࠩΛখ͘͢͞Δɻ 人工知能学会研究会資料 SIG-FPAI-B507-01 - 1 -

(2)

0 0.1 0.2 0.3 0.4 0.5 -1.2 -1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1 1.2 with Eq(3) with Eq(4) |QE| w ਤ2: ର਺ྔࢠԽద༻࣌ͷྔࢠԽޡࠩ(|QE|)

2

ྔࢠԽχϡʔϥϧωοτϫʔΫ

χϡʔϥϧωοτͷྔࢠԽ͸ුಈখ਺Ͱදݱ͞ΕΔ ஋Λݻఆখ਺΍όΠφϦԽ(ೋ஋(ʶ 1))ͱ͍ͬͨগͳ ͍৘ใྔͰදݱ͢Δ͜ͱͰ͋Δɻۙ೥Ͱ͸ɺ͞Βʹର਺ Λ༻͍ͯ਺஋Λදݱ͢Δର਺ྔࢠԽΛద༻ͨ͠χϡʔ ϥϧωοτ͕ఏҊ͞Ε͍ͯΔɻର਺ྔࢠԽΛ༻͍Δ͜ ͱʹΑͬͯɺදݱͰ͖Δ਺஋ͷྖҬͷ޿͞ͱখ͍͞஋ ͷղ૾౓Λಉ࣌ʹߴΊΔ͜ͱ͕ՄೳͱͳΔɻຊݚڀͰ ͸ɺॏΈͷྔࢠԽख๏ͱͯ͠ର਺ྔࢠԽͱόΠφϦԽ ΛऔΓ্͛Δɻͨͩ͠ɺΞΫνϕʔγϣϯͷྔࢠԽʹ ͍ͭͯ͸ߟྀ͠ͳ͍͜ͱͱ͢Δɻ ͸͡Ίʹɺݩͷ࣮਺දݱͷॏΈ܎਺Λwͱ͢Δͱɺ ର਺ྔࢠԽ͸࣍ࣜͰද͞ΕΔɻ AP 2(w) = sign(w)× 2round(log2|w|) (1) ͜ͷAP 2(·)͸approximate-power-of-2ͷ಄จࣈΛͱͬ ͨ΋ͷͰɺ࠷΋͍ۙ2ͷ΂͖৐ʹۙࣅ͢ΔԋࢉͰ͋Δɻ ͜ͷԋࢉΛ༻͍Δͱුಈখ਺Ͱදݱ͞Ε͍ͯΔ஋Λର਺ ྔࢠԽ͢Δ͜ͱ͕Ͱ͖Δɻ͞Βʹɺ1ͭͷ஋Λදݱ͢Δ ͨΊʹඞཁͳϏοτ෯(ූ߸ϏοτΛؚΉ)Λbitwidthɺ ͜ͷϏοτ෯ͰදݱͰ͖Δ࠷େ஋ͱ࠷খ஋ΛͦΕͧΕ maxV ͱminV ͱͨ͠ͱ͖ɺ࣍ࣜΛ༻͍ͯϏοτ੍໿ Λ͔͚Δɻ

LogQuant(w, bitwidth, maxV )

= Clip(AP 2(w), minV, maxV ) (2)

ࣜ(1) ʹ͸roundԋࢉؚ͕·Ε͍ͯΔɻҰൠతʹ͸ roundԋࢉ͸ҎԼͷ࢛ࣺޒೖ͕༻͍ΒΕΔɻ round(x) =  ceil(x) (x− x ≥ 0.5) f loor(x) (x− x < 0.5) (3) ͜ͷroundԋࢉ͸தԝ஋͕0.5Ͱ͋ΔͨΊ࣮਺ྖҬͰ ͷྔࢠԽޡࠩΛ࠷΋ݮΒ͢͜ͱ͕Ͱ͖Δɻ͔͠͠ɺର ਺ྖҬͰͷதԝ஋͸0.5Ͱ͸ͳ͘ɺlog2(32)ͱͳΔɻͦ ͷͨΊࣜ(1)Ͱ͸࣍ࣜΛ༻͍Δ͜ͱʹΑͬͯྔࢠԽ࣌ ͷޡ͕ࠩݮΒ͢͜ͱ͕Ͱ͖Δɻ round(x) =  ceil(x) (x− x ≥ log2(32)) f loor(x) (x− x < log2(32)) (4) ্ࣜΛ༻͍ͨର਺ྔࢠԽ࣌ͷޡࠩΛਤ2ʹࣔ͢ɻ·ͨɺ όΠφϦԽʹ͸࣍ͷࣜΛ༻͍Δɻ Binarize(w) = sign(w) =  +1 (if w≥ 0) −1 (otherwise) (5)

3

ྔࢠԽޡࠩʹجͮ͘ਖ਼ଇԽ

͜Ε·ͰͷྔࢠԽχϡʔϥϧωοτϫʔΫ͸ɺ਺஋Λ ྔࢠԽ͢Δͱ͖ʹൃੜ͢ΔྔࢠԽޡࠩ(QE;Quantization Error)Λߟྀ͍ͯ͠ͳ͍ɻຊݚڀͰ͸ɺྔࢠԽޡࠩΛߟ ྀͨ͠χϡʔϥϧωοτϫʔΫͷֶशख๏ΛఏҊ͢Δɻ QEͷൃੜʹΑΔೝࣝਫ਼౓ͷ௿ԼΛ཈͑ΔͨΊʹ͸ɺ QE͕খ͘͞ͳΔΑ͏ʹॏΈΛֶश͢Ε͹ྑ͍ɻ͜͜ ͰQEΛਖ਼ଇԽ߲ͱͯ͠໨తؔ਺ʹ෇Ճ͢Δɻ͜ͷ໨ తؔ਺ʹ͍ͭͯॏΈͷֶशΛਐΊΔ͜ͱͰɺQEͱೝ ࣝޡࠩΛಉ࣌ʹখ͘͞͠ɺೝࣝਫ਼౓Λอͭ͜ͱ͕Մೳ ͱͳΔɻ ࣮਺දݱͷॏΈΛw,ྔࢠԽ(ର਺ྔࢠԽ͓ΑͼόΠ φϦԽ)ޙͷॏΈΛwqͱͨ͠ͱ͖ɺྔࢠԽޡࠩ͸࣍ࣜ Ͱఆٛ͞ΕΔɻ QE(w) = w− wq (6) ͦͯ͠ྔࢠԽޡࠩʹجͮ͘ਖ਼ଇԽ(QER;Quantization Error-based Regularization)߲ΛҎԼͷΑ͏ʹఆٛ͢ Δɻ QER(w) =w − wq2 (7) ͞Βʹೝࣝޡࠩؔ਺ΛE(w)ͱͨ͠ͱ͖ɺ໨తؔ਺ʹ QER߲ΛՃ͑ͯҎԼͷΑ͏ʹఆٛ͢Δɻ L(w) = E(w) + η2QER(w) (8) ͜ͷ໨తؔ਺Λ࣍ࣜͷΑ͏ʹ࠷খԽ͢Δํ޲΁ֶशΛ ਐΊΔ͜ͱͰɺೝࣝޡࠩͱQE͕ಉ࣌ʹখ͘͞ͳΔɻ min w L(w) (9) ҰൠతͳχϡʔϥϧωοτϫʔΫͰ͸ɺաֶशΛ๷͙ ͨΊʹॏΈͷL2ϊϧϜͱ͍͏ਖ਼ଇԽ߲Λ༻͍Δ͜ͱ͕ ͋Δ[5]ɻL2ਖ਼ଇԽ͸ॏΈ͕0ʹۙͮ͘Α͏ʹಇͨ͘ ΊɺॏΈͷൃࢄΛ๷͙͜ͱ͕Ͱ͖ΔɻҰํͰɺຊఏҊ ͷQER͸࣮਺දݱͷॏΈ͕ྔࢠԽޙͷॏΈʹۙͮ͘ Α͏ʹಇ͘ɻͦͷͨΊɺྔࢠԽޡ͕ࠩখ͘͞ͳΓɺೝ ࣝޡࠩΛ཈͑Δ͜ͱ͕ՄೳͱͳΔɻ - 2 -

(3)

Algorithm 1QERΛద༻ͨ͠ྔࢠԽχϡʔϥϧωο τϫʔΫͷֶश

Require: a minibatch of inputs and targets (x0, x∗), previous weights w, previous learning rate ηt. Ensure: updated weights wt+1, updated learning

rate ηt+1 1. Forward propagation forl = 1 to L do wql ⇐ Quantize(wl) ul⇐ xl−1· wl if l < L then xl⇐ReLU(ul) end if end for 2. Backward propagation Compute ∂E

∂uL knowing uLand x

forl = L to 1 do ∂E ∂uql−1 ∂E ∂ul · w q l ∂E ∂wl ∂E ∂ul T · uq l−1 end for

3. Accumulating the parameter gradients

forl = 1 to L do ∂QER(wl) ∂wl ⇐ QE(wl) wt+1l ⇐ wl− η1t·∂w∂El − η t 2·∂QER(w∂wl l) ηt+1⇐ ληt end for

4

ධՁ

ຊઅͰ͸ɺQER߲Λ༻͍ͣ௨ৗͷֶशΛߦͬͨ৔߹ ͱɺQER߲Λ໨తؔ਺ʹ෇Ճֶͯ͠शΛߦͬͨ৔߹ͷ ධՁΛߦͬͨɻධՁʹ͸ػցֶशϑϨʔϜϫʔΫ Ten-sorFlowΛ༻͍ͨɻ࢖༻ͨ͠਺஋දݱ͸্ه·Ͱͱಉ༷ ʹɺର਺ྔࢠԽͱόΠφϦԽͰ͋Δɻର਺ྔࢠԽͰ͸ɺ ͢΂ͯͷ૚ʹ͓͚ΔॏΈͷྔࢠԽΛLogQuant(w, 4, 1) ͱͨ͠ɻֶश͸Algorithm 1ʹଇͬͯߦ͏ɻ͜͜Ͱͷ ֶश཰η1͸0.001ͱͨ͠ɻ·ͨη2͸ॳظ஋Λ0.00001 ͱ͠ɺ10ΤϙοΫຖʹ1.2ഒͱઃఆͨ͠ɻ͜ͷ܎਺Λ ༻͍Δ͜ͱͰɺֶश։࢝௚ޙ͸ೝࣝޡࠩΛॏࢹ͠ɺֶ शऴ൫͸ྔࢠԽޡࠩΛॏࢹ͢ΔΑ͏ʹॏΈ͕ߋ৽͞Ε Δɻֶशͷ࠷దԽʹ͸Adam[7]Λ࢖༻ͨ͠ɻ·ͨɺֶ शʹ࢖༻ͨ͠σʔληοτ͸࣍ͷ2ͭͰ͋Δɻ 1. MNIST[10]͸28×28ͷάϨʔεέʔϧը૾Ͱߏ ੒͞Ε͓ͯΓɺ0͔Β9·Ͱͷखॻ͖਺ࣈը૾ͷ σʔληοτͰ͋Δɻֶश༻ͷը૾6000ຕͱධ Ձ༻ͷը૾10000ຕͷ70000ຕ͕͋Δɻσʔλ ΞʔΪϡϝϯτ͸࢖༻͍ͯ͠ͳ͍ɻֶशʹ࢖༻ ͨ͠ωοτϫʔΫ͸ӅΕ૚2૚ؚΉmulti-layer ද1: ֤ख๏ͷೝࣝਫ਼౓ MNIST CIFAR-10 float 0.9777 0.6941

LogQuantize (4bit) w/o QER 0.9773 0.6844

w/ QER 0.9783 0.7031

Binarize (1bit) w/o QER 0.9664 0.6724

w/ QER 0.9709 0.6839 0.4 0.45 0.5 0.55 0.6 0.65 0.7 0.75 0 50 100 150 200 250 300 float

LogQuantize (4bit) w/o QER LogQuantize (4bit) w/ QER

epoch accuracy  0.4 0.45 0.5 0.55 0.6 0.65 0.7 0.75 0 50 100 150 200 250 300 float

Binarize w/o QER Binarize w/ QER epoch accuracy  ਤ3: CIFAR-10Ͱͷೝࣝਫ਼౓ͷऩଋ perceptronΛ࢖༻͓ͯ͠ΓɺҎԼͷ΋ͷͰ͋Δɻ ͜͜ͰͷFC͸શ݁߹૚Λࣔ͢ɻ FC728-256, FC256-256, FC256-10 2. CIFAR-10[8]͸32×32ͷΧϥʔը૾Ͱߏ੒͞Ε ͓ͯΓɺ10Ϋϥε෼ྨͷը૾ͷσʔληοτͰ͋ Δɻֶश༻ͷը૾50000ຕͱධՁ༻ͷը૾10000 ຕͷ60000ຕ͕͋ΔɻσʔλΞʔΪϡϝϯτ͸ ࢖༻͍ͯ͠ͳ͍ɻֶशʹ࢖༻ͨ͠ωοτϫʔΫ͸

CNN(convolutional neural network)Λ࢖༻ͯ͠

͓ΓɺҎԼͷ΋ͷͰ͋Δɻ͜͜ͰͷC3-X͸3×3 ϑΟϧλʔΛ༻͍ͨXνϟωϧग़ྗͷ৞ࠐΈ૚ɺ MP2͸max-pooling૚Λࣔ͢ɻ C3-64, MP2, C3-64, MP2, FC4096-384, FC384-192, FC192-10 ද1ʹೝࣝਫ਼౓ͷ݁ՌΛࣔ͠ɺਤ3ʹͦΕΒͷऩଋ ঢ়گΛࣔ͢ɻMNISTͱCIFAR-10ͷ྆ํͷϕϯνϚʔ ΫͰɺର਺ྔࢠԽɺόΠφϦԽͷͲͪΒͷྔࢠԽ๏ʹ͓ ͍ͯ΋QER߲ΛؚΊֶͯशΛ͢Δ͜ͱͰೝࣝਫ਼౓͕޲ ্ͨ͠ɻಛʹLogQuantize(4bit)ͷධՁͰ͸QERΛద ༻͢Δ͜ͱͰfloatͷೝࣝਫ਼౓Λ্ճΔ݁Ռͱͳͬͨɻ - 3 -

(4)

5

ؔ࿈ݚڀ

ॏΈ܎਺ΛྔࢠԽͯ͠ϋʔυ΢ΣΞʹదͨ͠ܗʹѹॖ ͢Δख๏͕ఏҊ͞Ε͍ͯΔɻShinΒ͸LUT(Look Up Table)Λ࢖͏ͨΊॏΈΛѹॖͨ͠[13]ɻGyselΒ͸ϋʔ υ΢ΣΞࢦ޲ͷݻఆখ਺දݱͷॏΈʹ͢ΔͨΊͷϑΝ Πϯνϡʔχϯάٕज़ΛఏҊͨ͠[3]ɻ͜ΕΒͷख๏͸ ݶΒΕͨ਺஋දݱͰͷೝࣝਫ਼౓ΛߴΊΔͨΊʹॏΈΛ ࠷దԽ͍ͯ͠Δɻզʑͷݚڀ͸ɺྔࢠԽޡࠩΛߟྀͯ͠ ͍Δͱ͍͏఺Ͱ͜ΕΒͷݚڀͱ͸ҟͳΔɻ͔͠͠ɺ͜ ΕΒͷख๏ͱಉ࣌ʹ࢖༻͢Δ͜ͱ͕ՄೳͰ͋Δɻ Loss-aware binarization[4]͸όΠφϦԽ͞ΕͨॏΈ ʹର͢ΔଛࣦΛ௚઀࠷খʹ͢ΔͨΊʹɺDiagonal

Hes-sian ApproximationΛ༻͍ͨproximal Newton

algo-rithmΛ࠾༻͍ͯ͠Δɻզʑͷݚڀ͸ྔࢠԽ࣌ͷӨڹ Λߟྀ͍ͯ͠Δͱ͍͏఺Ͱಉ͡Ͱ͋Δɻզʑ͸ɺਖ਼ଇ Խ߲Λ༻͍ͯೝࣝਫ਼౓Λ্͛Δ͜ͱΛ໨తͱ͍ͯ͠Δ ͕ɺόΠφϦԽҎ֎ͷྔࢠԽʹ΋ద༻ՄೳͰ͋Δɻ

6

·ͱΊ

ຊݚڀͰ͸ɺχϡʔϥϧωοτͷϋʔυ΢ΣΞ࣮૷ʹ ޲͚ɺྔࢠԽޡࠩΛߟֶྀͨ͠शख๏ΛఏҊͨ͠ɻྔ ࢠԽޡࠩʹجͮ͘ਖ਼ଇԽ߲Λ໨తؔ਺ʹ෇Ճ͢Δ͜ͱ ͰɺྔࢠԽޡࠩͱೝࣝޡࠩΛখ͘͞ͳΔΑ͏ʹֶशΛ ߦ͏͜ͱ͕ՄೳͱͳΔɻ ࠓޙͷ՝୊ͱͯ͠ɺΑΓେن໛ͳωοτϫʔΫ΁ͷ ద༻΍ɺର਺ྔࢠԽ΍όΠφϦԽͱ͍ͬͨྔࢠԽ๏ͩ ͚Ͱͳ͘ɺઢܗྔࢠԽ΁ͷద༻Λ͠ɺධՁΛߦ͏ඞཁ ͕͋Δɻ·ͨɺਖ਼ଇԽ߲ͷಋೖʹ͸܎਺Λ༻͍ΔͨΊɺ ͜ͷ܎਺ͷಈత࠷దԽٕज़͕ߟ͑ΒΕΔɻ

ँࣙ

ຊݚڀ͸JST ACCElٴͼςΫϊόͷॿ੒Λड͚ͨ ΋ͷͰ͋Δɻ

ࢀߟจݙ

[1] Ando, K., Orimo, K., Ueyoshi, K., Yonekawa, H., Sato, S., Nakahara, H., Ikebe, M., Asai, T., Takamaeda-Yamazaki, S., Kuroda, T., Mo-tomura, M.: BRein Memory: A 13-layer 4.2 K neuron/0.8 M synapse binary/ternary reconfig-urable in-memory deep neural network accelera-tor in 65 nm cmos. In: 2017 IEEE Symposium on VLSI Circuits (VLSI-Circuits). pp. C24–C25. Kyoto, Japan (2017)

[2] Courbariaux, M., Hubara, I., Soudry, D., El-Yaniv, R., Bengio, Y.: Binarized Neural Net-works: Training Deep Neural Networks with Weights and Activations Constrained to +1 or -1. ArXiv e-prints (Feb 2016)

[3] Gysel, P., Motamedi, M., Ghiasi, S.: Hardware-oriented Approximation of Convolutional Neural Networks. ArXiv e-prints (Apr 2016)

[4] Hou, L., Yao, Q., Kwok, J.T.: Loss-aware Bina-rization of Deep Networks. ArXiv e-prints (Nov 2016)

[5] Janocha, K., Czarnecki, W.M.: On Loss Func-tions for Deep Neural Networks in Classification. ArXiv e-prints (Feb 2017)

[6] Johnson, M., Schuster, M., Le, Q.V., Krikun, M., Wu, Y., Chen, Z., Thorat, N., Vi´egas, F., Wattenberg, M., Corrado, G., Hughes, M., Dean, J.: Google’s Multilingual Neural Machine Trans-lation System: Enabling Zero-Shot TransTrans-lation. ArXiv e-prints (Nov 2016)

[7] Kingma, D.P., Ba, J.: Adam: A Method for Stochastic Optimization. ArXiv e-prints (Dec 2014)

[8] Krizhevsky, A., Nair, V., Hinton, G.: Cifar-10 (canadian institute for advanced research) http://www.cs.toronto.edu/ kriz/cifar.html [9] LeCun, Y., Bengio, Y., Hinton, G.: Nature.

Na-ture (2016)

[10] LeCun, Y., Cortes, C.: MNIST handwritten digit database (2010), http://yann.lecun.com/exdb/mnist/

[11] Lin, D.D., Talathi, S.S., Sreekanth Annapureddy, V.: Fixed Point Quantization of Deep Convolu-tional Networks. ArXiv e-prints (Nov 2015) [12] Miyashita, D., Lee, E.H., Murmann, B.:

Con-volutional Neural Networks using Logarithmic Data Representation. ArXiv e-prints (Mar 2016) [13] Shin, D., Lee, J., Lee, J., Yoo, H.J.: 14.2 dnpu: An 8.1tops/w reconfigurable cnn-rnn pro-cessor for general-purpose deep neural networks. In: 2017 IEEE International Solid-State Circuits Conference (ISSCC). pp. 240–241 (Feb 2017)

参照

関連したドキュメント

We are going to represent λ-calculus via a translation into MELL proofnets MELL proofnets are going to be presented via a mix between sharing graphs (i.e. numbered interaction nets)

The approach based on the strangeness index includes un- determined solution components but requires a number of constant rank conditions, whereas the approach based on

The main purpose of this paper is to extend the characterizations of the second eigenvalue to the case treated in [29] by an abstract approach, based on techniques of metric

By an inverse problem we mean the problem of parameter identification, that means we try to determine some of the unknown values of the model parameters according to measurements in

For suitable representations and with respect to the bounded and weak operator topologies, it is shown that the algebra of functions with compact support is dense in the algebra

Massoudi and Phuoc 44 proposed that for granular materials the slip velocity is proportional to the stress vector at the wall, that is, u s gT s n x , T s n y , where T s is the

The study of nonlinear elliptic equations involving quasilinear homogeneous type operators is based on the theory of Sobolev spaces W m,p (Ω) in order to find weak solu- tions.. In

proved that on any bounded symmetric domain (Hermitian symmetric space of non-compact type), for any compactly supported smooth functions f and g , the product of the Toeplitz