• 検索結果がありません。

Efficient Implementation and PerformanceEvaluation of Lattice/based Cryptography overMemory/Constrained Environments

N/A
N/A
Protected

Academic year: 2021

シェア "Efficient Implementation and PerformanceEvaluation of Lattice/based Cryptography overMemory/Constrained Environments"

Copied!
105
0
0

読み込み中.... (全文を見る)

全文

(1)

Efficient Implementation and Performance

Evaluation of Lattice/based Cryptography over Memory/Constrained Environments

袁, ヤ

https://doi.org/10.15017/4060003

出版情報:九州大学, 2019, 博士(機能数理学), 課程博士 バージョン:

権利関係:

(2)

Kyushu University

Efficient Implementation and Performance Evaluation of Lattice-based Cryptography over

Memory-Constrained Environments

Author:

Ye Yuan

A thesis submitted in fulfillment for the degree of Doctor of Philosophy in Functional Mathematics

in the

Graduate School of Mathematics

Supervisor:

Prof. Tsuyoshi Takagi

(3)
(4)

iii

Abstract

Traditional public-key cryptography algorithms such as the popular RSA and el- liptic curve cryptography is the critical technology to advanced cybersecurity.

However, since 1994 Peter Shor proposed an algorithm that can crack all RSA and elliptic curve cryptography, quantum computers using Shor’s algorithm that brings with it a significant threat to the modern cryptosystems, applications, com- munication protocols, and so on. Thus, the practical research of new cryptogra- phy secure against attacks by quantum computers is required.

In August 2015, the National Security Agency (NSA) announced its plans to tran- sition to quantum-resistant algorithms. In 2016, the National Institute of Stan- dards and Technology (NIST) launched a project to solicit, evaluate, and stan- dardize post-quantum cryptography (PQC) in a bid to develop the next-generation quantum-safe cryptographic standard. Lattice-based cryptography, as one of the strongest candidates of PQC, has attracted a lot of interest from the cryptographic community because of its high-security, efficiency, and applicability. In this thesis, we focus on the efficient implementation and performance evaluation of lattice-based cryptography.

Firstly, we propose an efficient implementation of a lattice-based encryption scheme on a memory-constrained Java Card. We implement the original ring-LWE based encryption scheme on a standard Java Card platform by combining the number theoretic transform and improved Montgomery modular multiplication without any cryptographic co-processor support. We then optimize discrete Ziggurat sam- pling and Knuth-Yao methods to sample from prescribed probability distributions on our Java Card. Our result demonstrates that implementing more lattice-based cryptosystems on memory-constrained devices is feasible.

Secondly, we present the first implementation of several candidate algorithms for the NIST PQC project on JavaScript-enabled platforms with good performance,

(5)

portability, and scalability. We use the number theoretic transform to speed up polynomial multiplication and the modified Knuth-Yao algorithm for efficient discrete Gaussian sampling. We report the performance of our JavaScript im- plementation on multiple platforms, including Web browsers, embedded devices, Android phones, and so on. Our proof-of-concept implementation demonstrates that some of the lattice-based cryptosystems can be implemented efficiently in JavaScript.

Lastly, based on the above results, we present an approach using lattice-based cryptography as a proof-of-concept for securing power substation communica- tions. We implement a digital signature Dilithium and an encryption scheme Kyber on an intelligent electronic device (IED), transmitted encrypted data on a simulated substation Ethernet LAN using multicast message packets based on the international standard IEC 61850, and measure their performance. Our imple- mentation performs better than traditional RSA cryptosystems that the execution of Dilithium and Kyber on the low-specification device could be completed in 2 or 3 ms. We show that some of the lattice-based cryptosystems could run fast on IEDs and probably could be used to ensure the confidentiality and integrity of substation communications in the quantum computing era. Hence, our work could be a good reference for lattice-based cryptography in the standardization process of NIST. In our future work, we expect to improve the implementation for particular platforms and investigate more lattice-based cryptosystems on mul- tiple platforms for the NIST PQC standardization project.

(6)

v

Acknowledgements

My deepest gratitude goes first and foremost to Prof. Tsuyoshi Takagi, my super- visor, for his constant encouragement and guidance. I am grateful to thank my thesis committee members for all of their guidance through this process. I ac- knowledge the contribution of Mr. Junting Xiao for his collaboration in the early stages of this work. I would like to thank Dr. Shinsaku Kiyomoto, Dr. Kazuhide Fukushima, Mr. Takehiro Shimada, and Mr. Noriyuki Ueda for their assistance.

I would also like to thank Prof. Peter Schwabe and his research fellow and all staff of ISARA Corporation for their useful discussions and kindly help during my internship. I am indebted to the “Kyushu University Leading PhD Program in Mathematics for Key Technologies” for my overseas internship support. Spe- cial thanks to the coordinator of Prof. Osamu Saeki for his constant enthusiasm and encouragement, and Ms. Mika Tomonaga and Ms. Sayuri Kojima for their excellent work.

(7)
(8)

vii

Contents

Abstract iii

Acknowledgements v

1 Introduction 1

1.1 Public-Key Cryptography . . . 1

1.2 Post-Quantum Cryptography . . . 2

1.3 Contributions of this Thesis . . . 3

1.4 Outline . . . 5

2 Lattice-based Cryptography 7 2.1 Lattice . . . 7

2.2 Discrete Gaussian Sampling . . . 10

2.3 Learning With Errors (LWE) Problem . . . 11

2.3.1 LWE-based Encryption Scheme . . . 12

2.4 Ring-LWE Problem . . . 13

2.4.1 Ring-LWE based Encryption Scheme . . . 13

2.4.2 Ring-LWE based Digital Signature . . . 14

2.5 Module Learning with Errors problem (MLWE) . . . 15

3 Efficient Algorithms for Lattice-based Cryptography 17 3.1 Discreta Gaussian Sampling . . . 17

3.1.1 Rejection Sampling . . . 17

3.1.2 Knuth-Yao Algorithm . . . 18

3.2 Polynomial Multiplication . . . 19

3.2.1 Karatsuba Algorithm . . . 20

3.2.2 Number Theoretic Transform (NTT) . . . 21

3.3 Montgomery Modular Multiplication (MMM) . . . 22

4 Memory-Constrained Implementation of Lattice-based Encryption Scheme on Standard Java Card 25 4.1 Introduction . . . 25

(9)

4.2 Preliminaries . . . 29

4.2.1 Discrete Gaussian Sampling on Java Card . . . 29

4.2.2 Ring-LWE based Cryptosystem . . . 30

4.3 Java Card Platform Specification . . . 31

4.4 Efficient Algorithms for Ring-LWE Scheme on Java Card . . . . 33

4.4.1 Discrete Gaussian Sampling Methods . . . 33

Discrete Ziggurat Algorithm . . . 33

Optimized Knuth-Yao Algorithm . . . 36

4.4.2 Montgomery Modular Multiplication . . . 38

4.4.3 Number Theoretic Transform on Java Card . . . 40

4.5 Performance of Ring-LWE Cryptosystem on Java Card . . . 43

4.5.1 Performance Results of Discrete Gaussian Sampling . . 43

4.5.2 Performance Results of LPR-LWE . . . 45

4.5.3 Detailed Running Time of LPR-LWE . . . 46

4.5.4 Experimental Comparison of NTT . . . 48

5 Portable Implementation of Post-quantum Encryption Schemes and Key Exchange Protocols on JavaScript-enabled Platforms 49 5.1 Introduction . . . 49

5.2 Lattice-based Cryptosystems . . . 52

5.2.1 LWR problem. . . 52

5.2.2 Binomial distribution . . . 52

5.2.3 Lizard and ring-Lizard . . . 53

5.2.4 Kyber . . . 54

5.2.5 Frodo . . . 54

5.2.6 NewHope . . . 55

5.3 Experimental Runtime Environments . . . 56

5.3.1 Web Browsers . . . 56

5.3.2 Tessel2 . . . 56

5.3.3 Android WebView, PC, and Mac . . . 57

5.4 Polynomial Multiplication in JavaScript . . . 58

5.5 Performance Results on Web Browsers . . . 58

5.6 Performance Results on Other JavaScript-enabled Platforms . . 62

5.6.1 Performance on Tessel2 . . . 62

5.6.2 Performance on Android phone . . . 63

5.6.3 Performance on Other Run-time Environments of Win- dows and macOS . . . 63

(10)

ix

6 Performance Investigation of Lattice-based Cryptography for Quan-

tum Secure Power Grid Communication 67

6.1 Introduction . . . 67

6.2 Overview of Implemented Lattice-based Digital Signature and Encryption Scheme . . . 69

6.2.1 Summary of Dilithium Signature. . . 69

6.2.2 Summary of Kyber Encryption Scheme . . . 71

6.3 Our Implementation and Performance Results for Securing GOOSE Messages . . . 73

6.3.1 Performance of Dilithium . . . 73

6.3.2 Performance of Kyber . . . 74

6.4 Discussion . . . 77

7 Conclusions 79 7.1 Future Work . . . 80

(11)
(12)

xi

List of Figures

2.1 Diagrams of two sets of linearly independent vectors (a) and (b) that generate a same lattice. . . 8 2.2 Diagrams of (a) the shortest vector problem (SVP) and (b) the

closest vector problem. For SVP, blue vectors are a basis, red vector is the shortest vector in the lattice. For CVP, blue vectors are a basis, green vector is the target, and red vector is the closest vector in the lattice. . . 9 3.1 A Knuth-Yao DDG tree . . . 19 4.1 Probability arrayk, includingl blocks . . . 36 4.2 Optimized probability array by deleting zeros in grey areas, the

sign bit indicates the difference between two consecutive block lengths . . . 38 4.3 Decomposition of the total running time (seconds) for different

parameter sets . . . 47 4.4 The proportions of computation time for different parameter sets 47 5.1 Running time (ms) of lattice-based cryptosystems Lizard, ring-

Lizard, Kyber, Frodo, and NewHope on Firefox . . . 59 5.2 Decomposition of computation time (ms) of Lizard, ring-Lizard,

Kyber, Frodo, and NewHope on Firefox . . . 59 5.3 Running time (ms) of Kyber and NewHope on Firefox, Google

Chrome, Opera, and Microsoft Edge . . . 60 5.4 Running time (second) of lattice-based cryptosystems Lizard, ring-

Lizard, Kyber, Frodo, and NewHope on Tessel2 . . . 62 6.1 IEC 61850 substation architecture for directly DERs control and

monitoring. . . 75 6.2 GOOSE message structure . . . 75 6.3 The process of GOOSE PDU data encryption and decryption us-

ing Kyber . . . 76

(13)
(14)

xiii

List of Tables

1.1 The Round-2 candidates of the NIST PQC project (as of October

16th, 2019) . . . 3

4.1 The selected Ring-LWE parameters work on Java Card . . . 30

4.2 The selected parameters for performance comparison of NTT on Java Card . . . 30

4.3 Features of Java Card Platform Specification 2.2.2 . . . 31

4.4 Experimental comparison of discrete Gaussian sampling methods on Java Card for different parameter sets . . . 44

4.5 Performance results (seconds) of LPR-LWE on Java Card for dif- ferent parameter sets (comparing with our previous work in [YFKT17]) 45 4.6 Experimental comparison of NTT for different parameter sets (seconds) . . . 48

5.1 Summary of the selected parameters that provide about 128-bit security . . . 50

5.2 Lattice-based public-key exchange protocol Frodo . . . 55

5.3 Lattice-based public-key exchange protocol NewHope . . . 56

5.4 Performance results on Android phone . . . 63

5.5 Performance results on WSH . . . 64

5.6 Performance results on Node.js . . . 64

5.7 Performance results on osascript . . . 65

5.8 Performance results on Pacifista . . . 65

6.1 Running time of signing and verification of different digital sig- natures . . . 74 6.2 Running time of sign and verify for different digital signatures . 77

(15)
(16)

1

Chapter 1

Introduction

1.1 Public-Key Cryptography

Public-key cryptography (PKC), sometimes referred to as asymmetric cryptogra- phy, is a cryptographic system to encrypt the plaintext and decrypt the ciphertext using key pairs. The security of PKC is based on the difficulties of solving hard mathematical problems. For example, the widely used RSA cryptography and elliptic curve cryptography (ECC) are based on the difficulties of solving integer factorization and discrete logarithm problems, respectively. PKC can generate a public key and a secret key as a key pair; any sender can encrypt a message using the receiver’s public key, and only the secret key owned by the receiver can be used to decrypt the encrypted message. Besides, a sender can also use a secret key to create a digital signature on a message, and anyone who holds the sender’s corresponding public key can verify if the signature is valid. It is infeasible for any other person to determine the kept secret key by analyzing the corresponding public key. Therefore, PKC is different from the symmetric cryptography such as AES or DES, which uses the same keys for both encryption and decryption.

PKC has become an essential element of modern cybersecurity. However, Shor’s algorithm [Shor94] that was invented by Peter Shor for factorization and com- putation of discrete logarithms can efficiently break the widely used RSA and ECC. The rapid development of quantum computers, coupled with Shor’s algo- rithm, brings a significant threat to modern cybersecurity in recent years, and will inevitably exert a significant impact on the traditional PKC.

(17)

1.2 Post-Quantum Cryptography

Post-quantum cryptography (PQC) refers to public-key cryptographic algorithms which are aiming to address concerns about the quantum threat for cybersecurity.

Unlike quantum cryptography of which security features are based on the quan- tum mechanical properties, the security of PQC relies on some hard mathematical problems which are unable to be solved by quantum computers. In the face of the security threat posed by quantum computers, the practical approaches of PQC should be secure against attacks by both classical and quantum computing and be able to run on various platforms, including hardware, applications, browsers, and so on.

In recent years, the rapid development of quantum computers and applications has accelerated PQC research and also affected existing security policies. In Au- gust 2015, the National Security Agency (NSA) announced its plans to move to- wards quantum-resistant algorithms to replace the recommended Suite B elliptic curve cryptographic algorithms specified by the National Institute of Standards and Technology (NIST). In 2016, NIST launched a PQC standardization project to select several acceptable candidate post-quantum cryptosystems as the next- generation cryptographic standard. By the end of November 2017 deadline, the NIST PQC project accepted 69 candidate algorithms as the round-1 submissions.

Among the all submissions, lattice-based cryptography has gained wide attention from academia and industry due to its high-security, efficiency, and applicabil- ity. A lattice is the set of all integer linear combinations of its basis vectors in a Euclidean space. Since the first lattice-based cryptographic construction in- troduced by Ajtai in [Ajt96], many lattice-based cryptographic algorithms pro- vide strong provable-security guarantees based on the worst-case hardness of lat- tice problems. Many efficient lattice-based cryptographic constructions leverage average-case hardness of lattice problems, for example, the short integer solu- tion problem [Ajt96], the learning with errors (LWE) problem [Reg05] and its variant called ring-LWE problem [LPR10]. As of October 2019, 26 candidates are remaining for the round-2 evaluation, and that nearly half (12 algorithms) are lattice-based cryptography (see Table1.1), which means some lattice-based cryp- tographic algorithms will be promising to become the new cryptographic standard in the future. Therefore, it is necessary to carry out the study on those candidate lattice-based cryptographic algorithms with different constructions.

(18)

1.3. Contributions of this Thesis 3 TABLE1.1: The Round-2 candidates of the NIST PQC project (as

of October 16th, 2019)

NIST PQC Project Round 2 Public-key Encryption Scheme/

Key Encapsulation Mechanism Signature

Lattice-based CRYSTALS-KYBER; FrodoKEM; LAC;

NewHope; NTRU; NTRU Prime; Round5;

SABER; Three Bears

CRYSTALS-DILITHIUM; FALCON;

qTESLA

Code-based BIKE; Classic McEliece; HQC; LEDAcrypt;

NTS-KEM; ROLLO; RQC -

Hash-based - SPHINCS+

Multivariate - GeMSS; LUOV; MQDSS; Rainbow

Supersingular Elliptic Curve Isogeny SIKE -

Zero-knowledge proofs - Picnic

1.3 Contributions of this Thesis

In fact, we started to study and investigate lattice-based cryptography even before the PQC plans of NSA and NIST were formally announced. We are focusing on the efficient implementation and performance evaluation of lattice-based cryp- tography. We also have studied many other lattice-based cryptosystems except the candidates for the NIST PQC project. Unlike those submitted algorithms which contain the complete source code, early cryptosystems did not provide full implementation or were short on implementation details. We have no idea about their performance unless we implement them by ourselves. Some of those cryptosystems have shown relatively poor performance after implemented (e.g.

see [SS11]). For the NIST PQC project, the final selected algorithms should be proved secure enough and run on various platforms with excellent performance.

Therefore, it is an important task to evaluate the performance of those candidate algorithms on multiple platforms.

The contributions of this thesis summarized as follows:

We propose an efficient implementation of lattice-based encryption scheme on a memory-constrained Java Card. In the IoT era, including widely used smart cards, memory-constrained devices require resisting attacks by quantum computers. Lattice-based encryption scheme possesses highly efficiency and reliability, which could run on small devices with limited storage capacity and computation resources such as IoT sensor nodes or smart cards. We present the first implementation of the original ring-LWE based encryption scheme on a standard Java Card platform by combining the number theoretic transform and improved Montgomery modular mul- tiplication. Without any cryptographic co-processor support, the running

(19)

time of decryption is nearly optimal (around 7 seconds corresponding to AES-128 security level). We also optimize discrete Ziggurat sampling and Knuth-Yao methods to sample from prescribed probability distributions on the Java Card. More importantly, we indicate that polynomial multiplica- tion can be performed on a standard Java Card platform efficiently, even if the big integers are not supported. Our result demonstrates that im- plementing more lattice-based cryptosystems on existing low-specification Java cards is feasible.

We are the first to present the implementation of several candidate lattice- based cryptographic algorithms for the NIST PQC project on multiple plat- forms in JavaScript with good performance, portability, and scalability.

Compared to powerful machines with ample amount of hardware resources, IoT devices (including the massive number of microcontrollers, smart ter- minals, and sensor nodes with limited computing capacity) and applica- tions should also have some post-quantum cryptography features for secu- rity and privacy. To ensure the correct execution of encryption algorithms on any platform, the portability of implementation becomes more critical.

As distinguished from C/C++, JavaScript is a popular cross-platform lan- guage that can be used for web applications and some hardware platforms directly, and it could be one of the solutions of portability. Therefore, we investigate and implement several round-1 candidate lattice-based cryp- tographic algorithms (Lizard, ring-Lizard, Kyber, Frodo, and NewHope) in JavaScript because of their applicabilities and efficiencies. We show and compare the performance of our JavaScript implementation on web browsers, embedded device Tessel2, Android phone, as well as several JavaScript-enabled platforms on PC and Mac. Our work demonstrates that implementing lattice-based cryptography on JavaScript-enabled platforms is feasible and results in desirable performance and portability.

As a proof-of-concept for message authentication, our implementation of the new lattice-based digital signature and encryption scheme shows the feasibility of post-quantum cryptography for securing modern power sub- station communications. The modern substation communication becomes more efficient using the generic object-oriented substation event (GOOSE) service of the IEC 61850 standard. GOOSE service should have crypto- graphic features to protect message integrity and confidentiality. The inter- national standard IEC 62351 recommends using digital signatures such as

(20)

1.4. Outline 5 RSA to secure GOOSE messages. However, large universal quantum com- puters will be able to break the traditional public-key cryptosystems include RSA and Diffie-Hellman, in the nearly future. Besides, some researches such as [FAM+10, DH11, FHU19] show that traditional RSA cryptogra- phy is unable to meet the stringent timing requirements of GOOSE mes- sages. Therefore, we aim to investigate the availability and performance of lattice-based cryptographic algorithms for securing GOOSE messages. In this paper, We implement the round-2 NIST PQC candidates “Kyber” and

“Dilithium” to ensure the authenticity and the confidentiality of GOOSE messages. We report the performance of our implementation on intelligent electronic devices (IEDs) widely used in substations. Our result shows that lattice-based cryptography runs fast on IEDs and probably could be used to ensure the confidentiality and integrity of substation communications.

1.4 Outline

The rest of this thesis is organized as follows: Chapter 2 explains our notation and gives a brief mathematical background of lattice-based cryptography. Chap- ter 3 describes the efficient implementation techniques for lattice-based cryptog- raphy and cryptosystems for variants of LWE. Chapter 4 proposes our efficient implementation approach of the ring-LWE based cryptosystem on a memory- constrained Java card platform. Chapter 5 introduces the portable implementation of lattice-based encryption schemes and key exchange protocols on JavaScript- enabled platforms. Chapter 6 discusses our feasibility study of new lattice-based cryptosystems in the modern power substation monitoring and control system.

Finally, we conclude the thesis in Chapter 7.

(21)
(22)

7

Chapter 2

Lattice-based Cryptography

In this chapter, we present the relevant mathematical background for lattice, dis- crete Gaussian distribution, lattice problems, the LWE-based cryptosystem and its variants schemes.

Throughout this thesis, bold italic letters denote polynomials such asforF, bold lower-case letters denote vectors such as v, and bold upper-case letters denote matrices such as A. For a matrix A (or a vector v), we denote by A (or v) its transpose. For a setS, we write s←Sto denote that an element sis chosen uniformly at random fromS; ifSis a probability distribution,s←Sdenotes that sis chosen according to S. We denote byRthe field of real numbers. Letnand qbe a positive integers, we denote byZqthe set of integers{0,1, . . . ,q−1}, and Rq=Zq[x]/(xn+1) the quotient polynomial ring such that for any polynomial f∈Rq, its maximum degree isn−1 and coefficients are inZq.

2.1 Lattice

A lattice is a subgroup of the Euclidean space. Letm,nbe two positive integers.

Letb= (b1,b2, ...,bm)be a vector wherebiR,i=1,2, ...,m. We denote byRm the m-dimensional Euclidean space which is the set of all vectors of lengths m with entries in R. Let L denote a subgroup of Rm i.e. LRm. Let b1,b2, ..., bn be linearly independent vectors inRm, such that the equation∑ni=1xibi=0is only satisfied byxi=0 fori=1,2, ...,n.

Definition 2.1 (Lattice) LetBRm×n be a matrix whose columns are linearly independent vectorsb1,b2, ...,bn. We denote byL(B)the lattice generated byB. The latticeL(B)is a set and defined as

L(B) ={∑ni=1xibi|xiZ}.

(23)

(a) (b)

FIGURE2.1: Diagrams of two sets of linearly independent vectors (a) and (b) that generate a same lattice.

We referBas a basis matrix of the latticeL(B), wheremandnare the dimension and the rank of the lattice, respectively. In lattice-based cryptography, it is com- mon to consider integer lattices only, i.e. L(B)Zm. In this chapter we will be concerned with full rank lattices i.e. n=m. Any lattice admits multiple different basises.

Definition 2.2 (Fundamental Parallelepiped) For any basisB= (b1,b2, ...,bn) Rm×n, the fundamental parallelepipedP(B)is defined as

P(B) ={∑ni=1xibi|0≤xi<1}. For the determinant ofL(B), we follow the definitions in [Reg05].

Definition 2.3 (Determinant) LetBRm×n be a basis matrix, the determinant of the latticeL(B)is then-dimensional volume ofP(B)and defined as

det(L(B)) =

det(BB).

If the basisBis a square matrix, the determinant of the full rank latticeL(B) is the absolute value of the determinant of the basis B i.e. det(L(B)) =

|det(B)|.

We denote by v the Euclidean norm of a vector vRm in this chapter. Let λ1(L(B))denote the minimum distance for a latticeL(B), which is the length of a shortest non-zero vector inL(B) i.e. λ1(L(B)) = min

v∈L(B)\{0}v. For an arbitrary target vector t Rm which is not necessarily in a lattice L(B), we denote by dist(t,L(B))its distance toL(B) i.e. dist(t,L(B)) = min

v∈L(B)vt.

Many lattice-based cryptographic algorithms can be proved secure assuming the hardness of lattice problems in the worst-case. For example, two certain compu- tational problems on lattices are shortest vector problem (SVP) and closest vector problem (CVP). Here we give the definitions of SVP and CVP:

(24)

2.1. Lattice 9

(a) SVP (b) CVP

FIGURE 2.2: Diagrams of (a) the shortest vector problem (SVP) and (b) the closest vector problem. For SVP, blue vectors are a basis, red vector is the shortest vector in the lattice. For CVP, blue vectors are a basis, green vector is the target, and red vector is the

closest vector in the lattice.

Definition 2.4 (Shortest Vector Problem (SVP)) Given a basis B of a lattice L(B), the problem is to find a shortest non-zero vectorvL(B)i.e. v= λ1(L(B)).

Definition 2.5 (Closest Vector Problem (CVP)) Given a basisBof a latticeL(B) and a target vectortRmwhich is not necessarily inL(B), the problem is to find a vectorvL(B)closest toti.e.vt=dist(t,L(B)).

The conjectured intractability of such problems is central to constructions of secure lattice-based cryptosystems. Some approximation lattice problems are needed to get average-case to worst-case reductions in practice. We consider the following approximation variants of hard problems on lattices.

Definition 2.6 (Approximate Shortest Vector Problem (Appr. SVP)) Letγ >1 be a real number. Given a basisBof a latticeL(B), the problem is to find a short non-zero vectorvL(B)withv∥ ≤γλ1(L(B)).

Definition 2.7 (Bounded Distance Decoding (BDD)) Letγ>1 be a real number.

Given a basis Bof a lattice L(B) and a target vector tRm which is not necessarily inL(B), the problem is to find a short vectorvL(B)closest toti.e. vt∥ ≤d whered1(L(B))/2γ dist(t,L(B)).

We can see that BDD is a special case of the approximate variant of CVP. Both SVP and CVP have been proved to be NP-hard in their exact (when γ = 1) (see [Ajt98, Boas81]) or approximate versions for small factors (see [Mic98, DKRS03, Khot06, Pei08]).

For convenience, we write L to be the lattice L(B) if the knownBis a basis in the following sections.

(25)

2.2 Discrete Gaussian Sampling

In this section, we introduce the definiton of discrete Gaussian distribution.

Definition 3.1 (Gaussian Function) Given a positive real numbers, the Gaussian function ρs(x) with mean 0 is defined asρs(x) =exp(π|x|2/s2)for any x∈R.

Definition 3.2 (Discrete Gaussian Distribution) Given a positive real numbers, the discrete Gaussian distributionDs(x)with mean 0 is defined as

Ds(x) =ρs(x)/∑y=−∞ρs(y), for anyx∈R.

Let mbe a positive integer, we extend this definition to a lattice LRm. For a xL, we writeρL,s(x) =exp(πx2/s2)andDL,s(x) =ρL,s(x)/ρL,s(L), where ρL,s(L)denotes∑y∈Lρs(y). In this thesis, first we sample integer arrays randomly according to DZm,s, then we output results that each sampled integer is over Zq

by giving a positive integerq.

No finite computation can produce a discrete Gaussian distribution. In some lattice-based cryptosystems, we need efficient algorithms that produce distribu- tions which are close to the desired discrete Gaussian distribution statistically.

Definition 3.3 (Statistical Distance) LetX andY be two random variables cor- responding to given distributions overZ. The statistical distance between their distribution is defined as

∆(X,Y) =12x=−∞|Pr[X=x]−Pr[Y =x]|.

The statistical distance between the produced distribution and the desired dis- crete distribution should be negligible. Therefore, choosing a suitable length of the tail-cut factor for a target discrete Gaussian distribution is necessary; oth- erwise no sampling algorithm could cover it. The tail-bound is closely related to the maximum statistical distance allowed by the security discrete Gaussian parameters [Lyu12, RVV13]. The required statistical distance between those two distributions should be less than 2−λ with λ between 90 and 128. We use [MR04, GPV08] to compute the value ranges of a safe tail-cut of Ds. Note that sampling values from the discrete Gaussian distribution are different from sampling values from a normal distribution [TLLV07].

(26)

2.3. Learning With Errors (LWE) Problem 11 Computing the suitable probabilities of the tails is a resource-consuming oper- ation depending on the floating-point capabilities of the experimental platform.

It requires a high precision floating-point operation and/or large storage require- ment [BCG+13, CWB14]. However, small devices such as smart cards have lim- ited storage space and poor computation capacities and do not support floating- point operation; hence, it is a challenge to sample from a target discrete Gaussian distribution on small devices. The standard way of discrete Gaussian sampling is rejection sampling that costs less memory but is not very efficient. We will introduce two discrete Gaussian sampling algorithms in Chapter 3 and describe our modified approaches for memory-constrained implementation in Chapter 4.

2.3 Learning With Errors (LWE) Problem

In this section, we introduce the definition of the original LWE-based problem proposed by Oded Regev in [Reg05].

Definition 4.1 (q-ary lattices) Letm, nbe two positive integers, and qa prime modulus. Given an integer matrixAZm×nq , the two typesn-dimensional q-ary lattices are defined as follows:

Lq(A) ={vZn|v=Az(mod q), for allzZm}, Lq(A) ={vZn|Av=0(mod q)}.

For a secret vectorschosen uniformly at random fromZnq, LWE outputs(A,b= As+e)Zmq×n×Zmq×1, whereAis chosen uniformly at random, with an errore sampled from a discrete Gaussian distribution.

Definition 4.2 (Search LWE problem) Letm, nbe positive integers, qa prime modulus, andsa positive real number. Given an integer matrix AZmq×n

and an integer vectorbZmq, the problem is to find the secretsZnqsuch thatb=As+e, where the erroreis sampled fromDZm,s.

Definition 4.3 (Decision LWE problem) Letm,nbe positive integers,qa prime modulus, and s a positive real number. Given integer matrices (A,b) Zm×nq ×Zm×1q , the problem is to distinguish b between a uniformly dis- tributed random vector fromZmq and a vectorAs+e(mod q), wheresZnq

is a secret andeis an error sampled fromDZm,s.

LWE can be viewed as a BDD instance in latticeLwith target b=As(mod q) sinceeis taken to be a small error. Without the error adding, both of such LWE

(27)

problems are easy to solve using Gaussian elimination. In [Reg05] Regev proved that search and decision LWE problems are as hard to solve as several worst-case lattice problems. An efficient solution to the LWE problem implies an efficient algorithm for solving NP-hard SVP; therefore, the LWE problem can be used to construct secure public-key cryptosystems assuming the hardness of SVP.

2.3.1 LWE-based Encryption Scheme

Oded Regev proposed the original public-key cryptosystem in [Reg05] based on the hardness of the LWE problem. In this section, we explain the construction using Regev’s multi-bit LWE-based encryption scheme described in [PVW08, MR08] as an example.

Regev’s multi-bit LWE-based encryption scheme is parameterized by positive integersm,n,l,q,t,r, and a real numberα >0. It requires a pair of simple error- tolerant encoder and decoder, given byencode: ZltZlqanddecode: ZlqZlt, whereZlt is the message alphabet, and the message length>1. For example, let t =2, for a vectormZl2, we have encode(m) =m, which means the element¯ mofZl2multiplied by⌊q/2⌋, anddecode(m) =¯ 1 if ¯mis closer to⌊q/2⌋(modq) and 0 otherwise. The following procedures define the Regev’s multi-bit LWE- based encryption scheme:

KEY GENERATION(l,m,n,q,s):

Letsbe the Gaussian parameter. A matrixAZqm×nand a matrixSZnq×l are generated uniformly at random. An errorE= (e1,e2, ...,el)and each columneiis sampled fromDZm,sfori=1,2, ...,l. We compute the matrixB=AS+EZmq×l. The public keypkis the integer matrix pair(A,B)Zqm×n×Zmq×land the secret keyskisSZnq×l.

ENCRYPTION(l,m,n,q,s,pk,m):

Let a vectormZl2be the plaintext, and we sampler←DZm,s. By inputting the public key(A,B), we computea=ArZnq andb=Br+encode(m)∈Zlq. The cipher text is the integer vector pair(a,b).

DECRYPTION(l,n,q,sk,a,b):

Outputdecode(b−Sa)Zl2.

(28)

2.4. Ring-LWE Problem 13

2.4 Ring-LWE Problem

In this section, we explain the definition of the original ring-LWE based problem proposed in [LPR10].

There is a natural bijection between Znq and the previously defined ring Rq = Zq[x]/(xn+1). We can identify vectors in a lattice as ring elements. If our lattice happens to correspond to an ideal in the ring, we call such lattice an ideal lattice. SVP on ideal lattices is connected with the hardness of the ring-LWE problem, which, similar to the LWE problem, asks to find a secret s∈Rq by giving a polynomial pair(a,b=as+e)∈Rq×Rq, where the coefficients of the polynomial a are chosen uniformly at random and the coefficients of the error polynomialeare sampled from a discrete Gaussian distribution.

Definition 5.1 (Search Ring-LWE problem) Letnbe a positive integer,qa prime modulus, and s a positive real number. Given polynomials a,b∈Rq, the problem is to find the secret s∈Rq such thatb=as+e∈Rq, where the coefficients of the erroreare sampled fromDZn,s.

Definition 5.2 (Decision Ring-LWE problem) Let n be a positive integer, q a prime modulus, and s a positive real number. Given polynomials a,b∈ Rq, the problem is to distinguish b between a random polynomial which coefficients are generated uniformly fromRqand a polynomialas+e(mod q), wheres∈Rqis a secret andeis an error with coefficients sampled from DZn,s.

Lyubashevsky et al. proved that such search and decision ring-LWE problems are also hard on average and are reducible to the NP-hard SVP over ideal lattices (see [LPR10]); hence, ring-LWE can be used in efficient constructions of lattice- based scheme.

2.4.1 Ring-LWE based Encryption Scheme

Here we introduce the original ring-LWE based encryption scheme [LPR10]

which is more efficient than the original matrices based Regev’s LWE cryptosys- tem. It uses constructed polynomials over an ideal integer ring Rq instead of the structured integer matrices, which is based on the hardness of solving the ring-LWE problem by appropriately parameterized ideal lattices. All polynomial operations take place inRq.

(29)

Let Σ be a message alphabet, the message encoder and decoder are a pair of functions within a certain tolerance, that is, encode: Σn Rq and decode: Rq

Σn, such that decode(encode(m) +e) =m is an anti-operation of encoding for any error ewith coefficients sampled from DZn,s. The following procedures define the original ring-LWE based encryption scheme:

KEY GENERATION(n,q,s):

Let s be the Gaussian parameter. Sample an error polynomial e∈Rq DnZ,s. Polynomialsa,s∈Rqare chosen uniformly at random. Computeb=as+e∈Rq. The public keypkis the polynomial pair (a,b)∈Rq×Rqand the secret keyskis the polynomials∈Rq.

ENCRYPTION(n,q,s,pk,m):

The positive real numbersis the Gaussian parameter. Let a polynomialm∈R2be the plaintext. Polynomialst,e1,e2∈Rqare sampled according toDZn,s. Let ¯m= encode(m)∈Rq. By inputting the public key(a,b), we computec1=at+e1∈Rq and c2=bt+e2+encode(m)∈Rq. The cipher text is the polynomial pair (c1, c2).

DECRYPTION(n,q,sk,c1,c2):

Outputdecode(c2−c1s)∈Zn2.

2.4.2 Ring-LWE based Digital Signature

Gentry et al. proposed a hash-and-sign signature [GPV08] which is based on the hardness of worst-case lattice problems and has large signature size. Lyuba- shevsky constructed a digital signature [Lyu09] using the Fiat-Shamir frame- work [FS86] which is based on the hardness of ideal lattice problems. The sub- sequent signature scheme [Lyu12] improved by Lyubashevsky is based on the hardness of the ring-LWE problem and reduces the signature size. Güneysu et al. proposed and implemented an optimized signature scheme [GLP12] based on [Lyu09, Lyu12] using signature compression techniques. The summary of the combination of Lyubashevsky’s digital signature schemes [Lyu09, Lyu12] de- scribed in [GLP12] is described as follows:

KEY GENERATION(n,q):

(30)

2.5. Module Learning with Errors problem (MLWE) 15 Choose a uniformly random polynomial a∈Rq, and sample s1,s2 that each co- efficient of them is sampled from a triplet T1={−1,0,1}. Then compute t= as1+s2. The public keypkis (a,t) and the secret keyskis (s1,s2).

SIGN(k,n,q,sk,µ):

Letµ be a message. Denote byHa hash function that outputs an-elements array which has all zero elements except for at most 32 elements are±1. Sampley1,y2 that each coefficient of them is from a 2ktupleT2={−k, ...,0, ...,k}. Compute a polynomial c of which coefficient is H(ay1+y2,µ), and then compute z1 = cs1+y1,z2= cs2+y2. If each coefficient of z1 and z2 are from a 2(k32)- tupleT3={−(k32), ...,0, ...,k−32}, output (z1,z2,c); otherwise return to the beginning and start the signing process over.

VERIFY(k,n,q,pk,µ,z1,z2,c):

Verify and accept if and only if each coefficient of z1 and z2 are from T3 and c=H(az1+z2−ct,µ).

[Lyu09] and its improvement scheme [Lyu12] are based on the worst-case hard- ness of Appr. SVP and are proved to be strongly unforgeable.

2.5 Module Learning with Errors problem (MLWE)

A variant of both LWE and ring-LWE, called MLWE, was proposed by [LS12, BGV14, LS15]. Letnandkbe positive integers, we denote bynandkthe dimen- sion of the polynomial ringRqand the rank of a module over(Rq)k, respectively.

The dimension of the corresponding module lattice is equal tonk. We denote by

a,bthe dot product of two vectorsa,b(Rq)k.

Definition 6.1 (Search MLWE problem) Letk,nbe positive integers,qa prime modulus, and s a positive real number. Given a polynomial vector a (Rq)k, the problem is to find the secret vectors(Rq)ksuch thatb=a,s+ e∈Rq, wheree∈Rqis an error with coefficients sampled fromDZn,s. Definition 6.2 (Decision ring-LWE problem) Let k, n be positive integers, q a

prime modulus, and 0 a positive real number. Given (a(Rq)k,b∈Rq), the problem is to distinguishbbetween a random polynomial which coeffi- cients are generated uniformly fromRqand a polynomiala,s+e(modq),

(31)

wheres(Rq)kis a secret ande∈Rqis an error with coefficients sampled fromDZn,s.

In [AD17] Albrecht and Deo proved that MLWE problem is at least as hard as ring-LWE problem. Some candidates submitted to the NIST PQC project such as Kyber [BDK+17, ABD+19] and Dilithium [DKL+17, DKL+19] are MLWE- based algorithms. We will introduce our implementation of Kyber on multiple platforms in Chapter 5, and 6 and that of Dilithium for securing power system communications in Chapter 6.

(32)

17

Chapter 3

Efficient Algorithms for

Lattice-based Cryptography

In this chapter, we describe some efficient algorithms of our implementation tech- niques for discrete Gaussian sampling, polynomial multiplication, and modular multiplication. We improve those algorithms for the particular implementation and will give detailed introductions of our approaches in Chapters 4, 5, and 6.

3.1 Discreta Gaussian Sampling

In this section, we describe the basic rejection sampling method and an efficient sampling method called Knuth-Yao algorithm which doesn’t require floating- point arithmetic.

3.1.1 Rejection Sampling

Rejection sampling is first error sampling approach for lattice-based cryptogra- phy. Let t be a positive integer. Given a positive real number s, the rejection sampling outputs a valuexthat is chosen uniformly from[−ts, ...,ts]∩Z, wheret is a the tail-cut factor that determines where to drop the negligible probability of far samples. By choosing a suitable tail-cut factort, the resulting distribution can have negligible difference from the desired distribution.

Algorithm 1 shows the basic rejection sampling method for discrete Gaussian distribution (see [GPV08, DN12]). The support methodRandomInt(a,b)in Step 2 outputs an integer that is sampled uniformly at random from the range [a,b], where a and b Z. The support method RandomFloat() in Step 4 outputs a

(33)

limited-precision floating-point number that is sampled uniformly at random from the range (0, 1].

Input: A floating-point numbersand a positive integert Output: A sample valuex∈Z

1 xmin=⌊−ts⌋ ∩Z,xmax=⌈ts⌉ ∩Z;

2 x=RandomInt(xmin,xmax);

3 p= exp(πx2/s2);

4 r=RandomFloat();

5 Ifr<pthen returnx;else gotoStep 2

Algorithm 1:Rejection Sampling overZ

When it comes to actual implementation, rejection sampling only needs to store a few parameters and hence enjoys a small memory footprint. However, it requires a high-precision floating-point number opearation and easily becomes a bottle- neck in runtime. We can create a look-up table to store the probability values of all possible values to speed-up the sampling.

3.1.2 Knuth-Yao Algorithm

A more efficient discrete Gaussian sampling method is Knuth-Yao algorithm.

We use a method described in [DG14] to generate a matrix of the finite binary expansions of the probability values. Let (p1,p2, ...,) be an infinite probability value vector, its binary expansion tree that contains two types of nodes: internal nodes and terminal nodes. Each internal node has two children labeled with 0 and 1, respectively, and each terminal nodes has no child. We define an infinite discrete distribution generating (DDG) tree of a discrete random value with the given probability value vector if pi=∑k>0bi(k)/2kfor alli=1,2, ..., wherebi(k) is the number of terminal nodes that is labeled with the valueion thek-th level.

In practical, we need to truncate the binary expansions of the probability values up to at leastlbits such that a statistical distance is less than 2l.

We convert such tree as a bit integer matrix from the finite binary expansions of the probability values. We denote by l Z the precision of binary expan- sion of the probability values. For n∈Z, there are n binary probability values p0,p1, ...,pn−1Zl2. A matrixPmat= (p0,p1, ...,pn−1)Zl2×nis composed of all the computed probability values, and each column stores one probability value.

(34)

3.2. Polynomial Multiplication 19

I 1

I 3

I

I 2

3 I 1

2 3

FIGURE3.1: A Knuth-Yao DDG tree

Let k0,k1, ...,kl−1Zn2 be all the rows of Pmat. In our case, Pmat can be con- verted and stored as a multi-dimensional array k= (k0,k1, ...,kl−1) Zln2 for Algorithm2.

With limited computing capacity, the computation of probabilities would become a time-consuming operation for some programming languages or platforms. In general, discrete Gaussian sampling requires a high-precision floating-point op- eration or large storage requirement [CWB14] to ensure the security level. Be inspired by the idea of implementing Knuth-Yao algorithm in FPGAs [RVV13], we modify and implement the algorithm. Moreover, discrete Ziggurat algorithm in [BCG+13] which allows for a time-memory trade-off has been changed to be portable in chosen platforms. In this case, Knuth-Yao algorithm shows better performance than modified discrete Ziggurat algorithm. In fact, with different features, the performance of those two sampling algorithms varies on different platforms. Therefore, we choose Knuth-Yao algorithm to speed up discrete Gaus- sian sampling.

3.2 Polynomial Multiplication

To multiply two polynomials, we can use the standard schoolbook polynomial multiplication. But it has high asymptotic complexity equal toO(n2). Other ap- proaches include the Karatsuba algorithm and number theoretic transform (NTT).

In this section, we introduce these two fast approaches of polynomial multiplica- tion.

(35)

Input: l,n∈Zand a multi-dimensional array of probability values k= (k0,k1, ...,kl1)Zln2

Output: Sample valuex∈Z

1 Letd=0,x=0,sign=0 ;

2 whiletruedo

3 r← {0,1};

4 d=2d+r;

5 fori=ndown to 0 by 1do

6 d=d−ki;

7 ifd=1then

8 ifi=0thensign← {0,1};

9 else

10 sign← {−1,1};

11 returnx=sign∗row_num;

12 endif

13 ifsign=1then returnx=i;

14 else

15 d=0;

16 r← {0,1};

17 d=2d+r;

18 x=0;

19 continue

20 endif

21 endif

22 endfor

23 x+ =1;

24 endwhile

Algorithm 2:Knuth-Yao Algorithm

3.2.1 Karatsuba Algorithm

We use an array (f0, . . . ,fn1)to represent a polynomialf=∑n−1i=0 fixi∈Rq. In other words, we simply create an array object to store the coefficients to represent a polynomial. In our case, those coefficients are stored in an ascending order.

That array object provides all coefficients from the lowest to the highest degree such that the position of each coefficient corresponds to the degree of the term to which it belongs. Algorithm3shows the Karatsuba algorithm for polynomial multiplication.

Karatsuba algorithm also has lower asymptotic complexity equal to O(nlogn) for multiplying polynomials with larger degrees. However, as a typical recur- sive method, the Karatsuba algorithm is not easy to run on memory-constrained

(36)

3.2. Polynomial Multiplication 21 Input: Polynomialsaandb

Output: c=ab

1 Letn= max(degree(a), degree(b)) + 1;

2 Ifn==1,then returnab;

3 Leta=a1(x) +a2(x)xn/2;

4 Letb=b1(x) +b2(x)x⌈n/2⌋;

5 c1(x)= KaratsubaAlgorithm(a1(x),b1(x));

6 c2(x)= KaratsubaAlgorithm(a2(x),b2(x));

7 c3(x)= KaratsubaAlgorithm(a1(x) +a2(x),b1(x) +b2(x));

8 returnc1(x) +c2(x)xn+ (c3(x)−c2(x)−c1(x))xn/2 Algorithm 3:Karatsuba algorithm

devices efficiently because it would be very costly in terms of the recursive opera- tion. Even if we rewritten the recursive version into non-recursive, the Karatsuba algorithm will run into a large overhead. Therefore, we use NTT to perform fast polynomial multiplication in this thesis.

3.2.2 Number Theoretic Transform (NTT)

NTT is a generalization of fast Fourier transform (FFT) algorithm by replacing complex numbers with an n-th primitive root of unity in a finite ring Zq, the integers modulo a prime integer q=1(mod 2n). Algorithm 4 shows the basic version of iterative forward NTT algorithm for polynomial multiplication that we used for implementation of ring-LWE cryptosystems and those variants. The support method BitReverse at line 1 is a permutation of a sequence ofn items, such that the new index of each element is the reverse of the bit string of its old index.

As stated in Chapter 2, we perform polynomial operations over the ring Rq = Zq[x]/(xn+1), wheren is a power of 2. Considering the parameters of the tar- get ring-LWE encryption schemes, we need to pad zeros to obtain an equiva- lent polynomial with 2ncoefficients, as well as to reduce the final product mod- ulo (xn+1). To multiply two polynomials a,b∈ Rq, it is required to compute the productc=NTT1(NTT(a)◦NTT(b)), where is point-wise multiplication.

NTT also has asymptotic complexity equal toO(nlogn); thus, it is faster than the standard schoolbook polynomial multiplication when a polynomial multiplier has high-enough degrees (see [GT01]).

(37)

Input: Polynomiala∈Zq[x]of degreen−1 andn-th primitive rootωnZqof unity

Output: PolynomialA=NTT(a)

1 LetA=BitReverse(a);

2 form=2tonbym=2mdo

3 ωm=ωnn/m,ω = 1;

4 for j=0tom/2−1do

5 fork=0ton−1 bymdo

6 tA[k+j+m/2](mod q);

7 u=A[k+j];

8 A[k+j] =u+t(mod q);

9 A[k+j+m/2] =u−t (mod q);

10 ω =ωmω;

11 returnA

Algorithm 4:Iterative number theoretic transform (NTT)

3.3 Montgomery Modular Multiplication (MMM)

In this section, we describe the basic idea of Montgomery multiplication and show the basic method of word-level Montgomery modular multiplication.

In 1985, Peter L. Montgomery proposed an algorithm, commonly referred to as Montgomery modular multiplication (MMM) [Mont85], to perform fast modular multiplication. Let n be a positive integer,q a n-bit odd positive modulus, and x,y <q two n-bit positive integers. The MMM algorithm outputs the integer modular productz=xy(mod q)without performing a division byq.

MMM works in the Montgomery domain. Let r=2nbe a positive integer such thatr>qand gcd(q,r) =1. Given an integerx≤q, the representative ¯xofxin the Montgomery domain is defined as ¯x=xr(mod q). Therefore, for any integer over Zq, we can easily compute its representative in the Montgomery domain.

Since r and qare co-prime, there exists a multiplicative inverse of r moduloq, written as r−1, such that r−1r (mod q) =1. Given ¯x and ¯y, the Montgomery product is defined as ¯z=x¯yr¯ −1(mod q)because

¯

z=x¯yr¯ −1(mod q)

= (xr)(yr)r1(mod q)

=xyr(mod q)

=zr(mod q).

(38)

3.3. Montgomery Modular Multiplication (MMM) 23 To achieve the fast modular multiplication in the Montgomery domain, we need to compute an additional integerqwith the propertyr−1r−qq=1. We can use the extended Euclidean algorithm to computer−1andq. Letω andsbe positive integers that satisfys=⌈n/ω. We can compute that−q0q0=1(mod 2ω)where q0andq0are theω-bit least significant word ofqandq, respectively. Therefore, q0is equal to−q−10 (mod 2ω).

Input: A word-size 2ω andn-bit integersq= (qs−1, ...,q0)2ω,

x= (xs−1, ...,x0)2ω, andy, where 0≤x,y≤q,s=⌈n/ω, and

−q01=q0(mod 2ω)

Output: Non-negative integerxyr−1(mod q)

1 Letz=0;

2 fori=0tos−1do

3 z=z+xiy;

4 z=z+z0(−q01)q;

5 z=z/2ω;

6 ifz≥qthenz=z−q;

7 returnz.

Algorithm 5:Montgomery multiplication (MM)

Algorithm5shows the word-level Montgomery multiplication. It can be used in traditional cryptography such as RSA which requires lots of operations modulo a large integer. For our implementation of ring-LWE based cryptography and its variants, the efficient polynomial multiplication approach, that is, NTT also requires multiple modular multiplication operations. However, some memory- constrained devices, such as the Java Card platform, do not support large integer operations. It tends to results in integer overflow when perform the computa- tion of Step 3 and Step 4 in Algorithm5on memory-constrained devices. Some of optimized Montgomery multiplication approaches (e.g., [OBPV03, MMM04, FSV07, MBCM16]) do not address this concern. Therefore, to implement ring- LWE based cryptographic algorithms on memory-constrained devices, we com- bine NTT and an optimized MMM method which performs only 8-bit integer multiplication within the body of the loop. We will introduce our approach in Chapter 4.

(39)
(40)

25

Chapter 4

Memory-Constrained

Implementation of Lattice-based Encryption Scheme on Standard Java Card

4.1 Introduction

Java Card is a mainstream applet development platform that allows developers to program in a subset of the Java language or run applets on a deployed smart card virtual machine. It is now widely used in telecommunications, finance, and banking, and aims to define a standard smart card computing and developing en- vironment [JCF13]. The card vendors and issuers can develop and test applets on smart cards quickly due to its interoperability and standard development in- terface; hence, the costs of applet development, employment, and maintenance are reduced significantly. Accordingly, Java Card has rapidly become the most popular smart card platform.

In the IoT era, tons of devices will be connected to the Internet and need crypto- graphic schemes for security and privacy. As an essential part of IoT ecosystems, embedded devices like smart cards also require safe and reliable data protection features. A significant unique benefit for Java Card vendors and issuers is inher- ent security, including secure memory/data protection and cryptographic support.

Typically, the standard Java Card platform has an insufficient computational ca- pability (see [GP03, SM06]). To make up for the processing capacity, almost all Java card manufacturers install dedicated cryptographic co-processors to speed up the operations of encryption algorithms such as the 2048-bit RSA. However,

Figure 4.2 shows that by “removing” zeros in grey areas, less memory can be used. Similar to Algorithm 8, P mat is divided into l blocks k ′ 0 , k ′ 1 , ..., k ′ l−1
Table 4.5 shows the performance results of LPR-LWE executed on a standard Java Card without using any additional co-processor
Figure 4.4 shows the proportions of computation time for different parameter sets. We see that though the running time using dimension 128 parameter set only accounts for less than 50% using dimension 256 parameter set, the proportions of running time of e
Figure 5.4 shows the performance of our implementation executed on Tessel2 4 . Note that the running time is measured in seconds
+4

参照

関連したドキュメント