![]() Groszschädl, Johann ![]() ![]() in Proceedings of the 8th International Workshop on Secure Internet of Things 2019 (SIoT 2019) (2019, September) It is widely accepted that public-key cryptosystems play a major role in the security arena of the Internet of Things (IoT), but they need to be implemented efficiently to not deplete the scarce resources ... [more ▼] It is widely accepted that public-key cryptosystems play a major role in the security arena of the Internet of Things (IoT), but they need to be implemented efficiently to not deplete the scarce resources of battery-operated devices such as wireless sensor nodes. This paper describes a highly-optimized software implementation of scalar multiplication for Elliptic Curve Diffie-Hellman (ECDH) key exchange on resource-limited IoT devices that achieves fast execution times along with reasonably small code size and RAM consumption. Our software uses a special class of elliptic curves, namely twisted Edwards curves with an efficiently computable endomorphism similar to that of the so- called Gallant-Lambert-Vanstone (GLV) curves. This allows us to combine the main advantage of the GLV model, which is an efficiently-computable endomorphism to speed up variable-base scalar multiplication, with the fast and complete addition rules of the (twisted) Edwards model. We implemented variable-base scalar multiplication for static ECDH on two such curves, one over a 159-bit and the second over a 207-bit pseudo-Mersenne prime field, respectively, and evaluated their execution time on a 16-bit MSP430F1611 processor. The arithmetic operations in the prime field do not contain operand-dependent conditional statements (in particular no "if-then-else" clauses) and also the scalar multiplication follows a fixed execution path for a given (static) scalar. A variable-base scalar multiplication on curves over the 159 and 207-bit field takes about 2.63 and 4.84 million clock cycles, respectively, on an MSP430F1611 processor. These results compare favorably with the Montgomery ladder on the equivalent Montgomery curves, which is almost 50% slower. [less ▲] Detailed reference viewed: 46 (5 UL)![]() Ghatpande, Sankalp ![]() ![]() ![]() in Blazy, Olivier; Yeun, Chan Y. (Eds.) Information Security Theory and Practice, 12th IFIP WG 11.2 International Conference, WISTP 2018, Brussels, Belgium, December 10-11, 2018, Proceedings (2018, December) We introduce a set of four twisted Edwards curves that satisfy common security requirements and allow for fast implementations of scalar multiplication on 8, 16, and 32-bit processors. Our curves are ... [more ▼] We introduce a set of four twisted Edwards curves that satisfy common security requirements and allow for fast implementations of scalar multiplication on 8, 16, and 32-bit processors. Our curves are defined by an equation of the form -x^2 + y^2 = 1 + dx^2y^2 over a prime field Fp, where d is a small non-square modulo p. The underlying prime fields are based on "pseudo-Mersenne" primes given by p = 2^k - c and have in common that p is congruent to 5 modulo 8, k is a multiple of 32 minus 1, and c is at most eight bits long. Due to these common features, our primes facilitate a parameterized implementation of the low-level arithmetic so that one and the same arithmetic function is able to process operands of different length. Each of the twisted Edwards curves we introduce in this paper is birationally equivalent to a Montgomery curve of the form -(A+2)y^2 = x^3 + Ax^2 + x where 4/(A+2) is small. Even though this contrasts with the usual practice of choosing A such that (A+2)/4 is small, we show that the Montgomery form of our curves allows for an equally efficient implementation of point doubling as Curve25519. The four curves we put forward roughly match the common security levels of 80, 96, 112 and 128 bits. In addition, their Weierstraß representations are isomorphic to curves of the form y^2 = x^3 - 3x + b so as to facilitate inter-operability with TinyECC and other legacy software. [less ▲] Detailed reference viewed: 457 (33 UL)![]() Liu, Zhe ![]() ![]() in IEEE Communications Magazine (2018), 56(2), 158-162 In order to increase the security of edge computing, all data transmitted to and from edge devices, as well as all data stored on edge devices, must be encrypted. Especially when the transmitted or stored ... [more ▼] In order to increase the security of edge computing, all data transmitted to and from edge devices, as well as all data stored on edge devices, must be encrypted. Especially when the transmitted or stored data contains sensitive personal information, long-term protection over periods of ten or more years may be required, which can only be achieved with post-quantum cryptography. This paper first gives a brief overview of post-quantum public-key cryptosystems based on hard mathematical problems related to hash functions, error-correcting codes, multivariate quadratic systems, and lattices. Then, the suitability of lattice-based cryptosystems for resource-constrained devices is discussed and efficient implementations for 8 and 32-bit microcontrollers are outlined. [less ▲] Detailed reference viewed: 212 (3 UL)![]() ; Liu, Zhe ![]() in The 20th Annual International Conference on Information Security and Cryptology - ICISC2017 (2017, December 30) Detailed reference viewed: 174 (0 UL)![]() ; ; Liu, Zhe ![]() in The 20th Annual International Conference on Information Security and Cryptology - ICISC2017 (2017, December 01) Detailed reference viewed: 122 (0 UL)![]() ; ; et al in The 13th International Conference on Information Security Practice and Experience - ISPEC 2017 (2017, December 01) Detailed reference viewed: 133 (1 UL)![]() Liu, Zhe ![]() in Homma, Naofumi; Fischer, Wieland (Eds.) International Conference on Cryptographic Hardware and Embedded Systems - CHES2017 (2017, August 25) Detailed reference viewed: 127 (16 UL)![]() Liu, Zhe ![]() in IEEE 24th Symposium on Computer Arithmetic - ARITH24 (2017, August 01) Detailed reference viewed: 137 (2 UL)![]() ; ; Liu, Zhe ![]() in The 16th IEEE International Conference on Trust, Security and Privacy in Computing and Communications - IEEE TrustCom-17 (2017, August) Detailed reference viewed: 114 (0 UL)![]() ; ; et al in 22nd Australasian Conference on Information Security and Privacy - ACISP2017 (2017, July 03) Detailed reference viewed: 151 (0 UL)![]() Liu, Zhe ![]() ![]() in IEEE Transactions on Computers (2017), 66(5), 773-785 Verification of an ECDSA signature requires a double scalar multiplication on an elliptic curve. In this work, we study the computation of this operation on a twisted Edwards curve with an efficiently ... [more ▼] Verification of an ECDSA signature requires a double scalar multiplication on an elliptic curve. In this work, we study the computation of this operation on a twisted Edwards curve with an efficiently computable endomorphism, which allows reducing the number of point doublings by approximately 50 percent compared to a conventional implementation. In particular, we focus on a curve defined over the 207-bit prime field Fp with p = 2^207 - 5131. We develop several optimizations to the operation and we describe two hardware architectures for computing the operation. The first architecture is a small processor implemented in 0.13 μm CMOS ASIC and is useful in resource-constrained devices for the Internet of Things (IoT) applications. The second architecture is designed for fast signature verifications by using FPGA acceleration and can be used in the server-side of these applications. Our designs offer various trade-offs and optimizations between performance and resource requirements and they are valuable for IoT applications. [less ▲] Detailed reference viewed: 134 (3 UL)![]() ; Liu, Zhe ![]() ![]() in Security and Communication Networks (2016), 9(18), 5401-5411 A steadily increasing number of modern processors support Single Instruction Multiple Data (SIMD) instructions to speed up multimedia, communication, and security applications. The computational power of ... [more ▼] A steadily increasing number of modern processors support Single Instruction Multiple Data (SIMD) instructions to speed up multimedia, communication, and security applications. The computational power of Intel's SSE and AVX extensions as well as ARM's NEON engine has initiated a body of research on SIMD-parallel implementation of multiple-precision integer arithmetic operations, in particular modular multiplication and modular squaring, which are performance-critical components of widely-used public-key cryptosystems such as RSA, DSA, Diffie-Hellman, and their elliptic-curve variants ECDSA and ECDH. In this paper, we introduce the Double Operand Scanning (DOS) method for multiple-precision squaring and describe its implementation for ARM NEON processors. The DOS method uses a full-radix representation of the operand to be squared and aims to maximize performance by reducing the number of Read-After-Write (RAW) dependencies between source and destination registers. We also analyze the benefits of applying Karatsuba's technique to both multiple-precision multiplication and squaring, and present an optimized implementation of Montgomery's algorithm for modular reduction. Our performance evaluation shows that the DOS method along with the other optimizations described in this paper allows one to execute a full 2048-bit modular exponentiation in about 14.25 million clock cycles on an ARM Cortex-A15 processor, which is significantly faster than previously-reported RSA implementations for the ARM-NEON platform. [less ▲] Detailed reference viewed: 149 (2 UL)![]() Galindo, David ![]() ![]() ![]() in Journal of Cryptographic Engineering (2016), 6(3), 229-238 Leakage-resilient cryptography aims to extend the rigorous guarantees achieved through the provable security paradigm to physical implementations. The constructions designed on basis of this new approach ... [more ▼] Leakage-resilient cryptography aims to extend the rigorous guarantees achieved through the provable security paradigm to physical implementations. The constructions designed on basis of this new approach inevitably suffer from an Achilles heel: a bounded leakage assumption is needed. Currently, a huge gap exists between the theory of such designs and their implementation to confirm the leakage resilience in practice. The present work tries to narrow this gap for the leakage-resilient bilinear ElGamal key encapsulation mechanism (BEG-KEM) proposed by Kiltz and Pietrzak in 2010. Our first contribution is a variant of the bounded leakage and the only-computation-leaks model that is closer to practice. We weaken the restriction on the image size of the leakage functions in these models and only insist that the inputs to the leakage functions have sufficient min-entropy left, in spite of the leakage, with no limitation on the quantity of this leakage. We provide a novel security reduction for BEG-KEM in this relaxed leakage model using the generic bilinear group axiom. Secondly, we show that a naive implementation of the exponentiation in BEG-KEM makes it impossible to meet the leakage bound. Instead of trying to find an exponentiation algorithm that meets the leakage axiom (which is a non-trivial problem in practice), we propose an advanced scheme, BEG-KEM+, that avoids exponentiation by a secret value, but rather uses an encoding into the base group due to Fouque and Tibouchi. Thirdly, we present a software implementation of BEG-KEM+ based on the Miracl library and provide detailed experimental results. We also assess its (theoretical) resistance against power analysis attacks from a practical perspective, taking into account the state-of-the-art in side-channel cryptanalysis. [less ▲] Detailed reference viewed: 123 (2 UL)![]() Liu, Zhe ![]() ![]() in Liu, Joseph K.; Steinfeld, Ron (Eds.) Information Security and Privacy - 21st Australasian Conference, ACISP 2016, Melbourne, VIC, Australia, July 4-6, 2016, Proceedings, Part I (2016, July) The Internet is rapidly evolving from a network of personal computers and servers to a network of smart objects ("things") able to communicate with each other and with central resources. This evolution ... [more ▼] The Internet is rapidly evolving from a network of personal computers and servers to a network of smart objects ("things") able to communicate with each other and with central resources. This evolution has created a demand for lightweight implementations of cryptographic algorithms suitable for resource-constrained devices such as RFID tags and wireless sensor nodes. In this paper we describe a highly optimized software implementation of Elliptic Curve Cryptography (ECC) for the MSP430 series of ultra-low-power 16-bit microcontrollers. Our software is scalable in the sense that it supports prime fields and elliptic curves of different order without recompilation, which allows for flexible trade-offs between execution time (i.e. energy consumption) and security. The low-level modular arithmetic is optimized for pseudo-Mersenne primes of the form p = 2^n - c where n is a multiple of 16 minus 1 and c fits in a 16-bit register. All prime-field arithmetic functions are parameterized with respect to the length of operands (i.e. the number of 16-bit words they consist of) and written in Assembly language, whereby we avoided conditional jumps and branches that could leak information about the secret key. Our ECC implementation can perform scalar multiplication on two types of elliptic curves, namely Montgomery curves and twisted Edwards curves. A full scalar multiplication using a Montgomery curve over a 159-bit field requires about 3.86*10^6 clock cycles when executed on an MSP430F1611 microcontroller. [less ▲] Detailed reference viewed: 337 (22 UL)![]() Liu, Zhe ![]() ![]() in IEEE Transactions on Information Forensics and Security (2016), 11(7), 1385-1397 In this paper, we introduce a highly optimized software implementation of standards-compliant elliptic curve cryptography (ECC) for wireless sensor nodes equipped with an 8-bit AVR microcontroller. We ... [more ▼] In this paper, we introduce a highly optimized software implementation of standards-compliant elliptic curve cryptography (ECC) for wireless sensor nodes equipped with an 8-bit AVR microcontroller. We exploit the state-of-the-art optimizations and propose novel techniques to further push the performance envelope of a scalar multiplication on the NIST P-192 curve. To illustrate the performance of our ECC software, we develope the prototype implementations of different cryptographic schemes for securing communication in a wireless sensor network, including elliptic curve Diffie-Hellman (ECDH) key exchange, the elliptic curve digital signature algorithm (ECDSA), and the elliptic curve Menezes-Qu-Vanstone (ECMQV) protocol. We obtain record-setting execution times for fixed-base, point variable-base, and double-base scalar multiplication. Compared with the related work, our ECDH key exchange achieves a performance gain of roughly 27% over the best previously published result using the NIST P-192 curve on the same platform, while our ECDSA performs twice as fast as the ECDSA implementation of the well-known TinyECC library. We also evaluate the impact of Karatsuba's multiplication technique on the overall execution time of a scalar multiplication. In addition to offering high performance, our implementation of scalar multiplication has a highly regular execution profile, which helps to protect against certain side-channel attacks. Our results show that NIST-compliant ECC can be implemented efficiently enough to be suitable for resource-constrained sensor nodes. [less ▲] Detailed reference viewed: 264 (13 UL)![]() Liu, Zhe ![]() Doctoral thesis (2015) This thesis studies techniques for the efficient implementation of public-key cryptography on resource-constrained computing devices such as wireless sensor nodes. We focus on three different families of ... [more ▼] This thesis studies techniques for the efficient implementation of public-key cryptography on resource-constrained computing devices such as wireless sensor nodes. We focus on three different families of public-key cryptosystems, namely RSA, elliptic curve schemes (e.g. ECDH), as well as a lattice-based encryption algorithm using the ring variant of the Learning With Errors problem (Ring-LWE). In the context of efficient RSA implementation, this thesis first describes the Reverse Product Scanning (RPS) technique for multi-precision multiplication and squaring on 8-bit AVR processors. Then, we analyze and compare six algorithms for software implementation of Montgomery multiplication and shed some new light on the relative performance of the different variants. Finally, we present a highly-optimized 1024-bit RSA software that achieves record-setting execution times on 8-bit ATmega processors thanks to a combination of sophisticated implementation options including the Chinese Remainder Theorem (CRT), the m-ary exponentiation method, as well as the RPS multiplication technique. The main contribution of this thesis lies in the area of Elliptic Curve Cryptography (ECC), where we describe three different implementations for wireless sensor nodes equipped with 8 and 16-bit processors. First, we present several techniques to improve the performance of prime-field arithmetic operations (e.g. optimized operand caching, sliding block doubling, and Karatsuba block squaring) and report the fastest implementation of key exchange and signature generation/verification based on the NIST P-192 curve. Second, we introduce an efficient yet highly-regular library for arithmetic in Optimal Prime Fields (OPFs) and propose a novel approach for the implementation of ephemeral ECDH key exchange that exploits the birational equivalence of Montgomery and twisted Edwards curves. Third, we describe an algorithm for the generation of so-called MoTE curves that allows one to use the fastest formulae for point addition/doubling in both the Montgomery form and the twisted Edwards form. Finally, we describe efficient ECC implementations for 159 and 191-bit MoTE curves that achieve record-setting execution times for scalar multiplication on a 16-bit MSP430 processor. The third part of this thesis deals with the efficient implementation of Ring-LWE encryption and proposes novel approaches to speed up the Number Theoretic Transform (NTT) used for polynomial multiplication. Our contributions in this area include the Move-and-Add (MA) method for coefficient multiplication, the Shift-Add-Multiply-Subtract-Subtract (SAMS2) technique for modular reduction, and an optimized variant for Montgomery multiplication. Furthermore, we describe an optimized coefficient storage method to minimize the RAM footprint of NTT multiplications. Finally, we propose a special byte-scanning technique to reduce the execution time of discrete Gaussian sampling based on the Knuth-Yao random walk algorithm. [less ▲] Detailed reference viewed: 706 (34 UL)![]() Liu, Zhe ![]() in Güneysu, Tim; Handschuh, Helena (Eds.) Cryptographic Hardware and Embedded Systems - CHES 2015, 17th International Workshop, Saint-Malo, France, September 13-16, 2015, Proceedings (2015, September) Public-key cryptography based on the "ring-variant" of the Learning with Errors (ring-LWE) problem is both efficient and believed to remain secure in a post-quantum world. In this paper, we introduce a ... [more ▼] Public-key cryptography based on the "ring-variant" of the Learning with Errors (ring-LWE) problem is both efficient and believed to remain secure in a post-quantum world. In this paper, we introduce a carefully-optimized implementation of a ring-LWE encryption scheme for 8-bit AVR processors like the ATxmega128. Our research contributions include several optimizations for the Number Theoretic Transform (NTT) used for polynomial multiplication. More concretely, we describe the Move-and-Add (MA) and the Shift-Add-Multiply-Subtract-Subtract (SAMS2) technique to speed up the performance-critical multiplication and modular reduction of coefficients, respectively. We take advantage of incompletely-reduced intermediate results to minimize the total number of reduction operations and use a special coefficient-storage method to decrease the RAM footprint of NTT multiplications. In addition, we propose a byte-wise scanning strategy to improve the performance of a discrete Gaussian sampler based on the Knuth-Yao random walk algorithm. For medium-term security, our ring-LWE implementation needs 590k, 672k, and 276k clock cycles for key-generation, encryption, and decryption, respectively. On the other hand, for long-term security, the execution time of key-generation, encryption, and decryption amount to 2.2M, 2.6M, and 686k cycles, respectively. These results set new speed records for ring-LWE encryption on an 8-bit processor and outperform related RSA and ECC implementations by an order of magnitude. [less ▲] Detailed reference viewed: 237 (11 UL)![]() Liu, Zhe ![]() in Bao, Feng; Miller, Steven; Zhou, Jianying (Eds.) et al ASIACCS'15: Proceedings of the 10th ACM Symposium on Information, Computer and Communications Security, April 14-17, 2015, Singapore (2015, April) Public-Key Cryptography (PKC) is an indispensable building block of modern security protocols, and, thus, essential for secure communication over insecure networks. Despite a significant body of research ... [more ▼] Public-Key Cryptography (PKC) is an indispensable building block of modern security protocols, and, thus, essential for secure communication over insecure networks. Despite a significant body of research devoted to making PKC more "lightweight," it is still commonly perceived that software implementations of PKC are computationally too expensive for practical use in ultra-low power devices such as wireless sensor nodes. In the present paper we aim to challenge this perception and present a highly-optimized implementation of Elliptic Curve Cryptography (ECC) for the TI MSP430 series of 16-bit microcontrollers. Our software is inspired by MoTE-ECC and supports scalar multiplication on two families of elliptic curves, namely Montgomery and twisted Edwards curves. However, in contrast to MoTE-ECC, we use pseudo-Mersenne prime fields as underlying algebraic structure to facilitate inter-operability with existing ECC implementations. We introduce a novel "zig-zag" technique for multiple-precision squaring on the MSP430 and assess its execution time. Similar to MoTE-ECC, we employ the Montgomery model for variable-base scalar multiplications and the twisted Edwards model if the base point is fixed (e.g. to generate an ephemeral key pair). Our experiments show that the two scalar multiplications needed to perform an ephemeral ECDH key exchange can be accomplished in 4.88 million clock cycles altogether (using a 159-bit prime field), which sets a new speed record for ephemeral ECDH on a 16-bit processor. We also describe the curve generation process and analyze the execution time of various field and point arithmetic operations on curves over a 159-bit and a 191-bit pseudo-Mersenne prime field. [less ▲] Detailed reference viewed: 280 (19 UL)![]() ; ; Liu, Zhe ![]() in Shi, Elaine; Yiu, S.M. (Eds.) Information and Communications Security (2014, December) Detailed reference viewed: 135 (4 UL)![]() ; Liu, Zhe ![]() ![]() in Kim, Jongsung; Lee, Jooyoung (Eds.) Information Security and Cryptology - ICISC 2014, 17th International Conference, Seoul, Korea, December 3-5, 2014, Revised Selected Papers (2014, December) Detailed reference viewed: 236 (8 UL) |
||