Browse ORBi

- What it is and what it isn't
- Green Road / Gold Road?
- Ready to Publish. Now What?
- How can I support the OA movement?
- Where can I learn more?

ORBi

FELICS-AEAD: Benchmarking of Lightweight Authenticated Encryption Algorithms Cardoso Dos Santos, Luan ; Groszschädl, Johann ; Biryukov, Alex in Smart Card Research and Advanced Applications, 18th International Conference (2019) Cryptographic algorithms that can simultaneously provide both encryption and authentication play an increasingly important role in modern security architectures and protocols (e.g.\ TLS v1.3). Dozens of ... [more ▼] Cryptographic algorithms that can simultaneously provide both encryption and authentication play an increasingly important role in modern security architectures and protocols (e.g.\ TLS v1.3). Dozens of authenticated encryption systems have been designed in the past five years, which has initiated a large body of research in cryptanalysis. The interest in authenticated encryption has further risen after the National Institute of Standards and Technology (NIST) announced an initiative to standardize ``lightweight'' authenticated ciphers and hash functions that are suitable for resource-constrained devices. However, while there already exist some cryptanalytic results on these recent designs, little is known about their performance, especially when they are executed on small 8, 16, and 32-bit microcontrollers. In this paper, we introduce an open-source benchmarking tool suite for a fair and consistent evaluation of Authenticated Encryption with Associated Data (AEAD) algorithms written in C or assembly language for 8-bit AVR, 16-bit MSP430, and 32-bit ARM Cortex-M3 platforms. The tool suite is an extension of the FELICS benchmarking framework and provides a new AEAD-specific low-level API that allows users to collect very fine-grained and detailed results for execution time, RAM consumption, and binary code size in a highly automated fashion. FELICS-AEAD comes with two pre-defined evaluation scenarios, which were developed to resemble security-critical operations commonly carried out by real IoT applications to ensure the benchmarks are meaningful in practice. We tested the AEAD tool suite using five authenticated encryption algorithms, namely AES-GCM and the CAESAR candidates ACORN, ASCON, Ketje-Jr, and NORX, and present some preliminary results. [less ▲] Detailed reference viewed: 10 (6 UL)A Family of Lightweight Twisted Edwards Curves for the Internet of Things Ghatpande, Sankalp ; Groszschädl, Johann ; Liu, Zhe in Blazy, Olivier; Yeun, Chan Y. (Eds.) Information Security Theory and Practice, 12th IFIP WG 11.2 International Conference, WISTP 2018, Brussels, Belgium, December 10-11, 2018, Proceedings (2018, December) We introduce a set of four twisted Edwards curves that satisfy common security requirements and allow for fast implementations of scalar multiplication on 8, 16, and 32-bit processors. Our curves are ... [more ▼] We introduce a set of four twisted Edwards curves that satisfy common security requirements and allow for fast implementations of scalar multiplication on 8, 16, and 32-bit processors. Our curves are defined by an equation of the form -x^2 + y^2 = 1 + dx^2y^2 over a prime field Fp, where d is a small non-square modulo p. The underlying prime fields are based on "pseudo-Mersenne" primes given by p = 2^k - c and have in common that p is congruent to 5 modulo 8, k is a multiple of 32 minus 1, and c is at most eight bits long. Due to these common features, our primes facilitate a parameterized implementation of the low-level arithmetic so that one and the same arithmetic function is able to process operands of different length. Each of the twisted Edwards curves we introduce in this paper is birationally equivalent to a Montgomery curve of the form -(A+2)y^2 = x^3 + Ax^2 + x where 4/(A+2) is small. Even though this contrasts with the usual practice of choosing A such that (A+2)/4 is small, we show that the Montgomery form of our curves allows for an equally efficient implementation of point doubling as Curve25519. The four curves we put forward roughly match the common security levels of 80, 96, 112 and 128 bits. In addition, their Weierstraß representations are isomorphic to curves of the form y^2 = x^3 - 3x + b so as to facilitate inter-operability with TinyECC and other legacy software. [less ▲] Detailed reference viewed: 282 (25 UL)Efficient Implementation of the SHA-512 Hash Function for 8-bit AVR Microcontrollers Cheng, Hao ; ; Groszschädl, Johann in Lanet, Jean-Louis; Toma, Cristian (Eds.) Innovative Security Solutions for Information Technology and Communications, 11th International Conference, SecITC 2018, Bucharest, Romania, November 8-9, 2018, Revised Selected Papers (2018, November) SHA-512 is a member of the SHA-2 family of cryptographic hash algorithms that is based on a Davies-Mayer compression function operating on eight 64-bit words to produce a 512-bit digest. It provides ... [more ▼] SHA-512 is a member of the SHA-2 family of cryptographic hash algorithms that is based on a Davies-Mayer compression function operating on eight 64-bit words to produce a 512-bit digest. It provides strong resistance to collision and preimage attacks, and is assumed to remain secure in the dawning era of quantum computers. However, the compression function of SHA-512 is challenging to implement on small 8 and 16-bit microcontrollers because of their limited register space and the fact that 64-bit rotations are generally slow on such devices. In this paper, we present the first highly-optimized Assembler implementation of SHA-512 for the ATmega family of 8-bit AVR microcontrollers. We introduce a special optimization technique for the compression function based on a duplication of the eight working variables so that they can be more efficiently loaded from RAM via the indirect addressing mode with displacement (using the ldd and std instruction). In this way, we were able to achieve high performance without unrolling the main loop of the compression function, thereby keeping the code size small. When executed on an 8-bit AVR ATmega128 microcontroller, the compression function takes slightly less than 60k clock cycles, which corresponds to a compression rate of roughly 467 cycles per byte. The binary code size of the full SHA-512 implementation providing a standard Init-Update-Final (IUF) interface amounts to approximately 3.5 kB. [less ▲] Detailed reference viewed: 169 (22 UL)Energy-Scalable Montgomery-Curve ECDH Key Exchange for ARM Cortex-M3 Microcontrollers Franck, Christian ; Groszschädl, Johann ; Le Corre, Yann et al in Awan, Irfan; Younas, Muhammad; Portela, Filipe (Eds.) Proceedings of the 6th International Conference on Future Internet of Things and Cloud Workshops (W-FICLOUD 2018) (2018, August) The number of smart devices connected to the Internet is growing at an enormous pace and will reach 30 billion within the next five years. A large fraction of these devices have limited processing ... [more ▼] The number of smart devices connected to the Internet is growing at an enormous pace and will reach 30 billion within the next five years. A large fraction of these devices have limited processing capabilities and energy supply, which makes the execution of computation-intensive cryptographic algorithms very costly. This problem is exacerbated by the fact that basic optimization techniques like loop unrolling can not (always) be applied since cryptographic software for the IoT often needs to meet strict constraints on code size to not exceed the program storage capacity of the target device. In this paper we introduce SECCCM3, a "lightweight" software library for scalable elliptic curve cryptography on ARM Cortex-M3 microcontrollers. The current version of SECCCM3 is able to carry out variable-base scalar multiplication on Montgomery-form curves over pseudo-Mersenne prime fields, such as Curve25519, and can be used to implement static ECDH key exchange. SECCCM3 is scalable in the sense that it supports curves of different order (as long as certain conditions are met), thereby enabling trade-offs between security and execution time (resp. energy dissipation). We made an effort to protect the field arithmetic against Timing Attacks (TAs) and Simple Power Analysis (SPA), taking into account the so-called early-termination effect of the Cortex-M3 integer multiplier, which makes the latency of "long" multiply instructions operand-dependent. Our experiments show that the integration of countermeasures against information leakage caused by this effect increases the execution time by 34%, while the code size grows by 13%. A TA and SPA-resistant scalar multiplication on Curve25519 has an execution time of 4.565 million clock cycles and consumes approximately 5.1 mJ of energy when executed on a STM32L152RE Cortex-M3 microcontroller. SECCCM3 has a binary code size of 4.0 kB, which includes domain parameters for curves over 159, 191, 223, and 255-bit prime fields. [less ▲] Detailed reference viewed: 112 (6 UL)Micro-Architectural Power Simulator for Leakage Assessment of Cryptographic Software on ARM Cortex-M3 Processors Le Corre, Yann ; Groszschädl, Johann ; Dinu, Dumitru-Daniel in Fan, Junfeng; Gierlichs, Benedikt (Eds.) Constructive Side-Channel Analysis and Secure Design - 9th International Workshop, COSADE 2018, Singapore, April 23-24, 2018, Proceedings (2018, April) Masking is a common technique to protect software implementations of symmetric cryptographic algorithms against Differential Power Analysis (DPA) attacks. The development of a properly masked version of a ... [more ▼] Masking is a common technique to protect software implementations of symmetric cryptographic algorithms against Differential Power Analysis (DPA) attacks. The development of a properly masked version of a block cipher is an incremental and time-consuming process since each iteration of the development cycle involves a costly leakage assessment. To achieve a high level of DPA resistance, the architecture-specific leakage properties of the target processor need to be taken into account. However, for most embedded processors, a detailed description of these leakage properties is lacking and often not even the HDL model of the micro-architecture is openly available. Recent research has shown that power simulators for leakage assessment can significantly speed up the development process. Unfortunately, few such simulators exist and even fewer take target-specific leakages into account. To fill this gap, we present MAPS, a micro-architectural power simulator for the M3 series of ARM Cortex processors, one of today's most widely-used embedded platforms. MAPS is fast, easy to use, and able to model the Cortex-M3 pipeline leakages, in particular the leakage introduced by the pipeline registers. The M3 leakage properties are inferred from its HDL source code, and therefore MAPS does not need a complicated and expensive profiling phase. Taking first-order masked Assembler implementations of the lightweight cipher Simon as example, we study how the pipeline leakages manifest and discuss some guidelines on how to avoid them. [less ▲] Detailed reference viewed: 71 (0 UL)Micro-architectural Power Simulator for Leakage Assessment of Cryptographic Software on ARM Cortex-M3 Processors Le Corre, Yann ; Groszschädl, Johann ; Dinu, Dumitru-Daniel in Fan, Junfeng; Gierlichs, Benedikt (Eds.) Constructive Side-Channel Analysis and Secure Design - 9th International Workshop, COSADE 2018, Singapore, April 23-24, 2018, Proceedings (2018, April) Masking is a common technique to protect software implementations of symmetric cryptographic algorithms against Differential Power Analysis (DPA) attacks. The development of a properly masked version of a ... [more ▼] Masking is a common technique to protect software implementations of symmetric cryptographic algorithms against Differential Power Analysis (DPA) attacks. The development of a properly masked version of a block cipher is an incremental and time-consuming process since each iteration of the development cycle involves a costly leakage assessment. To achieve a high level of DPA resistance, the architecture-specific leakage properties of the target processor need to be taken into account. However, for most embedded processors, a detailed description of these leakage properties is lacking and often not even the HDL model of the micro-architecture is openly available. Recent research has shown that power simulators for leakage assessment can significantly speed up the development process. Unfortunately, few such simulators exist and even fewer take target-specific leakages into account. To fill this gap, we present MAPS, a micro-architectural power simulator for the M3 series of ARM Cortex processors, one of today’s most widely-used embedded platforms. MAPS is fast, easy to use, and able to model the Cortex-M3 pipeline leakages, in particular the leakage introduced by the pipeline registers. The M3 leakage properties are inferred from its HDL source code, and therefore MAPS does not need a complicated and expensive profiling phase. Taking first-order masked Assembler implementations of the lightweight cipher Simon as example, we study how the pipeline leakages manifest and discuss some guidelines on how to avoid them. [less ▲] Detailed reference viewed: 103 (1 UL)Securing Edge Devices in the Post-Quantum Internet of Things Using Lattice-Based Cryptography Liu, Zhe ; ; Groszschädl, Johann in IEEE Communications Magazine (2018), 56(2), 158-162 In order to increase the security of edge computing, all data transmitted to and from edge devices, as well as all data stored on edge devices, must be encrypted. Especially when the transmitted or stored ... [more ▼] In order to increase the security of edge computing, all data transmitted to and from edge devices, as well as all data stored on edge devices, must be encrypted. Especially when the transmitted or stored data contains sensitive personal information, long-term protection over periods of ten or more years may be required, which can only be achieved with post-quantum cryptography. This paper first gives a brief overview of post-quantum public-key cryptosystems based on hard mathematical problems related to hash functions, error-correcting codes, multivariate quadratic systems, and lattices. Then, the suitability of lattice-based cryptosystems for resource-constrained devices is discussed and efficient implementations for 8 and 32-bit microcontrollers are outlined. [less ▲] Detailed reference viewed: 94 (3 UL)Triathlon of Lightweight Block Ciphers for the Internet of Things Dinu, Dumitru-Daniel ; Le Corre, Yann ; Khovratovich, Dmitry et al in Journal of Cryptographic Engineering (2018) In this paper, we introduce a framework for the benchmarking of lightweight block ciphers on a multitude of embedded platforms. Our framework is able to evaluate the execution time, RAM footprint, as well ... [more ▼] In this paper, we introduce a framework for the benchmarking of lightweight block ciphers on a multitude of embedded platforms. Our framework is able to evaluate the execution time, RAM footprint, as well as binary code size, and allows one to define a custom "figure of merit" according to which all evaluated candidates can be ranked. We used the framework to benchmark implementations of 19 lightweight ciphers, namely AES, Chaskey, Fantomas, HIGHT, LBlock, LEA, LED, Piccolo, PRESENT, PRIDE, PRINCE, RC5, RECTANGLE, RoadRunneR, Robin, Simon, SPARX, Speck, and TWINE, on three microcontroller platforms: 8-bit AVR, 16-bit MSP430, and 32-bit ARM. Our results bring some new insights into the question of how well these lightweight ciphers are suited to secure the Internet of things. The benchmarking framework provides cipher designers with an easy-to-use tool to compare new algorithms with the state of the art and allows standardization organizations to conduct a fair and consistent evaluation of a large number of candidates. [less ▲] Detailed reference viewed: 111 (1 UL)Efficient Masking of ARX-Based Block Ciphers Using Carry-Save Addition on Boolean Shares Dinu, Dumitru-Daniel ; Groszschädl, Johann ; Le Corre, Yann in Nguyen, Phong Q.; Zhou, Jianying (Eds.) Information Security - 20th International Conference, ISC 2017, Ho Chi Minh City, Vietnam, November 22-24, 2017, Proceedings (2017, November) Masking is a widely-used technique to protect block ciphers and other symmetric cryptosystems against Differential Power Analysis (DPA) attacks. Applying masking to a cipher that involves both arithmetic ... [more ▼] Masking is a widely-used technique to protect block ciphers and other symmetric cryptosystems against Differential Power Analysis (DPA) attacks. Applying masking to a cipher that involves both arithmetic and Boolean operations requires a conversion between arithmetic and Boolean masks. An alternative approach is to perform the required arithmetic operations (e.g. modular addition or subtraction) directly on Boolean shares. At FSE 2015, Coron et al. proposed a logarithmic-time algorithm for modular addition on Boolean shares based on the Kogge-Stone carry-lookahead adder. We revisit their addition algorithm in this paper and present a fast implementation for ARM processors. Then, we introduce a new technique for direct modular addition/subtraction on Boolean shares using a simple Carry-Save Adder (CSA) in an iterative fashion. We show that the average complexity of CSA-based addition on Boolean shares grows logarithmically with the operand size, similar to the Kogge-Stone carry-lookahead addition, but consists of only a single AND, an XOR, and a left-shift per iteration. A 32-bit CSA addition~on Boolean shares has an average execution time of 162 clock cycles on an ARM Cortex-M3 processor, which is approximately 43% faster than the Kogge-Stone adder. The performance gain increases to over 55% when comparing the average subtraction times. We integrated both addition techniques into a masked implementation of the block cipher Speck and found that the CSA-based variant clearly outperforms its Kogge-Stone counterpart by a factor of 1.70 for encryption and 2.30 for decryption. [less ▲] Detailed reference viewed: 85 (0 UL)High-Performance Ideal Lattice-Based Cryptography on 8-Bit AVR Microcontrollers ; ; et al in ACM Transactions on Embedded Computing Systems (2017), 16(4), 117 Over recent years lattice-based cryptography has received much attention due to versatile average-case problems like Ring-LWE or Ring-SIS that appear to be intractable by quantum computers. In this work ... [more ▼] Over recent years lattice-based cryptography has received much attention due to versatile average-case problems like Ring-LWE or Ring-SIS that appear to be intractable by quantum computers. In this work, we evaluate and compare implementations of Ring-LWE encryption and the bimodal lattice signature scheme (BLISS) on an 8-bit Atmel ATxmega128 microcontroller. Our implementation of Ring-LWE encryption provides comprehensive protection against timing side-channels and takes 24.9ms for encryption and 6.7ms for decryption. To compute a BLISS signature, our software takes 317ms and 86ms for verification. These results underline the feasibility of lattice-based cryptography on constrained devices. [less ▲] Detailed reference viewed: 76 (1 UL)Efficient Implementation of Pedersen Commitments Using Twisted Edwards Curves Franck, Christian ; Groszschädl, Johann in Bouzefrane, Samia; Banerjee, Soumya; Sailhan, Françoise (Eds.) et al Mobile, Secure, and Programmable Networking - Third International Conference, MSPN 2017, Paris, France, June 29-30, 2017, Revised Selected Papers (2017, June) Cryptographic commitment schemes are used in many contexts, whereby the size of the secret data and the security requirements depend on the target application. Using a software library that has been ... [more ▼] Cryptographic commitment schemes are used in many contexts, whereby the size of the secret data and the security requirements depend on the target application. Using a software library that has been designed for other purposes (e.g., key-exchange or digital signatures) to compute commitments can be complicated or inefficient. We present in this paper a flexible implementation of Pedersen commitments based on elliptic curves in twisted Edwards form. The implementation supports a set of five curves of varying cryptographic strength, which are defined over 127, 159, 191, 223, and 255-bit pseudo-Mersenne prime fields. One can dynamically (i.e., at runtime) choose one of the curves according to the required level of security, and it is also possible to adapt to the size of the data to be committed by varying the number of base points. The point arithmetic is performed with optimized formulas using extended coordinates and dynamically pre-computed tables are utilized to speed up the scalar multiplication. Our implementation is written in ANSI C (with optional x86 assembler optimizations for the field arithmetic) and was compiled and tested successfully with Visual C on Windows, gcc on Linux, and clang on macOS. We present detailed benchmarking results for the field and point arithmetic on all five curves. When using an Intel Core i7 processor clocked at 2.7 GHz as test platform, we can compute more than 38,000 commitments per second on a twisted Edwards curve over a 127-bit field. [less ▲] Detailed reference viewed: 196 (14 UL)Elliptic Curve Cryptography with Efficiently Computable Endomorphisms and Its Hardware Implementations for the Internet of Things Liu, Zhe ; Groszschädl, Johann ; et al in IEEE Transactions on Computers (2017), 66(5), 773-785 Verification of an ECDSA signature requires a double scalar multiplication on an elliptic curve. In this work, we study the computation of this operation on a twisted Edwards curve with an efficiently ... [more ▼] Verification of an ECDSA signature requires a double scalar multiplication on an elliptic curve. In this work, we study the computation of this operation on a twisted Edwards curve with an efficiently computable endomorphism, which allows reducing the number of point doublings by approximately 50 percent compared to a conventional implementation. In particular, we focus on a curve defined over the 207-bit prime field Fp with p = 2^207 - 5131. We develop several optimizations to the operation and we describe two hardware architectures for computing the operation. The first architecture is a small processor implemented in 0.13 μm CMOS ASIC and is useful in resource-constrained devices for the Internet of Things (IoT) applications. The second architecture is designed for fast signature verifications by using FPGA acceleration and can be used in the server-side of these applications. Our designs offer various trade-offs and optimizations between performance and resource requirements and they are valuable for IoT applications. [less ▲] Detailed reference viewed: 72 (2 UL)Efficient Arithmetic on ARM-NEON and Its Application for High-Speed RSA Implementation ; Liu, Zhe ; Groszschädl, Johann et al in Security and Communication Networks (2016), 9(18), 5401-5411 A steadily increasing number of modern processors support Single Instruction Multiple Data (SIMD) instructions to speed up multimedia, communication, and security applications. The computational power of ... [more ▼] A steadily increasing number of modern processors support Single Instruction Multiple Data (SIMD) instructions to speed up multimedia, communication, and security applications. The computational power of Intel's SSE and AVX extensions as well as ARM's NEON engine has initiated a body of research on SIMD-parallel implementation of multiple-precision integer arithmetic operations, in particular modular multiplication and modular squaring, which are performance-critical components of widely-used public-key cryptosystems such as RSA, DSA, Diffie-Hellman, and their elliptic-curve variants ECDSA and ECDH. In this paper, we introduce the Double Operand Scanning (DOS) method for multiple-precision squaring and describe its implementation for ARM NEON processors. The DOS method uses a full-radix representation of the operand to be squared and aims to maximize performance by reducing the number of Read-After-Write (RAW) dependencies between source and destination registers. We also analyze the benefits of applying Karatsuba's technique to both multiple-precision multiplication and squaring, and present an optimized implementation of Montgomery's algorithm for modular reduction. Our performance evaluation shows that the DOS method along with the other optimizations described in this paper allows one to execute a full 2048-bit modular exponentiation in about 14.25 million clock cycles on an ARM Cortex-A15 processor, which is significantly faster than previously-reported RSA implementations for the ARM-NEON platform. [less ▲] Detailed reference viewed: 74 (2 UL)Design Strategies for ARX with Provable Bounds: SPARX and LAX Dinu, Dumitru-Daniel ; Perrin, Léo Paul ; Udovenko, Aleksei et al in Cheon, Jung Hee; Takagi, Tsuyoshi (Eds.) Advances in Cryptology --- ASIACRYPT 2016, 22nd International Conference on the Theory and Application of Cryptology and Information Security, Hanoi, Vietnam, December 4-8, 2016, Proceedings, Part I (2016, December) We present, for the first time, a general strategy for designing ARX symmetric-key primitives with provable resistance against single-trail differential and linear cryptanalysis. The latter has been a ... [more ▼] We present, for the first time, a general strategy for designing ARX symmetric-key primitives with provable resistance against single-trail differential and linear cryptanalysis. The latter has been a long standing open problem in the area of ARX design. The Wide-Trail design Strategy (WTS), that is at the basis of many S-box based ciphers, including the AES, is not suitable for ARX designs due to the lack of S-boxes in the latter. In this paper we address the mentioned limitation by proposing the Long-Trail design Strategy (LTS) -- a dual of the WTS that is applicable (but not limited) to ARX constructions. In contrast to the WTS, that prescribes the use of small and efficient S-boxes at the expense of heavy linear layers with strong mixing properties, the LTS advocates the use of large (ARX-based) S-Boxes together with sparse linear layers. With the help of the so-called long-trail argument, a designer can bound the maximum differential and linear probabilities for any number of rounds of a cipher built according to the LTS. To illustrate the effectiveness of the new strategy, we propose Sparx -- a family of ARX-based block ciphers designed according to the LTS. Sparx has 32-bit ARX-based S-boxes and has provable bounds against differential and linear cryptanalysis. In addition, Sparx is very efficient on a number of embedded platforms. Its optimized software implementation ranks in the top-6 of the most software-efficient ciphers along with Simon, Speck, Chaskey, LEA and RECTANGLE. As a second contribution we propose another strategy for designing ARX ciphers with provable properties, that is completely independent of the LTS. It is motivated by a challenge proposed earlier by Wallen and uses the differential properties of modular addition to minimize the maximum differential probability across multiple rounds of a cipher. A new primitive, called LAX is designed following those principles. LAX partly solves the Wallen challenge. [less ▲] Detailed reference viewed: 183 (13 UL)Implementation of a Leakage-Resilient ElGamal Key Encapsulation Mechanism Galindo, David ; Groszschädl, Johann ; Liu, Zhe et al in Journal of Cryptographic Engineering (2016), 6(3), 229-238 Leakage-resilient cryptography aims to extend the rigorous guarantees achieved through the provable security paradigm to physical implementations. The constructions designed on basis of this new approach ... [more ▼] Leakage-resilient cryptography aims to extend the rigorous guarantees achieved through the provable security paradigm to physical implementations. The constructions designed on basis of this new approach inevitably suffer from an Achilles heel: a bounded leakage assumption is needed. Currently, a huge gap exists between the theory of such designs and their implementation to confirm the leakage resilience in practice. The present work tries to narrow this gap for the leakage-resilient bilinear ElGamal key encapsulation mechanism (BEG-KEM) proposed by Kiltz and Pietrzak in 2010. Our first contribution is a variant of the bounded leakage and the only-computation-leaks model that is closer to practice. We weaken the restriction on the image size of the leakage functions in these models and only insist that the inputs to the leakage functions have sufficient min-entropy left, in spite of the leakage, with no limitation on the quantity of this leakage. We provide a novel security reduction for BEG-KEM in this relaxed leakage model using the generic bilinear group axiom. Secondly, we show that a naive implementation of the exponentiation in BEG-KEM makes it impossible to meet the leakage bound. Instead of trying to find an exponentiation algorithm that meets the leakage axiom (which is a non-trivial problem in practice), we propose an advanced scheme, BEG-KEM+, that avoids exponentiation by a secret value, but rather uses an encoding into the base group due to Fouque and Tibouchi. Thirdly, we present a software implementation of BEG-KEM+ based on the Miracl library and provide detailed experimental results. We also assess its (theoretical) resistance against power analysis attacks from a practical perspective, taking into account the state-of-the-art in side-channel cryptanalysis. [less ▲] Detailed reference viewed: 73 (1 UL)Energy-Efficient Elliptic Curve Cryptography for MSP430-Based Wireless Sensor Nodes Liu, Zhe ; Groszschädl, Johann ; et al in Liu, Joseph K.; Steinfeld, Ron (Eds.) Information Security and Privacy - 21st Australasian Conference, ACISP 2016, Melbourne, VIC, Australia, July 4-6, 2016, Proceedings, Part I (2016, July) The Internet is rapidly evolving from a network of personal computers and servers to a network of smart objects ("things") able to communicate with each other and with central resources. This evolution ... [more ▼] The Internet is rapidly evolving from a network of personal computers and servers to a network of smart objects ("things") able to communicate with each other and with central resources. This evolution has created a demand for lightweight implementations of cryptographic algorithms suitable for resource-constrained devices such as RFID tags and wireless sensor nodes. In this paper we describe a highly optimized software implementation of Elliptic Curve Cryptography (ECC) for the MSP430 series of ultra-low-power 16-bit microcontrollers. Our software is scalable in the sense that it supports prime fields and elliptic curves of different order without recompilation, which allows for flexible trade-offs between execution time (i.e. energy consumption) and security. The low-level modular arithmetic is optimized for pseudo-Mersenne primes of the form p = 2^n - c where n is a multiple of 16 minus 1 and c fits in a 16-bit register. All prime-field arithmetic functions are parameterized with respect to the length of operands (i.e. the number of 16-bit words they consist of) and written in Assembly language, whereby we avoided conditional jumps and branches that could leak information about the secret key. Our ECC implementation can perform scalar multiplication on two types of elliptic curves, namely Montgomery curves and twisted Edwards curves. A full scalar multiplication using a Montgomery curve over a 159-bit field requires about 3.86*10^6 clock cycles when executed on an MSP430F1611 microcontroller. [less ▲] Detailed reference viewed: 254 (20 UL)Efficient Implementation of NIST-Compliant Elliptic Curve Cryptography for 8-bit AVR-Based Sensor Nodes Liu, Zhe ; ; Groszschädl, Johann et al in IEEE Transactions on Information Forensics and Security (2016), 11(7), 1385-1397 In this paper, we introduce a highly optimized software implementation of standards-compliant elliptic curve cryptography (ECC) for wireless sensor nodes equipped with an 8-bit AVR microcontroller. We ... [more ▼] In this paper, we introduce a highly optimized software implementation of standards-compliant elliptic curve cryptography (ECC) for wireless sensor nodes equipped with an 8-bit AVR microcontroller. We exploit the state-of-the-art optimizations and propose novel techniques to further push the performance envelope of a scalar multiplication on the NIST P-192 curve. To illustrate the performance of our ECC software, we develope the prototype implementations of different cryptographic schemes for securing communication in a wireless sensor network, including elliptic curve Diffie-Hellman (ECDH) key exchange, the elliptic curve digital signature algorithm (ECDSA), and the elliptic curve Menezes-Qu-Vanstone (ECMQV) protocol. We obtain record-setting execution times for fixed-base, point variable-base, and double-base scalar multiplication. Compared with the related work, our ECDH key exchange achieves a performance gain of roughly 27% over the best previously published result using the NIST P-192 curve on the same platform, while our ECDSA performs twice as fast as the ECDSA implementation of the well-known TinyECC library. We also evaluate the impact of Karatsuba's multiplication technique on the overall execution time of a scalar multiplication. In addition to offering high performance, our implementation of scalar multiplication has a highly regular execution profile, which helps to protect against certain side-channel attacks. Our results show that NIST-compliant ECC can be implemented efficiently enough to be suitable for resource-constrained sensor nodes. [less ▲] Detailed reference viewed: 187 (11 UL)Correlation Power Analysis of Lightweight Block Ciphers: From Theory to Practice Biryukov, Alex ; Dinu, Dumitru-Daniel ; Groszschädl, Johann in Manulis, Mark; Sadeghi, Ahmad-Reza; Schneider, Steve (Eds.) Applied Cryptography and Network Security - 14th International Conference, ACNS 2016, Guildford, UK, June 19-22, 2016. Proceedings (2016, June) Side-Channel Analysis (SCA) represents a serious threat to the security of millions of smart devices that form part of the so-called Internet of Things (IoT). Choosing the "right" cryptographic primitive ... [more ▼] Side-Channel Analysis (SCA) represents a serious threat to the security of millions of smart devices that form part of the so-called Internet of Things (IoT). Choosing the "right" cryptographic primitive for the IoT is a highly challenging task due to the resource constraints of IoT devices and the variety of primitives. An important criterion to assess the suitability of a lightweight cipher with respect to SCA is the amount of leakage available to an adversary. In this paper, we analyze the efficiency of different selection functions that are commonly used in Correlation Power Analysis (CPA) attacks on symmetric primitives. To this end, we attacked implementations of the lightweight block ciphers AES, Fantomas, LBlock, Piccolo, PRINCE, RC5, Simon, and Speck on an 8-bit AVR processor. By exploring the relation between the nonlinearity of the studied selection functions and the measured leakages, we discovered some imperfections when using nonlinearity to quantify the resilience against CPA. Then, we applied these findings in an evaluation of the "intrinsic" CPA-resistance of unprotected implementations of the eight mentioned ciphers. We show that certain implementation aspects can influence the leakage level and try to explain why. Our results shed new light on the resilience of basic operations executed by these ciphers against CPA and help to bridge the gap between theory and practice. [less ▲] Detailed reference viewed: 454 (70 UL)Efficient Ring-LWE Encryption on 8-bit AVR Processors Liu, Zhe ; ; et al in Güneysu, Tim; Handschuh, Helena (Eds.) Cryptographic Hardware and Embedded Systems - CHES 2015, 17th International Workshop, Saint-Malo, France, September 13-16, 2015, Proceedings (2015, September) Public-key cryptography based on the "ring-variant" of the Learning with Errors (ring-LWE) problem is both efficient and believed to remain secure in a post-quantum world. In this paper, we introduce a ... [more ▼] Public-key cryptography based on the "ring-variant" of the Learning with Errors (ring-LWE) problem is both efficient and believed to remain secure in a post-quantum world. In this paper, we introduce a carefully-optimized implementation of a ring-LWE encryption scheme for 8-bit AVR processors like the ATxmega128. Our research contributions include several optimizations for the Number Theoretic Transform (NTT) used for polynomial multiplication. More concretely, we describe the Move-and-Add (MA) and the Shift-Add-Multiply-Subtract-Subtract (SAMS2) technique to speed up the performance-critical multiplication and modular reduction of coefficients, respectively. We take advantage of incompletely-reduced intermediate results to minimize the total number of reduction operations and use a special coefficient-storage method to decrease the RAM footprint of NTT multiplications. In addition, we propose a byte-wise scanning strategy to improve the performance of a discrete Gaussian sampler based on the Knuth-Yao random walk algorithm. For medium-term security, our ring-LWE implementation needs 590k, 672k, and 276k clock cycles for key-generation, encryption, and decryption, respectively. On the other hand, for long-term security, the execution time of key-generation, encryption, and decryption amount to 2.2M, 2.6M, and 686k cycles, respectively. These results set new speed records for ring-LWE encryption on an 8-bit processor and outperform related RSA and ECC implementations by an order of magnitude. [less ▲] Detailed reference viewed: 170 (9 UL)FELICS - Fair Evaluation of Lightweight Cryptographic Systems Dinu, Dumitru-Daniel ; Biryukov, Alex ; Groszschädl, Johann et al Scientific Conference (2015, July) In this paper we introduce FELICS, a free and open-source benchmarking framework designed for fair and consistent evaluation of software implementations of lightweight cryptographic primitives for ... [more ▼] In this paper we introduce FELICS, a free and open-source benchmarking framework designed for fair and consistent evaluation of software implementations of lightweight cryptographic primitives for embedded devices. The framework is very flexible thanks to its modular structure, which allows for an easy integration of new metrics, target devices and evaluation scenarios. It consists of two modules that can currently asses the performance of lightweight block and stream ciphers on three widely used microcontrollers: 8-bit AVR, 16-bit MSP and 32-bit ARM. The metrics extracted are execution time, RAM consumption and binary code size. FELICS has a simple user interface and is intended to be used by cipher designers to compare new primitives with the state of the art. The extracted metrics are very detailed and assist embedded software engineers in selecting the best cipher to match the requirements of a particular application. The tool aims to increase the transparency and trust in benchmarking results of lightweight primitives and facilitates a fair comparison between different primitives using the same evaluation conditions. [less ▲] Detailed reference viewed: 412 (14 UL) |
||