High-Throughput Elliptic Curve Cryptography Using AVX2 Vector Instructions

CHENG, Hao; GROSZSCHÄDL, Johann; TIAN, Jiaqi; ROENNE, Peter; RYAN, Peter Y A

doi:10.1007/978-3-030-81652-0_27

Download

Paper published in a book (Scientific congresses, symposiums and conference proceedings)

High-Throughput Elliptic Curve Cryptography Using AVX2 Vector Instructions

CHENG, Hao; GROSZSCHÄDL, Johann; TIAN, Jiaqi et al.

2020 • In Dunkelman, Orr; Jacobson Jr., Michael J.; O'Flynn, Colin (Eds.) Selected Areas in Cryptography, 27th International Conference, Halifax, NS, Canada (Virtual Event), October 21-23, 2020, Revised Selected Papers

Peer reviewed

Permalink
https://hdl.handle.net/10993/48810

DOI
10.1007/978-3-030-81652-0_27

Files (1)Send to Details Statistics Bibliography Similar publications

Files

Full Text

SAC2020.pdf

Author postprint (472.23 kB)

Download

All documents in ORBilu are protected by a user license.

Send to

RIS BibTex APA Chicago Permalink X Linkedin

Details

Keywords :

Throughput-Optimized Cryptography; Elliptic Curve Cryptography; Curve25519; Single Instruction Multiple Data (SIMD); Advanced Vector Extension 2 (AVX2)

Abstract :

[en] Single Instruction Multiple Data (SIMD) execution engines like Intel’s Advanced Vector Extensions 2 (AVX2) offer a great potential to accelerate elliptic curve cryptography compared to implementations using only basic x64 instructions. All existing AVX2 implementations of scalar multiplication on e.g. Curve25519 (and alternative curves) are optimized for low latency. We argue in this paper that many real-world applications, such as server-side SSL/TLS handshake processing, would benefit more from throughput-optimized implementations than latency-optimized ones. To support this argument, we introduce a throughput-optimized AVX2 implementation of variable-base scalar multiplication on Curve25519 and fixed-base scalar multiplication on Ed25519. Both implementations perform four scalar multiplications in parallel, where each uses a 64-bit element of a 256-bit vector. The field arithmetic is based on a radix-2^29 representation of the field elements, which makes it possible to carry out four parallel multiplications modulo a multiple of p=2^255−19 in just 88 cycles on a Skylake CPU. Four variable-base scalar multiplications on Curve25519 require less than 250,000 Skylake cycles, which translates to a throughput of 32,318 scalar multiplications per second at a clock frequency of 2 GHz. For comparison, the to-date best latency-optimized AVX2 implementation has a throughput of some 21,000 scalar multiplications per second on the same Skylake CPU.

Research center :

Interdisciplinary Centre for Security, Reliability and Trust (SnT) > Applied Security and Information Assurance Group (APSIA)

Disciplines :

Computer science

Author, co-author :

CHENG, Hao ; University of Luxembourg > Faculty of Science, Technology and Medicine (FSTM) > APSIA

GROSZSCHÄDL, Johann ; University of Luxembourg > Faculty of Science, Technology and Medicine (FSTM) > Department of Computer Science (DCS)

TIAN, Jiaqi ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > APSIA

ROENNE, Peter ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > APSIA

RYAN, Peter Y A ; University of Luxembourg > Faculty of Science, Technology and Medicine (FSTM) > Department of Computer Science (DCS)

External co-authors :

Language :

English

Title :

High-Throughput Elliptic Curve Cryptography Using AVX2 Vector Instructions

Publication date :

October 2020

Event name :

27th International Conference on Selected Areas in Cryptography (SAC 2020)

Event place :

Halifax, NS, Canada

Event date :

2020-10-19 to 2020-10-23

Audience :

International

Main work title :

Selected Areas in Cryptography, 27th International Conference, Halifax, NS, Canada (Virtual Event), October 21-23, 2020, Revised Selected Papers

Editor :

Dunkelman, Orr

Jacobson Jr., Michael J.

O'Flynn, Colin

Publisher :

Springer Verlag

ISBN/EAN :

978-3-030-81651-3

Collection name :

Lecture Notes in Computer Science, volume 12804

Pages :

698-719

Peer reviewed :

Peer reviewed

Focus Area :

Security, Reliability and Trust

Additional URL :

https://link.springer.com/chapter/10.1007/978-3-030-81652-0_27

European Projects :

H2020 - 779391 - FutureTPM - Future Proofing the Connected World: A Quantum-Resistant Trusted Platform Module

Funders :

CE - Commission Européenne

Available on ORBilu :

since 06 December 2021

Statistics

Number of views

296 (19 by Unilu)

Number of downloads

521 (21 by Unilu)

More statistics

Scopus citations^®

Scopus citations^®
without self-citations

OpenCitations

OpenAlex citations

WoS citations^™