Efficient Arithmetic on ARM-NEON and Its Application for High-Speed RSA Implementation

Seo, Hwajeong; LIU, Zhe; GROSZSCHÄDL, Johann; Kim, Howon

doi:10.1002/sec.1706

Demander un accès

Article (Périodiques scientifiques)

Efficient Arithmetic on ARM-NEON and Its Application for High-Speed RSA Implementation

Seo, Hwajeong; LIU, Zhe; GROSZSCHÄDL, Johann et al.

2016 • In Security and Communication Networks, 9 (18), p. 5401-5411

Peer reviewed

Permalien
https://hdl.handle.net/10993/37482

DOI
10.1002/sec.1706

Documents (1)Envoyer vers Détails Statistiques Bibliographie Publications similaires

Documents

Texte intégral

SCN2016.pdf

Postprint Auteur (102.13 kB)

Demander un accès

Tous les documents dans ORBilu sont protégés par une licence d'utilisation.

Envoyer vers

RIS BibTex APA Chicago Permalink X Linkedin

Détails

Mots-clés :

Public-Key Cryptography; Multiple-Precision Arithmetic; Modular Reduction; SIMD-Level Parallelism; Vector Instructions; ARM NEON

Résumé :

[en] A steadily increasing number of modern processors support Single Instruction Multiple Data (SIMD) instructions to speed up multimedia, communication, and security applications. The computational power of Intel's SSE and AVX extensions as well as ARM's NEON engine has initiated a body of research on SIMD-parallel implementation of multiple-precision integer arithmetic operations, in particular modular multiplication and modular squaring, which are performance-critical components of widely-used public-key cryptosystems such as RSA, DSA, Diffie-Hellman, and their elliptic-curve variants ECDSA and ECDH. In this paper, we introduce the Double Operand Scanning (DOS) method for multiple-precision squaring and describe its implementation for ARM NEON processors. The DOS method uses a full-radix representation of the operand to be squared and aims to maximize performance by reducing the number of Read-After-Write (RAW) dependencies between source and destination registers. We also analyze the benefits of applying Karatsuba's technique to both multiple-precision multiplication and squaring, and present an optimized implementation of Montgomery's algorithm for modular reduction. Our performance evaluation shows that the DOS method along with the other optimizations described in this paper allows one to execute a full 2048-bit modular exponentiation in about 14.25 million clock cycles on an ARM Cortex-A15 processor, which is significantly faster than previously-reported RSA implementations for the ARM-NEON platform.

Disciplines :

Sciences informatiques

Auteur, co-auteur :

Seo, Hwajeong; Pusan National University > School of Computer Science and Engineering

LIU, Zhe ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > Computer Science and Communications Research Unit (CSC)

GROSZSCHÄDL, Johann ; University of Luxembourg > Faculty of Science, Technology and Communication (FSTC) > Computer Science and Communications Research Unit (CSC)

Kim, Howon; Pusan National University > School of Computer Science and Engineering

Co-auteurs externes :

yes

Langue du document :

Anglais

Titre :

Efficient Arithmetic on ARM-NEON and Its Application for High-Speed RSA Implementation

Date de publication/diffusion :

décembre 2016

Titre du périodique :

Security and Communication Networks

ISSN :

1939-0114

eISSN :

1939-0122

Maison d'édition :

John Wiley & Sons, Malden, Royaume-Uni

Volume/Tome :

Fascicule/Saison :

Pagination :

5401-5411

Peer reviewed :

Peer reviewed

Focus Area :

Security, Reliability and Trust

URL complémentaire :

http://onlinelibrary.wiley.com/doi/10.1002/sec.1706

Disponible sur ORBilu :

depuis le 26 novembre 2018

Statistiques

Nombre de vues

208 (dont 3 Unilu)

Nombre de téléchargements

0 (dont 0 Unilu)

Voir plus de statistiques

citations Scopus^®

citations Scopus^®
sans auto-citations

OpenCitations

citations OpenAlex

citations WoS^™