Reference : Reverse Product-Scanning Multiplication and Squaring on 8-bit AVR Processors
Scientific congresses, symposiums and conference proceedings : Paper published in a book
Engineering, computing & technology : Computer science
http://hdl.handle.net/10993/19568
Reverse Product-Scanning Multiplication and Squaring on 8-bit AVR Processors
English
Liu, Zhe mailto [University of Luxembourg > Faculty of Science, Technology and Communication (FSTC) > Computer Science and Communications Research Unit (CSC) >]
Seo, Hwajeong [Pusan National University > School of Computer Science and Engineering]
Groszschädl, Johann mailto [University of Luxembourg > Faculty of Science, Technology and Communication (FSTC) > Computer Science and Communications Research Unit (CSC) >]
Kim, Howon [Pusan National University > School of Computer Science and Engineering]
Dec-2014
Information and Communications Security - 16th International Conference, ICICS 2014, Hong Kong, China, December 16-17, 2014. Proceedings
Hui, Lucas C. K.
Qing, Sihan
Shi, Elaine
Yiu, Siu-Ming
Springer Verlag
Lecture Notes in Computer Science, volume 8958
158–175
Yes
International
978-3-319-21966-0
16th International Conference on Information and Communications Security (ICICS 2014)
from 16-12-2014 to 17-12-2014
Hong Kong
China
[en] Multiple-Precision Arithmetic ; Hybrid Multiplication ; Karatsuba Multiplication ; Optimized Squaring ; AVR Architecture
[en] High performance, small code size, and good scalability are important requirements for software implementations of multi-precision arithmetic algorithms to fit resource-limited embedded systems. In this paper, we describe optimization techniques to speed up multi-precision multiplication and squaring on the AVR ATmega series of 8-bit microcontrollers. First, we present a new approach to perform multi-precision multiplication, called Reverse Product Scanning (RPS), that resembles the hybrid technique of Gura et al., but calculates the byte-products in the inner loop in reverse order. The RPS method processes four bytes of the two operands in each iteration of the inner loop and employs two carry-catcher registers to minimize the number of add instructions. We also describe an optimized algorithm for multi-precision squaring based on the RPS technique that is, depending on the operand length, up to 44.3% faster than multiplication. Our AVR Assembly implementations of RPS multiplication and RPS squaring occupy less than 1 kB of code space each and are written in a parameterized fashion so that they can support operands of varying length without recompilation. Despite this high level of flexibility, our RPS multiplication outperforms the looped variant of Hutter et al.'s operand-caching technique and saves between 40 and 51% of code size. We also combine our RPS multiplication and squaring routines with Karatsuba's method to further reduce execution time. When executed on an ATmega128 processor, the "karatsubarized RPS method" needs only 85k clock cycles for a 1024-bit multiplication (or 48k cycles for a squaring). These results show that it is possible to achieve high performance without sacrificing code size or scalability.
http://hdl.handle.net/10993/19568
10.1007/978-3-319-21966-0_12
http://link.springer.com/chapter/10.1007/978-3-319-21966-0_12

File(s) associated to this reference

Fulltext file(s):

FileCommentaryVersionSizeAccess
Open access
ICICS2014.pdfAuthor postprint335.41 kBView/Open

Bookmark and Share SFX Query

All documents in ORBilu are protected by a user license.