3 Ruby B. Lee Yu-Yuan Chen 2010 Processor Accelerator for AES Proceedings of the 2010 IEEE 8th Symposium on Application Specific Processors Anaheim, CA, USA 71-76 June 13-14 2010 Software AES cipher performance is not fast enough for encryption to be incorporated ubiquitously for all computing needs. Furthermore, fast software implementations of AES that use table lookups are susceptible to software cache-based side channel attacks, leaking the secret encryption key. To bridge the gap between software and hardware AES implementations, several Instruction Set Architecture (ISA) extensions have been proposed to provide speedup for software AES programs, most notably the recent introduction of six AES-specific instructions for Intel microprocessors. However, algorithm-specific instructions are less desirable than general-purpose ones for microprocessors. In this paper, we propose an enhanced parallel table lookup instruction that can achieve the fastest reported software AES encryption and decryption of 1.38 cycles/byte for generalpurpose microprocessors, a 1.45X speedup from the fastest prior work reported. Also, security is improved where cache-based side-channel attacks are thwarted, since all table lookups take the same amount of time. Furthermore, the new instructions can also be used to accelerate any functions that can be accelerated through table lookup operations of one or multiple small tables.