Internet Security - Hardware Crypto Accelerator

As internet is growing daily, network security is becoming a very important aspect to address. IPSec provides a way to encrypt/decrypt the IP packet, it is the foundation of Virtual Private Network. There are two ways to encrypt/decrypt IP packet:

  1. software based: use main CPU cycle to execute software to have IP packet encrypted/decrypted. It can be more generic implementation (software wise), no additional hardware cost; however encryption/decryption is CPU intensive work, it causes CPU busy and low throughput.
  2. hardware based: use dedicate piece of hardware to perform packet encryption/decryption. It frees up main CPU, provide high throughput; however requires additionall hardware cost, and specific software driver to make it work.

As internet traffic is increasing fast, the demand of higher IPSEc throughput is high. In fact, it creates a new business called “Security Processor”. There are three main players: Cavium neworks, Freescale and Intel.

The following diagram shows how a typical way of security processor works in high level:


In this post, I will give a summary of three hardware cypto players.

Cavium networks: Nitrox series

Cavium networks is the leader in security processor by providing Nitrox security processor. Nitrox V claims “offer solutions delivering 500Mbps to 100Gbps of encryption bandwidth with 1K to 120K 2048b RSA operations per second”.

Nitrox V is latest one it is offered (as of July 2015)

Features: High performance security processing

  1. 15 Gbps to 100 Gbps Security Performance
  2. 45K to 300K ECC Ops/s (p256)
  3. 20K to 120K RSA Ops/s (2048 bit keys)
  4. 288 RISC engines with instruction space

High-performance, industry standard compression

  1. 20 to 100 Gbps GZIP / LZS Compression


  1. Single Root to IO Virtualization (SR-IOV) support in hardware
  2. Up to 256 Virtual Functions

High-performance, industry standard interfaces</li>

  1. Dual PCI-Express Gen 3 x4, x8 (100+ Gbps)
  2. Dual Interlaken x8 lanes (100+ Gbps)

Wide variety of algorithms supported

  1. IPSec, SSL, TLS 1.2, DTLS, ECC (p224, p256, p384, p521)
  2. DES, 3DES, AES 256-bit (ECB, CBC, XCBC, CNTR, GCM)

Random Number Generator

  1. FIPS 140-3 compliant True RNG


Check for latest data.

Nitrox is widely used in security product like firewall, which requires high throughput.

Freescale has developed a range of crypto accelerators, tailored to the applications of performance of various product lines.

  1. Kinetis MCU mmCAU: Supports DES, 3DES, AES, MD5, SHA-1, and SHA-256 algorithms
  2. Kinetis MCU LTC: Cryptographic co-processor for AES, DES and public key cryptography
  3. SEC 1.x - 3.x: Supports single algorithms, single pass ciphering and message integrity
  4. SEC 4.x: Supports single algorithms, single pass ciphering and message integrity, and protocol encapsulations for IPSec, 802.1ae, SSL/TLS, SRTP, 802.11i, 802.16e
  5. CAAM: The Cryptographic Accelerator and Assurance Module supports single algorithms, single pass ciphering and message integrity. The CAAM also supports platform assurance by providing configurable secure memory that can be automatically erased in the event of a platform security alarm.
  6. DCP: The Data Co-Processor supports single algorithm processing.
  7. SAHARA: The Symmetric / Asymmetric Hashing and Random Accelerator support single algorithm processing. SAHARA also supports platform assurance by providing configurable secure memory that can be automatically erased in the event of a platform security alarm.

Freescale crypto accelerators is widely used in low to middle range of router.

Intel QuickAssist Technology (QAT): Cryptographic and Compression Acceleration

Symmetric cryptography functions include

  1. cipher operations (AES, DES,3DES, ARC4)
  2. wireless (Kasumi,Snow 3G)
  3. hash/authenticate operations (SHA-1, MD5, SHA-2[SHA-224, SHA-256, SHA-384, SHA-512])
  4. authentication (HMAC, AESXCBC,AES-CCM); AES-XTS (Chipset 8925 and 8950 only)
  5. random number generation.

Public Key functions include

  1. RSA operation;
  2. Diffie-Hellman operation;
  3. digital signature standard operation;
  4. key derivation operation;
  5. elliptic curve cryptography (ECDSA and ECDH);
  6. random number generation;
  7. prime number testing.

Compression/decompression include

  1. DEFLATE (Lempel-Ziv 77)
  2. LZS (Lempel-Ziv-Stac)


Intel provides open source driver which is linux based. It is realtively easy to use.


This month, I had chance to work closely with Intel engineer to integarte Intel QAT driver into Cisco Classic IOS as an effort to provide HW crypto support in Cisco IOT gateway IR829/IR809.

Intel QAT driver is deisgned to run in Linux enviroment, and has about quater million lines of code. Three engineers worked intensively for three weeks, and complete the work which easily requires two engineers for half year work in a normal schedule. It was fun and challenging game, It is great time only if it is once or twice a year, certainly not always. Otherwise I certainly feel it will impact to my health and family life.

During the course of integration, there are a few technical difficulties resolved:

  1. Modified large code base of Intel QAT driver, which was compiled using gcc and running in Linux kernel, so that it can be compiled using Intel Biendian Compiler ICC, and integrated into IOS image.
  2. Endian issue between QAT15 driver and IOS: Intel QAT driver builds and run under little endian enviroment, however IOS builds and runs in big endian enviroment, unless it is explicitly specified to be little endian. Some types are defined in IOS big endian enviroment, leak into Intel QAT driver. An adapter layer has to be created between IOS and QAT driver.
  3. Integrate Intel QAT driver into IOS RAW IPSEC framework. There was a bit struggle to decide between FULL IPSEC framework and RAW IPSEC framework as initially thought Intel QAT has equivalent capablity as chips from Cavium Nitrox series. It turns out to be not the case.One difference, during the first phase of establishsment of security associate, Intel QAT doesn't have capability to do so, this has to rely on IOS IPSEC framework.
  4. Performance tuning: hook up IXIA to feed high rate of traffic to IR829 for encryption and decryption.