# News archives 2016

• NIST released the SP 800-185 standard with useful new functions based on Keccak: cSHAKE, KMAC, TupleHash and ParallelHash.

Yesterday, NIST published the SP 800-185 standard [PDF]. It contains the following new functions based on Keccak:

• cSHAKE is a family of two extendable-output functions (cSHAKE128 and cSHAKE256) that generalize upon SHAKE128 and SHAKE256. It is used as a building block for KMAC, TupleHash and ParallelHash. The main difference with the SHAKEs lies in an explicit domain separation mechanism. In addition to the usual input string, the user can supply a function name and a customization string. The former is standardized by NIST to separate different standard functions, while the user is free to supply anything in the latter.
• KMAC is a keyed hash function or pseudo-random function (PRF) that can be used, e.g., to compute a message authentication code (MAC) or to derive a session key from a master key. It is more efficient than HMAC by removing the need for HMAC's nested construction.
• TupleHash is a hash function whose input domain is any number of input strings. The output depends on the exact sequence of strings, not just their concatenation. For instance, TupleHash("a", "b", "c"), TupleHash("a", "bc"), TupleHash("abc") and TupleHash("abc", "") all give unrelated outputs.
• ParallelHash is a hash function that can exploit the parallelism in modern processors by way of a tree hash mode. This significantly speeds up the hashing of long inputs. E.g., we reported speeds of 2.73 cycles/byte and 2.31 c/b on Haswell and Skylake processors, respectively. (These figures are based on the draft specifications, but we do not expect them to differ significantly for the final specifications.)

These new functions all support the 128-bit and 256-bit security strengths. They all consistently support domain separation through the user-chosen customization string input. And they all support variable ouput length in a natural way.

• We congratulate Meicheng Liu1 and Jian Guo2 for being the first ones to solve a 4-round pre-image challenge in the Keccak Crunchy Crypto Collision and Pre-image Contest!

They found a pre-image of a given 80-bit digest for Keccak[r=1440, c=160] reduced to its first 4 rounds. Up to now, only pre-image challenges up to 3 rounds had been solved.

1. State Key Laboratory of Information Security, Institute of Information Engineering, Chinese Academy of Sciences, China
2. Nanyang Technological University, Singapore
• Ketje and Keyak are authenticated encryption schemes based on Keccak-p. Both were accepted in round 3 of the CAESAR competition. We slightly modified Ketje (now v2) in a way that encourages cryptanalysis, while we kept Keyak unchanged (still v2) but updated and improved its documentation.

#### Ketje v2

Compared to Ketje v1, we now specify a different placement for the outer (input/output) part of the state. This is done by adding a change of coordinates (“twist”), so as to put the outer part on a diagonal and to limit its interaction with the preceding χ and following θ step mappings.

The motivation is to encourage cryptanalysis. Cryptanalysis usually starts by reducing the number of rounds to see at which point a given primitive becomes insecure. In the case of Ketje, one cannot decrease the step calls further than 1 round. Instead, a cryptanalyst can increase the rate to more than 2 lanes to determine at which point Ketje breaks. However, the lanes of the outer part are located in the same plane (i.e., same y coordinate) and contain the result of χ. The knowledge of too many lanes in the same plane could mean that χ is easily inverted on that part of the state. Also, we should not place the outer part on a sheet (i.e., same x coordinate) as this would help the adversary influence the parity computed in θ. Instead, the twist places the outer part on a diagonal.

We illustrate the usage of this twist with two new instances, Ketje Minor and Ketje Major, that have a rate of 4 lanes (instead of 2) and larger permutations (800 and 1600 bits).

The primary recommendation remains Ketje Sr. Both Ketje Jr and Ketje Sr keep their rate of 2 lanes and otherwise remain unchanged modulo the twist.

#### Keyak v2

Compared to round 2, River, Lake, Sea, Ocean and Lunar Keyak remain unchanged.

We nevertheless worked on improving the description of the Motorist mode of operation by simplifying the definition of the Piston, Engine and Motorist algorithms. We also updated the security rationale. These changes are available in version 2.2 of the documentation (see change log in Appendix A).

The definition of Motorist now restricts the tag length to the capacity. As pointed out by Seth Hoffert, a legitimate adversary could, in a session, submit a tag as next block of metadata. If the tag is as long as the rate, this allows the adversary to force the outer part to a constant value, hence increasing the multiplicity. This would not break Motorist but it would prevent it to reach near-capacity generic security.

• We propose a fast and secure arbitrary output-length hash function aiming at a higher speed than the FIPS 202's SHA-3 and SHAKE functions, while retaining their flexibility and basis of security. Furthermore, it can exploit a high degree of parallelism, whether using multiple cores or the single-instruction multiple-data (SIMD) instruction set of modern processors. On Intel's® Haswell and Skylake architectures, KangarooTwelve tops at less than 1.5 cycles/byte for long messages on a single core. Short messages also benefit from about a factor two speed-up compared to the fastest FIPS 202 instance SHAKE128.

• We congratulate Jian Guo1, Meicheng Liu1,2, Ling Song1,2,3 and Kexin Qiao2,3,1,4 for solving another 5-round collision challenge in the Keccak Crunchy Crypto Collision and Pre-image Contest!

They generated a collision for a Keccak variant with capacity 160 bits, calling the Keccak-f permutation with width 800 and 5 rounds.

This solution happens less than two months after their team published the solution for the 5-round collision challenge with width 1600. We are looking forward to seeing the details of this attack, how much it differs from their previous one and whether their method can be adapted for the two remaining collision challenges in the 5-round category.

1. Cryptanalysis Taskforce, Temasek Laboratories @ Nanyang Technological University, Singapore
2. State Key Laboratory of Information Security, Institute of Information Engineering, Chinese Academy of Sciences, China
3. Data Assurance and Communication Security Research Center, Chinese Academy of Sciences, China
4. University of Chinese Academy of Sciences, China
• We congratulate Jian Guo1, Meicheng Liu1,2, Ling Song1,2,3 and Kexin Qiao2,3,1,4 for being the first ones to solve a 5-round collision challenge in the Keccak Crunchy Crypto Collision and Pre-image Contest!

They generated a collision for a Keccak variant with capacity 160 bits, calling the Keccak-f permutation with width 1600 and 5 rounds.

Remarkably, this breakthrough came only one month after two members of the same team, Jian and Meicheng, generated the first 3-round pre-images in our contest. [More details…]

1. Cryptanalysis Taskforce, Temasek Laboratories @ Nanyang Technological University, Singapore
2. State Key Laboratory of Information Security, Institute of Information Engineering, Chinese Academy of Sciences, China
3. Data Assurance and Communication Security Research Center, Chinese Academy of Sciences, China
4. University of Chinese Academy of Sciences, China
• When implemented on ASICs or on FPGAs, Keccak is significantly more efficient than other primitives with a similar security level, and allows efficient protections against side-channel attacks. Another area where Keccak's performance shines is on processors that exploit parallelism.

Recently, the NIST posted on the hash forum two draft special publications SP 800-XXX including proposals for customized SHAKE instances (Cshake), pseudo-random functions (KMAC), hash functions taking multiple strings as input (TupleHash) and a parallelized hash mode (Fast Parallel Hash, or FPH).

We implemented FPH in the Keccak Code Package and measured the following speeds for long messages:

HaswellSkylake
Keccak-FPH1282.732.31
Keccak-FPH2563.412.88

Performance in cycles per byte (single core) on Intel® Core™ i5-4570 (Haswell) and i5-6500 (Skylake), both at 3.2GHz

Keccak-FPH beats the speed line drawn by the legacy algorithms MD5 and SHA-1, usually considered fast.

Our implementation exploits the AVX-2 256-bit SIMD instruction set.

• We congratulate Jian Guo (Nanyang Technological University, Singapore) and Meicheng Liu (Nanyang Technological University, Singapore and State Key Laboratory of Information Security, Institute of Information Engineering, Chinese Academy of Sciences, China) for solving two 3-round pre-image challenges in the Keccak Crunchy Crypto Collision and Pre-image Contest!

Jian and Meicheng solved the 3-round pre-image challenges for widths 800 and 1600. There remains two others, i.e., those for widths 200 and 400, plus of course all the challenges with more rounds!!! [More details…]

• The Keccak Code Package (or KCP) contains different free and open-source implementations of Keccak and closely related variants such as Ketje and Keyak.

We reorganized it to make it easier to use and to develop in or on it. More specifically, the main changes are the following.

• The functions related to the permutations Keccak-f and Keccak-p are name-separated and can therefore coexist. Consequently, the different instances of Keccak, Ketje and Keyak can be compiled together.
• The user can create a library libkeccak.a.
• The user can easily select code optimized for a given platform.
• The parallel implementations can either exploit SIMD instructions or, when unvailable, rely on serial (or less parallel) implementations.

To sum up, the KCP contains:

• The FIPS 202 instances SHAKE128, SHAKE256, and SHA3-224 to SHA3-512.
• All the Keccak[r, c] sponge functions and duplex objects with r+c=200, 400, 800 or 1600.
• The Ketje and Keyak (version 2) authenticated encryption schemes, including
• Ketje Jr, Ketje Sr, River Keyak, Lake Keyak, Sea Keyak, Ocean Keyak and Lunar Keyak.
• Reference and optimized implementations of the Keccak-f[b] and Keccak-p[b] permutations, including
• compact implementations in C;
• generically optimized code in C for 32 or 64-bit platforms;
• assembly-optimized code for AVR8, ARMv6M, ARMv7M and ARMv7A.
• Paralellized implementations of the permutations, exploiting 128-bit and 256-bit SIMD instruction sets (SSE, AVX, AVX2).