Guido Bertoni3, Joan Daemen2, Seth Hoffert, Michaël Peeters1, Gilles Van Assche1 and Ronny Van Keer1
1STMicroelectronics - 2Radboud University - 3Security Pattern
We are glad to announce the final version of the Farfalle construction and of the Kravatte pseudo-random function and encryption schemes.
First published in late 2016 on IACR ePrint, an update of our paper Farfalle: parallel permutation-based cryptography was accepted at the journal Transactions on Symmetric Cryptography (ToSC). We will present it at the yearly Fast Software Encryption (FSE) conference in Brugge, Belgium, in March 2018.
In the last couple of months, we applied some changes to both Farfalle and Kravatte1. This was due to prompt third-party cryptanalysis by different researchers. First Ling Song and Jian Guo contacted us with a key recovery cube attack on the (full) previous version of Kravatte. Then a second team of cryptanalysts (who wish to stay anonymous at this point, as their paper is under submission) sent us the description of even more powerful attacks targeting the expansion layer specifically. Consequently, we modified Kravatte by taking 6 rounds for all four permutation instances. And to counteract the attacks of the second team, we made a more fundamental change by adopting a non-linear rolling function in the expansion layer. We realize that switching from a linear rolling function to a non-linear one is a change in philosophy, and we discuss it in the paper.
The optimized code in the KCP and the reference implementation in KeccakTools are in sync.
1To distinguish the latest version of Kravatte from the previous one, we call it Kravatte Achouffe.
If SHA-2 is not broken, why would one switch to SHA-3 and not just stay with SHA-2? In this post, we highlight another argument why Keccak/SHA-3 is a better choice than SHA-2, namely openness, in analogy with open-source versus closed-source in software development and deployment.
Software has two sides: its executable and its source code. The former is used as a black box by the users, while the latter is of interest to developers who want to extend it, to understand its inner workings or to make sure there is no obvious malicious code. As an analogy, we see the specification of the cryptographic primitive, mode or algorithm in a (proposed) cryptographic standard as the counterpart of the software executable: It allows everyone to include the cryptographic object, as is, in his/her project. The counterpart of the source code in cryptography would be the design rationale, preliminary cryptanalysis and evidence of extensive third-party cryptanalysis: These are the elements that give insight into the inner workings and ultimately trust.
The transition of cryptography from a proprietary activity to a scientific one in the last 50 years can be seen as a move from closed-source to open-source in this analogy. Surprisingly, there are exceptions and we still see closed-source cryptography today.
The SHA-1 and SHA-2 NIST standard hash functions were designed behind closed doors at NSA. The standards were put forward in 1995 and 2001 respectively, without public scrutiny of any significance, despite the fact that at time of publication there was already a considerable cryptographic community doing active research on this subject. Even the 2015 update of FIPS 180, the standard that specifies SHA-2, does not contain, nor refer to, a design rationale.
In contrast, SHA-3 is the result of an open call of NIST to the cryptographic community for hash function proposals. There was no restriction on who could participate, so submissions were open in the broadest possible sense. Every submitted candidate algorithm had to contain a description, a design rationale and preliminary cryptanalysis. The authors of the 64 submissions included the majority of people active in open symmetric crypto research at the time. NIST solicited the symmetric crypto community for performing and publishing research in cryptanalysis, implementations, proofs and comparisons of the candidates and based its decision on the results. After a three-round process involving hundreds of people in the community for several years, NIST finally announced that Keccak was selected to become the SHA-3 standard.
The open effort of the symmetric crypto community did not stop there. Since then, Keccak has remained under public scrutiny and new papers appear regularly. Paper after paper confirms the large safety margin of Keccak. What is important, is that these papers reach a high degree of sophistication as research can start from the preliminary cryptanalysis that we provided in our SHA-3 submission document.
It is true that cryptanalysis of MD5, SHA-1 and SHA-2 has also reached a high degree of sophistication. However, this took longer to develop due to the absence of rationale and preliminary cryptanalysis, but also due to the adoption of the ARX design methodology.
SHA-2 is essentially a security patch of SHA-1 while SHA-3 is its open-source alternative, much in the same way that Triple-DES is a security patch for DES and AES the open-source alternative. In retrospect, even if Triple-DES is not broken, would you still recommend not to switch to AES?
If SHA-2 is not broken, why would one switch to SHA-3 and not just stay with SHA-2? There are several arguments why Keccak/SHA-3 is a better choice than SHA-2. In this post, we come back on a particular design choice of Keccak and explain why Keccak is not ARX, unlike SHA-2.
We specified Keccak at the bit-level using only transpositions, bit-level additions and multiplications (in GF(2)). We arranged these operations to allow efficient software implementations using fixed sequences of bitwise Boolean instructions and (cyclic) shifts. In contrast, many designers specify their primitives directly in pseudocode similarly including bitwise Boolean instructions and (cyclic) shifts, but on top of that also additions. These additions are modulo 2n with n a popular CPU word length such as 8, 32 or 64. Such primitives are dubbed ARX that stands for “addition, rotation and exclusive-or (XOR)”. The ARX approach is widespread and adopted by popular designs MD4, MD5, SHA-1, SHA-2, Salsa, ChaCha, Blake(2) and Skein.
So why isn't Keccak following the ARX road? We give some arguments in the following paragraphs.
One of the main selling points of ARX is its efficiency in software: Addition, rotation and XOR usually only take a single CPU cycle. For addition, this is not trivial because the carry bits may need to propagate from the least to the most significant bit of a word. Processor vendors have gone through huge efforts to make additions fast, and ARX primitives take advantage of this in a smart way. When trying to speed up ARX primitives by using dedicated hardware, not so much can be gained, unlike in bit-oriented primitives as Keccak. Furthermore, the designer of an adder must choose between complexity (area, consumption) or gate delay (latency): It is either compact or fast, but not at the same time. A bitwise Boolean XOR (or AND, OR, NOT) does not have this trade-off: It simply take a single XOR per bit and has a gate delay of a single binary XOR (or AND, OR, NOT) circuit. So the inherent computational cost of additions is a factor 3 to 5 higher than that of bitwise Boolean operations.
But even software ARX gets into trouble when protection against power or electromagnetic analysis is a threat. Effective protection at primitive level requires masking, namely, where each sensitive variable is represented as the sum of two (or more) shares and where the operations are performed on the shares separately. For bitwise Boolean operations and (cyclic) shifts, this sum must be understood bitwise (XOR), and for addition the sum must be modulo 2n. The trouble is that ARX primitives require many computationally intensive conversions between the two types of masking.
The cryptographic strength of ARX comes from the fact that addition is not associative with rotation or XOR. However, it is very hard to estimate the security of such primitives. We give some examples to illustrate this. For MD5, it took almost 15 years to be broken while the collision attacks that have finally been found can be mounted almost by hand. For SHA-1, it took 10 years to convert the theoretical attacks of around 2006 into a real collision. More recently, at the FSE 2017 conference in Tokyo, some attacks on Salsa and ChaCha were presented, which in retrospect look trivial but that remained undiscovered for many years.
Nowadays, when a new cryptographic primitive is published, one expects arguments on why it would provide resistance against differential and linear cryptanalysis. Evaluating this resistance implies investigating propagation of difference patterns and linear masks through the round function. In ARX designs, the mere description of such difference propagation is complicated, and the study of linear mask propagation has only barely started, more than 25 years after the publication of MD5.
A probable reason for this is that (crypt)analyzing ARX, despite its merits, is relatively unrewarding in terms of scientific publications: It does not lend itself to a clean mathematical description and usually amounts to hard and ad-hoc programming work. A substantial part of the cryptographic community is therefore reluctant to spend their time trying to cryptanalyze ARX designs. We feel that the cryptanalysis of more structured designs such as Rijndael/AES or Keccak/SHA-3 leads to publications that provide more insight.
But if ARX is really so bad, why are there so many primitives from prominent cryptographers using it? Actually, the most recent hash function in Ronald L. Rivest's MD series, the SHA-3 candidate MD6, made use of only bitwise Boolean instructions and shifts. More recently, a large team including Salsa and ChaCha designer Daniel J. Bernstein published the non-ARX permutation Gimli. Gimli in turn refers to NORX for its design approach, a CAESAR candidate proposed by a team including Jean-Philippe Aumasson and whose name stems from a rather explicit “NO(T A)RX”. Actually, they are moving in the direction where Keccak and its predecessors (e.g., RadioGatún, Noekeon, BaseKing) always were.
So, maybe better skip ARX?
We congratulate Yao Sun1 and Ting Li1 for solving the 3-round pre-image challenge on Keccak[r=240, c=160].
The previous pre-image challenge on the 400-bit version was solved on 2 rounds by Paweł Morawiecki in 2011. This present challenge was solved by combining brute-force and algebraic techniques. It took 5 days with eight GPU cards (nvidia 1080Ti).
We are happy to announce that we moved the contents from {keccak, sponge, ketje, keyak}.noekeon.org
to our new domain keccak.team
. Many thanks to Benoit Viguier for designing the engine behind these new pages!
We hope that you will enjoy the refreshed look. Do not hesitate to contact us if you see something incorrect or missing.
In a recent post, Adam Langley complains that “SHA-3 is slow”. Similar comments appear from time to time on the web (see also David Wong's post). But what does it mean precisely? Let us dig into it.
There are a couple of ambiguities in this statement. Let's start with the first one: where would it be “slow”?
Keccak, the winner of the SHA-3 competition, is blazing fast when implemented on dedicated (ASIC) or programmable (FPGA) hardware. Its throughput for a given circuit area is an order of magnitude higher than SHA-2 or any of the SHA-3 finalists. And if you care beyond plain speed, note that it also consumes much less energy per bit. In this sense, Keccak is a green cryptographic primitive.
Keccak has other implementation advantages, like efficient protections against side-channel attacks, but let's go to the point: what seems to be at stake is the speed in software.
How come then, that SHA3-512 is slower than SHA-512 on modern processors? This brings up to the second ambiguity of the statement: what is “SHA-3”?
“SHA-3” initially refers to the target of the competition that NIST organized from 2008 to 2012, namely a new hash standard that would complement SHA-2 in case it is broken. Hence, the initially-intended outcome of the competition is a set of four functions called SHA3-224, SHA3-256, SHA3-384, and SHA3-512.
If “SHA-3” means these four functions, then indeed SHA3-512 is unnecessarily slowed down by an excessive security parameter. Due to an absurd rule in the competition, followed by a fierce controversy in 2013, the parameters of SHA3-512 are stuck at offering 512 bits of pre-image security, a nonsense for anyone who can compute powers of two.
It turned out that Keccak has more to offer than just the drop-in replacements for SHA-{224, 256, 384, 512}. In FIPS 202 (“the SHA-3 standard”), NIST also approves two extendable-output functions (XOFs) called SHAKE128 and SHAKE256. These generalize the traditional hashing paradigm by allowing any arbitrary output length and by being parameterized by their security strength, instead of their output length.
If the term “SHA-3” embraces the aforementioned XOFs, then SHAKE{128, 256} are on par with SHA-2 on common processors. But could “SHA-3” actually be super fast in software?
To answer this last question, let us go further down the standardization road. Recently, NIST released the SP 800-185 standard (“SHA-3 derived functions”), which proposes a framework for customizable functions, called cSHAKE, that generalize SHAKE{128, 256}. And, as an application of this framework, it approves the ParallelHash functions. These functions internally use a parallel mode of operation, which an implementation can exploit to speed-up the processing.
With these standard functions included, “SHA-3” can actually outperform SHA-2 and even SHA-1 (broken but usually considered fast) for long messages on modern processors.
Of course, a cryptographic functions should carefully balance speed and security. Keccak makes use of sound design properties, like fast linear and differential propagation or purposely weak alignment, and it clearly stays away from the ARX (modular additions, rotations and exclusive or's) approach. What we understand of Keccak now made us wonder: aren't 24 rounds too much?
KangarooTwelve is a recently proposed variant of Keccak, in which the number of rounds has been reduced to 12—yet, with exactly the same round function, no tweak! It may seem like a drastic reduction, but it is in line with our proposed solution for authenticated encryption, Keyak, submitted to the CAESAR competition. And, more importantly, this wouldn't have been possible if Keccak hadn't had the chance of getting such a rather extensive scrutiny from cryptanalysis experts throughout the years—and to resist to them.
KangarooTwelve is defined on the “SHA-3” components from FIPS 202 and may please the aficionados of extreme software speed. But beware: with such speeds, the hard drive or the network connection has long become the bottleneck for most applications.
In a joint work with Silvia Mella (STMicroelectronics and University of Milano), we propose a framework for bounding the weight of differential trails. We apply this on Keccak-f with widths of 200, 400, 800 and 1600 bits to show that no trail of weight less than 92 over 6 rounds exist in either of these Keccak-f instances. Should a 6-round differential trail be used as part of a collision attack, the ratio of complying pairs is thus guaranteed to be at most 2-92.
This work improves over our results of FSE 2012, both extending the coverage of differential trails both over the different Keccak-f widths and over a higher weight per round. In particular, Silvia was able to scan all 3-round trails up to weight 45. At 15 per round, and given the exponential growth of trail number per weight, this is a significant improvement over previous works.
Silvia presented her work at FSE 2017. The paper can be found here.
We are happy to announce a new cryptanalysis prize! The subject of the stress-test is the authenticated encryption scheme Ketje.
We are particularly interested in attacks aiming at recovering the internal state, the secret key or at forging a MAC. Other attacks would be appreciated as well. Competitors have the freedom to increase the input rate of Ketje. The attack can be applied to any of the four instances of Ketje.
Who wins the prize will be decided by consensus in the Keccak team. Some hints:
The Ketje authenticated encryption scheme has already established such a solid reputation in Belgium that a beer was named after it. Unfortunately, it is already sold out since 2011. Instead, the winner will receive a selection of Belgian beers.
The attacks should be submitted to the crypto-competitions
-at- googlegroups.com
mailing list (with ketje
-at- noekeon.org
in CC) before January 31st, 2018.
We congratulate Ling Song1,2,3, Guohong Liao4,1 and Jian Guo1 for solving the 6-round collision challenge on Keccak[r=1440, c=160].
The collision search took a computational effort of about 250 evaluations of the Keccak-f rounds. It follows an improvement of the technique previously used for solving the 5-round collision challenge in May last year (then solved by Jian Guo, Meicheng Liu, Ling Song and Kexin Qiao).