mirror of
https://github.com/GaloisInc/cryptol.git
synced 2024-12-18 13:31:50 +03:00
687 lines
25 KiB
Markdown
687 lines
25 KiB
Markdown
% Poly1305 for IETF protocols
|
|
% Y. Nir (Check Point), A. Langley (Google Inc), D. McNamee (Galois, Inc)
|
|
% July 28, 2014
|
|
%
|
|
% Poly1305 extracted from the original RFC by Thomas DuBuisson
|
|
|
|
```cryptol
|
|
module Poly1305 where
|
|
```
|
|
|
|
|
|
# The Poly1305 algorithm
|
|
|
|
Poly1305 is a one-time authenticator designed by D. J. Bernstein.
|
|
Poly1305 takes a 32-byte one-time key and a message and produces a
|
|
16-byte tag.
|
|
|
|
The original article ([poly1305]) is entitled "The Poly1305-AES
|
|
message-authentication code", and the MAC function there requires a
|
|
128-bit AES key, a 128-bit "additional key", and a 128-bit (non-
|
|
secret) nonce. AES is used there for encrypting the nonce, so as to
|
|
get a unique (and secret) 128-bit string, but as the paper states,
|
|
"There is nothing special about AES here. One can replace AES with
|
|
an arbitrary keyed function from an arbitrary set of nonces to 16-
|
|
byte strings."
|
|
|
|
Regardless of how the key is generated, the key is partitioned into
|
|
two parts, called "r" and "s". The pair ``(r,s)`` should be unique, and
|
|
MUST be unpredictable for each invocation (that is why it was
|
|
originally obtained by encrypting a nonce), while "r" MAY be
|
|
constant, but needs to be modified as follows before being used: ("r"
|
|
is treated as a 16-octet little-endian number):
|
|
|
|
* r[3], r[7], r[11], and r[15] are required to have their top four
|
|
bits clear (be smaller than 16)
|
|
|
|
```cryptol
|
|
Om = 15 // odd masks - for 3, 7, 11 & 15
|
|
```
|
|
* r[4], r[8], and r[12] are required to have their bottom two bits
|
|
clear (be divisible by 4)
|
|
|
|
The following Cryptol code clamps "r" to be appropriate:
|
|
|
|
```cryptol
|
|
Em = 252 // even masks - for 4, 8 & 12
|
|
nm = 255 // no mask
|
|
|
|
PolyMasks : [16][8] // mask indices
|
|
PolyMasks = [ nm, nm, nm, Om, // 0-3
|
|
Em, nm, nm, Om, // 4-7
|
|
Em, nm, nm, Om, // 8-11
|
|
Em, nm, nm, Om ] // 12-15
|
|
|
|
Poly1305_clamp : [16][8] -> [16][8]
|
|
Poly1305_clamp r = [ re && mask | re <- r | mask <- PolyMasks ]
|
|
```
|
|
|
|
The "s" should be unpredictable, but it is perfectly acceptable to
|
|
generate both "r" and "s" uniquely each time. Because each of them
|
|
is 128-bit, pseudo-randomly generating them (see "Generating the
|
|
Poly1305 key using ChaCha20") is also acceptable.
|
|
|
|
The inputs to Poly1305 are:
|
|
|
|
* A 256-bit one-time key
|
|
* An arbitrary length message (comprised of `floorBlocks` 16-byte blocks,
|
|
and `rem` bytes left over)
|
|
|
|
The output is a 128-bit tag.
|
|
|
|
```cryptol
|
|
Poly1305 : {m, floorBlocks, rem} (fin m, floorBlocks == m/16, rem == m - floorBlocks*16)
|
|
=> [256] -> [m][8] -> [16][8]
|
|
```
|
|
|
|
Set the constant prime "P" be 2^130-5.
|
|
|
|
```cryptol
|
|
P : [136]
|
|
P = 2^^130 - 5
|
|
```
|
|
|
|
First, the "r" value should be clamped.
|
|
|
|
```cryptol
|
|
Poly1305 key msg = result where
|
|
[ru, su] = split key
|
|
r : [136] // internal arithmetic on (128+8)-bit numbers
|
|
r = littleendian ((Poly1305_clamp (split ru)) # [0x00])
|
|
s = littleendian ((split su) # [0x00])
|
|
```
|
|
|
|
Next, divide the message into 16-byte blocks. The last block might be shorter:
|
|
|
|
* Read each block as a little-endian number.
|
|
* Add one bit beyond the number of octets. For a 16-byte block this
|
|
is equivalent to adding 2^128 to the number. For the shorter
|
|
block it can be 2^120, 2^112, or any power of two that is evenly
|
|
divisible by 8, all the way down to 2^8.
|
|
|
|
```cryptol
|
|
// pad all the blocks uniformly (we'll handle the final block later)
|
|
paddedBlocks = [ 0x01 # (littleendian block)
|
|
| block <- groupBy`{16}(msg # (zero:[inf][8])) ]
|
|
```
|
|
* If the block is not 17 bytes long (the last block), then left-pad it with
|
|
zeros. This is meaningless if you're treating them as numbers.
|
|
|
|
```cryptol
|
|
lastBlock : [136]
|
|
lastBlock = zero # 0x01 # (littleendian (drop`{16*floorBlocks} msg))
|
|
```
|
|
|
|
* Add the current block to the accumulator.
|
|
* Multiply by "r"
|
|
* Set the accumulator to the result modulo p. To summarize:
|
|
``accum[i+1] = ((accum[i]+block)*r) % p``.
|
|
|
|
```cryptol
|
|
accum:[_][136]
|
|
accum = [zero:[136]] # [ computeElt a b r P | a <- accum | b <- paddedBlocks ]
|
|
// ^ the accumulator starts at zero
|
|
```
|
|
|
|
* If the block division leaves no remainder, the last value of the accumulator is good
|
|
otherwise compute the special-case padded block, and compute the final value of the accumulator
|
|
|
|
```cryptol
|
|
lastAccum : [136]
|
|
lastAccum = if `rem == 0
|
|
then accum@`floorBlocks
|
|
else computeElt (accum@`floorBlocks) lastBlock r P
|
|
```
|
|
|
|
Finally, the value of the secret key "s" is added to the accumulator,
|
|
and the 128 least significant bits are serialized in little-endian
|
|
order to form the tag.
|
|
|
|
```cryptol
|
|
result = reverse (groupBy`{8} (drop`{8}(lastAccum + s)))
|
|
|
|
// Compute ((a + b) * r ) % P being pedantic about bit-widths
|
|
computeElt : [136] -> [136] -> [136] -> [136] -> [136]
|
|
computeElt a b r p = (drop`{137}bigResult) where
|
|
bigResult : [273]
|
|
aPlusB : [137]
|
|
aPlusB = (0b0#a) + (0b0#b) // make room for carry
|
|
timesR : [273]
|
|
timesR = ((zero:[136])#aPlusB) * ((zero:[137])#r) // [a]*[b]=[a+b]
|
|
bigResult = timesR % ((zero:[137])#p)
|
|
|
|
```
|
|
|
|
### Poly1305 Example and Test Vector
|
|
|
|
For our example, we will dispense with generating the one-time key
|
|
using AES, and assume that we got the following keying material:
|
|
|
|
* Key Material: 85:d6:be:78:57:55:6d:33:7f:44:52:fe:42:d5:06:a8:01:
|
|
03:80:8a:fb:0d:b2:fd:4a:bf:f6:af:41:49:f5:1b
|
|
|
|
```cryptol
|
|
Poly1305TestKey = join (parseHexString
|
|
( "85:d6:be:78:57:55:6d:33:7f:44:52:fe:42:d5:06:a8:01:"
|
|
# "03:80:8a:fb:0d:b2:fd:4a:bf:f6:af:41:49:f5:1b."
|
|
) )
|
|
```
|
|
|
|
* s as an octet string: 01:03:80:8a:fb:0d:b2:fd:4a:bf:f6:af:41:49:f5:1b
|
|
* s as a 128-bit number: 1bf54941aff6bf4afdb20dfb8a800301
|
|
|
|
```cryptol
|
|
Poly1305Test_s = parseHexString
|
|
"01:03:80:8a:fb:0d:b2:fd:4a:bf:f6:af:41:49:f5:1b."
|
|
Poly1305Test_sbits = join (reverse Poly1305Test_s)
|
|
|
|
property poly1306Sokay = Poly1305Test_sbits == 0x1bf54941aff6bf4afdb20dfb8a800301
|
|
```
|
|
|
|
|
|
```cryptol
|
|
Poly1305TestMessage = "Cryptographic Forum Research Group"
|
|
```
|
|
|
|
* r before clamping: 85:d6:be:78:57:55:6d:33:7f:44:52:fe:42:d5:06:a8
|
|
* Clamped r as a number: 806d5400e52447c036d555408bed685.
|
|
|
|
Since Poly1305 works in 16-byte chunks, the 34-byte message divides
|
|
into 3 blocks. In the following calculation, "Acc" denotes the
|
|
accumulator and "Block" the current block:
|
|
|
|
Here we define a Cryptol function that returns all of the intermediate
|
|
values of the accumulator:
|
|
|
|
```cryptol
|
|
// TODO: refactor the Poly function in terms of this AccumBlocks
|
|
// challenge: doing so while maintaining the clean literate correspondence with the spec
|
|
AccumBlocks : {m, floorBlocks, rem} (fin m, floorBlocks == m/16, rem == m - floorBlocks*16)
|
|
=> [256] -> [m][8] -> ([_][136], [136])
|
|
|
|
AccumBlocks key msg = (accum, lastAccum) where
|
|
[ru, su] = split key
|
|
r : [136] // internal arithmetic on (128+8)-bit numbers
|
|
r = littleendian ((Poly1305_clamp (split ru)) # [0x00])
|
|
s = littleendian ((split su) # [0x00])
|
|
// pad all the blocks uniformly (we'll handle the final block later)
|
|
paddedBlocks = [ 0x01 # (littleendian block)
|
|
| block <- groupBy`{16}(msg # (zero:[inf][8])) ]
|
|
lastBlock : [136]
|
|
lastBlock = zero # 0x01 # (littleendian (drop`{16*floorBlocks} msg))
|
|
accum:[_][136]
|
|
accum = [zero:[136]] # [ computeElt a b r P | a <- accum | b <- paddedBlocks ]
|
|
// ^ the accumulator starts at zero
|
|
lastAccum : [136]
|
|
lastAccum = if `rem == 0
|
|
then accum@`floorBlocks
|
|
else computeElt (accum@`floorBlocks) lastBlock r P
|
|
|
|
```
|
|
|
|
```example
|
|
Block #1
|
|
|
|
Acc = 00
|
|
Block = 6f4620636968706172676f7470797243
|
|
Block with 0x01 byte = 016f4620636968706172676f7470797243
|
|
Acc + block = 016f4620636968706172676f7470797243
|
|
(Acc+Block) * r =
|
|
b83fe991ca66800489155dcd69e8426ba2779453994ac90ed284034da565ecf
|
|
Acc = ((Acc+Block)*r) % P = 2c88c77849d64ae9147ddeb88e69c83fc
|
|
|
|
Block #2
|
|
|
|
Acc = 2c88c77849d64ae9147ddeb88e69c83fc
|
|
Block = 6f7247206863726165736552206d7572
|
|
Block with 0x01 byte = 016f7247206863726165736552206d7572
|
|
Acc + block = 437febea505c820f2ad5150db0709f96e
|
|
(Acc+Block) * r =
|
|
21dcc992d0c659ba4036f65bb7f88562ae59b32c2b3b8f7efc8b00f78e548a26
|
|
Acc = ((Acc+Block)*r) % P = 2d8adaf23b0337fa7cccfb4ea344b30de
|
|
|
|
Last Block
|
|
|
|
Acc = 2d8adaf23b0337fa7cccfb4ea344b30de
|
|
Block = 7075
|
|
Block with 0x01 byte = 017075
|
|
Acc + block = 2d8adaf23b0337fa7cccfb4ea344ca153
|
|
(Acc + Block) * r =
|
|
16d8e08a0f3fe1de4fe4a15486aca7a270a29f1e6c849221e4a6798b8e45321f
|
|
((Acc + Block) * r) % P = 28d31b7caff946c77c8844335369d03a7
|
|
```
|
|
|
|
```cryptol
|
|
property polyBlocksOK =
|
|
(blocks @ 1 == 0x02c88c77849d64ae9147ddeb88e69c83fc) &&
|
|
(blocks @ 2 == 0x02d8adaf23b0337fa7cccfb4ea344b30de) &&
|
|
(lastBlock == 0x028d31b7caff946c77c8844335369d03a7) where
|
|
(blocks, lastBlock) = AccumBlocks Poly1305TestKey Poly1305TestMessage
|
|
```
|
|
|
|
Adding s we get this number, and serialize if to get the tag:
|
|
|
|
Acc + s = 2a927010caf8b2bc2c6365130c11d06a8
|
|
|
|
Tag: a8:06:1d:c1:30:51:36:c6:c2:2b:8b:af:0c:01:27:a9
|
|
|
|
```cryptol
|
|
// Putting it all together and testing:
|
|
|
|
Poly1305TestTag = "a8:06:1d:c1:30:51:36:c6:c2:2b:8b:af:0c:01:27:a9."
|
|
|
|
property Poly1305_passes_test = Poly1305 Poly1305TestKey Poly1305TestMessage ==
|
|
parseHexString Poly1305TestTag
|
|
```
|
|
|
|
## Generating the Poly1305 key using ChaCha20
|
|
|
|
As said in the "Poly 1305 Algorithm" section, it is acceptable to generate
|
|
the one-time Poly1305 pseudo-randomly. This section proposes such a method.
|
|
|
|
To generate such a key pair (r,s), we will use the ChaCha20 block
|
|
function described in Section 2.3. This assumes that we have a 256-
|
|
bit session key for the MAC function, such as SK_ai and SK_ar in
|
|
IKEv2 ([RFC5996]), the integrity key in ESP and AH, or the
|
|
client_write_MAC_key and server_write_MAC_key in TLS. Any document
|
|
that specifies the use of Poly1305 as a MAC algorithm for some
|
|
protocol must specify that 256 bits are allocated for the integrity
|
|
key. Note that in the AEAD construction defined in Section 2.8, the
|
|
same key is used for encryption and key generation, so the use of
|
|
SK_a* or *_write_MAC_key is only for stand-alone Poly1305.
|
|
|
|
The method is to call the block function with the following
|
|
parameters:
|
|
|
|
* The 256-bit session integrity key is used as the ChaCha20 key.
|
|
* The block counter is set to zero.
|
|
* The protocol will specify a 96-bit or 64-bit nonce. This MUST be
|
|
unique per invocation with the same key, so it MUST NOT be
|
|
randomly generated. A counter is a good way to implement this,
|
|
but other methods, such as an LFSR are also acceptable. ChaCha20
|
|
as specified here requires a 96-bit nonce. So if the provided
|
|
nonce is only 64-bit, then the first 32 bits of the nonce will be
|
|
set to a constant number. This will usually be zero, but for
|
|
protocols with multiple senders it may be different for each
|
|
sender, but should be the same for all invocations of the function
|
|
with the same key by a particular sender.
|
|
|
|
After running the block function, we have a 512-bit state. We take
|
|
the first 256 bits or the serialized state, and use those as the one-
|
|
time Poly1305 key: The first 128 bits are clamped, and form "r",
|
|
while the next 128 bits become "s". The other 256 bits are
|
|
discarded.
|
|
|
|
Note that while many protocols have provisions for a nonce for
|
|
encryption algorithms (often called Initialization Vectors, or IVs),
|
|
they usually don't have such a provision for the MAC function. In
|
|
that case the per-invocation nonce will have to come from somewhere
|
|
else, such as a message counter.
|
|
|
|
### Poly1305 Key Generation Test Vector
|
|
|
|
For this example, we'll set:
|
|
|
|
```cryptol
|
|
PolyKeyTest = join (parseHexString (
|
|
"80 81 82 83 84 85 86 87 88 89 8a 8b 8c 8d 8e 8f " #
|
|
"90 91 92 93 94 95 96 97 98 99 9a 9b 9c 9d 9e 9f "
|
|
))
|
|
|
|
PolyNonceTest : [96]
|
|
PolyNonceTest = join (
|
|
parseHexString ("00 00 00 00 00 01 02 03 04 05 06 07 "))
|
|
```
|
|
|
|
The ChaCha state set up with key, nonce, and block counter zero:
|
|
|
|
## A Pseudo-Random Function for ChaCha/Poly-1305 based Crypto Suites
|
|
|
|
Some protocols such as IKEv2([RFC5996]) require a Pseudo-Random
|
|
Function (PRF), mostly for key derivation. In the IKEv2 definition,
|
|
a PRF is a function that accepts a variable-length key and a
|
|
variable-length input, and returns a fixed-length output. This
|
|
section does not specify such a function.
|
|
|
|
Poly-1305 is an obvious choice, because MAC functions are often used
|
|
as PRFs. However, Poly-1305 prohibits using the same key twice,
|
|
whereas the PRF in IKEv2 is used multiple times with the same key.
|
|
Adding a nonce or a counter to Poly-1305 can solve this issue, much
|
|
as we do when using this function as a MAC, but that would require
|
|
changing the interface for the PRF function.
|
|
|
|
Chacha20 could be used as a key-derivation function, by generating an
|
|
arbitrarily long keystream. However, that is not what protocols such
|
|
as IKEv2 require.
|
|
|
|
For this reason, this document does not specify a PRF, and recommends
|
|
that crypto suites use some other PRF such as PRF_HMAC_SHA2_256
|
|
(section 2.1.2 of [RFC4868])
|
|
|
|
## AEAD Construction
|
|
|
|
AEAD_CHACHA20-POLY1305 is an authenticated encryption with additional
|
|
data algorithm. The inputs to AEAD_CHACHA20-POLY1305 are:
|
|
|
|
* A 256-bit key
|
|
* A 96-bit nonce - different for each invocation with the same key.
|
|
* An arbitrary length plaintext (fewer than 2^64 bytes)
|
|
* Arbitrary length additional data (AAD) (fewer than 2^64 bytes)
|
|
|
|
Some protocols may have unique per-invocation inputs that are not 96-
|
|
bit in length. For example, IPsec may specify a 64-bit nonce. In
|
|
such a case, it is up to the protocol document to define how to
|
|
transform the protocol nonce into a 96-bit nonce, for example by
|
|
concatenating a constant value.
|
|
|
|
The ChaCha20 and Poly1305 primitives are combined into an AEAD that
|
|
takes a 256-bit key and 96-bit nonce as follows:
|
|
|
|
* First, a Poly1305 one-time key is generated from the 256-bit key
|
|
and nonce using the procedure described in "Generating the Poly1305 key using ChaCha20".
|
|
|
|
* Next, the ChaCha20 encryption function is called to encrypt the
|
|
plaintext, using the input key and nonce, and with the initial
|
|
counter set to 1.
|
|
|
|
* Finally, the Poly1305 function is called with the Poly1305 key
|
|
calculated above, and a message constructed as a concatenation of
|
|
the following:
|
|
* The AAD
|
|
* padding1 - the padding is up to 15 zero bytes, and it brings
|
|
the total length so far to an integral multiple of 16. If the
|
|
length of the AAD was already an integral multiple of 16 bytes,
|
|
this field is zero-length.
|
|
* The ciphertext
|
|
* padding2 - the padding is up to 15 zero bytes, and it brings
|
|
the total length so far to an integral multiple of 16. If the
|
|
length of the ciphertext was already an integral multiple of 16
|
|
bytes, this field is zero-length.
|
|
* The length of the additional data in octets (as a 64-bit
|
|
little-endian integer).
|
|
* The length of the ciphertext in octets (as a 64-bit little-
|
|
endian integer).
|
|
|
|
The output from the AEAD is twofold:
|
|
|
|
* A ciphertext of the same length as the plaintext.
|
|
* A 128-bit tag, which is the output of the Poly1305 function.
|
|
|
|
Decryption is pretty much the same thing.
|
|
|
|
A few notes about this design:
|
|
|
|
1. The amount of encrypted data possible in a single invocation is
|
|
2^32-1 blocks of 64 bytes each, because of the size of the block
|
|
counter field in the ChaCha20 block function. This gives a total
|
|
of 247,877,906,880 bytes, or nearly 256 GB. This should be
|
|
enough for traffic protocols such as IPsec and TLS, but may be
|
|
too small for file and/or disk encryption. For such uses, we can
|
|
return to the original design, reduce the nonce to 64 bits, and
|
|
use the integer at position 13 as the top 32 bits of a 64-bit
|
|
block counter, increasing the total message size to over a
|
|
million petabytes (1,180,591,620,717,411,303,360 bytes to be
|
|
exact).
|
|
|
|
1. Despite the previous item, the ciphertext length field in the
|
|
construction of the buffer on which Poly1305 runs limits the
|
|
ciphertext (and hence, the plaintext) size to 2^64 bytes, or
|
|
sixteen thousand petabytes (18,446,744,073,709,551,616 bytes to
|
|
be exact).
|
|
|
|
### Example and Test Vector for AEAD_CHACHA20-POLY1305
|
|
|
|
For a test vector, we will use the following inputs to the
|
|
AEAD_CHACHA20-POLY1305 function:
|
|
|
|
Plaintext:
|
|
|
|
```cryptol
|
|
AeadPt = "Ladies and Gentlemen of the class of '99: " #
|
|
"If I could offer you only one tip for " #
|
|
"the future, sunscreen would be it."
|
|
|
|
AeadAAD = parseHexString "50 51 52 53 c0 c1 c2 c3 c4 c5 c6 c7 "
|
|
|
|
AeadKey = join (parseHexString (
|
|
"80 81 82 83 84 85 86 87 88 89 8a 8b 8c 8d 8e 8f " #
|
|
"90 91 92 93 94 95 96 97 98 99 9a 9b 9c 9d 9e 9f " ))
|
|
|
|
|
|
AeadIV = join [ 0x40, 0x41, 0x42, 0x43, 0x44, 0x45, 0x46, 0x47 ]
|
|
|
|
AeadC = join [0x07, 0x00, 0x00, 0x00]
|
|
|
|
AeadNonce = AeadC # AeadIV
|
|
```
|
|
|
|
32-bit fixed-common part:
|
|
|
|
Poly1305 Key:
|
|
|
|
```cryptol
|
|
AeadConstructionTestVector = parseHexString (
|
|
"50:51:52:53:c0:c1:c2:c3:c4:c5:c6:c7:00:00:00:00:" #
|
|
"d3:1a:8d:34:64:8e:60:db:7b:86:af:bc:53:ef:7e:c2:" #
|
|
"a4:ad:ed:51:29:6e:08:fe:a9:e2:b5:a7:36:ee:62:d6:" #
|
|
"3d:be:a4:5e:8c:a9:67:12:82:fa:fb:69:da:92:72:8b:" #
|
|
"1a:71:de:0a:9e:06:0b:29:05:d6:a5:b6:7e:cd:3b:36:" #
|
|
"92:dd:bd:7f:2d:77:8b:8c:98:03:ae:e3:28:09:1b:58:" #
|
|
"fa:b3:24:e4:fa:d6:75:94:55:85:80:8b:48:31:d7:bc:" #
|
|
"3f:f4:de:f0:8e:4b:7a:9d:e5:76:d2:65:86:ce:c6:4b:" #
|
|
"61:16:00:00:00:00:00:00:00:00:00:00:00:00:00:00:" #
|
|
"0c:00:00:00:00:00:00:00:72:00:00:00:00:00:00:00." )
|
|
```
|
|
|
|
Note the 4 zero bytes in line 000 and the 14 zero bytes in line 128
|
|
|
|
```cryptol
|
|
// Tag:
|
|
AeadTagTestVector = parseHexString "1a:e1:0b:59:4f:09:e2:6a:7e:90:2e:cb:d0:60:06:91."
|
|
```
|
|
|
|
# Implementation Advice
|
|
|
|
Each block of ChaCha20 involves 16 move operations and one increment
|
|
operation for loading the state, 80 each of XOR, addition and Roll
|
|
operations for the rounds, 16 more add operations and 16 XOR
|
|
operations for protecting the plaintext. Section 2.3 describes the
|
|
ChaCha block function as "adding the original input words". This
|
|
implies that before starting the rounds on the ChaCha state, we copy
|
|
it aside, only to add it in later. This is correct, but we can save
|
|
a few operations if we instead copy the state and do the work on the
|
|
copy. This way, for the next block you don't need to recreate the
|
|
state, but only to increment the block counter. This saves
|
|
approximately 5.5% of the cycles.
|
|
|
|
It is not recommended to use a generic big number library such as the
|
|
one in OpenSSL for the arithmetic operations in Poly1305. Such
|
|
libraries use dynamic allocation to be able to handle any-sized
|
|
integer, but that flexibility comes at the expense of performance as
|
|
well as side-channel security. More efficient implementations that
|
|
run in constant time are available, one of them in DJB's own library,
|
|
NaCl ([NaCl]). A constant-time but not optimal approach would be to
|
|
naively implement the arithmetic operations for a 288-bit integers,
|
|
because even a naive implementation will not exceed 2^288 in the
|
|
multiplication of (acc+block) and r. An efficient constant-time
|
|
implementation can be found in the public domain library poly1305-
|
|
donna ([poly1305_donna]).
|
|
|
|
|
|
# Security Considerations
|
|
|
|
The ChaCha20 cipher is designed to provide 256-bit security.
|
|
|
|
The Poly1305 authenticator is designed to ensure that forged messages
|
|
are rejected with a probability of 1-(n/(2^102)) for a 16n-byte
|
|
message, even after sending 2^64 legitimate messages, so it is SUF-
|
|
CMA in the terminology of [AE].
|
|
|
|
Proving the security of either of these is beyond the scope of this
|
|
document. Such proofs are available in the referenced academic
|
|
papers.
|
|
|
|
The most important security consideration in implementing this draft
|
|
is the uniqueness of the nonce used in ChaCha20. Counters and LFSRs
|
|
are both acceptable ways of generating unique nonces, as is
|
|
encrypting a counter using a 64-bit cipher such as DES. Note that it
|
|
is not acceptable to use a truncation of a counter encrypted with a
|
|
128-bit or 256-bit cipher, because such a truncation may repeat after
|
|
a short time.
|
|
|
|
The Poly1305 key MUST be unpredictable to an attacker. Randomly
|
|
generating the key would fulfill this requirement, except that
|
|
Poly1305 is often used in communications protocols, so the receiver
|
|
should know the key. Pseudo-random number generation such as by
|
|
encrypting a counter is acceptable. Using ChaCha with a secret key
|
|
and a nonce is also acceptable.
|
|
|
|
The algorithms presented here were designed to be easy to implement
|
|
in constant time to avoid side-channel vulnerabilities. The
|
|
operations used in ChaCha20 are all additions, XORs, and fixed
|
|
rotations. All of these can and should be implemented in constant
|
|
time. Access to offsets into the ChaCha state and the number of
|
|
operations do not depend on any property of the key, eliminating the
|
|
chance of information about the key leaking through the timing of
|
|
cache misses.
|
|
|
|
For Poly1305, the operations are addition, multiplication and
|
|
modulus, all on >128-bit numbers. This can be done in constant time,
|
|
but a naive implementation (such as using some generic big number
|
|
library) will not be constant time. For example, if the
|
|
multiplication is performed as a separate operation from the modulus,
|
|
the result will some times be under 2^256 and some times be above
|
|
2^256. Implementers should be careful about timing side-channels for
|
|
Poly1305 by using the appropriate implementation of these operations.
|
|
|
|
|
|
# IANA Considerations
|
|
|
|
There are no IANA considerations for this document.
|
|
|
|
|
|
# Acknowledgements
|
|
|
|
ChaCha20 and Poly1305 were invented by Daniel J. Bernstein. The AEAD
|
|
construction and the method of creating the one-time poly1305 key
|
|
were invented by Adam Langley.
|
|
|
|
Thanks to Robert Ransom, Watson Ladd, Stefan Buhler, and kenny
|
|
patterson for their helpful comments and explanations. Thanks to
|
|
Niels Moeller for suggesting the more efficient AEAD construction in
|
|
this document. Special thanks to Ilari Liusvaara for providing extra
|
|
test vectors, helpful comments, and for being the first to attempt an
|
|
implementation from this draft.
|
|
|
|
# References
|
|
|
|
## Normative References
|
|
|
|
```example
|
|
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
|
|
Requirement Levels", BCP 14, RFC 2119, March 1997.
|
|
|
|
[chacha] Bernstein, D., "ChaCha, a variant of Salsa20", Jan 2008.
|
|
|
|
[poly1305]
|
|
Bernstein, D., "The Poly1305-AES message-authentication
|
|
code", Mar 2005.
|
|
```
|
|
|
|
## Informative References
|
|
|
|
```example
|
|
[AE] Bellare, M. and C. Namprempre, "Authenticated Encryption:
|
|
Relations among notions and analysis of the generic
|
|
composition paradigm",
|
|
<http://cseweb.ucsd.edu/~mihir/papers/oem.html>.
|
|
|
|
[FIPS-197]
|
|
National Institute of Standards and Technology, "Advanced
|
|
Encryption Standard (AES)", FIPS PUB 197, November 2001.
|
|
|
|
[FIPS-46] National Institute of Standards and Technology, "Data
|
|
Encryption Standard", FIPS PUB 46-2, December 1993,
|
|
<http://www.itl.nist.gov/fipspubs/fip46-2.htm>.
|
|
|
|
[LatinDances]
|
|
Aumasson, J., Fischer, S., Khazaei, S., Meier, W., and C.
|
|
Rechberger, "New Features of Latin Dances: Analysis of
|
|
Salsa, ChaCha, and Rumba", Dec 2007.
|
|
|
|
[NaCl] Bernstein, D., Lange, T., and P. Schwabe, "NaCl:
|
|
Networking and Cryptography library",
|
|
<http://nacl.cace-project.eu/index.html>.
|
|
|
|
[RFC4868] Kelly, S. and S. Frankel, "Using HMAC-SHA-256, HMAC-SHA-
|
|
384, and HMAC-SHA-512 with IPsec", RFC 4868, May 2007.
|
|
|
|
[RFC5116] McGrew, D., "An Interface and Algorithms for Authenticated
|
|
Encryption", RFC 5116, January 2008.
|
|
|
|
[RFC5996] Kaufman, C., Hoffman, P., Nir, Y., and P. Eronen,
|
|
"Internet Key Exchange Protocol Version 2 (IKEv2)",
|
|
RFC 5996, September 2010.
|
|
|
|
[Zhenqing2012]
|
|
Zhenqing, S., Bin, Z., Dengguo, F., and W. Wenling,
|
|
"Improved key recovery attacks on reduced-round salsa20
|
|
and chacha", 2012.
|
|
|
|
[poly1305_donna]
|
|
Floodyberry, A., "Poly1305-donna",
|
|
<https://github.com/floodyberry/poly1305-donna>.
|
|
|
|
[standby-cipher]
|
|
McGrew, D., Grieco, A., and Y. Sheffer, "Selection of
|
|
Future Cryptographic Standards",
|
|
draft-mcgrew-standby-cipher (work in progress).
|
|
```
|
|
|
|
|
|
Authors' Addresses
|
|
|
|
```verbatim
|
|
Yoav Nir
|
|
Check Point Software Technologies Ltd.
|
|
5 Hasolelim st.
|
|
Tel Aviv 6789735
|
|
Israel
|
|
Email: ynir.ietf@gmail.com
|
|
|
|
Adam Langley
|
|
Google Inc
|
|
Email: agl@google.com
|
|
|
|
Dylan McNamee
|
|
Galois Inc
|
|
Email: dylan@galois.com
|
|
```
|
|
|
|
## Utilities
|
|
|
|
|
|
```cryptol
|
|
// Converts a bytestring encoded like "fe:ed:fa:ce." into a sequence of bytes
|
|
// Note: the trailing punctuation is needed
|
|
parseHexString : {n} (fin n) => [3*n][8] -> [n][8]
|
|
parseHexString hexString = [ charsToByte (take`{2} cs) | cs <- groupBy`{3} hexString ] where
|
|
charsToByte : [2][8] -> [8]
|
|
charsToByte [ ub, lb ] = (charToByte ub) << 4 || (charToByte lb)
|
|
charToByte c = if c >= '0' && c <= '9' then c-'0'
|
|
| c >= 'a' && c <= 'f' then 10+(c-'a')
|
|
else 0 // error case
|
|
|
|
property parseHexString_check =
|
|
join (parseHexString
|
|
("00:01:02:03:04:05:06:07:08:09:0a:0b:0c:0d:0e:0f:10:11:12:13:" #
|
|
"14:15:16:17:18:19:1a:1b:1c:1d:1e:1f.")) ==
|
|
0x000102030405060708090a0b0c0d0e0f101112131415161718191a1b1c1d1e1f
|
|
|
|
// Takes a finite sequence of bytes, and turns them into a word via
|
|
// a little-endian interpretation
|
|
littleendian : {a}(fin a) => [a][8] -> [a*8]
|
|
littleendian b = join(reverse b)
|
|
|
|
```
|