biscuit/DESIGN.md

425 lines
15 KiB
Markdown
Raw Normal View History

2018-12-17 15:05:07 +03:00
# Biscuit Authentication
## Introduction
distributed authorization is traditionally done through
centralized systems like OAuth, where any new authorization
will be delivered by a server, and validated by that same server.
This is fine when working with a monolithic system, or a small
set of microservices.
A request coming from a user agent could result in hundreds of
internal requests between microservices, each requiring a verification
of authorization, and we cannot have a centralized server to handle
authorization for every service.
### Inspiration
This system draws ideas from X509 certificates,
JWT, macaroons and vanadium.
2018-12-17 15:05:07 +03:00
JSON Web Tokens were designed in part to handle distributed authorization,
and in part to provide a stateless authentication token.
While it has been shown that state management cannot be avoid (it is
the only way to have correct revocation), distributed authorization
has proven useful. JSON Web Tokens are JSON objects that carry
data about their principal, expiration dates and a serie of claims,
all signed by the authorization server's public key. Any service that
knows and trusts that public key will be able to validate the token.
JWTs are also quite large and often cannot fit in a cookie, so they are
often stored in localstorage, where they are easily stolen via XSS.
Macaroons provide a token that can be delegated: the holder can
create a new, valid token from the first one, by attenuating its
rights. They are built from a secret known to the authorization server.
A token can be created from a caveat and the HMAC of the secret and the caveat.
To build a new token, we add a caveat, remove the previous HMAC signature,
and add a HMAC of the previous signature and the new caveat (so from
an attenuated token we cannot go back to a more general one).
This allows use to build tokens with very limited access, that wan can hand
over to an external service, or build unique restricted tokens per requests.
Building macaroons on a secret means that any service that wants to validate
the token must know that secret.
Vanadium builds a distributed authorization and delegation system
based on public keys, by binding a token to a public key with
a certificate, and a blessing (a name with an optional prefix).
Attenuating the token means generating a new blessing by appending
a name, and signing a list of caveats and the public key of the new
holder. The token is then validated first by validating the certificate,
then validating the caveats, then applying ACLs based on patterns
in the blessings.
### Goals
2018-12-17 15:05:07 +03:00
Here is what we want:
- distributed authorization: any node could validate the token only with public information
- delegation: a new, valid token can be created from another one by attenuating its rights
- avoiding identity and impersonation: in a distributed system, not all services
need to know about the token holder's identity. Instead, they care about
specific authorizations
- capabilities: a request carries a token that contains a set of rights
that will be used for authorization, instead of deploying ACLs on every node
## Format
A biscuit token is an ordered list of key and value tuples, stored in HPACK
format. HPACK was chosen to avoid specifying yet another serialization format,
and reusing its data compression features to make tokens small enough to
fit in a cookie.
2019-01-02 18:54:54 +03:00
```
2018-12-17 15:05:07 +03:00
biscuit := block\*, signature
block := HPACK{ kv\* }
kv := ["rights", rights] | ["pub", pubkey] | [TEXT, TEXT]
TEXT := characters (UTF-8 or ASCII?)
pubkey := base64(public key)
rights := namespace { right,\* }
namespace := TEXT
right := (+|-) tag : feature(options)
tag := TEXT | /regexp/ | *
feature := TEXT | /regexp/ | *
options := (r|w|e),\*
2019-01-02 18:54:54 +03:00
```
2018-12-17 15:05:07 +03:00
Example:
2019-01-02 18:54:54 +03:00
```
2018-12-17 15:05:07 +03:00
[
issuer = Clever Cloud
user = user_id_123
rights = clevercloud{-/.*prod/ : *(*) +/org_456-*/: *(*) +lapin-prod:log(r) +sozu-prod:metric(r)}
]
<signature = base_64(64 bytes signature)>
2019-01-02 18:54:54 +03:00
```
2018-12-17 15:05:07 +03:00
This token was issued by "Clever Cloud" for user "user_id_123".
It defines the following capabilities, applied in order:
- remove all rights from any tag with the "prod" suffix
- give all rights on any tag that has the "org_456" prefix (even those with "prod" suffix)
- add on the "lapin-prod" tag the "log" feature with right "r"
- add on the "sozu-prod" tag the "metric" feature with right "r"
Example of attenuated token:
2019-01-02 18:54:54 +03:00
```
2018-12-17 15:05:07 +03:00
[
issuer = Clever Cloud
user = user_id_123
organization = org_456
rights = clevercloud{-/.*prod/ : *(*) +/org_456-*/: *(*) +lapin-prod:log(r) +sozu-prod:metric(r)}
]
[
pub = base64(128 bytes key)
rights = clevercloud { -/org_456-*/: *(*) +/org_456-test/ database(*) }
]
<signature = base_64(new 64 bytes signature)>
2019-01-02 18:54:54 +03:00
```
2018-12-17 15:05:07 +03:00
This new token starts from the same rights as the previous one, but attenuates it
that way:
- all access to tags with "org_456-" prefix is removed
- except that "org_456-test" tag, on which we activate the "database" feature with all accesses
The new token has a signature derived from the previous one and the second block.
### Common keys and values
2018-12-17 15:05:07 +03:00
Key-value tuples can contain arbitrary data, but some of them have predefined
semantics (and could be part of HPACK's static tables to reduce the size of
the token):
- issuer: original creator of the token (validators are able to look up the root public key from the issuer field). Appears in the first block
- holder: current holder of the token (will be used for audit purpose). Can appear once per block
- pub: public key used to sign the current block. Appears in every block except the first
- created-on: creation date, in ISO 8601 format. Can appear once per block
- expires-on: expiration date, in ISO 8601 format. Can appear once per block. Must be lower than the expiration dates from previous blocks if present
- restricts: comma separated list of public keys. Any future block can only be signed by one of those keys
- sealed: if present, stops delegation (no further block can be added). Its only value is "true"
- rights: string specifying the rights restriction for this block
Those common keys and values will be present in the HPACK static table
## Rights management
The rules are defined to allow flexibility in rules verification. The default token
will start with all the rights, and restrict them with the "rights" field in each
new block. But what those restrictions mean will depend on which service verifies
the token, as they might care (or even know) about different sets of capabilities.
Starting from a set of rights `R`, that contains a list of namespaces. Each namespace
has a list of tuples `(tag, feature, [options])`. Tags and features can appear in
multiple tuples.
A `rights` field contains a list of namespaces, and for each namespace,
a list of right patterns matching `(tag, feature, [options])` tuples,
and a `+` or `-` tag indicating if it should be added or removed.
Appying rights attenuation:
- for each namespace `N`:
- load the current set of rights `R`
- either the original set of rights for the verifier
- or the set of rights after attenuation by the previous block
- all rights in `R` are marked as `+` (active)
- for each right pattern ( `RP = (+|-) tag : feature(options)` ):
- for each right tuple `r = (tag, feature, [options])` in `R` matched by `RP`:
- if r is active ( `+` ) but `RP` contains `-`, mark r as inactive ( `-` )
- if r is inactive ( `-` ) but `RP` contains `+`, mark r as active ( `+` )
- filter `R` to keep only the tuples marked as active
- store `R` as the newt rights for `N`
2018-12-17 15:05:07 +03:00
## Cryptography
This design requires a non interactive signature aggregation scheme.
We have multiple propositions, described in annex to the document.
We have not chosen yet which scheme will be used. The choice will
depend on the speed on the algorithm (for signature, aggregation and
verification), the size of the keys and signatures, and pending
an audit.
The system needs to be non interactive, so that delegation can
be done "offline", without talking to the initial authorization
system, or any of the other participants in the delegation chain.
A signature aggregation scheme, can take a list of tuples
(message, signature, public key), and produce one signature
that can be verified with the list of messages and public keys.
An additional important property we need here: we cannot get
the original signatures from an aggregated one.
### Biscuit signature scheme
Assuming we have the following primitives:
- `Keygen()` can give use a publick key `pk` and a private key `sk`
- `Sign(sk, message)` can give us a signature `S`, with `message` a byte array or arbitrary length
- `Aggregate(S1, S2)` can give us an aggregated signature `S`. Additionally, `Aggregate`
can be called with an aggregated signature `S` and a single signature `S'`, and return a new
aggregated signature `S"`
- `Verify([message], [pk], S)` will return true if the signature `S`
is valid for the list of messages `[message]` and the list of public keys `[pk]`
#### First layer of the authorization token
2018-12-17 15:05:07 +03:00
The issuing server performs the following steps:
2018-12-17 15:05:07 +03:00
- `(pk1, sk1) <- Keygen()` (done once)
- create the first block (we can omit `pk1` from that block, since we assume the
token will be verified on a system that knows that public key)
- Serialize that first block to `m1`
- `S <- Sign(sk1, m1)`
- `token1 <- m1||S`
2018-12-17 15:05:07 +03:00
#### Adding a block to the token
2018-12-17 15:05:07 +03:00
The holder of a token can attenuate it by adding a new block and
signing it, with the following steps:
2018-12-17 15:05:07 +03:00
- With `token1` containing `[messages]||S`, and a way to get
the list of public keys `[pk]` for each block from the blocks, or
from the environment
- `(pk2, sk2) <- Keygen()`
- With `message2` the block we want to add (containing `pk2`, so it
can be found in further verifications)`
- `S2 <- Sign(sk2, message2)`
- `S' <- Aggregate(S, S2)`
- `token2 <- [messages]||message2||S'`
2018-12-17 15:05:07 +03:00
Note: the block can contain `sealed: true` in its keys and values, to
indicate a token should not be attenuated further.
2018-12-17 15:05:07 +03:00
Question: should the previous signature be verified before adding the
new block?
2018-12-17 15:05:07 +03:00
#### Verifying the token
2018-12-17 15:05:07 +03:00
- With `token` containing `[messages]||S`
- extract `[pk]` from `[messages]` and the environment: the first public
key should already be known, and for performance reasons, some public keys
could also be present in the list of common keys and values
- `b <- Verify([messages], [pk], S)`
- if `b` is true, the signature is valid
- proceed to validating rights
## Annex 1: Cryptographic design proposals
### Pairing based cryptography
2019-01-02 18:49:01 +03:00
proposed by @geal
Assuming we have a pairing e: G1 x G2 -> Gt with G1 and G2 two additive cyclic groups of prime order q, Gt a multiplicative cyclic group of order q
2018-12-17 15:05:07 +03:00
with a, b from Fq* finite field of order q
with P from G1, Q from G2
We have the following properties:
2019-01-02 18:54:54 +03:00
- `e(aP, bQ) == e(P, Q)^(ab)`
- `e != 1`
More specifically:
2019-01-02 18:54:54 +03:00
- `e(aP, Q) == e(P, aQ) == e(P,Q)^a`
- `e(P1 + P2, Q) == e(P1, Q) * e(P2, Q)`
#### Signature
2018-12-17 15:05:07 +03:00
2019-01-02 18:54:54 +03:00
- choose k from Fq* as private key, g2 a generator of G2
- public key P = k*g2
2018-12-17 15:05:07 +03:00
2019-01-02 18:54:54 +03:00
- Signature S = k*H1(message) with H1 function to hash message to G1
- Verifying: knowing message, P and S
```
2018-12-17 15:05:07 +03:00
e(S, g2) == e( k*H1(message), g2)
2019-01-02 18:54:54 +03:00
== e( H1(message), k*g2)
== e( H1(message), P)
```
2018-12-17 15:05:07 +03:00
#### Signature aggregation
2019-01-02 18:54:54 +03:00
- knowing messages m1 and m2, public keys P1 and P2
- signatures S1 = Sign(k1, m1), S2 = Sign(k2, m2)
- the aggregated signature S = S1 + S2
2018-12-17 15:05:07 +03:00
Verifying:
2019-01-02 18:54:54 +03:00
```
2018-12-17 15:05:07 +03:00
e(S, g2) == e(S1+S2, g2)
2019-01-02 18:54:54 +03:00
== e(S1, g2)*e(S2, g2)
== e(k1*H1(m1), g2) * e(k2*HA(m2), g2)
== e(H1(m1), k1*g2) * e(H1(m2), k2*g2)
== e(H1(m1), P1) * e(H1(m2), P2)
```
2018-12-17 15:05:07 +03:00
so we calculate signature verification pairing for every caveat
then we multiply the result and check equality
we use curve BLS12-381 (Boneh Lynn Shacham) for security reasons
(cf https://github.com/zcash/zcash/issues/2502
for comparions with Barreto Naehrig curves)
assumes computational Diffe Hellman is hard
Performance is not stellar (with the pairing crate, we can
spend 30ms verifying a token with 3 blocks, with mcl 1.7ms).
2018-12-17 15:05:07 +03:00
Example of library this can be implemented with:
- pairing crate: https://github.com/zkcrypto/pairing
- mcl: https://github.com/herumi/mcl
2019-01-02 18:49:01 +03:00
### Elliptic curve verifiable random functions
proposed by @KellerFuchs
https://tools.ietf.org/html/draft-goldbe-vrf-01
2019-01-02 18:49:01 +03:00
Using the primitives defined in https://tools.ietf.org/html/draft-goldbe-vrf-01#section-5 :
2019-01-02 18:54:54 +03:00
```
2019-01-02 18:49:01 +03:00
F - finite field
2n - length, in octets, of a field element in F
E - elliptic curve (EC) defined over F
m - length, in octets, of an EC point encoded as an octet string
G - subgroup of E of large prime order
q - prime order of group G
cofactor - number of points on E divided by q
g - generator of group G
Hash - cryptographic hash function
hLen - output length in octets of Hash
2019-01-02 18:54:54 +03:00
```
2019-01-02 18:49:01 +03:00
Constraints on options:
Field elements in F have bit lengths divisible by 16
hLen is equal to 2n
Steps:
Keygen:
`(pk, sk) <- Keygen()`: sk random x with 0 < x < q
Sign(pk, sk, message):
2019-01-02 18:54:54 +03:00
2019-01-02 18:49:01 +03:00
creating a proof pi = ECVRF_prove(pk, sk, message):
2019-01-02 18:54:54 +03:00
2019-01-02 18:49:01 +03:00
- h = ECVRF_hash_to_curve(pk, message)
- gamma = h^sk
- choose a random integer nonce k from [0, q-1]
- c = ECVRF_hash_points(g, h, pk, gamma, g^k, h^k)
- s = k - c*sk mod q
- pi = (gamma, c, s)
Verify(pk, pi, message) for one message and its signature:
2019-01-02 18:54:54 +03:00
2019-01-02 18:49:01 +03:00
- (gamma, c, s) = pi
2019-01-02 18:55:59 +03:00
```
2019-01-02 18:54:54 +03:00
u = pk^c * g^s
= g^(sk*c)*g^(k - c*sk)
= g^k
```
2019-01-02 18:49:01 +03:00
- h = ECVRF_hash_to_curve(pk, message)
2019-01-02 18:55:59 +03:00
```
2019-01-02 18:54:54 +03:00
v = gamma^c * h^s
= h^(sk*c)*h^(k - c*sk)
= h^k
```
2019-01-02 18:49:01 +03:00
- c' = ECVRF_hash_points(g, h, pk, gamma, u, v)
- return c == c'
Aggregate(pk', pi', [pk], PI) with [pk] list of public keys and PI aggregated signature:
2019-01-02 18:54:54 +03:00
2019-01-02 18:49:01 +03:00
- (gamma', c', s') = pi'
- ([gamma], [c], S, W, C) = PI
- h' = ECVRF_hash_to_curve(pk', message')
- S' = S + s'
- if previous signature was:
- the first block:
- set W' = h_0^-s_1 * h_1^-s_0
- else:
- W == (h_0 ^ (s_0 - S) * .. * h_n^(s_n - S))
2019-01-02 18:57:32 +03:00
```
2019-01-02 18:54:54 +03:00
W' = W * (h_0^-s') * .. * (h_n^-s') * (h'^-S)
= (h_0 ^ (s_0 - S - s') * .. * h_n^(s_n - S - s')) * h'^(s' - S')
= (h_0 ^ (s_0 - S') * .. * h_n^(s_n - S')) * h'^(s' - S')
```
2019-01-02 18:49:01 +03:00
- C' = ECVRF_hash_points(g, h, pk_0 * .. * pk_n, gamma_0 * .. * gamma_n, U', V')
- PI' = ([gamma]||gamma', [c]||c', S', W', C')
Verify([pk], PI, [message]):
2019-01-02 18:54:54 +03:00
2019-01-02 18:49:01 +03:00
- ([gamma], [c], S, W, C) = PI
- check that n = |[pk]| == |[message]| == |[gamma]| == |[c]|
- for each tuple (pk_i, message_i, gamma_i, c_i) with i from 0 to n:
- p_i = pk_i^c_i
- h_i = ECVRF_hash_to_curve(pk_i, message_i)
- v_i = gamma_i^c_i * h_i^S
2019-01-02 18:55:59 +03:00
```
2019-01-02 18:54:54 +03:00
U = (p_0* .. * p_n) * g^S
2019-01-02 18:49:01 +03:00
= pk_0^c_0 * .. * pk_n ^ c_n * g^((k_0 - c_0*sk_0) + .. + (k_n - c_n*sk_n))
= g^(sk0 * c_0 + .. + sk_n * c_n + (k_0 - c_0*sk_0) + .. + (k_n - c_n*sk_n))
= g^(k_0 + .. + k_n)
2019-01-02 18:54:54 +03:00
```
2019-01-02 18:49:01 +03:00
2019-01-02 18:55:59 +03:00
```
2019-01-02 18:54:54 +03:00
V = v_0 * .. * v_n * W
2019-01-02 18:49:01 +03:00
= gamma_0^c_0 * .. * gamma_n^c_n * h_0^S * .. * h_n^S * h_0^(s_0 - S) * .. * h_n^(s_n - S)
= h_0^(sk_0*c_0) * .. * h_n^(sk_n*c_n) * h_0^s_0 * .. * h_n^s_n
= h_0^(sk_0*c_0) * .. * h_n^(sk_n*c_n) * h_0^(k_0 - sk0*c0) * .. * h_n^(k_n - sk_n*c_n)
= h_0^k_0 * .. * h_n^k_n
2019-01-02 18:54:54 +03:00
```
2019-01-02 18:49:01 +03:00
- C' = ECVRF_hash_points(g, h, pk_0 * .. * pk_n, gamma_0 * .. * gamma_n, U, V)
- return C == C'
### Gamma signatures
proposed by @bascule
Yao, A. C.-C., & Yunlei Zhao. (2013). Online/Offline Signatures for Low-Power Devices. IEEE Transactions on Information Forensics and Security, 8(2), 283294.
Aggregation of Gamma-Signatures and Applications to Bitcoin, Yunlei Zhao https://eprint.iacr.org/2018/414.pdf
2019-01-02 18:49:01 +03:00
### BIP32 derived keys
2019-01-02 18:49:01 +03:00
proposed by @bascule
https://github.com/bitcoin/bips/blob/master/bip-0032.mediawiki