Merge branch 'master' into geal/check-all

2024-10-06 02:27:33 +03:00 · 2022-10-28 18:59:28 +02:00 · 2022-10-28 18:59:28 +02:00 · c861db415b
commit c861db415b
parent f31e9b8804 e92bbc7c60
1 changed files with 86 additions and 84 deletions
--- a/SPECIFICATIONS.md
+++ b/SPECIFICATIONS.md
@ -5,20 +5,20 @@
 Biscuit is a bearer token that supports offline attenuation, can be verified
 by any system that knows the root public key, and provides a flexible
 caveat language based on logic programming. It is serialized as
-Protocol Buffers [^Protobuf], and designed to be small enough for storage in
+Protocol Buffers [^protobuf], and designed to be small enough for storage in
 HTTP cookies.

 ### Vocabulary

 - Datalog: a declarative logic language that works on facts defining data relationship,
-rules creating more facts if conditions are met, and queries to test such conditions
+  rules creating more facts if conditions are met, and queries to test such conditions
 - check: a restriction on the kind of operation that can be performed with
-the token that contains it, represented as a datalog query in biscuit. For the operation
-to be valid, all of the checks defined in the token and the authorizer must succeed
+  the token that contains it, represented as a datalog query in biscuit. For the operation
+  to be valid, all of the checks defined in the token and the authorizer must succeed
 - allow/deny policies: a list of datalog queries that are tested in a sequence
-until one of them matches. They can only be defined in the authorizer
+  until one of them matches. They can only be defined in the authorizer
 - block: a list of datalog facts, rules and checks. The first block is the authority
-block, used to define the basic rights of a token
+  block, used to define the basic rights of a token
 - (Verified) Biscuit: a completely parsed biscuit, whose signatures and final proof
  have been successfully verified
 - Unverified Biscuit: a completely parsed biscuit, whose signatures and final proof
@ -54,60 +54,60 @@ variant of datalog that accepts expressions on some data types.

 ## Semantics

-A biscuit is structured as an append-only list of blocks, containing *checks*,
-and describing authorization properties.  As with Macaroons[^MACAROONS],
+A biscuit is structured as an append-only list of blocks, containing _checks_,
+and describing authorization properties. As with Macaroons[^macaroons],
 an operation must comply with all checks in order to be allowed by the biscuit.

 Checks are written as queries defined in a flavor of Datalog that supports
-expressions on some data types[^DATALOG], without support for negation. This
+expressions on some data types[^datalog], without support for negation. This
 simplifies its implementation and makes the check more precise.

 ### Logic language

 #### Terminology

-A Biscuit Datalog program contains *facts* and *rules*, which are made of
-*predicates* over the following types:
+A Biscuit Datalog program contains _facts_ and _rules_, which are made of
+_predicates_ over the following types:

- *variable*
- *integer*
- *string*
- *byte array*
- *date*
- *boolean*
- *set* a deduplicated list of values of any type, except *variable* or *set*
+- _variable_
+- _integer_
+- _string_
+- _byte array_
+- _date_
+- _boolean_
+- _set_ a deduplicated list of values of any type, except _variable_ or _set_

 While a Biscuit token does not use a textual representation for storage, we
 use one for parsing and pretty printing of Datalog elements.

-A *predicate* has the form `Predicate(v0, v1, ..., vn)`.
+A _predicate_ has the form `Predicate(v0, v1, ..., vn)`.

-A *fact* is a *predicate* that does not contain any *variable*.
+A _fact_ is a _predicate_ that does not contain any _variable_.

-A *rule* has the form:
-`Pr(r0, r1, ..., rk) <- P0(t1_1, t1_2, ..., t1_m1), ..., Pn(tn_1, tn_2, ..., tn_mn), E0(v0, ..., vi), ..., Ex(vx, ..., vy)`.
-The part of the left of the arrow is called the *head* and on the right, the
-*body*. In a *rule*, each of the `ri` or `ti_j` terms can be of any type. A
-*rule* is safe if all of the variables in the head appear somewhere in the body.
-We also define an *expression* `Ex` over the variables `v0` to `vi`. *Expressions*
-define a test of variable values when applying the *rule*. If the *expression*
-returns `false`, the *rule* application fails.
+A _rule_ has the form:
+`Pr(r0, r1, ..., rk) <- P0(t0_1, t0_2, ..., t0_m1), ..., Pn(tn_1, tn_2, ..., tn_mn), E0(v0, ..., vi), ..., Ex(vx, ..., vy)`.
+The part of the left of the arrow is called the _head_ and on the right, the
+_body_. In a _rule_, each of the `ri` or `ti_j` terms can be of any type. A
+_rule_ is safe if all of the variables in the head appear somewhere in the body.
+We also define an _expression_ `Ex` over the variables `v0` to `vi`. _Expressions_
+define a test of variable values when applying the _rule_. If the _expression_
+returns `false`, the _rule_ application fails.

-A *query* is a type of *rule* that has no head. It has the following form:
+A _query_ is a type of _rule_ that has no head. It has the following form:
 `?- P0(t1_1, t1_2, ..., t1_m1), ..., Pn(tn_1, tn_2, ..., tn_mn), C0(v0), ..., Cx(vx)`.
-When applying a *rule*, if there is a combination of *facts* that matches the
-body's predicates, we generate a new *fact* corresponding to the head (with the
+When applying a _rule_, if there is a combination of _facts_ that matches the
+body's predicates, we generate a new _fact_ corresponding to the head (with the
 variables bound to the corresponding values).

-A *check* is a list of *query* for which the token validation will fail if it cannot
+A _check_ is a list of _query_ for which the token validation will fail if it cannot
 produce any fact. A single query needs to match for the fact to succeed.
 If any of the cheks fails, the entire verification fails.

-An *allow policy* or *deny policy* is a list of *query*. If any of the queries produces something,
+An _allow policy_ or _deny policy_ is a list of _query_. If any of the queries produces something,
 the policy matches, and we stop there, otherwise we test the next one. If an
-*allow policy* succeeds, the token verification succeeds, while if a *deny policy*
+_allow policy_ succeeds, the token verification succeeds, while if a _deny policy_
 succeeds, the token verification fails. Those policies are tested after all of
-the *checks* have passed.
+the _checks_ have passed.

 We will represent the various types as follows:

@ -135,22 +135,22 @@ rules application does not generate any new facts, we can stop.

 #### Data types

-An *integer* is a signed 64 bits integer. It supports the following
+An _integer_ is a signed 64 bits integer. It supports the following
 operations: lower than, greater than, lower than or equal, greater than or equal, equal, set
 inclusion.

-A *string* is a suite of UTF-8 characters. It supports the following
+A _string_ is a suite of UTF-8 characters. It supports the following
 operations: prefix, suffix, equal, set inclusion, regular expression, concatenation (with `+`), substring test (with `.contains()`).

-A *byte array* is a suite of bytes. It supports the following
+A _byte array_ is a suite of bytes. It supports the following
 operations: equal, set inclusion.

-A *date* is a 64 bit unsigned integer representing a TAI64. It supports the
+A _date_ is a 64 bit unsigned integer representing a TAI64. It supports the
 following operations: `<`, `<=` (before), `>`, `>=` (after), equality, set inclusion.

-A *boolean* is `true` or `false`. It supports the following operations: `==`, `||`, `&&`, set inclusion. 
+A _boolean_ is `true` or `false`. It supports the following operations: `==`, `||`, `&&`, set inclusion.

-A *set* is a deduplicated list of terms of the same type. It cannot contain
+A _set_ is a deduplicated list of terms of the same type. It cannot contain
 variables or other sets. It supports equality, intersection, union, set inclusion.

 #### Grammar
@ -183,16 +183,16 @@ The logic language is descibed by the following EBNF grammar:
 <bytes> ::= "hex:" ([a-z] | [0-9])+
 <boolean> ::= "true" | "false"
 <date> ::= [0-9]* "-" [0-9] [0-9] "-" [0-9] [0-9] "T" [0-9] [0-9] ":" [0-9] [0-9] ":" [0-9] [0-9] ( "Z" | ( ("+" | "-") [0-9] [0-9] ":" [0-9] [0-9] ))
-<set> ::= "[" <sp>? ( <fact_term> ( <sp>? "," <sp>? <set_term>)* <sp>? )? "]"
+<set> ::= "[" <sp>? ( <set_term> ( <sp>? "," <sp>? <set_term>)* <sp>? )? "]"

 <expression> ::= <expression_element> (<sp>? <operator> <sp>? <expression_element>)*
-<expression_element> ::= <expression_unary> | (<expression_term> <expression_method>? ) 
+<expression_element> ::= <expression_unary> | (<expression_term> <expression_method>? )
 <expression_unary> ::= "!" <sp>? <expression>
-<expression_method> ::= "." <method_name> "(" <sp>? (<term> ( <sp>? "," <sp>? <term>)* )? <sp>? ")" 
+<expression_method> ::= "." <method_name> "(" <sp>? (<term> ( <sp>? "," <sp>? <term>)* )? <sp>? ")"
 <method_name> ::= ([a-z] | [A-Z] ) ([a-z] | [A-Z] | [0-9] | "_" )*

 <expression_term> ::= <term> | ("(" <sp>? <expression> <sp>? ")")
-<operator> ::= "<" | ">" | "<=" | ">=" | "==" | "&&" | "||" | "+" | "-" | "*" | "/" 
+<operator> ::= "<" | ">" | "<=" | ">=" | "==" | "&&" | "||" | "+" | "-" | "*" | "/"

 <sp> ::= (" " | "\t" | "\n")+
 ```
@ -205,7 +205,7 @@ The `name`, `variable` and `string` rules are defined as:
 - `variable`:
  - first character is `$`
  - following characters are any UTF-8 letter character, numbers, `_` or `:`
- ` string`:
+- `string`:
  - first character is `"`
  - any printable UTF-8 character except `"` which must be escaped as `\"`
  - last character is `"`
@ -318,7 +318,7 @@ check if
 These rules will define authority facts depending on verifier data.
 If we had the facts `resource("file1")` and
 `owner("alice", "file1")`, the authority rules will define
-`right"file1", "read")` and `right("file1", "write")`,
+`right("file1", "read")` and `right("file1", "write")`,
 which will allow check 1 and check 2 to succeed.

 If the owner ambient fact does not match the restriction in the second check, the
@ -366,42 +366,43 @@ rule.
 Expressions are internally represented as a series of opcodes for a stack based
 virtual machine. There are three kinds of opcodes:

- *value*: a raw value of any type. If it is a variable, the variable must also
-appear in a predicate, so the variable gets a real value for execution. When
-encountering a *value* opcode, we push it onto the stack
- *unary operation*: an operation that applies on one argument. When executed,
-it pops a value from the stack, applies the operation, then pushes the result
- *binary operation*: an operation that applies on two arguments. When executed,
-it pops two values from the stack, applies the operation, then pushes the result
+- _value_: a raw value of any type. If it is a variable, the variable must also
+  appear in a predicate, so the variable gets a real value for execution. When
+  encountering a _value_ opcode, we push it onto the stack
+- _unary operation_: an operation that applies on one argument. When executed,
+  it pops a value from the stack, applies the operation, then pushes the result
+- _binary operation_: an operation that applies on two arguments. When executed,
+  it pops two values from the stack, applies the operation, then pushes the result

 After executing, the stack must contain only one value, of the boolean type.

 Here are the currently defined unary operations:

- *negate*: boolean negation
- *parens*: returns its argument without modification (this is used when printing
-the expression, to avoid precedence errors)
- *length*: defined on strings, byte arrays and sets
+- _negate_: boolean negation
+- _parens_: returns its argument without modification (this is used when printing
+  the expression, to avoid precedence errors)
+- _length_: defined on strings, byte arrays and sets

 Here are the currently defined binary operations:
- *less than*, defined on integers and dates, returns a boolean
- *greater than*, defined on integers and dates, returns a boolean
- *less or equal*, defined on integers and dates, returns a boolean
- *greater or equal*, defined on integers and dates, returns a boolean
- *equal*, defined on integers, strings, byte arrays, dates, set, returns a boolean
- *contains* takes a set and another value as argument, returns a boolean. Between two sets, indicates if the first set is a superset of the second one.
-  between two strings, indicates a substring test. 
- *prefix*, defined on strings, returns a boolean
- *suffix*, defined on strings, returns a boolean
- *regex*, defined on strings, returns a boolean
- *add*, defined on integers, returns an integer. Defined on strings, concatenates them.
- *sub*, defined on integers, returns an integer
- *mul*, defined on integers, returns an integer
- *div*, defined on integers, returns an integer
- *and*, defined on booleans, returns a boolean
- *or*, defined on booleans, returns a boolean
- *intersection*, defined on sets, return a set that is the intersection of both arguments
- *union*, defined on sets, return a set that is the union of both arguments
+
+- _less than_, defined on integers and dates, returns a boolean
+- _greater than_, defined on integers and dates, returns a boolean
+- _less or equal_, defined on integers and dates, returns a boolean
+- _greater or equal_, defined on integers and dates, returns a boolean
+- _equal_, defined on integers, strings, byte arrays, dates, set, returns a boolean
+- _contains_ takes a set and another value as argument, returns a boolean. Between two sets, indicates if the first set is a superset of the second one.
+  between two strings, indicates a substring test.
+- _prefix_, defined on strings, returns a boolean
+- _suffix_, defined on strings, returns a boolean
+- _regex_, defined on strings, returns a boolean
+- _add_, defined on integers, returns an integer. Defined on strings, concatenates them.
+- _sub_, defined on integers, returns an integer
+- _mul_, defined on integers, returns an integer
+- _div_, defined on integers, returns an integer
+- _and_, defined on booleans, returns a boolean
+- _or_, defined on booleans, returns a boolean
+- _intersection_, defined on sets, return a set that is the intersection of both arguments
+- _union_, defined on sets, return a set that is the union of both arguments

 Integer operations must have overflow checks. If it overflows, the expression
 fails.
@ -593,11 +594,11 @@ token.

 #### Signature (one block)

-* `(pk_0, sk_0)` the root public and private Ed25519 keys
-* `data_0` the serialized Datalog
-* `(pk_1, sk_1)` the next key pair, generated at random
-* `alg_1` the little endian representation of the signature algorithm fr `pk1, sk1` (see protobuf schema)
-* `sig_0 = sign(sk_0, data_0 + alg_1 + pk_1)`
+- `(pk_0, sk_0)` the root public and private Ed25519 keys
+- `data_0` the serialized Datalog
+- `(pk_1, sk_1)` the next key pair, generated at random
+- `alg_1` the little endian representation of the signature algorithm fr `pk1, sk1` (see protobuf schema)
+- `sig_0 = sign(sk_0, data_0 + alg_1 + pk_1)`

 The token will contain:

@ -621,6 +622,7 @@ Token {
 With a token containing blocks 0 to n:

 Block n contains:
+
 - `data_n`
 - `pk_n+1`
 - `sig_n`
@ -780,6 +782,6 @@ We provide sample tokens and the expected result of their verification at

 - "Trust Management Languages" https://www.cs.purdue.edu/homes/ninghui/papers/cdatalog_padl03.pdf

-[^ProtoBuf]: ProtoBuf https://developers.google.com/protocol-buffers/
-[^DATALOG]: "Datalog with Constraints: A Foundation for Trust Management Languages" http://crypto.stanford.edu/~ninghui/papers/cdatalog_padl03.pdf
-[^MACAROONS]: "Macaroons: Cookies with Contextual Caveats for Decentralized Authorization in the Cloud" https://ai.google/research/pubs/pub41892
+[^protobuf]: ProtoBuf https://developers.google.com/protocol-buffers/
+[^datalog]: "Datalog with Constraints: A Foundation for Trust Management Languages" http://crypto.stanford.edu/~ninghui/papers/cdatalog_padl03.pdf
+[^macaroons]: "Macaroons: Cookies with Contextual Caveats for Decentralized Authorization in the Cloud" https://ai.google/research/pubs/pub41892