remove symbols

They complicated the Datalog, and the #authority and #ambient symbols
are not needed anymore with the scoped execution
This commit is contained in:
Geoffroy Couprie 2021-09-09 21:56:02 +02:00
parent deb45ac1db
commit 6092bd1614
2 changed files with 74 additions and 97 deletions

View File

@ -19,7 +19,6 @@ to be valid, all of the checks defined in the token and the verifier must succee
until one of them matches. They can only be defined in the verifier
- block: a list of datalog facts, rules and checks. The first block is the authority
block, used to define the basic rights of a token
- symbol: string that is stored in a table and referred to by its index to save space
### Overview
@ -60,7 +59,6 @@ simplifies its implementation and makes the check more precise.
A Biscuit Datalog program contains *facts* and *rules*, which are made of
*predicates* over the following types:
- *symbol*
- *variable*
- *integer*
- *string*
@ -94,36 +92,29 @@ the policy matches, and we stop there, otherwise we test the next one. If an
succeeds, the token verification fails. Those policies are tested after all of
the *checks* have passed.
We will represent the various types as follows:
- symbol: `#a`
- variable: `$variable` (the variable name is converted to an integer id through the symbol table)
- integer: `12`
- string: `"hello"`
- byte array: `hex:01A2`
- date in RFC 3339 format: `1985-04-12T23:20:50.52Z`
- boolean: `true` or `false`
- set: `[ #a, #b, #c]`
- set: `[ "a", "b", "c"]`
As an example, assuming we have the following facts: `parent(#a, #b)`,
`parent(#b, #c)`, `#parent(#c, #d)`. If we apply the rule
As an example, assuming we have the following facts: `parent("a", "b")`,
`parent("b", "c")`, `parent("c", "d")`. If we apply the rule
`grandparent($x, $z) <- parent($x, $y), parent($y, $z)`, we will try to replace
the predicates in the body by matching facts. We will get the following
combinations:
- `grandparent(#a, #c) <- parent(#a, #b), parent(#b, #c)`
- `grandparent(#b, #d) <- parent(#b, #c), parent(#c, #d)`
- `grandparent("a", "c") <- parent("a", "b"), parent("b", "c")`
- `grandparent("b", "d") <- parent("b", "c"), parent("c", "d")`
The system will now contain the two new facts `grandparent(#a, #c)` and
`grandparent(#b, #d)`. Whenever we generate new facts, we have to reapply all of
The system will now contain the two new facts `grandparent("a", "c")` and
`grandparent("b", "d")`. Whenever we generate new facts, we have to reapply all of
the system's rules on the facts, because some rules might give a new result. Once
rules application does not generate any new facts, we can stop.
#### Data types
A *symbol* indicates a value that supports equality, set inclusion and set
exclusion checks. Its internal representation is an index into the token's
symbol table, which is a list of strings. The symbol table reduces the size of
tokens by storing common symbols in a predefined table, and writing new symbols
only once per token.
An *integer* is a signed 64 bits integer. It supports the following
operatios: lower, larger, lower or equal, larger or equal, equal, set
inclusion and set exclusion.
@ -153,15 +144,15 @@ apply on facts created in the current or previous blocks. Facts, rules, checks
and policies of the verifier are executed in the context of the authority block.
Example:
- the token contains `right("file1", #read)` in the first block
- the token holder adds a block with the fact `right("file2", #read)`
- the token contains `right("file1", "read")` in the first block
- the token holder adds a block with the fact `right("file2", "read")`
- the verifier adds:
- `resource("file2")`
- `operation(#read)`
- `operation("read")`
- `check if resource($res), operation($op), right($res, $op)`
The verifier's check will fail because when it is evaluated, it only sees
`right("file1", #read)` from the authority block.
`right("file1", "read")` from the authority block.
### Checks
@ -190,15 +181,15 @@ The second caveat checks that the resource is `file1`.
```
authority:
right("file1", #read);
right("file2", #read);
right("file1", #write);
right("file1", "read");
right("file2", "read");
right("file1", "write");
----------
Block 1:
check if
resource($0),
operation(#read),
right($0, #read) // restrict to read operations
operation("read"),
right($0, "read") // restrict to read operations
----------
Block 2:
check if
@ -209,13 +200,13 @@ The verifier side provides the `resource` and `operation` facts with information
from the request.
If the verifier provided the facts `resource("file2")` and
`operation(#read)`, the rule application of the first check would see
`resource("file2"), operation(#read), right("file2", #read)`
`operation("read")`, the rule application of the first check would see
`resource("file2"), operation("read"), right("file2", "read")`
with `X = "file2"`, so it would succeed, but the second check would fail
because it expects `resource("file1")`.
If the verifier provided the facts `resource("file1")` and
`operation(#read)`, both checks would succeed.
`operation("read")`, both checks would succeed.
#### Broad authority rules
@ -228,9 +219,9 @@ the token.
authority:
// if there is an ambient resource and we own it, we can read it
right($0, #read) <- resource($0), owner($1, $0);
right($0, "read") <- resource($0), owner($1, $0);
// if there is an ambient resource and we own it, we can write to it
right($0, #write) <- resource($0), owner($1, $0);
right($0, "write") <- resource($0), owner($1, $0);
----------
Block 1:
@ -243,13 +234,13 @@ Block 2:
check if
resource($0),
owner(#alice, $0) // defines a token only usable by alice
owner("alice", $0) // defines a token only usable by alice
```
These rules will define authority facts depending on verifier data.
If we had the facts `resource("file1")` and
`owner(#alice, "file1")`, the authority rules will define
`right"file1", #read)` and `right("file1", #write)`,
`owner("alice", "file1")`, the authority rules will define
`right"file1", "read")` and `right("file1", "write")`,
which will allow check 1 and check 2 to succeed.
If the owner ambient fact does not match the restriction in the second check, the
@ -273,9 +264,9 @@ restrict usage based on ambient values:
```
authority:
right("/folder/file1", #read);
right("/folder/file2", #read);
right("/folder2/file3", #read);
right("/folder/file1", "read");
right("/folder/file2", "read");
right("/folder2/file3", "read");
----------
check if resource($0), right($0, $1)
----------
@ -315,7 +306,7 @@ Here are the currently defined binary operations:
- *greater than*, defined on integers and dates, returns a boolean
- *less or equal*, defined on integers and dates, returns a boolean
- *greater or equal*, defined on integers and dates, returns a boolean
- *equal*, defined on integers, strings, byte arrays, dates, symbols, set, returns a boolean
- *equal*, defined on integers, strings, byte arrays, dates, set, returns a boolean
- *contains* takes a set and another value as argument, returns a boolean. Between two sets, indicates if the first set is a superset of the second one
- *prefix*, defined on strings, returns a boolean
- *suffix*, defined on strings, returns a boolean

View File

@ -137,7 +137,6 @@ It has the following base types (for elements inside of a fact):
- string
- date (seconds from epoch, UTC)
- byte array
- symbol (interned strings that are stored in a dictionary to spare space)
- boolean (true or false)
- set: a deduplicated list of values, that can be of any type except variables or sets
@ -171,25 +170,13 @@ using a string expression:
```
check if
resource(#ambient, $path),
resource($path),
$path.matches("file[0-9]+.txt")
```
This check matches only if there exists a `resource(#ambient, $path)` fact for
This check matches only if there exists a `resource($path)` fact for
which `$path` matches a pattern.
## `#ambient` and `#authority` symbols
This check uses a _symbol_ named `#ambient` (symbols start with a `#`).
There are two special symbols that can appear in facts:
-`#ambient`: facts that are _provided by the verifier_, and that depend on the **request**, like which resource we want to access(file path, REST endpoint, etc), operation(read, write...), current date and time, source IP address, HTTP headers...
- `#authority`: facts _defined by the token's original creator_ or _the verifier_, that indicates the basic rights of the **token**. Every new attenation of the token will reduce those rights by adding checks
`#ambient` and `#authority` tokens can only be provided by the token's origin
or by the verifier, **they cannot be added by attenuating the token**.
## Allow and deny policies
The validation in Biscuit relies on a list of allow or deny policies, that are
@ -204,12 +191,12 @@ Example policies:
```
// verifies that we have rights for this request
allow if
resource(#ambient, $res),
operation(#ambient, $op),
right(#authority, $res, $op)
resource($res),
operation($op),
right($res, $op)
// otherwise, allow if we're admin
allow if is_admin(#authority)
allow if is_admin()
// catch all if non of the policies matched
deny if true
@ -251,7 +238,7 @@ Biscuit {
context: ""
version: 1
facts: [
user_id(#authority, "user_1234"),
user_id("user_1234"),
]
rules: []
checks: []
@ -264,35 +251,34 @@ Biscuit {
Let's unpack what's displayed here:
- `symbols` carries a list of symbols used in the biscuit.
- `authority` carries information provided by the token creator. It gives the initial scope of the bicuit.
- `authority` this block carries information provided by the token creator. It gives the initial scope of the biscuit.
- `blocks` carries a list of blocks, which can refine the scope of the biscuit
Here, `authority` provides the initial block, which can be refined in subsequent blocks.
A block comes with new symbols it adds to the system (there's a default symbol
table that already contains values like `#authority` or `#operation`). It can
A block comes with new symbols it adds to the system. It can
contain facts, rules and checks. A block contains:
- `symbols`: a block can introduce new symbols: these symbols are available in the current block, _and the following blocks_. **It is not possible to re-declare an existing symbol**.
- `context`: free form text used either for documentation purpose, or to give a hind about which facts should be retrieved from DB
- `facts`: each block can define new facts (but only `authority` can define facts mentioning `#authority`)
- `rules` each block can define new rules (but only `authority` can define rules deriving facts mentioning `#authority`)
- `checks` each block can define new checks (queries that need to match in order to make the biscuit valid)
- `facts`: each block can define new facts
- `rules` each block can define new rules but they only have access to facts from the current and previous blocks
- `checks` each block can define new checks (queries that need to match in order to make the biscuit valid) but they only have access to facts from the current and previous blocks
Let's assume the user is sending this token with a `PUT /bucket_5678/folder1/hello.txt` HTTP
request. The verifier would then load the token's facts and rules, along with
facts from the request:
```
user_id(#authority, "user_1234");
operation(#ambient, #write);
resource(#ambient, "bucket_5678", "/folder1/hello.txt");
current_time(#ambient, 2020-11-17T12:00:00+00:00);
user_id("user_1234");
operation("write");
resource("bucket_5678", "/folder1/hello.txt");
current_time(2020-11-17T12:00:00+00:00);
```
The verifier would also be able to load authorization data from its database,
like ownership information: `owner(#authority, "user_1234", "bucket_1234")`,
`owner(#authority, "user_1234", "bucket_5678")` `owner(#authority, "user_ABCD", "bucket_ABCD")`.
like ownership information: `owner("user_1234", "bucket_1234")`,
`owner("user_1234", "bucket_5678")` `owner("user_ABCD", "bucket_ABCD")`.
In practice,this data could be filtered by limiting it to facts related to
the current ressource, or extracting the user id from the token with a query.
@ -301,11 +287,11 @@ if we own a specific folder:
```
// the resource owner has all rights on the resource
right(#authority, $bucket, $path, $operation) <-
resource(#ambient, $bucket, $path),
operation(#ambient, $operation),
user_id(#authority, $id),
owner(#authority, $id, $bucket)
right($bucket, $path, $operation) <-
resource($bucket, $path),
operation($operation),
user_id($id),
owner($id, $bucket)
```
This rule will generate a `right` fact if it finds data matching the variables.
@ -313,14 +299,14 @@ This rule will generate a `right` fact if it finds data matching the variables.
We end up with a system with the following facts:
```
user_id(#authority, "user_1234");
operation(#ambient, #write);
resource(#ambient, "bucket_5678", "/folder1/hello.txt");
current_time(#ambient, 2020-11-17T12:00:00+00:00);
owner(#authority, "user_1234", "bucket_1234");
owner(#authority, "user_1234", "bucket_5678");
owner(#authority, "user_ABCD", "bucket_ABCD");
right(#authority, "bucket_5678", "/folder1/hello.txt", #write);
user_id("user_1234");
operation("write");
resource("bucket_5678", "/folder1/hello.txt");
current_time(2020-11-17T12:00:00+00:00);
owner("user_1234", "bucket_1234");
owner("user_1234", "bucket_5678");
owner("user_ABCD", "bucket_ABCD");
right("bucket_5678", "/folder1/hello.txt", "write");
```
At last, the verifier provides a policy to test that we have the rights for this
@ -328,9 +314,9 @@ operation:
```
allow if
right(#authority, $bucket, $path, $operation),
resource(#ambient, $bucket, $path),
operation(#ambient, $operation)
right($bucket, $path, $operation),
resource($bucket, $path),
operation($operation)
```
Here we can find matching facts, so the request succeeds. If the request was
@ -351,7 +337,7 @@ Biscuit {
context: ""
version: 1
facts: [
right(#authority, "bucket_5678", "/folder1/hello.txt", #read)
right("bucket_5678", "/folder1/hello.txt", "read")
]
rules: []
checks: []
@ -375,7 +361,7 @@ Biscuit {
context: ""
version: 1
facts: [
user_id(#authority, "user_1234"),
user_id("user_1234"),
]
rules: []
checks: []
@ -388,7 +374,7 @@ Biscuit {
facts: []
rules: []
checks: [
check if resource(#ambient, "bucket_5678", "/folder1/hello.txt"), operation(#ambient, #read)
check if resource("bucket_5678", "/folder1/hello.txt"), operation("read")
]
}
@ -400,18 +386,18 @@ With that token, if the holder tried to do a `PUT /bucket_5678/folder1/hello.txt
request, we would end up with the following facts:
```
user_id(#authority, "user_1234");
operation(#ambient, #write);
resource(#ambient, "bucket_5678", "/folder1/hello.txt");
current_time(#ambient, 2020-11-17T12:00:00+00:00);
owner(#authority, "user_1234", "bucket_1234");
owner(#authority, "user_1234", "bucket_5678");
owner(#authority, "user_ABCD", "bucket_ABCD");
right(#authority, "bucket_5678", "/folder1/hello.txt", #write);
user_id("user_1234");
operation("write");
resource("bucket_5678", "/folder1/hello.txt");
current_time(2020-11-17T12:00:00+00:00);
owner("user_1234", "bucket_1234");
owner("user_1234", "bucket_5678");
owner("user_ABCD", "bucket_ABCD");
right("bucket_5678", "/folder1/hello.txt", "write");
```
The verifier's policy would still succeed, but the check from block 1 would
fail because it cannot find `operation(#ambient, #read)`.
fail because it cannot find `operation("read")`.
By playing with the facts provided on the token and verifier sides, generating
data through rules, and restricting access with a series of checks, it is