mirror of
https://github.com/digital-asset/daml.git
synced 2024-09-20 09:17:43 +03:00
document lf-value-json's encoding of various types (#2519)
* copy design draft for JSON LF encoding with minimal changes * literal block formatting * :: on its own * only describe the variant notation we've chosen * wrap * mark some text literal * correct reason why null is not a valid Unit * alignment * explain that the presence of type variables has no bearing on nested Optional encoding * correct copyright header * quotes and subscripts * remove disallowed examples of variants * positive? negative? * you never know when JavaScript will ruin your day * what's {} anyway * we are talking about JSON, you know
This commit is contained in:
parent
0d72f84fe8
commit
f77e4229a2
438
ledger-service/lf-value-json/specification.rst
Normal file
438
ledger-service/lf-value-json/specification.rst
Normal file
@ -0,0 +1,438 @@
|
|||||||
|
.. Copyright (c) 2019 The DAML Authors. All rights reserved.
|
||||||
|
.. SPDX-License-Identifier: Apache-2.0
|
||||||
|
|
||||||
|
DAML-LF JSON Encoding
|
||||||
|
=====================
|
||||||
|
|
||||||
|
We describe how to decode and encode DAML-LF values as JSON. For each
|
||||||
|
DAML-LF type we explain what JSON inputs we accept (decoding), and what
|
||||||
|
JSON output we produce (encoding).
|
||||||
|
|
||||||
|
The output format is parameterized by two flags::
|
||||||
|
|
||||||
|
encodeDecimalAsString: boolean
|
||||||
|
encodeInt64AsString: boolean
|
||||||
|
|
||||||
|
The suggested defaults for both of these flags is false. If the
|
||||||
|
intended recipient is written in JavaScript, however, note that the
|
||||||
|
JavaScript data model will decode these as numbers, discarding data in
|
||||||
|
some cases; encode-as-String avoids this, as mentioned with respect to
|
||||||
|
``JSON.parse`` below.
|
||||||
|
|
||||||
|
Note that throughout the document the decoding is type-directed. In
|
||||||
|
other words, the same JSON value can correspond to many DAML-LF values,
|
||||||
|
and the expected DAML-LF type is needed to decide which one.
|
||||||
|
|
||||||
|
ContractId
|
||||||
|
----------
|
||||||
|
|
||||||
|
Contract ids are expressed as their string representation::
|
||||||
|
|
||||||
|
"123"
|
||||||
|
"XYZ"
|
||||||
|
"foo:bar#baz"
|
||||||
|
|
||||||
|
Decimal
|
||||||
|
-------
|
||||||
|
|
||||||
|
Input
|
||||||
|
~~~~~
|
||||||
|
|
||||||
|
Decimals can be expressed as JSON numbers or as JSON strings. JSON
|
||||||
|
strings are accepted using the same format that JSON accepts, and
|
||||||
|
treated them as the equivalent JSON number::
|
||||||
|
|
||||||
|
-?(?:0|[1-9]\d*)(?:\.\d+)?(?:[eE][+-]?\d+)?
|
||||||
|
|
||||||
|
Note that JSON numbers would be enough to represent all
|
||||||
|
Decimals. However, we also accept strings because in many languages
|
||||||
|
(most notably JavaScript) use IEEE Doubles to express JSON numbers, and
|
||||||
|
IEEE Doubles cannot express DAML-LF Decimals correctly. Therefore, we
|
||||||
|
also accept strings so that JavaScript users can use them to specify
|
||||||
|
Decimals that do not fit in IEEE Doubles.
|
||||||
|
|
||||||
|
Numbers must be within the bounds of Decimal, [–(10³⁸–1)÷10¹⁰,
|
||||||
|
(10³⁸–1)÷10¹⁰]. Numbers outside those bounds will be rejected. Numbers
|
||||||
|
inside the bounds will always be accepted, using banker's rounding to
|
||||||
|
fit them within the precision supported by Decimal.
|
||||||
|
|
||||||
|
A few valid examples::
|
||||||
|
|
||||||
|
42 --> 42
|
||||||
|
42.0 --> 42
|
||||||
|
"42" --> 42
|
||||||
|
9999999999999999999999999999.9999999999 -->
|
||||||
|
9999999999999999999999999999.9999999999
|
||||||
|
-42 --> -42
|
||||||
|
"-42" --> -42
|
||||||
|
0 --> 0
|
||||||
|
-0 --> 0
|
||||||
|
0.30000000000000004 --> 0.3
|
||||||
|
2e3 --> 2000
|
||||||
|
|
||||||
|
A few invalid examples::
|
||||||
|
|
||||||
|
" 42 "
|
||||||
|
"blah"
|
||||||
|
99999999999999999999999999990
|
||||||
|
+42
|
||||||
|
|
||||||
|
Output
|
||||||
|
~~~~~~
|
||||||
|
|
||||||
|
If encodeDecimalAsString is set, decimals are encoded as strings, using
|
||||||
|
the format ``-?[0-9]{1,28}(\.[0-9]{1,10})?``. If encodeDecimalAsString
|
||||||
|
is not set, they are encoded as JSON numbers, also using the format
|
||||||
|
``-?[0-9]{1,28}(\.[0-9]{1,10})?``.
|
||||||
|
|
||||||
|
Note that the flag encodeDecimalAsString is useful because it lets
|
||||||
|
JavaScript consumers consume Decimals safely with the standard
|
||||||
|
JSON.parse.
|
||||||
|
|
||||||
|
Int64
|
||||||
|
-----
|
||||||
|
|
||||||
|
Input
|
||||||
|
~~~~~
|
||||||
|
|
||||||
|
Int64, much like Decimal, can be represented as JSON numbers and as
|
||||||
|
strings, with the string representation being ``[+-]?[0-9]+``. The
|
||||||
|
numbers must fall within [-9223372036854775808,
|
||||||
|
9223372036854775807]. Moreover, if represented as JSON numbers, they
|
||||||
|
must have no fractional part.
|
||||||
|
|
||||||
|
A few valid examples::
|
||||||
|
|
||||||
|
42
|
||||||
|
"+42"
|
||||||
|
-42
|
||||||
|
0
|
||||||
|
-0
|
||||||
|
9223372036854775807
|
||||||
|
"9223372036854775807"
|
||||||
|
-9223372036854775808
|
||||||
|
"-9223372036854775808"
|
||||||
|
|
||||||
|
A few invalid examples::
|
||||||
|
|
||||||
|
42.3
|
||||||
|
+42
|
||||||
|
9223372036854775808
|
||||||
|
-9223372036854775809
|
||||||
|
"garbage"
|
||||||
|
" 42 "
|
||||||
|
|
||||||
|
Output
|
||||||
|
~~~~~~
|
||||||
|
|
||||||
|
If encodeInt64AsString is set, Int64s are encoded as strings, using the
|
||||||
|
format ``-?[0-9]+``. If encodeInt64AsString is not set, they are encoded as
|
||||||
|
JSON numbers, also using the format ``-?[0-9]+``.
|
||||||
|
|
||||||
|
Note that the flag encodeInt64AsString is useful because it lets
|
||||||
|
JavaScript consumers consume Int64s safely with the standard
|
||||||
|
``JSON.parse``.
|
||||||
|
|
||||||
|
Timestamp
|
||||||
|
---------
|
||||||
|
|
||||||
|
Input
|
||||||
|
~~~~~
|
||||||
|
|
||||||
|
Timestamps are represented as ISO 8601 strings, rendered using the
|
||||||
|
format ``yyyy-mm-ddThh:mm:ss[.ssssss]Z``::
|
||||||
|
|
||||||
|
1990-11-09T04:30:23.1234569Z
|
||||||
|
1990-11-09T04:30:23Z
|
||||||
|
1990-11-09T04:30:23.123Z
|
||||||
|
0001-01-01T00:00:00Z
|
||||||
|
9999-12-31T23:59:59.999999Z
|
||||||
|
|
||||||
|
It's OK to omit the microsecond part partially or entirely. Sub-second
|
||||||
|
data beyond microseconds will be dropped. The UTC timezone designator
|
||||||
|
must be included. The rationale behind the inclusion of the timezone
|
||||||
|
designator is minimizing the risk that users pass in local times.
|
||||||
|
|
||||||
|
The timestamp must be between the bounds specified by DAML-LF and ISO
|
||||||
|
8601, [0001-01-01T00:00:00Z, 9999-12-31T23:59:59.999999Z].
|
||||||
|
|
||||||
|
JavaScript
|
||||||
|
|
||||||
|
::
|
||||||
|
|
||||||
|
> new Date().toISOString()
|
||||||
|
'2019-06-18T08:59:34.191Z'
|
||||||
|
|
||||||
|
Python
|
||||||
|
|
||||||
|
::
|
||||||
|
|
||||||
|
>>> datetime.datetime.utcnow().isoformat() + 'Z'
|
||||||
|
'2019-06-18T08:59:08.392764Z'
|
||||||
|
|
||||||
|
Java
|
||||||
|
|
||||||
|
::
|
||||||
|
|
||||||
|
import java.time.Instant;
|
||||||
|
class Main {
|
||||||
|
public static void main(String[] args) {
|
||||||
|
Instant instant = Instant.now();
|
||||||
|
// prints 2019-06-18T09:02:16.652Z
|
||||||
|
System.out.println(instant.toString());
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
Output
|
||||||
|
~~~~~~
|
||||||
|
|
||||||
|
Timestamps are encoded as ISO 8601 strings, rendered using the format
|
||||||
|
``yyyy-mm-ddThh:mm:ss[.ssssss]Z``.
|
||||||
|
|
||||||
|
The sub-second part will be formatted as follows:
|
||||||
|
|
||||||
|
- If no sub-second part is present in the timestamp (i.e. the timestamp
|
||||||
|
represents whole seconds), the sub-second part will be omitted
|
||||||
|
entirely;
|
||||||
|
- If the sub-second part does not go beyond milliseconds, the sub-second
|
||||||
|
part will be up to milliseconds, padding with trailing 0s if
|
||||||
|
necessary;
|
||||||
|
- Otherwise, the sub-second part will be up to microseconds, padding
|
||||||
|
with trailing 0s if necessary.
|
||||||
|
|
||||||
|
In other words, the encoded timestamp will either have no sub-second
|
||||||
|
part, a sub-second part of length 3, or a sub-second part of length 6.
|
||||||
|
|
||||||
|
Party
|
||||||
|
-----
|
||||||
|
|
||||||
|
Represented using their string representation, without any additional
|
||||||
|
quotes::
|
||||||
|
|
||||||
|
"Alice"
|
||||||
|
"Bob"
|
||||||
|
|
||||||
|
Unit
|
||||||
|
----
|
||||||
|
|
||||||
|
Represented as empty object ``{}``. Note that in JavaScript ``{} !==
|
||||||
|
{}``; however, ``null`` would be ambiguous; for the type ``Optional
|
||||||
|
Unit``, ``null`` decodes to ``None``, but ``{}`` decodes to ``Some ()``.
|
||||||
|
|
||||||
|
Additionally, we think that this is the least confusing encoding for
|
||||||
|
Unit since unit is conceptually an empty record. We do not want to
|
||||||
|
imply that Unit is used similarly to null in JavaScript or None in
|
||||||
|
Python.
|
||||||
|
|
||||||
|
Date
|
||||||
|
----
|
||||||
|
|
||||||
|
Represented as an ISO 8601 date rendered using the format
|
||||||
|
``yyyy-mm-dd``::
|
||||||
|
|
||||||
|
2019-06-18
|
||||||
|
9999-12-31
|
||||||
|
0001-01-01
|
||||||
|
|
||||||
|
The dates must be between the bounds specified by DAML-LF and ISO 8601,
|
||||||
|
[0001-01-01, 9999-99-99].
|
||||||
|
|
||||||
|
Text
|
||||||
|
----
|
||||||
|
|
||||||
|
Represented as strings.
|
||||||
|
|
||||||
|
Bool
|
||||||
|
----
|
||||||
|
|
||||||
|
Represented as booleans.
|
||||||
|
|
||||||
|
Record
|
||||||
|
------
|
||||||
|
|
||||||
|
Input
|
||||||
|
~~~~~
|
||||||
|
|
||||||
|
Records can be represented in two ways. As objects::
|
||||||
|
|
||||||
|
{ f₁: v₁, ..., fₙ: vₙ }
|
||||||
|
|
||||||
|
And as arrays::
|
||||||
|
|
||||||
|
[ v₁, ..., vₙ ]
|
||||||
|
|
||||||
|
Note that DAML-LF record fields are ordered. So if we have
|
||||||
|
|
||||||
|
::
|
||||||
|
|
||||||
|
record Foo = {f1: Int64, f2: Bool}
|
||||||
|
|
||||||
|
when representing the record as an array the user must specify the
|
||||||
|
fields in order::
|
||||||
|
|
||||||
|
[42, true]
|
||||||
|
|
||||||
|
The motivation for the array format for records is to allow specifying
|
||||||
|
tuple types closer to what it looks like in DAML. Note that a DAML
|
||||||
|
tuple, i.e. (42, True), will be compiled to a DAML-LF record ``Tuple2 {
|
||||||
|
_1 = 42, _2 = True }``.
|
||||||
|
|
||||||
|
Output
|
||||||
|
~~~~~~
|
||||||
|
|
||||||
|
Records are always encoded as objects.
|
||||||
|
|
||||||
|
List
|
||||||
|
----
|
||||||
|
|
||||||
|
Lists are represented as
|
||||||
|
|
||||||
|
::
|
||||||
|
|
||||||
|
[v₁, ..., vₙ]
|
||||||
|
|
||||||
|
Map
|
||||||
|
---
|
||||||
|
|
||||||
|
Maps are represented as objects:
|
||||||
|
|
||||||
|
::
|
||||||
|
|
||||||
|
{ k₁: v₁, ..., kₙ: vₙ }
|
||||||
|
|
||||||
|
Optional
|
||||||
|
--------
|
||||||
|
|
||||||
|
Input
|
||||||
|
~~~~~
|
||||||
|
|
||||||
|
Optionals are encoded using ``null`` if the value is None, and with the
|
||||||
|
value itself if it's Some. However, this alone does not let us encode
|
||||||
|
nested optionals unambiguously. Therefore, nested Optionals are encoded
|
||||||
|
using an empty list for None, and a list with one element for Some. Note
|
||||||
|
that after the top-level Optional, all the nested ones must be
|
||||||
|
represented using the list notation.
|
||||||
|
|
||||||
|
A few examples, using the form
|
||||||
|
|
||||||
|
::
|
||||||
|
|
||||||
|
JSON --> DAML-LF : Expected DAML-LF type
|
||||||
|
|
||||||
|
to make clear what the target DAML-LF type is::
|
||||||
|
|
||||||
|
null --> None : Optional Int64
|
||||||
|
null --> None : Optional (Optional Int64)
|
||||||
|
42 --> Some 42 : Optional Int64
|
||||||
|
[] --> Some None : Optional (Optional Int64)
|
||||||
|
[42] --> Some (Some 42) : Optional (Optional Int64)
|
||||||
|
[[]] --> Some (Some None) : Optional (Optional (Optional Int64))
|
||||||
|
[[42]] --> Some (Some (Some 42)) : Optional (Optional (Optional Int64))
|
||||||
|
...
|
||||||
|
|
||||||
|
Finally, if Optional values appear in records, they can be omitted to
|
||||||
|
represent None. Given DAML-LF types
|
||||||
|
|
||||||
|
::
|
||||||
|
|
||||||
|
record Depth1 = { foo: Optional Int64 }
|
||||||
|
record Depth2 = { foo: Optional (Optional Int64) }
|
||||||
|
|
||||||
|
We have
|
||||||
|
|
||||||
|
::
|
||||||
|
|
||||||
|
{ } --> Depth1 { foo: None } : Depth1
|
||||||
|
{ } --> Depth2 { foo: None } : Depth2
|
||||||
|
{ foo: 42 } --> Depth1 { foo: Some 42 } : Depth1
|
||||||
|
{ foo: [42] } --> Depth2 { foo: Some (Some 42) } : Depth2
|
||||||
|
{ foo: null } --> Depth1 { foo: None } : Depth1
|
||||||
|
{ foo: null } --> Depth2 { foo: None } : Depth2
|
||||||
|
{ foo: [] } --> Depth2 { foo: Some None } : Depth2
|
||||||
|
|
||||||
|
Note that the shortcut for records and Optional fields does not apply to
|
||||||
|
Map (which are also represented as objects), since Map relies on absence
|
||||||
|
of key to determine what keys are present in the Map to begin with. Nor
|
||||||
|
does it apply to the ``[f₁, ..., fₙ]`` record form; ``Depth1 None`` in
|
||||||
|
the array notation must be written as ``[null]``.
|
||||||
|
|
||||||
|
Type variables may appear in the DAML-LF language, but are always
|
||||||
|
resolved before deciding on a JSON encoding. So, for example, even
|
||||||
|
though ``Oa`` doesn't appear to contain a nested ``Optional``, it may
|
||||||
|
contain a nested ``Optional`` by virtue of substituting the type
|
||||||
|
variable ``a``::
|
||||||
|
|
||||||
|
record Oa a = { foo: Optional a }
|
||||||
|
|
||||||
|
{ foo: 42 } --> Oa { foo: Some 42 } : Oa Int
|
||||||
|
{ } --> Oa { foo: None } : Oa Int
|
||||||
|
{ foo: [] } --> Oa { foo: Some None } : Oa (Optional Int)
|
||||||
|
{ foo: [42] } --> Oa { foo: Some (Some 42) } : Oa (Optional Int)
|
||||||
|
|
||||||
|
In other words, the correct JSON encoding for any LF value is the one
|
||||||
|
you get when you have eliminated all type variables.
|
||||||
|
|
||||||
|
Output
|
||||||
|
~~~~~~
|
||||||
|
|
||||||
|
Encoded as described above, always applying the shortcut for None record
|
||||||
|
fields.
|
||||||
|
|
||||||
|
Variant
|
||||||
|
-------
|
||||||
|
|
||||||
|
Variants are expressed as
|
||||||
|
|
||||||
|
::
|
||||||
|
|
||||||
|
{ constructor: argument }
|
||||||
|
|
||||||
|
For example, if we have
|
||||||
|
|
||||||
|
::
|
||||||
|
|
||||||
|
variant Foo = Bar Int64 | Baz Unit | Quux (Optional Int64)
|
||||||
|
|
||||||
|
These are all valid JSON encodings for values of type Foo::
|
||||||
|
|
||||||
|
{"Bar": 42}
|
||||||
|
{"Baz": {}}
|
||||||
|
{"Quux": null}
|
||||||
|
{"Quux": 42}
|
||||||
|
|
||||||
|
Note that DAML data types with named fields are compiled by factoring
|
||||||
|
out the record. So for example if we have
|
||||||
|
|
||||||
|
::
|
||||||
|
|
||||||
|
data Foo = Bar {f1: Int64, f2: Bool} | Baz
|
||||||
|
|
||||||
|
We'll get in DAML-LF
|
||||||
|
|
||||||
|
::
|
||||||
|
|
||||||
|
record Foo.Bar = {f1: Int64, f2: Bool}
|
||||||
|
variant Foo = Bar Foo.Bar | Baz Unit
|
||||||
|
|
||||||
|
and then, from JSON
|
||||||
|
|
||||||
|
::
|
||||||
|
|
||||||
|
{"Bar": {"f1": 42, "f2": true}}
|
||||||
|
{"Baz": {}}
|
||||||
|
|
||||||
|
This can be encoded and used in TypeScript, including exhaustiveness
|
||||||
|
checking; see `a keyed example`_.
|
||||||
|
|
||||||
|
.. _a keyed example: https://www.typescriptlang.org/play/#src=type%20Foo%20%3D%0D%0A%20%20%20%20%7B%20Bar%3A%20%7B%20f1%3A%20number%2C%20f2%3A%20boolean%20%7D%20%7D%0D%0A%20%20%7C%20%7B%20Baz%3A%20%7B%20f3%3A%20string%20%7D%20%7D%3B%0D%0A%0D%0Afunction%20test(v%3A%20Foo)%20%7B%0D%0A%20%20if%20(%22Bar%22%20in%20v)%20%7B%0D%0A%20%20%20%20console.log(v.Bar.f1%2C%20v.Bar.f2)%3B%0D%0A%20%20%7D%20else%20if%20(%22Baz%22%20in%20v)%20%7B%0D%0A%20%20%20%20console.log(v.Baz.f3)%3B%0D%0A%20%20%7D%20else%20%7B%0D%0A%20%20%20%20const%20_%3A%20never%20%3D%20v%3B%0D%0A%20%20%7D%0D%0A%7D%20%0D%0A
|
||||||
|
|
||||||
|
Enum
|
||||||
|
----
|
||||||
|
|
||||||
|
Enums are represented as strings. So if we have
|
||||||
|
|
||||||
|
::
|
||||||
|
|
||||||
|
enum Foo = Bar | Baz
|
||||||
|
|
||||||
|
There are exactly two valid JSON values for Foo, "Bar" and "Baz".
|
Loading…
Reference in New Issue
Block a user