74 lines
2.9 KiB
Markdown
74 lines
2.9 KiB
Markdown
# NetText
|
|
|
|
A text-based data format for cryptographic network protocols.
|
|
|
|
## Principles
|
|
|
|
- Only uses a limited subset of ASCII characters
|
|
- Has a minimal set of fundamental data types
|
|
- Retains the raw representation of complex data structures for hashing and cryptographic signing
|
|
- Minimal value data type: a string type that can only be used to represent identifiers, numbers and base64-encoded byte strings.
|
|
|
|
## Fundamental types
|
|
|
|
A term can be of any of the following kinds:
|
|
|
|
- a string, which may contain only ASCII alphanumeric terms and `.-_*?`
|
|
- a dict, which maps strings (as defined above) to any term type
|
|
- a list, which is a consecutive sequence of at least 2 strings or dicts (can be mixed), simply separated by whitespace
|
|
|
|
Dicts are represented as follows:
|
|
|
|
```
|
|
{
|
|
key1 = value1,
|
|
key2 = value2
|
|
}
|
|
```
|
|
|
|
Lists are represented as follows:
|
|
|
|
```
|
|
term1 term2 term3
|
|
```
|
|
|
|
As a consequence, complex data structures can be defined as follows:
|
|
|
|
```
|
|
SENDTO alex {
|
|
topic = blah,
|
|
body = blah blah
|
|
}
|
|
```
|
|
|
|
The raw representation of a parsed dict or list is retained for hashing purposes.
|
|
It in the sequence of bytes, in the encoded string, trimmed from whitespace at extremities,
|
|
that represents the encoded dict or list in that string.
|
|
|
|
In the complex stance example above, here are the lists and dicts and their raw representation:
|
|
|
|
- the toplevel term is a list, whose raw representation is the entire encoded string (assuming no whitespace at beginning or end)
|
|
- the third term of the list is a dict, whose raw representation starts at `{` and ends at `}`
|
|
- the second mapping of the dict is a list, whose raw representation is exactly `blah blah`.
|
|
|
|
Since strings cannot contain whitespace, they are always equivalent to their raw representation.
|
|
|
|
## Structural mappings
|
|
|
|
Terms can be interpreted in a number of different ways, depending on the context:
|
|
|
|
- RAW: the term is interpreted as its raw encoding (see above)
|
|
- STRING: if the term is a string or a list composed exclusively of strings, the term is interpreted as its raw encoding
|
|
- VARIANT: if the term is a list whose first item is a string, it is interpreted as a variant with the following properties:
|
|
- a discriminator (the first item)
|
|
- a value, which is either the second item in case there are only two items, or the list composed of all items starting from the second if there are more than two
|
|
- DICT: if the term is a dict, interpret it as such
|
|
- LIST: if the term is a string or a dict, interpret it as a list composed of that single term. Otherwise, the term is a list, interpret it as a list of terms.
|
|
|
|
## Data mappings
|
|
|
|
Terms further have mappings as different data types:
|
|
|
|
- BYTES: if the term maps as a STRING, decode it using base64
|
|
- INT: if the term maps as a STRING, decode it as an integer written in decimal notation
|
|
- HASH, PUBKEY, SECKEY, SIGNATURE, ENCKEY, DECKEY, SYMKEY: a bunch of things that interpret BYTES as specific cryptographic items
|