Implementation Details
NOP tags and Alignment
Alignment refers to the position of an element with respect to the start of a sequence of LiteVector elements (the start of a message or beginning of a file for example).
Individual elements are unaligned. It is good form for vectors to be aligned to the size of their type. For example, a vector of 32-bit float values would have the first byte of the first element start at a 4-byte offset from the start of the message. This can be accomplished by inserting NOP
values in the stream before the vector's tag.
Aligning vectors allows deserializer code to optimize vector processing by processing vectors in place. If they are not aligned, then the receiver may need to perform additional work in order to use the vector data. However, serialization, deserialization, and alignment needs can change between applications, so alignment is not strictly required. Standard conformant deserializers should be able to deal with both aligned and unaligned vectors.
Standalone Integer Sizing
For arrays of integers, LiteVectors takes the relatively hands off approach of 'set a datatype and go to town' - all elements are the same size and type.
For standalone integers, however, different environments use different strategies. Some like C allow you to define an 'int' that adapts to the native word size of the hardware. Some JavaScript engines track whether a 'number' is an integer or not, and can fit it into about 2^53. Python 3 implements arbitrarily large integers.
In order to implement a normalized interoperable format and not be wasteful with space, LiteVectors recommends a Goldilocks 'best fit' integer encoding for standalone integers. The rules are simple:
- Non-negative integers are encoded as unsigned types
- Negative integers are encoded as signed types
- Integers are encoded in the smallest type that will hold them (among u8, u16, u32, u64, i8, i16, i32, i64).
By following these rules, most integers are stored relatively efficiently. Additionally, integer format is deterministic even between platforms with normally incompatible integer representations.
Encoders should goldilocks fit standalone integers, and decoders should be ready to deserialize them into whichever platform type is appropriate.
JSON Representation
LiteVectors support JSON representation for interoperability and diagnostics - JSON is often easier to read than hex.
LiteVector | JSON | JSON Example | Note |
---|---|---|---|
nil | null | JSON null | |
struct | object | { 'age': 2, 'cat': true } | JSON objects |
list | array | [1, "brown", "cat"] | JSON array |
string | string | "Hello JSON | |
bool | bool | true, false | |
(i,u)(8,16,32) | number | -100, 0, 200 | Integers are JSON numbers |
i64, u64 | string | "-100", "0", "200" | JSON string to avoid number overflow |
f32, f64 | number | 1.2, -5.9, "NaN, "Infinity" | JSON value is a number or one of the special strings "NaN", "Infinity" or "-Infinity" |
vector | array | [1, 2, 3, 4] | Vectors of primitives follow the same encoding above, but in a JSON array |