-
Notifications
You must be signed in to change notification settings - Fork 193
Type Mapping Rules
Because MessagePack uses a schema-less, polymorphic type system, and Go is a strongly-typed language, any Go implementation of MessagePack serialization will have to make choices about how Go types map onto MessagePack types, and vice-versa. This document aims to explain the rules that msgp
uses, and the justifications behind them.
msgp
always attempts to encode Go values in the smallest wire representation possible without any loss in numerical precision. For example, even though a Go int
is 64 bits on 64-bit hardware, the encoding of int(5)
is one byte on the wire, and it will still be 5
when decoded.
As a consequence of this rule, msgp
will never let you decode a value that would overflow the object you are decoding into. For instance, if you use msgp.ReadInt16Bytes()
or (*Reader).ReadInt16()
to read out an integer value, the method will only succeed if the value of the integer is between math.MinInt16
and math.MaxInt16
. For clarity's sake, here is the actual code for (*Reader).ReadInt16()
:
// ReadInt16 reads an int16 from the reader
func (m *Reader) ReadInt16() (i int16, err error) {
var in int64
in, err = m.ReadInt64()
if in > math.MaxInt16 || in < math.MinInt16 {
err = IntOverflow{Value: in, FailedBitsize: 16}
return
}
i = int16(in)
return
}
Note that all the methods that read int64
values will never return overflow errors, since MessagePack does not support integers wider than 64 bits.
Tip: Use int
/uint
or int64
/uint64
when you cannot be sure of the magnitude of the encoded value.
msgp
will always encode a Go float32
as a 32-bit IEEE-754 float, and a float64
as a 64-bit IEEE-754 float.
When decoding, it is legal to decode a 32-bit float on the wire as a Go float64
, but the opposite is illegal. This is to avoid the possibility of losing numerical precision.
Using the //msgp:compactfloatsfile directive or
msgp.AppendFloat/
(*Writer).WriteFloat` will store float64 values as float32, if it can be done so without precision loss.
Tip: Don't mix-and-match; pick either float32
or float64
and use it everywhere.
msgp
will not allow a value that is a uint
on the wire to be decoded into an int
, and vice-versa.
The justification behind this is to prevent applications from failing sporadically because one implementation encodes int
values and the other decodes uint
values, for example. Those types are not strictly compatible, and thus it is treated as a type error.
(This is unlike the floating-point conversion rules, as neither uint
nor int
is a strict sub- or super-set of the other.)
Tip: Use mostly signed integers.
Like JSON, MessagePack has no notion of strongly-typed data structures. msgp
encodes Go struct
objects as MessagePack maps by default, but it can also encode them as tuples (ordered arrays). Decoding maps into Go struct
s can present some peculiar edge cases.
msgp
does not support decoding maps with keys that are not "string-able" (either str
or bin
type, although str
is preferred.)
You can still manually decode arbitrary maps with the primitives built into the library.
The generated implementations of msgp.Unmarshaler
and msgp.Decodable
decode the intersection of the map being decoded and the map represented by the struct. One of the most important consequences of this is that it is perfectly valid for a decode operation to not mutate the method receiver and return no error.
For example, let's assume we have the following type:
type Thing struct {
Name string `msg:"name"`
Value float64 `msg:"value"`
}
The following objects would all be legal to decode for the Thing
type:
{} // the object is not mutated
{"name":"bob"} // only "name" is mutated
{"name":"bob","value":0.0} // both "name" and "value" are mutated
{"name":"bob","uncle":"joe"} // "name" is mutated; "uncle" is ignored
Users should take care to reset the values of objects that are repeatedly decoded in order to avoid conflating a previously decoded value with a new one.
The advantage to such a "forgiving" decoding algorithm is that the user can change struct
definitions in production and still maintain some level of backwards-compatibility with previously-encoded values.
If you use the tuple encoding directive for a struct, it will be encoded as a list of its fields rather than a map. Structs encoded/decoded this way are only compatible with lists of the same size and constituent types.
Tuple encoding is faster and stricter (and therefore "safer") than map encoding, but comes at the cost of backwards-compatibility.
In Go, maps and slices can both have nil
values. However, msgp
will never decode a map or a slice as a nil
value, instead encoding them as a zero-length map and a zero-length slice, respectively.
To allow maps and slices to be represented as nil, use the allownil
tag. This will allow "clean" roundtrips for Go slices and maps, but may be problematic for other languages to understand.
The only default types that are encoded as MessagePack null
are pointers and interface{}
.
Decoding a null
object into anything yields a TypeError
.
Timestamps were implemented in this library before an official extension was added for timestamps. Therefore it uses its own extension for this (Extension #5).
From version 1.24 this package supports reading either as input. This will make exchanging data with other platforms easier.
To write cross-compatible timestamps in the official format, add the file directive //msgp:newtime
on an empty line to generate output using the -1
timestamp.
If you are implementing your own writer, use (*Writer).WriteTimeExt
or AppendTimeExt
to add fields in the compatible format.
Encoding and decoding of json.Number
is possible, either as struct members or as interface members.
Numbers will be encoded as integer, if possible, otherwise float64 is used. The zero value json.Number will be encoded as 0.
It is possible to encode as string with //msgp:replace json.Number with:string
.
Fields of type time.Duration
is encoded as signed integers.
This means that fields encoded as this will come back as a number, representing nanoseconds.
The following concrete types are legal to encode with methods that take interface{}
:
-
int{8,16,32,64}
,uint{8,16,32,64}
,complex{64,128}
,time.Time
,string
,[]byte
,float{32,64}
,map[string]interface{}
,map[string]string
,nil
- A pointer to one of the above
- A type that satisfies the
msgp.Encoder
ormsgp.Marshaler
interface, depending on method (defer to the documentation of the method in question) - A type that satisfies the
msgp.Extension
interface
When using decoding methods that return interface{}
, the following types will be returned depending on the MessagePack encoding (MessagePack type -> Go type):
-
uint
->uint64
-
int
->int64
-
bin
->[]byte
-
str
->string
-
map
->map[string]interface{}
-
array
->[]interface{}
-
float32
->float32
-
float64
->float64
-
ext
->time.Time
,complex64
,complex128
,msgp.RawExtension
, or a registered extension type
Tagging a field on a struct with msg:"fieldname,omitempty"
will cause the code generator to emit additional code to check if the field is empty (Go zero value) before writing it. The behavior of this option generally attempts to emulate that of the encoding/json package. See Zero Values; Omitempty and Allownil for more information.