Skip to content

Commit

Permalink
ext family encoding support
Browse files Browse the repository at this point in the history
+semver:minor
  • Loading branch information
deltics committed Mar 18, 2024
1 parent bc63f16 commit 0bb2f0e
Show file tree
Hide file tree
Showing 21 changed files with 1,013 additions and 402 deletions.
40 changes: 22 additions & 18 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,19 +15,23 @@

# blugnu/msgpack

Provides an efficient implementation of an encoder that may be used to stream structured data to an `io.Writer` in [`msgpack`](https://msgpack.org) format.
An implementation of a msgpack encoder that may be used to stream structured data to an `io.Writer` in [`msgpack`](https://msgpack.org) format.

## Using the Encoder

A new `Encoder` is obtained using `NewEncoder()`, supplying an initial `io.Writer` to which the encoder output is sent. To avoid allocations of encoders when encoding to various outputs, an existing `Encoder` may be retargeted to a different `io.Writer` using the `SetWriter()` method. To temporarily redirect output to a different `io.Writer`, the `Using()` method may be used.
A new `Encoder` is obtained using `NewEncoder()`, supplying an initial `io.Writer` to which the encoder output is sent. To avoid allocations of encoders when encoding to various outputs an existing `Encoder` may be retargeted to a different `io.Writer` using the `SetWriter()` method. To temporarily redirect output to a different `io.Writer`, the `Using()` method may be used.

`Encoder` offers high and low-level encoding functions to cater for a wide range of encoding scenarios.

The `Encode(any)` method will encode an `any` value in the most efficient manner possible according to the underlying type. There is a small overhead using this method, due to the need to type-switch on the supplied value to determine the appropriate encoding method.
The `Encode(any)` method will encode an `any` value in the most efficient manner possible according to the underlying type. There is a small overhead using this method due to the need to type-switch on the supplied value to determine the appropriate encoding method.

For more efficient encoding, when streaming values of known types, type-specific encoder methods may be used directly (_`EncodeBool()`, `EncodeString()` etc_) to avoid this type-switch.
> _NOTE: currently the `Encoder` will not attempt to reflect fields of struct types and will panic if encoding a struct is attempted (or is attempting to encode any other unsupported type)_
Whichever encoder method is used, _all_ determine the most efficient encoding possible for the values they are given.


For more efficient encoding when streaming values of known types, type-specific encoder methods may be used directly (_`EncodeBool()`, `EncodeString()` etc_) to avoid this type-switch.

Whichever encoder method is used, _all_ encoding methods determine the most efficient encoding possible for the values they are given.

e.g. `Encode()`, `EncodeInt()`, `EncodeInt16()`, `EncodeInt32()` will encode the following values as described:

Expand All @@ -40,24 +44,22 @@ e.g. `Encode()`, `EncodeInt()`, `EncodeInt16()`, `EncodeInt32()` will encode the

## Error Handling

If an error is returned from the `io.Writer` when encoding a value the error is returned but is also captured on the `Encoder`.
If a non-`nil` error is returned from the `io.Writer` when encoding a value, the error is returned and is also captured on the `Encoder`, placing it in error state. Any further encoder calls will return this captured error _without attempting to encode any further information to the `io.Writer`_. The `Encoder` remains in this error state until the `ClearErr()` method is called.

Any further encoder calls will return this captured error without attempting to encode any further information to the `io.Writer`. The `Encoder` remains in this error state until the `ResetError()` method is called.
`ClearErr()` returns and clears (sets `nil`) any currently captured error.

`ResetError()` returns and clears (sets `nil`) any currently captured error.

This enables error handling to be simplified by deferring a single error check to the end of compound encoding statements.
This enables error handling to be simplified by ignoring errors returned from a series of encoding statements and performing a `ClearErr()` check at the end.

i.e. instead of:

```go
if err := enc.EncodeString("id"); err != nil {
if err := enc.Encode("id"); err != nil {
return err
}
if err := enc.Encode(id); err != nil {
return err
}
if err := enc.EncodeString("name"); err != nil {
if err := enc.Encode("name"); err != nil {
return err
}
if err := enc.Encode(name); err != nil {
Expand All @@ -68,16 +70,18 @@ i.e. instead of:
You may instead use:

```go
_ = enc.EncodeString("id")
_ = enc.Encode("id")
_ = enc.Encode(id)
_ = enc.EncodeString("name")
_ = enc.Encode("name")
_ = enc.Encode(name)
return enc.ResetError()
return enc.ClearErr()
```

> The returned error must be explicitly ignored to avoid lint problems when using this approach.
Although convenient this approach is less efficient when an error occurs; when there is no error the difference is negligible.
Although convenient, this approach is less efficient when an error occurs, although when there is no error the difference is negligible.

> _If required, any current error on an encoder may be queried without clearing the error state by use of the `Err()` method_
## `EncodeArray[T]()` / `EncodeMap[K, V]()`
These generic functions are provided to encode slices and maps.
Expand All @@ -88,7 +92,7 @@ These are not `Encoder` _methods_ but are first order functions accepting an `En
In addition to an `Encoder`, both functions accept a `slice` or `map` to be encoded and an optional encoder function. The encoder function is called for each item in the slice or map to encode that item. If `nil` is specified for this function then a default encoder function is assumed, encoding items using the high-level `Encode()` method of the supplied `Encoder`.

More efficient encoding may be achieved by supplying a function which uses encoder methods appropriate to the types/values involved (to avoid type-switching in the `Encode()` method).
More efficient encoding may be achieved by supplying a function which uses encoder methods appropriate to the type of slice element of map key/values (to avoid type-switching in the `Encode()` method).

### Slices, Maps and Errors
If an `io.Writer` error occurs while writing the items in an slice or map, the encoder will stop processing any further items and immediately returns from the `EncodeArray()` or `EncodeMap()` function.
Expand Down Expand Up @@ -123,4 +127,4 @@ If the supplied function returns an error, the encoder is retargeted to the orig

_**Not currently implemented.**_

These may be implemented in the future, but at this time this module provides only an Encoder.
These may be implemented in the future, but at this time this module provides only an encoder for manual encoding or values.
26 changes: 13 additions & 13 deletions benchmarks/benchmarks_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ import (

func Benchmark(b *testing.B) {
b.Run("encode(256)", func(b *testing.B) {
enc := msgpack.NewEncoder(io.Discard)
enc, _ := msgpack.NewEncoder(io.Discard)

b.ResetTimer()
b.RunParallel(func(pb *testing.PB) {
Expand All @@ -40,7 +40,7 @@ func Benchmark(b *testing.B) {
})
})
b.Run("encodeint(256)", func(b *testing.B) {
enc := msgpack.NewEncoder(io.Discard)
enc, _ := msgpack.NewEncoder(io.Discard)

b.ResetTimer()
b.RunParallel(func(pb *testing.PB) {
Expand All @@ -50,7 +50,7 @@ func Benchmark(b *testing.B) {
})
})
b.Run("encodeint16(256)", func(b *testing.B) {
enc := msgpack.NewEncoder(io.Discard)
enc, _ := msgpack.NewEncoder(io.Discard)

b.ResetTimer()
b.RunParallel(func(pb *testing.PB) {
Expand All @@ -60,7 +60,7 @@ func Benchmark(b *testing.B) {
})
})
b.Run("encodestring", func(b *testing.B) {
enc := msgpack.NewEncoder(io.Discard)
enc, _ := msgpack.NewEncoder(io.Discard)

b.ResetTimer()
b.RunParallel(func(pb *testing.PB) {
Expand All @@ -71,7 +71,7 @@ func Benchmark(b *testing.B) {
})
})
b.Run("encodemap(.., nil)", func(b *testing.B) {
enc := msgpack.NewEncoder(io.Discard)
enc, _ := msgpack.NewEncoder(io.Discard)
data := map[string]int{
"one": 1,
"two": 2,
Expand All @@ -80,12 +80,12 @@ func Benchmark(b *testing.B) {
b.ResetTimer()
b.RunParallel(func(pb *testing.PB) {
for pb.Next() {
_ = msgpack.EncodeMap(enc, data, nil)
_ = msgpack.EncodeMap(*enc, data, nil)
}
})
})
b.Run("encodemap(.., fn)", func(b *testing.B) {
enc := msgpack.NewEncoder(io.Discard)
enc, _ := msgpack.NewEncoder(io.Discard)
data := map[string]int{
"one": 1,
"two": 2,
Expand All @@ -94,15 +94,15 @@ func Benchmark(b *testing.B) {
b.ResetTimer()
b.RunParallel(func(pb *testing.PB) {
for pb.Next() {
_ = msgpack.EncodeMap(enc, data, func(enc msgpack.Encoder, k string, v int) error {
_ = msgpack.EncodeMap(*enc, data, func(enc msgpack.Encoder, k string, v int) error {
_ = enc.EncodeString(k)
return enc.EncodeInt(v)
})
}
})
})
b.Run("encode x4 + x4 error checks", func(b *testing.B) {
enc := msgpack.NewEncoder(io.Discard)
enc, _ := msgpack.NewEncoder(io.Discard)
id := 1
name := "foo"
b.ResetTimer()
Expand All @@ -120,12 +120,12 @@ func Benchmark(b *testing.B) {
if err := enc.EncodeString(name); err != nil {
return
}
_ = enc.ResetError()
_ = enc.ClearErr()
}
})
})
b.Run("encode x4 + x1 error check", func(b *testing.B) {
enc := msgpack.NewEncoder(io.Discard)
enc, _ := msgpack.NewEncoder(io.Discard)
id := 1
name := "foo"
b.ResetTimer()
Expand All @@ -135,13 +135,13 @@ func Benchmark(b *testing.B) {
_ = enc.EncodeInt(id)
_ = enc.EncodeString("name")
_ = enc.EncodeString(name)
_ = enc.ResetError()
_ = enc.ClearErr()
}
})
})

b.Run("logfmt", func(b *testing.B) {
enc := msgpack.NewEncoder(io.Discard)
enc, _ := msgpack.NewEncoder(io.Discard)
_ = enc.Using(io.Discard, func() error { return errors.New("encoder error") })

b.ResetTimer()
Expand Down
5 changes: 5 additions & 0 deletions benchmarks/go.mod
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
module github.com/blugnu/msgpack/benchmarks

go 1.20

require github.com/ugorji/go/codec v1.2.11 // indirect
2 changes: 2 additions & 0 deletions benchmarks/go.sum
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
github.com/ugorji/go/codec v1.2.11 h1:BMaWp1Bb6fHwEtbplGBGJ498wD+LKlNSl25MjdZY4dU=
github.com/ugorji/go/codec v1.2.11/go.mod h1:UNopzCgEMSXjBc6AOMqYvWC1ktqTAfzJZUZgYf6w6lg=
38 changes: 38 additions & 0 deletions chars.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
package msgpack

import "strconv"

var chars = struct {
digit [10][]byte
digits2 [100][]byte
newline []byte
}{
newline: []byte{'\n'},
}

func init() {
chars.digit[0] = []byte("0")
chars.digit[1] = []byte("1")
chars.digit[2] = []byte("2")
chars.digit[3] = []byte("3")
chars.digit[4] = []byte("4")
chars.digit[5] = []byte("5")
chars.digit[6] = []byte("6")
chars.digit[7] = []byte("7")
chars.digit[8] = []byte("8")
chars.digit[9] = []byte("9")
chars.digits2[0] = []byte("00")
chars.digits2[1] = []byte("01")
chars.digits2[2] = []byte("02")
chars.digits2[3] = []byte("03")
chars.digits2[4] = []byte("04")
chars.digits2[5] = []byte("05")
chars.digits2[6] = []byte("06")
chars.digits2[7] = []byte("07")
chars.digits2[8] = []byte("08")
chars.digits2[9] = []byte("09")

for i := 10; i <= 99; i++ {
chars.digits2[i] = []byte(strconv.FormatInt(int64(i), 10))
}
}
28 changes: 28 additions & 0 deletions chars_test.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
package msgpack

import (
"testing"

"github.com/blugnu/test"
)

func TestInit(t *testing.T) {
// ARRANGE/ACT
// init() is implicitly called

// ASSERT
test.That(t, chars.newline).Equals([]byte{'\n'})
test.That(t, chars.digit).Equals([10][]byte{{'0'}, {'1'}, {'2'}, {'3'}, {'4'}, {'5'}, {'6'}, {'7'}, {'8'}, {'9'}})
test.That(t, chars.digits2).Equals([100][]byte{
{'0', '0'}, {'0', '1'}, {'0', '2'}, {'0', '3'}, {'0', '4'}, {'0', '5'}, {'0', '6'}, {'0', '7'}, {'0', '8'}, {'0', '9'},
{'1', '0'}, {'1', '1'}, {'1', '2'}, {'1', '3'}, {'1', '4'}, {'1', '5'}, {'1', '6'}, {'1', '7'}, {'1', '8'}, {'1', '9'},
{'2', '0'}, {'2', '1'}, {'2', '2'}, {'2', '3'}, {'2', '4'}, {'2', '5'}, {'2', '6'}, {'2', '7'}, {'2', '8'}, {'2', '9'},
{'3', '0'}, {'3', '1'}, {'3', '2'}, {'3', '3'}, {'3', '4'}, {'3', '5'}, {'3', '6'}, {'3', '7'}, {'3', '8'}, {'3', '9'},
{'4', '0'}, {'4', '1'}, {'4', '2'}, {'4', '3'}, {'4', '4'}, {'4', '5'}, {'4', '6'}, {'4', '7'}, {'4', '8'}, {'4', '9'},
{'5', '0'}, {'5', '1'}, {'5', '2'}, {'5', '3'}, {'5', '4'}, {'5', '5'}, {'5', '6'}, {'5', '7'}, {'5', '8'}, {'5', '9'},
{'6', '0'}, {'6', '1'}, {'6', '2'}, {'6', '3'}, {'6', '4'}, {'6', '5'}, {'6', '6'}, {'6', '7'}, {'6', '8'}, {'6', '9'},
{'7', '0'}, {'7', '1'}, {'7', '2'}, {'7', '3'}, {'7', '4'}, {'7', '5'}, {'7', '6'}, {'7', '7'}, {'7', '8'}, {'7', '9'},
{'8', '0'}, {'8', '1'}, {'8', '2'}, {'8', '3'}, {'8', '4'}, {'8', '5'}, {'8', '6'}, {'8', '7'}, {'8', '8'}, {'8', '9'},
{'9', '0'}, {'9', '1'}, {'9', '2'}, {'9', '3'}, {'9', '4'}, {'9', '5'}, {'9', '6'}, {'9', '7'}, {'9', '8'}, {'9', '9'},
})
}
62 changes: 17 additions & 45 deletions encode.array_test.go
Original file line number Diff line number Diff line change
@@ -1,10 +1,11 @@
package msgpack

import (
"bytes"
"errors"
"fmt"
"testing"

"github.com/blugnu/test"
)

func TestEncodeArray(t *testing.T) {
Expand All @@ -15,21 +16,22 @@ func TestEncodeArray(t *testing.T) {
type expect struct {
header []byte
error
n int
}
testcases := []struct {
errorState bool
n int
expect
skip bool
}{
{n: 0, expect: expect{header: []byte{atomEmptyArray}}},
{n: 1, expect: expect{header: []byte{maskFixArray | byte(1)}}},
{n: 15, expect: expect{header: []byte{maskFixArray | byte(15)}}},
{n: 16, expect: expect{header: []byte{typeArray16, 0x00, 0x10}}},
{n: 65535, expect: expect{header: []byte{typeArray16, 0xff, 0xff}}},
{n: 65536, expect: expect{header: []byte{typeArray32, 0x00, 0x01, 0x00, 0x00}}},
{n: 1 << 24, expect: expect{header: []byte{typeArray32, 0x01, 0x00, 0x00, 0x00}}, skip: !*allTests},
{n: (1 << 32) - 1, expect: expect{header: []byte{typeArray32, 0x01, 0x00, 0x00, 0x00}}, skip: true}, // NOTE: this test cannot be run by passing -all; it must be explicitly set to skip: false
{n: 0, expect: expect{n: 0, header: []byte{atomEmptyArray}}},
{n: 1, expect: expect{n: 1, header: []byte{maskFixArray | byte(1)}}},
{n: 15, expect: expect{n: 15, header: []byte{maskFixArray | byte(15)}}},
{n: 16, expect: expect{n: 16, header: []byte{typeArray16, 0x00, 0x10}}},
{n: 65535, expect: expect{n: 65535, header: []byte{typeArray16, 0xff, 0xff}}},
{n: 65536, expect: expect{n: 65536, header: []byte{typeArray32, 0x00, 0x01, 0x00, 0x00}}},
{n: 1 << 24, expect: expect{n: 1 << 24, header: []byte{typeArray32, 0x01, 0x00, 0x00, 0x00}}, skip: !*allTests},
{n: (1 << 32) - 1, expect: expect{n: (1 << 32) - 1, header: []byte{typeArray32, 0x01, 0x00, 0x00, 0x00}}, skip: true}, // NOTE: this test cannot be run by passing -all; it must be explicitly set to skip: false
{errorState: true, n: 0, expect: expect{error: encerr}},
{errorState: true, n: 1, expect: expect{error: encerr}},
{errorState: true, n: 15, expect: expect{error: encerr}},
Expand All @@ -45,7 +47,7 @@ func TestEncodeArray(t *testing.T) {
t.Skip("skipping slow test")
}
defer buf.Reset()
defer func() { _ = enc.ResetError() }()
defer func() { _ = enc.ClearErr() }()

// ARRANGE
if tc.errorState {
Expand All @@ -60,26 +62,9 @@ func TestEncodeArray(t *testing.T) {
err := EncodeArray(enc, s, nil)

// ASSERT
testError(t, tc.expect.error, err)

t.Run("array header", func(t *testing.T) {
wanted := tc.header
got := buf.Bytes()[:len(wanted)]
if !bytes.Equal(wanted, got) {
t.Errorf("\nwanted %#v\ngot %#v", wanted, got)
}
})

t.Run("value bytes", func(t *testing.T) {
wanted := tc.n
if tc.errorState {
wanted = 0
}
got := buf.Len() - len(tc.header)
if wanted != got {
t.Errorf("\nwanted %#v\ngot %#v", wanted, got)
}
})
test.Error(t, err).Is(tc.expect.error)
test.Value(t, buf.Len()-len(tc.header), "# encoded items").Equals(tc.expect.n)
test.Slice(t, buf.Bytes()[:len(tc.header)], "encoded header").Equals(tc.header)
})
}

Expand All @@ -97,20 +82,7 @@ func TestEncodeArray(t *testing.T) {
})

// ASSERT
t.Run("returns error", func(t *testing.T) {
wanted := encerr
got := err
if !errors.Is(got, wanted) {
t.Errorf("\nwanted %#v\ngot %#v", wanted, got)
}
})

t.Run("writes expected items", func(t *testing.T) {
wanted := []byte{maskFixArray | byte(3), 0x01}
got := buf.Bytes()
if !bytes.Equal(wanted, got) {
t.Errorf("\nwanted %#v\ngot %#v", wanted, got)
}
})
test.Error(t, err).Is(encerr)
test.Slice(t, buf.Bytes()).Equals([]byte{maskFixArray | byte(3), 0x01})
})
}
Loading

0 comments on commit 0bb2f0e

Please sign in to comment.