Releases: streamingfast/substreams
v1.1.9
Backend changes
-
Massive refactoring of the scheduler: prevent excessive splitting of jobs, grouping them into stages when they have the same dependencies. This should reduce the required number of
tier2
workers (2x to 3x, depending on the substreams). -
The
tier1
andtier2
config have a new configurationStateStoreDefaultTag
, will be appended to theStateStoreURL
value to form the final state store URL, ex:StateStoreURL="/data/states"
andStateStoreDefaultTag="v2"
will make/data/states/v2
the default state store location, while allowing users to provide aX-Sf-Substreams-Cache-Tag
header (gated by auth module) to point to/data/states/v1
, and so on. -
Authentication plugin
trust
can now specify an exclusive list ofallowed
headers (all lowercase), ex:trust://?allowed=x-sf-user-id,x-sf-api-key-id,x-real-ip,x-sf-substreams-cache-tag
-
The
tier2
app no longer has customizable auth plugin (or any Modules),trust
will always be used, so thattier
can pass down its headers (e.g.X-Sf-Substreams-Cache-Tag
). Thetier2
instances should not be accessible publicly.
GUI changes
-
Color theme is now adapted to the terminal background (fixes readability on 'light' background)
-
Provided parameters are now shown in the 'Request' tab.
CLI changes
Added
-
alpha init
command: replaceinitialBlock
for generated manifest based on contract creation block. -
alpha init
prompt Ethereum chain. Added: Mainnet, BNB, Polygon, Goerli, Mumbai.
Fixed
-
alpha init
reports better progress specially when performing ABI & creation block retrieval. -
alpha init
command without contracts fixed Protogen command invocation.
v1.1.8
Backend changes
Added
- Max-subrequests can now be overridden by auth header
X-Sf-Substreams-Parallel-Jobs
(note: if your auth plugin is 'trust', make sure that you filter out this header from public access - Request Stats logging. When enable it will log metrics associated to a Tier1 and Tier2 request
- On request, save "substreams.partial.spkg" file to the state cache for debugging purposes.
- Manifest reader can now read 'partial' spkg files (without protobuf and metadata) with an option.
Fixed
- Fixed a bug which caused "live" blocks to be sent while the stream previously received block(s) were historic.
CLI changes
Fixed
- In GUI, module output now shows fields with default values, i.e.
0
,""
,false
v1.1.7
Highlights
Now using plugin: buf.build/community/neoeinstein-prost-crate:v0.3.1
when generating the Protobuf Rust mod.rs
which fixes the warning that remote plugins are deprecated.
Previously we were using remote: buf.build/prost/plugins/crate:v0.3.1-1
. But remote plugins when using https://buf.build (which we use to generate the Protobuf) are now deprecated and will cease to function on July 10th, 2023.
The net effect of this is that if you don't update your Substreams CLI to 1.1.7
, on July 10th 2023 and after, the substreams protogen
will not work anymore.
v1.1.6
Backend changes
-
substreams-tier1
andsubstreams-tier2
are now standalone Apps, to be used as such by server implementations (firehose-ethereum, etc.) -
substreams-tier1
now listens to Connect protocol, enabling browser-based substreams clients -
Authentication has been overhauled to take advantage of https://github.com/streamingfast/dauth, allowing the use of a GRPC-based sidecar or reverse-proxy to provide authentication.
-
Metering has been overhauled to take advantage of https://github.com/streamingfast/dmetering plugins, allowing the use of a GRPC sidecar or logs to expose usage metrics.
-
The tier2 logs no longer show a
parent_trace_id
: thetrace_id
is now the same as tier1 jobs. Unique tier2 jobs can be distinguished by theirstage
andsegment
, corresponding to theoutput_module_name
andstartblock:stopblock
CLI changes
-
The
substreams protogen
command now uses this Buf plugin https://buf.build/community/neoeinstein-prost to generate the Rust code for your Substreams definitions. -
The
substreams protogen
command no longer generate theFILE_DESCRIPTOR_SET
constant which generates an unsued warning in Rust. We don't think nobody relied on having theFILE_DESCRIPTOR_SET
constant generated, but if it's the case, you can provide your ownbuf.gen.yaml
that will be used instead of the generated one when doingsubstreams protogen
. -
Added
-H
flag on thesubstreams run
command, to set HTTP Headers in the Substreams request.
Fixed
- Fixed generated
buf.gen.yaml
not being deleted when an error occurs while generating the Rust code.
v1.1.5
Highlights
This release fixes data determinism issues. This comes at a 20% performance cost but is necessary for integration with The Graph ecosystem.
Operators
- When upgrading a substreams server to this version, you should delete all existing module caches to benefit from deterministic output
Added
- Tier1 now records deterministic failures in wasm, "blacklists" identical requests for 10 minutes (by serving them the same InvalidArgument error) with a forced incremental backoff. This prevents accidental bad actors from hogging tier2 resources when their substreams cannot go passed a certain block.
- Tier1 now sends the ResolvedStartBlock, LinearHandoffBlock and MaxJobWorkers in SessionInit message for the client and gui to show
- Substreams CLI can now read manifests/spkg directly from an IPFS address (subgraph deployment or the spkg itself), using
ipfs://Qm...
notation
Fixed
- When talking to an updated server, the gui will not overflow on a negative start block, using the newly available resolvedStartBlock instead.
- When running in development mode with a start-block in the future on a cold cache, you would sometimes get invalid "updates" from the store passed down to your modules that depend on them. It did not impact the caches but caused invalid output.
- The WASM engine was incorrectly reusing memory, preventing deterministic output. It made things go faster, but at the cost of determinism. Memory is now reset between WASM executions on each block.
- The GUI no longer panics when an invalid output-module is given as argument
Changed
- Changed default WASM engine from
wasmtime
towazero
, useSUBSTREAMS_WASM_RUNTIME=wasmtime
to revert to prior engine. Note thatwasmtime
will now run a lot slower than before because resetting the memory inwasmtime
is more expensive than inwazero
. - Execution of modules is now done in parallel within a single instance, based on a tree of module dependencies.
- The
substreams gui
andsubstreams run
now accept commas inside aparam
value. For example:substreams run --param=p1=bar,baz,qux --param=p2=foo,baz
. However, you can no longer pass multiple parameters using an ENV variable, or a.yaml
config file.
v1.1.4
HIGHLIGHTS
- Module hashing changed to fix cache reuse on substreams use imported modules
- Memory leak fixed on rpc-enabled servers
- GUI more responsive
Fixed
-
BREAKING: The module hashing algorithm wrongfully changed the hash for imported modules, which made it impossible to leverage caches when composing new substreams off of imported ones.
- Operationally, if you want to keep your caches, you will need to copy or move the old hashes to the new ones.
- You can obtain the prior hashes for a given spkg with:
substreams info my.spkg
, using a prior release of thesubstreams
- With a more recent
substreams
release, you can obtain the new hashes with the same command. - You can then
cp
ormv
the caches for each module hash.
- You can obtain the prior hashes for a given spkg with:
- You can also ignore this change. This will simply invalidate your cache.
- Operationally, if you want to keep your caches, you will need to copy or move the old hashes to the new ones.
-
Fixed a memory leak where "PostJobHooks" were not always called. These are used to hook in rpc calls in ethereum chain. They are now always called, even if no block has been processed (can be called with
nil
value for the clock) -
Jobs that fail deterministically (during WASM execution) on tier2 will fail faster, without retries from tier1.
-
substreams gui
command now handles params flag (it was ignored) -
Substeams GUI responsiveness improved significantly when handling large payloads
Added
- Added Tracing capabilities, using https://github.com/streamingfast/sf-tracing . See repository for details on how to enable.
Known issues
- If the cached substreams states are missing a 'full-kv' file in its sequence (not a normal scenario), requests will fail with
opening file: not found
#222
v1.1.3
Highlights
This release contains fixes for race conditions that happen when multiple request tries to sync the same range using the same .spkg
. Those fixes will avoid weird state error at the cost of duplicating work in some circumstances. A future refactor of the Substreams engine scheduler will come later to fix those inefficiencies.
Operators, please read the operators section for upgrade instructions.
Operators
Note This upgrade procedure to you applies if your Substreams deployment topology includes both
tier1
andtier2
processes. If you have defined somewhere the config valuesubstreams-tier2: true
, then this applies to you, otherwise, if you can ignore the upgrade procedure.
This release includes a small change in the internal RPC layer between tier1
processes and tier2
processes. This change requires an ordered upgrade of the processes to avoid errors.
The components should be deployed in this order:
- Deploy and roll out
tier1
processes first - Deploy and roll out
tier2
processes in second
If you upgrade in the wrong order or if somehow tier2
processes start using the new protocol without tier1
being aware, user will end up with backend error(s) saying that some partial file are not found. Those will be resolved only when tier1
processes have been upgraded successfully.
Fixed
- Fixed a race when multiple Substreams request execute on the same
.spkg
, it was causing races between the two executors. - GUI: fixed an issue which would slow down message consumption when progress page was shown in ascii art "bars" mode
- GUI: fixed the display of blocks per second to represent actual blocks, not messages count
Changed
-
[
binary
]: Commandssubstreams <...>
that fails now correctly return an exit code 1. -
[
library
]: Themanifest.NewReader
signature changed and will now return a*Reader, error
(previously*Reader
).
Added
-
[
library
]: Themanifest.Reader
gained the ability to infer the path if provided with input""
based on the current working directory. -
[
library
]: Themanifest.Reader
gained the ability to infer the path if provided with input that is a directory.
v1.1.2
Highlights
This release contains bug fixes and speed/scaling improvements around the Substreams engine. It also contains few small enhancements for substreams gui
.
This release contains the fix to an important bug that could have generated corrupted store
state files. This is important for developers and operators.
Sinkers & Developers
The store
state files will be fully deleted on the Substreams server to start fresh again. The impact for you as a developer is that Substreams that were fully synced will now need to re-generate from initial block the store's state. So you might see long delays before getting a new block data while the Substreams engine is re-computing the store
states from scratch.
Operators
You need to clear the state store and remove all the files that are stored under substreams-state-store-url
flag. You can also make it point to a brand new folder and delete the old one after the rollout.
Fixed
- Fix a bug where not all extra modules would be sent back on debug mode
- Fixed a bug in tier1 that could result in corrupted state files when getting close to chain HEAD
- Fixed some performance and stalling issues when using GCS for blocks
- Fixed storage logs not being shown properly
- GUI: Fixed panic race condition
- GUI: Cosmetic changes
Added
- GUI: Added traceID
v1.1.1
Highlights
This release introduces a new RPC protocol and the old one has been removed. The new RPC protocol is in a new Protobuf package sf.substreams.rpc.v2
and it drastically changes how chain re-orgs are signaled to the user. Here the highlights of this release:
- Getting rid of
undo
payload during re-org substreams gui
Improvements- Substreams integration testing
- Substreams Protobuf definitions updated
Getting rid of undo
payload during re-org
Previously, the GRPC endpoint sf.substreams.v1.Stream/Blocks
would send a payload with the corresponding "step", NEW or UNDO.
Unfortunately, this led to some cases where the payload could not be deterministically generated for old blocks that had been forked out, resulting in a stalling request, a failure, or in some worst cases, incomplete data.
The new design, under sf.substreams.rpc.v2.Stream/Blocks
, takes care of these situations by removing the 'step' component and using these two messages types:
sf.substreams.rpc.v2.BlockScopedData
when chain progresses, with the payloadsf.substreams.rpc.v2.BlockUndoSignal
during a reorg, with the last valid block number + block hash
The client now has the burden of keeping the necessary means of performing the undo actions (ex: a map of previous values for each block). The BlockScopedData message now includes the final_block_height
to let you know when this "undo data" can be discarded.
With these changes, a substreams server can even handle a cursor for a block that it has never seen, provided that it is a valid cursor, by signaling the client to revert up to the last known final block, trading efficiency for resilience in these extreme cases.
substreams gui
Improvements
- Added key 'f' shortcut for changing display encoding of bytes value (hex, pruned string, base64)
- Added
jq
search mode (hit/
twice). Filters the output with thejq
expression, and applies the search to match all blocks. - Added search history (with
up
/down
), similar toless
. - Running a search now applies it to all blocks, and highlights the matching ones in the blocks bar (in red).
- Added
O
andP
, to jump to prev/next block with matching search results. - Added module search with
m
, to quickly switch from module to module.
Substreams integration testing
Added a basic Substreams testing framework that validates module outputs against expected values.
The testing framework currently runs on substreams run
command, where you can specify the following flags:
test-file
Points to a file that contains your test specstest-verbose
Enables verbose mode while testing.
The test file, specifies the expected output for a given substreams module at a given block.
Substreams Protobuf definitions updated
We changed the Substreams Protobuf definitions making a major overhaul of the RPC communication. This is a breaking change for those consuming Substreams through gRPC.
Note The is no breaking changes for Substreams developers regarding your Rust code, Substreams manifest and Substreams package.
- Removed the
Request
andResponse
messages (and related) fromsf.substreams.v1
, they have been moved tosf.substreams.rpc.v2
. You will need to update your usage if you were consuming Substreams through gRPC. - The new
Request
excludes fields and usages that were already deprecated, like using multiplemodule_outputs
. - The
Response
now contains a single module output - In
development
mode, the additional modules output can be inspected underdebug_map_outputs
anddebug_store_outputs
.
Separating Tier1 vs Tier2 gRPC protocol (for Substreams server operators)
Now that the Blocks
request has been moved from sf.substreams.v1
to sf.substreams.rpc.v2
, the communication between a substreams instance acting
as tier1 and a tier2 instance that performs the background processing has also been reworked, and put under sf.substreams.internal.v2.Stream/ProcessRange
. It has also been stripped of parameters that were not used for that level of communication (ex: cursor
, logs
...)
Fixed
-
The
final_blocks_only: true
on theRequest
was not honored on the server. It now correctly sends only blocks that are final/irreversible (according to Firehose rules). -
Prevent substreams panic when requested module has unknown value for "type"
Added
- The
substreams run
command now has flag--final-blocks-only
v1.0.3
This should be the last release before a breaking change in the API and handling of the reorgs and UNDO messages.
Highlights
- Added support for resolving a negative start-block on server (also added to run command)
- The
run
andgui
command no longer resolve astart-block=-1
to the 'initialBlock' of the requested module. To get this behavior, simply assign an empty string value to the flagstart-block
instead. - Added support for search within the Substreams gui
output
view. Usage of search withinoutput
behaves similar to theless
command, and can be toggled with "/".
Note This was initially released as 1.0.2 but it has been retracted because it accidentally included the upcoming refactoring.