-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC] gRPC-based API for Search #15190
Comments
Thanks @amberzsy for the proposal. Should we also highlight the abstraction the new 'gRPC SearchService' under an 'Experimental Flag' for the proposed timeline of this feature? |
I really like this. Do I understand correctly that the stated goal of this implementation is that a user can switch from REST/HTTP/application/(nd)json to HTTP2/grpc/protobuf via a configuration option on the client (and then it just works(TM) for all APIs)? |
correct. some lightweight translator/adaptor would be needed. |
@amberzsy @dblock I have two questions please:
|
Re: benefits I expect grpc + protobuf to improve both performance and throughput over HTTP/2 JSON. You're right to call this out though, @amberzsy were your benchmarks using HTTP/2? |
@dblock The previous benchmarks for the REST API were just sending binary protobuf blobs over the HTTP/1.1 protocol. It essentially showed that parsing protobuf was more performant than XContent parsing JSON (no surprise there). I expect any solution that is able to replace XContent parsing with protobuf to show performance improvements. I don't know if gRPC would show better performance when compared to any other HTTP/2-based solution that sent protobuf blobs but I think it is worth experimenting with some prototypes. |
Thanks @andrross , this is exactly what we need to figure out: tangible benefits of using gRPC vs HTTP/2 + JSON (since this RFC specifically focuses on gRPC and not HTTP/1.1 + Protobuf). Thank you. |
with HTTP/1.
gRPC uses http/2 as it's transfer protocol plus it has build-in protobuf support as its default serialization format. |
Thanks for the proposal @amberzsy, just went through some of the OpenSearch issue links that talks about Protobuf implementation. Regarding this RFC proposal, the input and output will be in Protobuf binary format (including the streaming-style search API with a gRPC endpoint). For OpenSearch users, to ensure that the API behavior remains unchanged, is there a plan to implement a generic interface that converts Protobuf messages back to a JSON-friendly format for output? Additionally, could this interface be used to read user input as JSON and convert it back to Protocol Buffers? |
There are no plans to remove the existing JSON APIs. |
close the issue and move to execution and implementation details listed in #16787. |
Is your feature request related to a problem? Please describe
Inspiration
Per effort of #6844 and benchmarking result (#10684 (comment)) (~20%), we can consider step further on adding support on gRPC-based API with protobuf as serializing/de-serializing. To validate our assumption on potential performance gain over protobuf which should be more efficient and compact compare to JSON, we performed PoC for client <> server protobuf on Search API with specific query types and we are able to see promising result from opensearch-project/opensearch-clients#69.
Proposal
With ongoing effort for node-to-node communication, which focuses more on Transport Layer with implementing StreamInput, StreamOutput with protobuf serializer/de-serializers. We can expand the effort and have client <> server protobuf support in parallel to achieve more significant performance gain.
The proto definition for search API and partial overlap with transport layer should follow opensearch-api-specification which is widely adopted by clients.
For server side change there are two options here:
Introduce new content-type and expose option to end-user send and receive protobuf binary payloads.
Pros: faster development cycle to begin with as potentially the extension on existing searchRequest/Response, builder
XContent.
Cons: potentially introduce significant code refactoring which introduces complexity alongside the development.
Implement new streaming-style search API(gRPC) using protobuf and expose new grpc endpoint for search API.
Pros:
a) gRPC natively supports client-side, server-side, and bidirectional streaming, allowing for real-time
communication. This is more efficient than HTTP/1.1 used by REST
b) generates client and server code in multiple programming languages based on the proto files. This reduces
boilerplate code and ensures consistency across different languages and platforms.
c) less code refactoring
Cons:
a) the development cycle might not as fast as approach 1.
b) Though bringing up new grpc service and hook with the internal transport layer might not be too complicated,
there will be unknowns on the overall integration with existing ecosystem, e.g related plugins (security, knn,
sql, some other monitoring etc).
For client (Java, Go, Python etc), would have support to optionally use new protobuf-based server API with minimal changes (i.e. no need to rewrite an application already using the client)
Next Steps
Timeline
2.17 release: (09/03/2024 ~ 09/17/2024)
[Experimental Feature]
Related
Transport layer Protobuf support: #6844
The text was updated successfully, but these errors were encountered: