[RFC] Remote Vector Index Build Feature with OpenSearch Vector Engine #2294
Labels
enhancement
indexing
indexing-improvements
This label should be attached to all the github issues which will help improving the indexing time.
Introduction
As part of RFC : Boosting OpenSearch Vector Engine Performance using GPUs, we proposed the idea of using GPUs to accelerate the vector index build time. In this RFC we are proposing the high level design of a generic Remote Vector Index Build capability in Vector Engine, which will be used to connect this remote GPUs based Index Build fleet.
Requirements
Proposed Vector Engine Architecture to integrate with Remote Vector Index Build Service
Assumptions
Below are some of the assumptions taken while designing the integration of Vector Engine with remote Index Build Service.
Roles and Responsibilities
Overall Flow/High Level Design
Local Index Build Flow(Old Flow)
Remote Index Build Flow:
New Components Definition
Alternatives Considered
Use flat vectors files stored in segment(for remote store cases) rather streaming vectors to and fro using object store
This is an interesting alternative and would potentially avoid heavy lifting of transferring of vectors to object store but this approach has some feasibility challenges:
Next Steps
Open Questions
Appendix
The text was updated successfully, but these errors were encountered: