-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add MVI data methods #506
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the biggest point I'm not sure about. We need a clean type-safe way of expressing the different types of metadata in the upsert and search methods. Doing this lets them be defined in a clean way, e.g. here, but parsing it feels a bit clunky since you need to switch on the different types like I'm doing here. I'm open to any suggestions to clean it up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you have examples of what user code will look like?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
at a glance this seems reasonable and what I would have probably started with, but if you have example user code that would help.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The linked test code is what the user code would look like for calling search or upsert. Specifically, creating metadata is easy for the different types:
var metadata = new Dictionary<string, MetadataValue>
{
{ "string_key", "string_value" },
{ "long_key", 123 },
{ "double_key", 3.14 },
{ "bool_key", true },
{ "list_key", new List<string> { "a", "b", "c" } },
{ "empty_list_key", new List<string>() }
};
For search hits that have metadata that could be any type you would need to do something like this:
foreach (var hit in successResponse.Hits)
{
foreach (var metadataPair in hit.Metadata)
{
var field = metadataPair.Key;
switch (metadataPair.Value)
{
case StringValue stringValue:
// do stuff
break;
case LongValue longValue:
// do stuff
break;
case DoubleValue doubleValue:
// do stuff
break;
case BoolValue boolValue:
// do stuff
break;
case StringListValue stringListValue:
// do stuff
break;
}
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This uses a similar trick to MetadataValue to allow a method to take either a List, or an ALL sentinel value. The downside is it can't take an IEnumerable. I'm also open to suggestions here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same comment here, looks reasonable to me at first glance but would love to see example code
{ | ||
private readonly VectorIndexControlClient controlClient; | ||
|
||
private readonly VectorIndexDataClient dataClient; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no data client round robin like in the cache client currently, nor are there eager connections or retries. I assumed we could add those in a future pr if there is demand.
/// </summary> | ||
/// <param name="fields">The fields to look up.</param> | ||
/// <returns></returns> | ||
public static implicit operator MetadataFields(List<string> fields) => new List(fields); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
neat, I didn't know you could do that.
You can't put an enumerable as the argument though?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope, it has to be a specific implementation. The best we can do is add conversions for all implementations that make sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i see.
i actually think list is probably fine, especially for MVP. was just curious because I hadn't seen this feature before.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i didn't review super closely, and definitely want michael to weigh in. but i skimmed and replied to your comments, and it looks reasonable to me from what I saw!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since .NET has namespacing, I'd prefer the request and response names to not be prefixed with Vector
. Feels redundant. In python we have elided it, but not in typescript since we don't have a nice option for namespacing.
@cprice404 thoughts?
var metadataRequest = metadataFields switch | ||
{ | ||
MetadataFields.AllFields => new _MetadataRequest { All = new _MetadataRequest.Types.All() }, | ||
MetadataFields.List list => new _MetadataRequest | ||
{ | ||
Some = new _MetadataRequest.Types.Some { Fields = { list.Fields } } | ||
}, | ||
_ => throw new InvalidArgumentException($"Unknown metadata fields type {metadataFields.GetType()}") | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't know it C# had switch expressions. That's cool!
/// <summary> | ||
/// A item in a vector index. Contains an ID, the vector, and any associated metadata. | ||
/// </summary> | ||
public class VectorIndexItem |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could use a record class here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we can since we the metadata dictionary would be compared by reference. I tried making MetadataValue a record for a while but I couldn't overwrite the equals method in ListStringValue to make it compare correctly. Looks like I overlooked adding an equals and hashcode method here. I can add that in.
In a greenfield I definitely agree. The inconsistency bugs me, so if we do it, I think we should be planning to make a consistent version of the existing ones. Which I think can be done, by basically renaming the We had the same convo about Python and I think we decided to do this, but I don't think we've actually come back and updated the Cache/Topic responses yet. I'm good with making this change for vectors now as long as we at least capture a ticket for updating the Cache/Topic ones and put it on the board. |
5aa0bad
to
078b735
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good; a few minor comments
var indexName = $"index-{Utils.NewGuidString()}"; | ||
|
||
var createResponse = await vectorIndexClient.CreateIndexAsync(indexName, 2, SimilarityMetric.InnerProduct); | ||
Assert.True(createResponse is CreateIndexResponse.Success, $"Unexpected response: {createResponse}"); | ||
|
||
try |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This repeated code and cleanup can be refactored to a disposable class, then you can use a using block; see https://learn.microsoft.com/en-us/dotnet/csharp/language-reference/statements/using
LGTM. I will 👍 once the conflicts are resolved. |
Add the upsert, delete, and search MVI methods to the vector client. Add an internal MVI data client and grpc manager. Update the protos to get the latest MVI changes.
705021c
to
0dd4fca
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
😎
Add the upsert, delete, and search MVI methods to the vector client.
Add an internal MVI data client and grpc manager.
Update the protos to get the latest MVI changes.