Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add MVI data methods #506

Merged
merged 4 commits into from
Nov 3, 2023
Merged

feat: Add MVI data methods #506

merged 4 commits into from
Nov 3, 2023

Conversation

nand4011
Copy link
Contributor

Add the upsert, delete, and search MVI methods to the vector client.

Add an internal MVI data client and grpc manager.

Update the protos to get the latest MVI changes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the biggest point I'm not sure about. We need a clean type-safe way of expressing the different types of metadata in the upsert and search methods. Doing this lets them be defined in a clean way, e.g. here, but parsing it feels a bit clunky since you need to switch on the different types like I'm doing here. I'm open to any suggestions to clean it up.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you have examples of what user code will look like?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

at a glance this seems reasonable and what I would have probably started with, but if you have example user code that would help.

Copy link
Contributor Author

@nand4011 nand4011 Oct 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The linked test code is what the user code would look like for calling search or upsert. Specifically, creating metadata is easy for the different types:

var metadata = new Dictionary<string, MetadataValue>
            {
                { "string_key", "string_value" },
                { "long_key", 123 },
                { "double_key", 3.14 },
                { "bool_key", true },
                { "list_key", new List<string> { "a", "b", "c" } },
                { "empty_list_key", new List<string>() }
            };

For search hits that have metadata that could be any type you would need to do something like this:

foreach (var hit in successResponse.Hits)
{
    foreach (var metadataPair in hit.Metadata)
    {
        var field = metadataPair.Key;
        switch (metadataPair.Value)
        {
            case StringValue stringValue:
                // do stuff
                break;
            case LongValue longValue:
                // do stuff
                break;
            case DoubleValue doubleValue:
                // do stuff
                break;
            case BoolValue boolValue:
                // do stuff
                break;
           case StringListValue stringListValue:
                // do stuff
                break;
        }
    }
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This uses a similar trick to MetadataValue to allow a method to take either a List, or an ALL sentinel value. The downside is it can't take an IEnumerable. I'm also open to suggestions here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comment here, looks reasonable to me at first glance but would love to see example code

{
private readonly VectorIndexControlClient controlClient;

private readonly VectorIndexDataClient dataClient;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no data client round robin like in the cache client currently, nor are there eager connections or retries. I assumed we could add those in a future pr if there is demand.

@nand4011 nand4011 requested review from malandis and a team October 26, 2023 21:23
/// </summary>
/// <param name="fields">The fields to look up.</param>
/// <returns></returns>
public static implicit operator MetadataFields(List<string> fields) => new List(fields);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

neat, I didn't know you could do that.

You can't put an enumerable as the argument though?

Copy link
Contributor Author

@nand4011 nand4011 Oct 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope, it has to be a specific implementation. The best we can do is add conversions for all implementations that make sense.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i see.

i actually think list is probably fine, especially for MVP. was just curious because I hadn't seen this feature before.

Copy link
Contributor

@cprice404 cprice404 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i didn't review super closely, and definitely want michael to weigh in. but i skimmed and replied to your comments, and it looks reasonable to me from what I saw!

Copy link
Contributor

@malandis malandis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since .NET has namespacing, I'd prefer the request and response names to not be prefixed with Vector. Feels redundant. In python we have elided it, but not in typescript since we don't have a nice option for namespacing.

@cprice404 thoughts?

src/Momento.Sdk/IPreviewVectorIndexClient.cs Outdated Show resolved Hide resolved
Comment on lines +78 to +87
var metadataRequest = metadataFields switch
{
MetadataFields.AllFields => new _MetadataRequest { All = new _MetadataRequest.Types.All() },
MetadataFields.List list => new _MetadataRequest
{
Some = new _MetadataRequest.Types.Some { Fields = { list.Fields } }
},
_ => throw new InvalidArgumentException($"Unknown metadata fields type {metadataFields.GetType()}")
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't know it C# had switch expressions. That's cool!

src/Momento.Sdk/Internal/VectorIndexDataClient.cs Outdated Show resolved Hide resolved
/// <summary>
/// A item in a vector index. Contains an ID, the vector, and any associated metadata.
/// </summary>
public class VectorIndexItem
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could use a record class here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we can since we the metadata dictionary would be compared by reference. I tried making MetadataValue a record for a while but I couldn't overwrite the equals method in ListStringValue to make it compare correctly. Looks like I overlooked adding an equals and hashcode method here. I can add that in.

src/Momento.Sdk/Responses/Vector/SearchHit.cs Show resolved Hide resolved
@cprice404
Copy link
Contributor

Since .NET has namespacing, I'd prefer the request and response names to not be prefixed with Vector. Feels redundant. In python we have elided it, but not in typescript since we don't have a nice option for namespacing.

@cprice404 thoughts?

In a greenfield I definitely agree.

The inconsistency bugs me, so if we do it, I think we should be planning to make a consistent version of the existing ones. Which I think can be done, by basically renaming the CacheFoo response to Cache.Foo, and then making a type alias (possibly deprecated) using the old name so we don't break backward compat.

We had the same convo about Python and I think we decided to do this, but I don't think we've actually come back and updated the Cache/Topic responses yet.

I'm good with making this change for vectors now as long as we at least capture a ticket for updating the Cache/Topic ones and put it on the board.

Base automatically changed from initial-mvi to main October 30, 2023 22:10
@nand4011 nand4011 requested a review from malandis October 30, 2023 23:03
Copy link
Contributor

@malandis malandis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good; a few minor comments

src/Momento.Sdk/IPreviewVectorIndexClient.cs Outdated Show resolved Hide resolved
src/Momento.Sdk/IPreviewVectorIndexClient.cs Outdated Show resolved Hide resolved
src/Momento.Sdk/IPreviewVectorIndexClient.cs Outdated Show resolved Hide resolved
src/Momento.Sdk/IPreviewVectorIndexClient.cs Outdated Show resolved Hide resolved
src/Momento.Sdk/IPreviewVectorIndexClient.cs Outdated Show resolved Hide resolved
Comment on lines +63 to +68
var indexName = $"index-{Utils.NewGuidString()}";

var createResponse = await vectorIndexClient.CreateIndexAsync(indexName, 2, SimilarityMetric.InnerProduct);
Assert.True(createResponse is CreateIndexResponse.Success, $"Unexpected response: {createResponse}");

try
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This repeated code and cleanup can be refactored to a disposable class, then you can use a using block; see https://learn.microsoft.com/en-us/dotnet/csharp/language-reference/statements/using

@malandis
Copy link
Contributor

malandis commented Nov 1, 2023

LGTM. I will 👍 once the conflicts are resolved.

Add the upsert, delete, and search MVI methods to the vector client.

Add an internal MVI data client and grpc manager.

Update the protos to get the latest MVI changes.
Copy link
Contributor

@malandis malandis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😎

@nand4011 nand4011 merged commit 5ca7411 into main Nov 3, 2023
7 checks passed
@nand4011 nand4011 deleted the mvi-data-methods branch November 3, 2023 22:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants