Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Multi-Vector support for HNSW search #13525

Open
wants to merge 135 commits into
base: main
Choose a base branch
from
Open
Changes from 1 commit
Commits
Show all changes
135 commits
Select commit Hold shift + click to select a range
96e09b3
tensor partial
vigyasharma May 15, 2024
50f3789
docstring edit
vigyasharma May 18, 2024
0050b30
define a KnnFloatTensorField
vigyasharma May 20, 2024
b543668
clean up tensor similarity function
vigyasharma May 20, 2024
c58a3fb
add to fieldIndo and indexing chain schema setup
vigyasharma Jun 13, 2024
b237cdb
started creating new tensor format
vigyasharma Jun 14, 2024
4324151
modified Lucene99FlatVectorsWriter to write tensors
vigyasharma Jun 16, 2024
774e91d
Tensor writer with changes to write tensorDataOffsets
vigyasharma Jun 20, 2024
d15cffc
write byte tensor values
vigyasharma Jun 20, 2024
ada3db1
read tensor metadata and create FieldEntry
vigyasharma Jun 20, 2024
e42a389
Added OffHeapFloatTensorValues
vigyasharma Jun 20, 2024
9bc98bc
support for OffHeapByteTensorValues with code to reuse the bytebuffer
vigyasharma Jun 21, 2024
5e11640
default flat tensor scorer impl
vigyasharma Jun 21, 2024
851a14b
Tensor reader impl. done
vigyasharma Jun 21, 2024
7c39b76
Flat tensor writer impl.
vigyasharma Jun 22, 2024
318455a
revert all tensor changes in FlatVectorsWriter
vigyasharma Jun 22, 2024
b24ae9a
remove unused imports
vigyasharma Jun 23, 2024
a35062f
modify hnswVectorsWriter to build graph with tensors
vigyasharma Jun 25, 2024
c94127d
reuse hnswvectorsreader for tensors
vigyasharma Jun 25, 2024
ced69a3
add knnTensorsformat with an hnsw impl
vigyasharma Jun 25, 2024
b20859f
plumb tensors into the indexing chain
vigyasharma Jun 25, 2024
a30e06e
update lucene94FieldInfosFormat constructor
vigyasharma Jun 25, 2024
13bd4c5
update FieldInfos ctor calls and bw codecs with new params
vigyasharma Jun 25, 2024
834db2b
get tensorReader from codec readers and segment readers
vigyasharma Jun 26, 2024
d214b1b
minor bug fix with readers
vigyasharma Jun 26, 2024
4adefed
add KnnFloatTensorQuery wrapper
vigyasharma Jun 26, 2024
b31e0be
add ByteTensorQuery
vigyasharma Jun 26, 2024
208b041
gradle tidy
vigyasharma Jun 26, 2024
4d173e7
remove commented code
vigyasharma Jun 26, 2024
b5551fc
tidy again
vigyasharma Jun 26, 2024
421b6d6
fix ctor bug
vigyasharma Jun 26, 2024
ee2e0d7
gradle check minus test files
vigyasharma Jun 26, 2024
f50e91a
add no commit tags
vigyasharma Jun 26, 2024
edde79a
make Tensor Similarity Fn a class with Agg enum
vigyasharma Jun 27, 2024
a2a91c4
changed tensor fields, fieldTypes and fieldInfos
vigyasharma Jun 27, 2024
0f7b1dd
modify FlatTensor reader and writers with new FI format
vigyasharma Jun 27, 2024
825e267
modify HnswVectors reader and writer for new FieldInfo
vigyasharma Jun 27, 2024
42475cc
update codecreader and tensor query classes
vigyasharma Jun 27, 2024
0316e02
fix FieldInfo in some files
vigyasharma Jun 27, 2024
0ccb40b
update FieldInfo ctor
vigyasharma Jun 27, 2024
ebc2518
add hashcode to TSF
vigyasharma Jun 27, 2024
21261af
gradle tidy
vigyasharma Jun 27, 2024
872874c
remove nocommits for builds
vigyasharma Jun 27, 2024
d52868f
linting, licenses, and compilation bugs
vigyasharma Jun 27, 2024
749a670
tidy
vigyasharma Jun 27, 2024
ece1728
add missing java docs
vigyasharma Jun 27, 2024
30bdb49
add no commits
vigyasharma Jun 28, 2024
b92d6eb
start single format multivectors change
vigyasharma Jun 30, 2024
1579aea
rename TensorSimilarity to MultiVectorSimilarity
vigyasharma Jul 1, 2024
4f5e494
Add interface to FlatVectorsScorer; del FlatTensorsScorer
vigyasharma Jul 1, 2024
f667e4d
Use DefaultFlatMultiVectorScorer that extends DefFlatVecScorer
vigyasharma Jul 1, 2024
6f13b2e
change OffHeapMVValues; change MVDataOffsetReader
vigyasharma Jul 1, 2024
429a106
change field values to Byte/FloatMultiVectorValue
vigyasharma Jul 1, 2024
104d802
changes to FlatVectorsWriter; del FlatTensorsWriter
vigyasharma Jul 1, 2024
5afa0d9
update FlatVectorsReader to handle multi-vectors
vigyasharma Jul 1, 2024
da32aae
minor import bug
vigyasharma Jul 2, 2024
3fb3808
del FlatTensorReader and FlatTensorFormat
vigyasharma Jul 2, 2024
d0a0b65
Hnsw reader, writer and format changes
vigyasharma Jul 2, 2024
001afd7
remove multivector metadata from hnsw format; not needed
vigyasharma Jul 2, 2024
ca41e85
remove codec change; use static default var for agg fn
vigyasharma Jul 2, 2024
db97c0d
change FieldInfo values; remove TensorQuery classes
vigyasharma Jul 2, 2024
7769e48
restore readers to main branch versions
vigyasharma Jul 2, 2024
1274a27
reword some occurrences of tensor in comments
vigyasharma Jul 2, 2024
05174ff
indexing change and vectorValConsumer changes
vigyasharma Jul 2, 2024
cd08506
some comment changes; use DEF_AGG for MultiVecSimFnAgg
vigyasharma Jul 2, 2024
79450dd
tidy
vigyasharma Jul 2, 2024
cd7e890
compile errors from FI ctor
vigyasharma Jul 2, 2024
0edb868
use ByteBuffer.clear instead of reset
vigyasharma Jul 2, 2024
19fe230
add logs for assert failures
vigyasharma Jul 2, 2024
bca27ff
missed a ;
vigyasharma Jul 2, 2024
93c5077
fix bug around multi-vector condition
vigyasharma Jul 3, 2024
bc0ebc1
bug fix for non-multi-vector case
vigyasharma Jul 3, 2024
1d80b28
tidy
vigyasharma Jul 3, 2024
a33d533
move changes to new format files
vigyasharma Jul 3, 2024
8a098f2
tidy; remove unused imports
vigyasharma Jul 3, 2024
c591b67
remove multi-vector specific check
vigyasharma Jul 3, 2024
51ca4dc
missing docstring
vigyasharma Jul 3, 2024
b5ada6f
tensor partial
vigyasharma May 15, 2024
bc68c8f
docstring edit
vigyasharma May 18, 2024
785c958
define a KnnFloatTensorField
vigyasharma May 20, 2024
d7197be
clean up tensor similarity function
vigyasharma May 20, 2024
9c6d4f5
add to fieldIndo and indexing chain schema setup
vigyasharma Jun 13, 2024
9ce6977
started creating new tensor format
vigyasharma Jun 14, 2024
462e150
modified Lucene99FlatVectorsWriter to write tensors
vigyasharma Jun 16, 2024
b2f3fad
Tensor writer with changes to write tensorDataOffsets
vigyasharma Jun 20, 2024
a88c4ba
write byte tensor values
vigyasharma Jun 20, 2024
8a3ec4f
read tensor metadata and create FieldEntry
vigyasharma Jun 20, 2024
9d6ffd5
Added OffHeapFloatTensorValues
vigyasharma Jun 20, 2024
cf4ca9c
support for OffHeapByteTensorValues with code to reuse the bytebuffer
vigyasharma Jun 21, 2024
a9a345b
default flat tensor scorer impl
vigyasharma Jun 21, 2024
555b055
Tensor reader impl. done
vigyasharma Jun 21, 2024
0abc881
Flat tensor writer impl.
vigyasharma Jun 22, 2024
eebf38e
revert all tensor changes in FlatVectorsWriter
vigyasharma Jun 22, 2024
5334fdb
remove unused imports
vigyasharma Jun 23, 2024
bc02b0b
modify hnswVectorsWriter to build graph with tensors
vigyasharma Jun 25, 2024
1a9ed21
reuse hnswvectorsreader for tensors
vigyasharma Jun 25, 2024
f567095
add knnTensorsformat with an hnsw impl
vigyasharma Jun 25, 2024
059a8ff
plumb tensors into the indexing chain
vigyasharma Jun 25, 2024
7b86353
update lucene94FieldInfosFormat constructor
vigyasharma Jun 25, 2024
1626195
update FieldInfos ctor calls and bw codecs with new params
vigyasharma Jun 25, 2024
27449f5
get tensorReader from codec readers and segment readers
vigyasharma Jun 26, 2024
34d4063
minor bug fix with readers
vigyasharma Jun 26, 2024
543565f
add KnnFloatTensorQuery wrapper
vigyasharma Jun 26, 2024
9a0633b
add ByteTensorQuery
vigyasharma Jun 26, 2024
90c42b2
gradle tidy
vigyasharma Jun 26, 2024
8c3b1c0
remove commented code
vigyasharma Jun 26, 2024
533c2bf
tidy again
vigyasharma Jun 26, 2024
9b7cb0b
fix ctor bug
vigyasharma Jun 26, 2024
c2ae83b
gradle check minus test files
vigyasharma Jun 26, 2024
0ecda0f
add no commit tags
vigyasharma Jun 26, 2024
24d231e
make Tensor Similarity Fn a class with Agg enum
vigyasharma Jun 27, 2024
d08bc68
changed tensor fields, fieldTypes and fieldInfos
vigyasharma Jun 27, 2024
419cc9e
modify FlatTensor reader and writers with new FI format
vigyasharma Jun 27, 2024
75ecb2c
modify HnswVectors reader and writer for new FieldInfo
vigyasharma Jun 27, 2024
e6e6bdf
update codecreader and tensor query classes
vigyasharma Jun 27, 2024
c0912d1
fix FieldInfo in some files
vigyasharma Jun 27, 2024
c24a99f
update FieldInfo ctor
vigyasharma Jun 27, 2024
04636b9
add hashcode to TSF
vigyasharma Jun 27, 2024
4b2bad4
gradle tidy
vigyasharma Jun 27, 2024
a98035c
remove nocommits for builds
vigyasharma Jun 27, 2024
56db6d0
linting, licenses, and compilation bugs
vigyasharma Jun 27, 2024
97d327d
tidy
vigyasharma Jun 27, 2024
acdd158
add missing java docs
vigyasharma Jun 27, 2024
c80cbcf
add no commits
vigyasharma Jun 28, 2024
6ca9e14
merge new changes
vigyasharma Jul 3, 2024
3365889
fix all merge conflicts
vigyasharma Jul 4, 2024
12f5b54
no commits
vigyasharma Jul 4, 2024
ffda63f
remove isMultiVector member in Lucene99FlatMultiVectorsWriter.FieldWr…
cpoerschke Jul 12, 2024
b2b95cd
remove FieldInfo.isMultiVector in favour of MultiVectorSimilarityFunc…
cpoerschke Jul 12, 2024
eacc63c
in IndexingChain.FieldSchema fold setMultiVectors into setVectors
cpoerschke Jul 12, 2024
1570690
remove metadata check for fixed vector lengths. add debug logs
vigyasharma Oct 21, 2024
b1e5568
fix hnsw vector to read MV scorer supplier; fix loop and array copy i…
vigyasharma Oct 22, 2024
ce2face
add Aggregate to FieldInfos format
vigyasharma Oct 22, 2024
2e52495
assert and logs on vector-data rw offsets
vigyasharma Oct 24, 2024
7aa9555
go to base vectorscorer for single valued use-case
vigyasharma Oct 25, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
tensor partial
  • Loading branch information
vigyasharma committed Jul 3, 2024
commit b5ada6fcd875424fa0c8955ca5c21d7ab73e2632
34 changes: 34 additions & 0 deletions lucene/core/src/java/org/apache/lucene/document/FieldType.java
Original file line number Diff line number Diff line change
@@ -25,6 +25,7 @@
import org.apache.lucene.index.IndexWriterConfig;
import org.apache.lucene.index.IndexableFieldType;
import org.apache.lucene.index.PointValues;
import org.apache.lucene.index.TensorSimilarityFunction;
import org.apache.lucene.index.VectorEncoding;
import org.apache.lucene.index.VectorSimilarityFunction;

@@ -48,6 +49,9 @@ public class FieldType implements IndexableFieldType {
private int vectorDimension;
private VectorEncoding vectorEncoding = VectorEncoding.FLOAT32;
private VectorSimilarityFunction vectorSimilarityFunction = VectorSimilarityFunction.EUCLIDEAN;
private int tensorDegree;
private VectorEncoding tensorEncoding = VectorEncoding.FLOAT32;
private TensorSimilarityFunction tensorSimilarityFunction = TensorSimilarityFunction.MAX_EUCLIDEAN;
private Map<String, String> attributes;

/** Create a new mutable FieldType with all of the properties from <code>ref</code> */
@@ -400,6 +404,36 @@ public VectorSimilarityFunction vectorSimilarityFunction() {
return vectorSimilarityFunction;
}

/** Enable tensor indexing for fixed degree tensors. Dimensions for individual vectors within
* the tensor can vary.
*
* @param degree Number of vectors in the tensor
* @param encoding {@link VectorEncoding} for each tensor vector. Should be the same for all vectors
* @param similarity Used to compare tensors during indexing and search.
*/
public void setTensorAttributes(
int degree, VectorEncoding encoding, TensorSimilarityFunction similarity) {
checkIfFrozen();
if (degree <= 0) {
throw new IllegalArgumentException("vector numDimensions must be > 0; got " + degree);
}
this.tensorDegree = degree;
this.tensorEncoding = Objects.requireNonNull(encoding);
this.tensorSimilarityFunction = Objects.requireNonNull(similarity);
}

public int tensorDegree() {
return tensorDegree;
}

public VectorEncoding tensorEncoding() {
return tensorEncoding;
}

public TensorSimilarityFunction tensorSimilarityFunction() {
return tensorSimilarityFunction;
}

/**
* Puts an attribute value.
*
Original file line number Diff line number Diff line change
@@ -0,0 +1,178 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one or more
* contributor license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright ownership.
* The ASF licenses this file to You under the Apache License, Version 2.0
* (the "License"); you may not use this file except in compliance with
* the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

package org.apache.lucene.document;

import org.apache.lucene.index.FloatVectorValues;
import org.apache.lucene.index.TensorSimilarityFunction;
import org.apache.lucene.index.VectorEncoding;
import org.apache.lucene.index.VectorSimilarityFunction;
import org.apache.lucene.search.KnnFloatVectorQuery;
import org.apache.lucene.search.Query;
import org.apache.lucene.util.VectorUtil;

import java.util.List;
import java.util.Objects;

/**
* A field that contains multiple (or none) floating-point numeric vectors for each document.
* Similar to {@link KnnFloatVectorField}, vectors are dense - that is, every dimension of a vector
* contains an explicit value, stored packed into an array (of type float[]) whose length is
* the vector dimension. <TODO: add more info>
*
* A field that contains a single floating-point numeric vector (or none) for each document. Vectors
* are dense - that is, every dimension of a vector contains an explicit value, stored packed into
* an array (of type float[]) whose length is the vector dimension. Values can be retrieved using
* {@link FloatVectorValues}, which is a forward-only docID-based iterator and also offers
* random-access by dense ordinal (not docId). {@link VectorSimilarityFunction} may be used to
* compare vectors at query time (for example as part of result ranking). A {@link
* KnnFloatTensorField} may be associated with a search similarity function defining the metric used
* for nearest-neighbor search among vectors of that field.
*
* @lucene.experimental
*/
public class KnnFloatTensorField extends Field {

private static FieldType createType(List<float[]> t, TensorSimilarityFunction similarityFunction) {
if (t == null) {
throw new IllegalArgumentException("tensor value must not be null");
}
int degree = t.size();
if (degree == 0) {
throw new IllegalArgumentException("cannot index an empty tensor");
}
if (similarityFunction == null) {
throw new IllegalArgumentException("similarity function must not be null");
}
for (int i = 0; i < t.size(); i++) {
if (t.get(i).length == 0) {
throw new IllegalArgumentException("empty vector found at index (" + i + "). Tensor cannot have empty vectors");
}
}
FieldType type = new FieldType();
type.setTensorAttributes(degree, VectorEncoding.FLOAT32, similarityFunction);
type.freeze();
return type;
}

/**
* A convenience method for creating a tensor field type.
*
* @param degree Number of vectors in each tensor
* @param similarityFunction a function defining tensor proximity.
* @throws IllegalArgumentException if any parameter is null, or has dimension &gt; 1024.
*/
public static FieldType createFieldType(
int degree, VectorSimilarityFunction similarityFunction) {
FieldType type = new FieldType();
type.setVectorAttributes(degree, VectorEncoding.FLOAT32, similarityFunction);
type.freeze();
return type;
}

/**
* Create a new vector query for the provided field targeting the float vector
*
* @param field The field to query
* @param queryVector The float vector target
* @param k The number of nearest neighbors to gather
* @return A new vector query
*/
public static Query newVectorQuery(String field, float[] queryVector, int k) {
return new KnnFloatVectorQuery(field, queryVector, k);
}

/**
* Creates a numeric vector field. Fields are single-valued: each document has either one value or
* no value. Vectors of a single field share the same dimension and similarity function. Note that
* some vector similarities (like {@link VectorSimilarityFunction#DOT_PRODUCT}) require values to
* be unit-length, which can be enforced using {@link VectorUtil#l2normalize(float[])}.
*
* @param name field name
* @param vector value
* @param similarityFunction a function defining vector proximity.
* @throws IllegalArgumentException if any parameter is null, or the vector is empty or has
* dimension &gt; 1024.
*/
public KnnFloatTensorField(
String name, float[] vector, VectorSimilarityFunction similarityFunction) {
super(name, createType(vector, similarityFunction));
fieldsData = VectorUtil.checkFinite(vector); // null check done above
}

/**
* Creates a numeric vector field with the default EUCLIDEAN_HNSW (L2) similarity. Fields are
* single-valued: each document has either one value or no value. Vectors of a single field share
* the same dimension and similarity function.
*
* @param name field name
* @param vector value
* @throws IllegalArgumentException if any parameter is null, or the vector is empty or has
* dimension &gt; 1024.
*/
public KnnFloatTensorField(String name, float[] vector) {
this(name, vector, VectorSimilarityFunction.EUCLIDEAN);
}

/**
* Creates a numeric vector field. Fields are single-valued: each document has either one value or
* no value. Vectors of a single field share the same dimension and similarity function.
*
* @param name field name
* @param vector value
* @param fieldType field type
* @throws IllegalArgumentException if any parameter is null, or the vector is empty or has
* dimension &gt; 1024.
*/
public KnnFloatTensorField(String name, float[] vector, FieldType fieldType) {
super(name, fieldType);
if (fieldType.vectorEncoding() != VectorEncoding.FLOAT32) {
throw new IllegalArgumentException(
"Attempt to create a vector for field "
+ name
+ " using float[] but the field encoding is "
+ fieldType.vectorEncoding());
}
Objects.requireNonNull(vector, "vector value must not be null");
if (vector.length != fieldType.vectorDimension()) {
throw new IllegalArgumentException(
"The number of vector dimensions does not match the field type");
}
fieldsData = VectorUtil.checkFinite(vector);
}

/** Return the vector value of this field */
public float[] vectorValue() {
return (float[]) fieldsData;
}

/**
* Set the vector value of this field
*
* @param value the value to set; must not be null, and length must match the field type
*/
public void setVectorValue(float[] value) {
if (value == null) {
throw new IllegalArgumentException("value must not be null");
}
if (value.length != type.vectorDimension()) {
throw new IllegalArgumentException(
"value length " + value.length + " must match field dimension " + type.vectorDimension());
}
fieldsData = value;
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,141 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one or more
* contributor license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright ownership.
* The ASF licenses this file to You under the Apache License, Version 2.0
* (the "License"); you may not use this file except in compliance with
* the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
package org.apache.lucene.index;

import java.util.List;

import static org.apache.lucene.util.VectorUtil.*;

/**
* Tensor similarity function; used in search to return top K most similar vectors to a target
* tensor. This is a label describing the method used during indexing and searching of the tensors
* in order to determine the nearest neighbors.
*/
public enum TensorSimilarityFunction {

/**
* Max Euclidean distance - returns max of pair wise {@link VectorSimilarityFunction#EUCLIDEAN}
* distance across corresponding vectors in the two tensors.
*/
MAX_EUCLIDEAN {
@Override
public float compare(List<float[]> t1, List<float[]> t2) {
if (t1.size() != t2.size()) {
throw new IllegalArgumentException("tensor degrees differ: " + t1.size() + "!=" + t2.size());
}
float sim = Float.MIN_VALUE;
for (int i = 0; i < t1.size(); i++) {
sim = Math.max(sim, VectorSimilarityFunction.EUCLIDEAN.compare(t1.get(i), t2.get(i)));
}
return sim;
}

// @Override
// public float compare(byte[] v1, byte[] v2) {
// throw new UnsupportedOperationException("not implemented for tensors");
// }
},

/**
* Max Dot product. NOTE: this similarity is intended as an optimized way to perform cosine
* similarity. In order to use it, all vectors must be normalized, including both document and
* query vectors. Using dot product with vectors that are not normalized can result in errors or
* poor search results. Floating point vectors must be normalized to be of unit length, while byte
* vectors should simply all have the same norm.
*/
MAX_DOT_PRODUCT {
@Override
public float compare(List<float[]> t1, List<float[]> t2) {
if (t1.size() != t2.size()) {
throw new IllegalArgumentException("tensor degrees differ: " + t1.size() + "!=" + t2.size());
}
float sim = Float.MIN_VALUE;
for (int i = 0; i < t1.size(); i++) {
sim = Math.max(sim, VectorSimilarityFunction.DOT_PRODUCT.compare(t1.get(i), t2.get(i)));
}
return sim;
}

// @Override
// public float compare(byte[] v1, byte[] v2) {
// throw new UnsupportedOperationException("not implemented for tensors");
// }
},

/**
* Max Cosine similarity. NOTE: the preferred way to perform cosine similarity is to normalize all
* vectors to unit length, and instead use {@link TensorSimilarityFunction#MAX_DOT_PRODUCT}. You
* should only use this function if you need to preserve the original vectors and cannot normalize
* them in advance. The cosine similarity score per vector is normalised to assure it is positive.
*/
MAX_COSINE {
@Override
public float compare(List<float[]> t1, List<float[]> t2) {
if (t1.size() != t2.size()) {
throw new IllegalArgumentException("tensor degrees differ: " + t1.size() + "!=" + t2.size());
}
float sim = Float.MIN_VALUE;
for (int i = 0; i < t1.size(); i++) {
sim = Math.max(sim, VectorSimilarityFunction.COSINE.compare(t1.get(i), t2.get(i)));
}
return sim;
}

// @Override
// public float compare(byte[] v1, byte[] v2) {
// return (1 + cosine(v1, v2)) / 2;
// }
};

// /**
// * Maximum inner product. This is like {@link TensorSimilarityFunction#DOT_PRODUCT}, but does not
// * require normalization of the inputs. Should be used when the embedding vectors store useful
// * information within the vector magnitude
// */
// MAXIMUM_INNER_PRODUCT {
// @Override
// public float compare(float[] v1, float[] v2) {
// return scaleMaxInnerProductScore(dotProduct(v1, v2));
// }
//
// @Override
// public float compare(byte[] v1, byte[] v2) {
// return scaleMaxInnerProductScore(dotProduct(v1, v2));
// }
// };

/**
* Calculates a similarity score between the two vectors with a specified function. Higher
* similarity scores correspond to closer vectors.
*
* @param t1 a tensor with non-empty vectors
* @param t2 another tensor, of the same degree with corresponding vectors of the same dimension.
* @return the value of the similarity function applied to the two vectors
*/
public abstract float compare(List<float[]> t1, List<float[]> t2);

// /**
// * Calculates a similarity score between the two vectors with a specified function. Higher
// * similarity scores correspond to closer vectors. Each (signed) byte represents a vector
// * dimension.
// *
// * @param v1 a vector
// * @param v2 another vector, of the same dimension
// * @return the value of the similarity function applied to the two vectors
// */
// public abstract float compare(byte[] v1, byte[] v2);
}