-
Notifications
You must be signed in to change notification settings - Fork 404
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Don't store signature in ElasticSearch index #74
base: master
Are you sure you want to change the base?
Conversation
ecdeveloper
commented
May 9, 2017
- Sort/cutoff by elasticsearch relevance score instead (references Signature Compression #2)
Hey thanks for submitting. Hopefully I can review this soon. The test is failing because of something on my end, so I have to fix that first. |
Hey @ecdeveloper sorry I took so long to get around to this. I fixed the build in master, can you rebase? Thank you |
22a0523
to
45afa8c
Compare
Thanks, @rhsimplex. I fixed some of my tests, still have to fix a few of them, and probably need to add some tests around checking the score (as you do with dist). |
8730577
to
05bdf55
Compare
Fixed all tests, except for a single one in |
05bdf55
to
1fda05a
Compare
That's the test for adding metadata, which should be ignored in the image search. Maybe the query is including it. I'll look too. |
Another update on my side -- since your change affects the behavior on a lot of the code, it's important we make sure the tests are actually hitting all your changes. I'll need to get codecov up and running (which means enlisting the help of @vrde since I no longer work at Ascribe) and possibly update the tests to do all checks with and without signatures. Thanks for your patience, it's just a big change so I want to get it right 🥇 |
Sure, no rush here. Thanks for your assistance. |
Sorry @ecdeveloper, I haven't forgotten this. Both myself and @vrde were coincidentally on vacation! |
@rhsimplex No worries. I just got back from a vacation too ;) |
We finally got Codecov. Can you rebase on master so it runs the report? |
* Sort/cutoff by elasticsearch relevance score instead
1fda05a
to
d5d4020
Compare
@rhsimplex yay! k, rebased. |
Believe it or not, I haven't forgotten about this PR. Looking into why the metadata filter test is failing. |
@rhsimplex Come on, man! I believe in you 😄 Actually, no worries. Take your time ;) |
@@ -117,7 +117,7 @@ def insert_single_record(self, rec): | |||
raise NotImplementedError | |||
|
|||
def __init__(self, k=16, N=63, n_grid=9, | |||
crop_percentile=(5, 95), distance_cutoff=0.45, | |||
crop_percentile=(5, 95), distance_cutoff=0.45, score_cutoff=9.0, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Stuff for Elasticsearch should be in the Elasticsearch driver. You can include a **kwargs here and pick it up in the child class (tell me if it's not clear what I'm saying).
@@ -159,6 +159,8 @@ def __init__(self, k=16, N=63, n_grid=9, | |||
considering how much variance to keep in the image (default (5, 95)) | |||
distance_cutoff (Optional [float]): maximum image signature distance to | |||
be considered a match (default 0.45) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the default value of 9.0
is the reason the test is failing ... only 8 words match, it looks like. Maybe I need to think a bit more about the default value, but if you change to something less than 8, the tests should pass.
|
||
return formatted_res | ||
|
||
def insert_single_record(self, rec, refresh_after=False): | ||
rec['timestamp'] = datetime.now() | ||
|
||
# Don't store signature in index | ||
if 'signature' in rec: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should include a setting in the class constructor whether we want to store the signature or not. The default should be yes, to keep it backwards compatible. I can work on this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So you think you can work on that? Or should I handle this?
ps Can't wait to get that PR in ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can look at it sometime in the near future