Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't store signature in ElasticSearch index #74

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

ecdeveloper
Copy link

@rhsimplex
Copy link
Owner

Hey thanks for submitting. Hopefully I can review this soon. The test is failing because of something on my end, so I have to fix that first.

@rhsimplex
Copy link
Owner

Hey @ecdeveloper sorry I took so long to get around to this. I fixed the build in master, can you rebase?

Thank you

@rhsimplex rhsimplex self-requested a review May 14, 2017 20:43
@ecdeveloper ecdeveloper force-pushed the no-signature-indexed branch 3 times, most recently from 22a0523 to 45afa8c Compare May 15, 2017 15:57
@ecdeveloper
Copy link
Author

Thanks, @rhsimplex. I fixed some of my tests, still have to fix a few of them, and probably need to add some tests around checking the score (as you do with dist).

@ecdeveloper ecdeveloper force-pushed the no-signature-indexed branch 2 times, most recently from 8730577 to 05bdf55 Compare May 16, 2017 06:25
@ecdeveloper
Copy link
Author

Fixed all tests, except for a single one in test_elasticsearch_driver_metadata_as_nested. Will take a closer look tomorrow.

@rhsimplex
Copy link
Owner

That's the test for adding metadata, which should be ignored in the image search. Maybe the query is including it. I'll look too.

@rhsimplex
Copy link
Owner

Another update on my side -- since your change affects the behavior on a lot of the code, it's important we make sure the tests are actually hitting all your changes. I'll need to get codecov up and running (which means enlisting the help of @vrde since I no longer work at Ascribe) and possibly update the tests to do all checks with and without signatures.

Thanks for your patience, it's just a big change so I want to get it right 🥇

@ecdeveloper
Copy link
Author

Sure, no rush here. Thanks for your assistance.

@rhsimplex
Copy link
Owner

Sorry @ecdeveloper, I haven't forgotten this. Both myself and @vrde were coincidentally on vacation!

@ecdeveloper
Copy link
Author

@rhsimplex No worries. I just got back from a vacation too ;)

@rhsimplex rhsimplex closed this Jul 10, 2017
@rhsimplex rhsimplex reopened this Jul 10, 2017
@rhsimplex
Copy link
Owner

We finally got Codecov. Can you rebase on master so it runs the report?

* Sort/cutoff by elasticsearch relevance score instead
@ecdeveloper
Copy link
Author

@rhsimplex yay! k, rebased.

@rhsimplex
Copy link
Owner

Believe it or not, I haven't forgotten about this PR. Looking into why the metadata filter test is failing.

@ecdeveloper
Copy link
Author

@rhsimplex Come on, man! I believe in you 😄 Actually, no worries. Take your time ;)

@@ -117,7 +117,7 @@ def insert_single_record(self, rec):
raise NotImplementedError

def __init__(self, k=16, N=63, n_grid=9,
crop_percentile=(5, 95), distance_cutoff=0.45,
crop_percentile=(5, 95), distance_cutoff=0.45, score_cutoff=9.0,
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stuff for Elasticsearch should be in the Elasticsearch driver. You can include a **kwargs here and pick it up in the child class (tell me if it's not clear what I'm saying).

@@ -159,6 +159,8 @@ def __init__(self, k=16, N=63, n_grid=9,
considering how much variance to keep in the image (default (5, 95))
distance_cutoff (Optional [float]): maximum image signature distance to
be considered a match (default 0.45)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the default value of 9.0 is the reason the test is failing ... only 8 words match, it looks like. Maybe I need to think a bit more about the default value, but if you change to something less than 8, the tests should pass.


return formatted_res

def insert_single_record(self, rec, refresh_after=False):
rec['timestamp'] = datetime.now()

# Don't store signature in index
if 'signature' in rec:
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should include a setting in the class constructor whether we want to store the signature or not. The default should be yes, to keep it backwards compatible. I can work on this.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So you think you can work on that? Or should I handle this?

ps Can't wait to get that PR in ;)

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can look at it sometime in the near future

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants