-
-
Notifications
You must be signed in to change notification settings - Fork 53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multilink #296
Conversation
There's a REST VOL url-encoding test with a group named |
The issue with URL encoding is actually on the REST VOL's side. This PR fixed an issue that was obscuring it. LGTM for now. Details on the encoding issue are below. The issue comes from backslashes in the link name. Before this PR, the link name After this PR, the link is successfully created with the entire name. When creating the link, the name is provided as in body of a I think this is because backslashes in the URL don't need to be escaped since they are (or should be) already percent-encoded, but backslashes in JSON strings do need to be escaped with another backslash. So when |
This looks good - I'll approve once I have a build of the REST VOL working with the new responses. |
obj_json["class"] = getObjectClass(obj_id) | ||
|
||
json_objs[h5path] = obj_json | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some response parsing callbacks in the REST VOL use hrefs to get information about the target domain, so POST_Domain
not returning any hrefs becomes a problem. As a first pass, this solved the problem by attaching hrefs to each returned object individually (though a complete version might need to handle when some objects have different domains due to external links):
hrefs = []
hrefs.append({"rel": "self", "href": getHref(request, "/")})
if "root" in domain_json:
root_uuid = domain_json["root"]
href = getHref(request, "/datasets")
hrefs.append({"rel": "database", "href": href})
href = getHref(request, "/groups")
hrefs.append({"rel": "groupbase", "href": href})
href = getHref(request, "/datatypes")
hrefs.append({"rel": "typebase", "href": href})
href = getHref(request, "/groups/" + root_uuid)
hrefs.append({"rel": "root", "href": href})
href = getHref(request, "/")
hrefs.append({"rel": "home", "href": href})
hrefs.append({"rel": "acls", "href": getHref(request, "/acls")})
parent_domain = getParentDomain(domain)
if not parent_domain or getPathForDomain(parent_domain) == "/":
is_toplevel = True
else:
is_toplevel = False
if not is_toplevel:
href = getHref(request, "/", domain=parent_domain)
hrefs.append({"rel": "parent", "href": href})
for h5path in json_objs:
(json_objs[h5path])["hrefs"] = hrefs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've come to think of the hrefs as mainly for human consumption - i.e. clicking on a link that shows up in the browser. As such, hrefs for POST and PUT don't seem as useful.
Another issue is that some of the POST responses could potentially involve a lot of hrefs in the response - e.g. multiple hrefs for each link in POST_links response.
Could you describe what info that you are currently pulling out of the hrefs? How about we just include whatever that is in the POST response as regular json?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The REST VOL gets the domain that contains the object from the hrefs, to satisfy some library API calls that retrieve the object's file number (H5Oget_info3
). It stores the hash of the domain name as a file number, so that comparisons between different objects' file numbers still work.
Returning the Domain that contains the object directly could also solve this. The REST VOL uses this information in GET_Datatype/Group/Dataset
and POST_Domain
requests.
Besides my two comments, this seems good to go in - I have a draft PR to the REST VOL which works with a version of this PR that incorporates the changes I requested |
I've checked in some new code to let you return links based on a regex-like pattern. Also added a follow_links paran to GET_Links. |
tests/integ/link_test.py
Outdated
expected_dset_links = ("dset1.1.1", "dset1.1.2", "dset2.1", "dset2.2") | ||
expected_soft_links = ("slink", ) | ||
expected_external_links = ("extlink", ) | ||
self.assertEqual(len(grp_links), 6) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpick: It's not clear what this 6 corresponds to, since expected_group_links
has 5 entries.
Does grp_links
refer to 'links under the targeted group in the request' rather than 'links to groups'? Renaming grp_links
would make this more clear.
tests/integ/link_test.py
Outdated
hrefs = rspJson["hrefs"] | ||
self.assertEqual(len(hrefs), 3) | ||
self.assertTrue("links" in rspJson) | ||
grp_links = rspJson["links"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment as above about grp_links
tests/integ/link_test.py
Outdated
@@ -507,6 +507,23 @@ def testGet(self): | |||
expected_uuid = helper.getUUIDByPath(domain, "/g1/g1.2/g1.2.1", session=self.session) | |||
self.assertEqual(expected_uuid, g1_2_1_uuid) | |||
|
|||
# do get with a regex pattern | |||
params = {"pattern": "ext*"} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This section should probably be moved under testGetPattern
tests/integ/link_test.py
Outdated
@@ -1115,7 +1259,7 @@ def testPostLinkMultiple(self): | |||
obj_links = rspJson["links"] | |||
self.assertEqual(len(obj_links), 6) | |||
expected_group_links = ("g1", "g2", "g1.1", "g1.2", "g1.2.1", ) | |||
expected_dset_links = ("dset1.2", "dset2.2", "dset1.1.1", "dset1.1.2", "dset2.1", ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was there a reason to remove dset1.2
from this test? It doesn't cause any issues when added back in.
hsds/dset_lib.py
Outdated
|
||
# get all chunk ids for chunks that have been allocated | ||
chunk_ids = await getAllocatedChunkIds(app, dset_id, bucket=bucket) | ||
chunk_ids.sort() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Double sort
hsds/link_dn.py
Outdated
@@ -109,6 +114,16 @@ async def GET_Links(request): | |||
titles.sort() # sort by key |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
globmatch
returns the link titles in arbitrary order, so it should probably be moved before this
ret_links = [] | ||
for link in grp_links: | ||
title = link["title"] | ||
if globmatch(title, pattern): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this use of globmatch redundant, since domain_crawl.sn/get_links
-> getLinks
->GET_Links (DN)
already does the pattern matching?
Added post method for links and multi support for delete link