Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gcp: respect permissions inherited from parent drives #563

Merged
merged 6 commits into from
Oct 17, 2020

Conversation

bobheadxi
Copy link
Member

@bobheadxi bobheadxi commented Oct 16, 2020

Drive shares frequently get recreated because we fail to page through all results (we likely have over 100 shares for large folders), and when deletion was enabled (prior to #540) we would also erroneously delete inherited permissions. Both of these issues result in many users receiving duplicate invites whenever Rocket performs a sync (#510, #497). To amend this, we make the following changes:

  • Permissions listing is now paginated
  • Permissions listing now happens for direct parents of target folders - permissions discovered in parents are treated as "inherited", and for these permissions we skip both recreation and deletion.

Ticket(s)

Closes (hopefully) #497

Details

This was really gnarly.

Note that this looks quite convoluted because there is no way in Drive to detect if a permission is inherited (attempt was made in #499) - that functionality is restricted to "Shared Drives", which we do not pay for. There are most misc. comments and context in #497 and #510

I have also not performed an integration test because I'm inclined to just yolo-deploy and revert if broken, which should be easy to do now that we've adopted #560 🙃

@bobheadxi bobheadxi requested a review from chuck-sys October 16, 2020 07:10
@bobheadxi bobheadxi requested review from a team as code owners October 16, 2020 07:10
@bobheadxi bobheadxi changed the title gcp: respect drive parents permissions gcp: respect permissions inherited from parent drives Oct 16, 2020
@codecov
Copy link

codecov bot commented Oct 16, 2020

Codecov Report

Merging #563 into master will decrease coverage by 0.09%.
The diff coverage is 84.61%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #563      +/-   ##
==========================================
- Coverage   93.81%   93.71%   -0.10%     
==========================================
  Files          47       47              
  Lines        2669     2707      +38     
  Branches      361      369       +8     
==========================================
+ Hits         2504     2537      +33     
- Misses        108      111       +3     
- Partials       57       59       +2     
Impacted Files Coverage Δ
interface/gcp_utils.py 41.37% <0.00%> (ø)
interface/gcp.py 84.44% <85.93%> (+1.75%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 61572d3...6c712aa. Read the comment docs.

interface/gcp.py Outdated Show resolved Hide resolved
interface/gcp.py Outdated Show resolved Hide resolved
interface/gcp.py Outdated
Comment on lines 52 to 62
next_page_req = self.drive.permissions()\
.list(fileId=drive_id,
fields=", ".join(fields),
pageSize=50)
while next_page_req is not None:
list_res = next_page_req.execute()
permissions: List[Any] = list_res['permissions']
for p in permissions:
if 'emailAddress' in p:
perm = GCPDrivePermission(p['id'], p['emailAddress'])
perms.append(perm)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Abstract pagination away with Python generators?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eh took a look at it and seems unnecessary and convoluted? Happy to take a suggestion if you believe theres a way to make this more succinct by setting up a generator for this single request (none of the others used, including parent listing, requires paging)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh. It just makes it easier for me to read.

def get_paginated_permissions(self, drive_id: str, fields: List[str]): Iterator[Any]:
	req = self.drive.permissions()\
		.list(fileId=drive_id,
			  fields=', '.join(fields),
			  pageSize=50)
	while req is not None:
		resp = req.execute()
		for perm in resp['permissions']:
			yield perm
		req = self.drive.permissions().list_next(req, resp)
	raise StopIteration

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried this, but then the tests started failing. Help.

diff --git a/interface/gcp.py b/interface/gcp.py
index 78830b5..e6a29da 100644
--- a/interface/gcp.py
+++ b/interface/gcp.py
@@ -1,5 +1,5 @@
 """Utility classes for interacting with Google APIs"""
-from typing import Any, List
+from typing import Any, List, Iterator
 from googleapiclient.discovery import Resource
 import logging
 
@@ -35,6 +35,27 @@ class GCPInterface:
                 return parents
         return []
 
+    def get_paginated_permissions(self,
+                                  drive_id: str,
+                                  fields: List[str]) -> Iterator[Any]:
+        """
+        Retrieve generator of permissions present on the given Drive item.
+
+        Pages through them so you don't have to.
+        """
+        req = self.drive.permissions()\
+            .list(fileId=drive_id,
+                  fields=', '.join(fields),
+                  pageSize=50)
+
+        while req is not None:
+            resp = req.execute()
+            for perm in resp['permissions']:
+                yield perm
+            req = self.drive.permissions().list_next(req, resp)
+
+        raise StopIteration
+
     def get_drive_permissions(self, drive_id: str) -> List[GCPDrivePermission]:
         """
         Retrieves list of permissions present on the given Drive item. It
@@ -42,34 +63,18 @@ class GCPInterface:
         as defined in :class:`GCPDrivePermission`.
         """
         perms: List[GCPDrivePermission] = []
-        page_count = 0
         fields = [
             "permissions/id",
             "permissions/emailAddress",
         ]
         # See http://googleapis.github.io/google-api-python-client/docs/dyn/drive_v3.permissions.html#list # noqa
         # pylint: disable=no-member
-        next_page_req = self.drive.permissions()\
-            .list(fileId=drive_id,
-                  fields=", ".join(fields),
-                  pageSize=50)
-        while next_page_req is not None:
-            list_res = next_page_req.execute()
-            permissions: List[Any] = list_res['permissions']
-            for p in permissions:
-                if 'emailAddress' in p:
-                    perm = GCPDrivePermission(p['id'], p['emailAddress'])
-                    perms.append(perm)
-
-            # set state for the next page
-            # see https://googleapis.github.io/google-api-python-client/docs/dyn/drive_v3.permissions.html#list_next # noqa
-            # pylint: disable=no-member
-            next_page_req = self.drive.permissions()\
-                .list_next(next_page_req, list_res)
-            page_count += 1
-
-        logging.info(f"Found {len(perms)} permissions across {page_count} "
-                     + f"pages for {drive_id}")
+        for p in self.get_paginated_permissions(drive_id, fields):
+            if 'emailAddress' in p:
+                perm = GCPDrivePermission(p['id'], p['emailAddress'])
+                perms.append(perm)
+
+        logging.info(f"Found {len(perms)} permissions for {drive_id}")
         return perms
 
     def get_parents_permissions(self,

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

---------------------------------------------------------------- Captured log call -----------------------------------------------------------------
WARNING  root:gcp.py:118 Unable to fetch parents for drive item(team, target-drive): name 'Iterator' is not defined
ERROR    root:gcp.py:141 Failed to load permissions for drive item(team, target-drive): name 'Iterator' is not defined

Working on applying a version of your patch

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1df27bb

I'm against having the paginated function be a to-level API for a similar reason to #563 (comment) (don't expose unused APIs), so I scoped it to the only function that will use it (this one)

interface/gcp.py Outdated Show resolved Hide resolved
interface/gcp.py Outdated Show resolved Hide resolved
interface/gcp.py Show resolved Hide resolved
@bobheadxi bobheadxi requested a review from chuck-sys October 17, 2020 00:45
@bobheadxi bobheadxi merged commit bc9a6d8 into master Oct 17, 2020
@bobheadxi bobheadxi deleted the permissions-paging branch October 17, 2020 01:28
@bobheadxi bobheadxi linked an issue Oct 17, 2020 that may be closed by this pull request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

drive: permission invites get re-sent on occasion
2 participants