Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HJ-97 - Update systems endpoint to filter vendor deleted systems #5553

Merged
merged 4 commits into from
Dec 4, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@ The types of changes are:
### Changed
- Adding hashes to system tab URLs [#5535](https://github.com/ethyca/fides/pull/5535)
- Updated Cookie House to be responsive [#5541](https://github.com/ethyca/fides/pull/5541)
- Updated `/system` endpoint to filter vendor deleted systems [#5553](https://github.com/ethyca/fides/pull/5553)

### Developer Experience
- Migrated remaining instances of Chakra's Select component to use Ant's Select component [#5502](https://github.com/ethyca/fides/pull/5502)
Expand Down
52 changes: 33 additions & 19 deletions src/fides/api/api/v1/endpoints/system.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
import datetime
from typing import Annotated, Dict, List, Optional, Union

from fastapi import Depends, HTTPException, Query, Response, Security
Expand All @@ -9,6 +10,7 @@
from fideslang.validation import FidesKey
from loguru import logger
from pydantic import Field
from sqlalchemy import or_
from sqlalchemy.ext.asyncio import AsyncSession
from sqlalchemy.future import select
from sqlalchemy.orm import Session
Expand All @@ -17,11 +19,7 @@

from fides.api.api import deps
from fides.api.api.v1.endpoints.saas_config_endpoints import instantiate_connection
from fides.api.db.crud import (
get_resource,
get_resource_with_custom_fields,
list_resource,
)
from fides.api.db.crud import get_resource, get_resource_with_custom_fields
from fides.api.db.ctl_session import get_async_db
from fides.api.db.system import (
create_system,
Expand Down Expand Up @@ -391,30 +389,23 @@ async def ls( # pylint: disable=invalid-name
data_subjects: Optional[List[FidesKey]] = Query(None),
dnd_relevant: Optional[bool] = Query(None),
show_hidden: Optional[bool] = Query(False),
show_deleted: Optional[bool] = Query(False),
) -> List:
"""Get a list of all of the Systems.
If any parameters or filters are provided the response will be paginated and/or filtered.
Otherwise all Systems will be returned (this may be a slow operation if there are many systems,
so using the pagination parameters is recommended).
"""
if not (
size
or page
or search
or data_uses
or data_categories
or data_subjects
or dnd_relevant
or show_hidden
):
return await list_resource(System, db)

query = select(System)

pagination_params = Params(page=page or 1, size=size or 50)
# Need to join with PrivacyDeclaration in order to be able to filter
# by data use, data category, and data subject
query = select(System).outerjoin(
PrivacyDeclaration, System.id == PrivacyDeclaration.system_id
)
if any([data_uses, data_categories, data_subjects]):
query = query.outerjoin(
PrivacyDeclaration, System.id == PrivacyDeclaration.system_id
)

# Fetch any system that is relevant for Detection and Discovery, ie any of the following:
# - has connection configurations (has some integration for DnD or SaaS)
Expand All @@ -431,6 +422,15 @@ async def ls( # pylint: disable=invalid-name
System.hidden == False # pylint: disable=singleton-comparison
)

# Filter out any vendor deleted systems, unless explicitly asked for
if not show_deleted:
query = query.filter(
or_(
System.vendor_deleted_date.is_(None),
System.vendor_deleted_date >= datetime.datetime.now(),
)
)

filter_params = FilterParams(
search=search,
data_uses=data_uses,
Expand All @@ -446,6 +446,20 @@ async def ls( # pylint: disable=invalid-name

# Add a distinct so we only get one row per system
duplicates_removed = filtered_query.distinct(System.id)

if not (
size
or page
or search
or data_uses
or data_categories
or data_subjects
or dnd_relevant
or show_hidden
):
result = await db.execute(duplicates_removed)
return result.scalars().all()
Comment on lines +460 to +461
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any particular reason that we're changing the functionality here? if it's just a general optimization, i understand that motivation - but i think i'd lean toward keeping this using the simple return await list_resource(System, db), because i believe this was in place for backward compatibility of this endpoint with uses in the fides CLI, etc - when it specifically did not paginate, filter, or deduplicate (or do anything besides list all the systems, raw).

granted, that backward compatibility is not well commented in the code, so if we do keep this as it was, then a nice addition would be a code comment to clarify why it's kept as list_resource, i.e. for backward compatibility reasons.

i see/know that @erosselli and @galvana worked on refactoring this endpoint a little while back, may be worth getting them to chime in quickly.

also, as more of a stylistic comment - i preferred this conditional coming at the beginning of the endpoint function, as it had been previously. if there's a branch of the function that's a "special case" and returns immediately, i'd prefer to see that at the beginning - i just find that easier to make sense of when reading the function...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey guys, yes, that's the reason we kept the non-paginated version of the system retrieval, for backwards compatibility. I also agree with @adamsachs's comment about having this higher up so it's a little more obvious that there's a possible early return.


return await async_paginate(db, duplicates_removed, pagination_params)


Expand Down
36 changes: 35 additions & 1 deletion tests/ctl/core/test_api.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
"""Integration tests for the API module."""
import json
import typing
from datetime import datetime, timezone
from datetime import datetime, timedelta, timezone
from json import loads
from typing import Dict, List, Tuple
from uuid import uuid4
Expand Down Expand Up @@ -1665,6 +1665,40 @@ def test_list_with_pagination_and_multiple_filters_2(

assert result_json["items"][0]["fides_key"] == tcf_system.fides_key

@pytest.mark.parametrize(
"vendor_deleted_date, expected_systems_count, show_deleted",
[
(datetime.now() - timedelta(days=1), 1, True),
(datetime.now() - timedelta(days=1), 0, None),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit - reads a bit weird to me that the value here is True or None, although i know that None is falsey 😛

Suggested change
(datetime.now() - timedelta(days=1), 0, None),
(datetime.now() - timedelta(days=1), 0, False),

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More than falsey, I was trying to test the parameter being not present.

(datetime.now() + timedelta(days=1), 1, None),
(None, 1, None),
],
)
andres-torres-marroquin marked this conversation as resolved.
Show resolved Hide resolved
def test_vendor_deleted_systems(
self,
db,
test_config,
system_with_cleanup,
vendor_deleted_date,
expected_systems_count,
show_deleted,
):

system_with_cleanup.vendor_deleted_date = vendor_deleted_date
db.commit()

result = _api.ls(
url=test_config.cli.server_url,
headers=test_config.user.auth_header,
resource_type="system",
query_params={"show_deleted": True} if show_deleted else {},
)

assert result.status_code == 200
result_json = result.json()

assert len(result_json) == expected_systems_count


@pytest.mark.unit
class TestSystemUpdate:
Expand Down
Loading