Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFD 151: Database Permission Management #33734

Merged
merged 10 commits into from
Jan 25, 2024
355 changes: 355 additions & 0 deletions rfd/0151-database-permission-management.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,355 @@
---
authors: Krzysztof Skrzętnicki ([email protected])
state: draft
---

# RFD 151 - Database Permission Management

## Required Approvals

- Engineering: @r0mant && @smallinsky
Tener marked this conversation as resolved.
Show resolved Hide resolved
- Product: @xinding33 || @klizhentas
- Security: @reedloden || @jentfoo

## What

Extends [automated db user provisioning](0113-automatic-database-users.md) with
permission management capabilities.

## Why

Managing database-level permissions is a natural extension of current Teleport
RBAC capabilities. The described model should also integrate seamlessly with
TAG, providing a future-proof solution.

Database administrators will be able to use Teleport for both user and
permission management.

## Details

### UX

A set of global "database object import rules" resources are defined for the
Teleport cluster. At certain points in time, the database schema is read and
passed through the import rules. The individual import rules may apply custom
labels. An database object resource will only be created if there is at least
one import rule that matched it.

Only object attributes (standard ones like `protocol` or custom from the
`attributes` field), which are sourced from the object spec, are subject to
matching against the import rules. Any labels present on an object, either from
an import rule or another source, are not subject to be matched by the import
rule. The permissions matching is different in this regard.

Example: the `sales-prod` import rule, which imports all tables from `sales`
Postgres database in prod:

```yaml
kind: db_object_import_rule
metadata:
name: sales-prod
spec:
db_labels:
env: prod
mappings:
- object_match:
- database: sales
greedy52 marked this conversation as resolved.
Show resolved Hide resolved
object_kind: table
protocol: postgres
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the "protocol" needed here? I'm thinking if you need to create an import rule for, say, Postgres databases, you mark them with appropriate label and filter here through "db_labels" above?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It isn't absolutely needed, just something which is likely to be helpful. Otherwise you may end up adding labels per protocol type on your own, which is not very user friendly.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBH I still think that having the protocol here is a little confusing and that labels are flexible enough for users to be able to construct rules they want. I would keep it out of initial scope, we can always add it later if needed.

Also I agree that we should support table names, something similar to Okta import rules like:

match:
- database_names: ["pii", "db*"]
  table_names: ["customers", "products"]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would be the meaning for table_names for other object kinds, e.g. stored procedures? Should we add stored_procedure_names, view_names, trigger_names etc.?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went ahead and simplified the permissions matching part; now it is only done via labels, and not object attributes (name, schema, ...). You can still get the same effective behaviour if you add an import rule which propagates the attributes that you need (which is some work), but the positive part is that the implementation gets easier to write and understand; the latter part is a win in my book, as it makes the UX better. The roles are shorter, too.

add_labels:
env: prod
product: sales
priority: 10
version: v1
```

Example database object, with applied labels:

```yaml
kind: db_object
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there is an error in the schema definition, how will the administrator know? For allow functions the failure mode is obvious, but a typo in a deny action could result in an expected control not functioning.

Can we validate this schema at the point of configuration? There is still a concern around database changes happening after configuration, but it would at least prevent typo like failures.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what can we do except to provide some facilities to debug the algorithm, visualising the intermediates etc. This is not a new problem: the existing deny rules that depend on matching have no guardrails either. There is also no existing functionality to help debug these or warn in case of mismatches.

Tener marked this conversation as resolved.
Show resolved Hide resolved
metadata:
labels:
env: prod
product: sales
name: sales_main
spec:
attributes:
attr1: custom attr1 value
attr2: custom attr2 value
Tener marked this conversation as resolved.
Show resolved Hide resolved
database: sales
name: sales_main
object_kind: table
Tener marked this conversation as resolved.
Show resolved Hide resolved
protocol: postgres
schema: public
service_name: sales-prod-123
Tener marked this conversation as resolved.
Show resolved Hide resolved
version: v1
```

The permissions to particular objects are defined in a role using the new field
`db_permissions`. The permission specifies the object kind (e.g. `table`),
permission to grant/revoke (`SELECT`) as well as labels to be matched.

The matching is performed against database object:

- attributes: standard and custom, sourced from the object spec, provided by
database-specific schema import code
- resource labels: aside from standard labels, these can be manipulated by the
import rules.

If both attributes and labels provide the same non-empty key, the attributes
take preference.

Example role `db-dev`, which grants `SELECT` permission to some objects, and
revokes `UPDATE` from all others.

```yaml
kind: role
metadata:
name: db-dev
spec:
allow:
db_permissions:
- match:
Tener marked this conversation as resolved.
Show resolved Hide resolved
product: sales
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's unclear that this is matching label product: sales. It's also on the same level as matching object_kind. What if there is a label object_kind that conflicts?

We probably need an extra level:

match:
  labels:
     product: sales
  object_kind: table
  db_type: postgres

or something similar

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In current design attributes and labels are matched together; if the same key is present for both, then the attributes are taking precedence. But I can see this is causing confusion, so its probably best that I change that. This will also remove some ambiguity, while the verbosity increase should be minimal.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've removed the attributes from the db_permissions in roles; now this is purely based on the object labels.

protocol: postgres
Tener marked this conversation as resolved.
Show resolved Hide resolved
object_kind: table
Tener marked this conversation as resolved.
Show resolved Hide resolved
permission: SELECT
Tener marked this conversation as resolved.
Show resolved Hide resolved
deny:
db_permissions:
- object_kind: table
permission: UPDATE
```

The database objects are imported during the connection, to be used for
Tener marked this conversation as resolved.
Show resolved Hide resolved
permission calculation.
greedy52 marked this conversation as resolved.
Show resolved Hide resolved

Additionally, the imports will be done on a predetermined schedule (e.g. every
Tener marked this conversation as resolved.
Show resolved Hide resolved
10 minutes), and stored in the backend.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you think about applying change on insert? Otherwise people will be still able to access the DB using the old permissions for a while.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Schema change is mentioned as a potential trigger, but this needs to be detected somehow. No general way to do it, so a cron job is a sensible baseline to have. But yeah, this is definitely an extension we want to support at some point.

Copy link
Contributor

@greedy52 greedy52 Nov 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For Postgres schema change, probably can use PostgreSQL trigger+notify:
https://medium.com/launchpad-lab/postgres-triggers-with-listen-notify-565b44ccd782

But agreed there is no general way to do it for all protocols and object kinds.

Assuming the periodic scan happens on Database Service (as opposed to discovery service or another service), how to coordinate the scan in a HA setup (multiple db service serving the same database)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For Postgres schema change, probably can use PostgreSQL trigger+notify:
https://medium.com/launchpad-lab/postgres-triggers-with-listen-notify-565b44ccd782

Yup, something like this should work for Postgres assuming permissions align etc. In case of errors periodic scan would provide a viable fallback. The actual notify implementation may be unreliable, too.

Assuming the periodic scan happens on Database Service (as opposed to discovery service or another service), how to coordinate the scan in a HA setup (multiple db service serving the same database)?

As a baseline, a random jitter would be added to the timer, and objects would be discovered multiple times. This should look just like regular discovery, except one running with increased frequency.

This hints at one possible solution to increased backend load: provide a tuneable for discovery frequency, and decrease it automatically as a function of the number of servers performing the discovery.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should make sure these details are clearly described in the docs. I believe that our customers will be shouldering a lot with understanding their specific database implementation, the permission model within it, and the schema design and update process.


The database objects stored in the backend will be used by TAG.
Tener marked this conversation as resolved.
Show resolved Hide resolved

The permissions will be applied to the user after the user is provisioned in the
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be helpful to include exact statements Teleport will use to assign/revoke these permissions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These depend on the object kinds, but here is the current procedure, lifted from the linked PR. As noted elsewhere, it handles only tables, but it is easy to see how it can be expanded for further object kinds.

CREATE OR REPLACE PROCEDURE teleport_update_permissions(username VARCHAR, permissions_ JSONB)
LANGUAGE plpgsql
AS $$
DECLARE
    grant_data JSONB;
    grant_item JSONB;
BEGIN
    CALL teleport_remove_permissions(username);

    -- Assign all roles to the created/activated user if grants are provided.
    grant_data = permissions_->'tables';
    IF grant_data != 'null'::jsonb THEN
        FOR grant_item IN SELECT * FROM jsonb_array_elements(grant_data)
        LOOP
            EXECUTE 'GRANT ' || text(grant_item->>'privilege') || ' ON TABLE ' || QUOTE_IDENT(grant_item->>'schema') || '.' || QUOTE_IDENT(grant_item->>'table') || ' TO ' || QUOTE_IDENT(username);
        END LOOP;
    END IF;
END;
$$;

database. After the session is finished, the user is removed/deactivated, and
all permissions are revoked.
Tener marked this conversation as resolved.
Show resolved Hide resolved

#### Configuration

##### Resource: `db_object_import_rule`

A new kind of resource `db_object_import_rule` is introduced.

The import rules are processed in order (defined by `priority` field) and
matched against the database labels.

If the database matches, the object-level mapping rules are processed.

```protobuf
// DatabaseObjectImportRuleV1 is the resource representing a global database object import rule.
message DatabaseObjectImportRuleV1 {
option (gogoproto.goproto_stringer) = false;
Tener marked this conversation as resolved.
Show resolved Hide resolved
option (gogoproto.stringer) = false;

ResourceHeader Header = 1 [
(gogoproto.nullable) = false,
(gogoproto.jsontag) = "",
(gogoproto.embed) = true
];

DatabaseObjectImportRuleSpec Spec = 2 [
(gogoproto.nullable) = false,
(gogoproto.jsontag) = "spec"
];
}

// DatabaseObjectImportRuleSpec is the spec for database object import rule.
message DatabaseObjectImportRuleSpec {
// Priority represents the priority of the rule application. Lower numbered rules will be applied first.
int32 Priority = 1 [(gogoproto.jsontag) = "priority"];

// DatabaseLabels is a set of labels which must match the database for the rule to be applied.
map<string, string> DatabaseLabels = 2 [(gogoproto.jsontag) = "db_labels,omitempty"];

// Mappings is a list of matches that will map match conditions to labels.
repeated DatabaseObjectImportRuleMapping Mappings = 3 [
(gogoproto.nullable) = false,
(gogoproto.jsontag) = "mappings,omitempty"
];
}
```

Individual objects are matched with against all object matches defined in
`DatabaseObjectImportRuleMapping` message. If any of the matches succeeds (or if
the list of matches is empty), the labels specified in the mapping are applied
to the object. The processing continues with the next mapping: multiple mappings
can match any given object.

```protobuf
// DatabaseObjectImportRuleMapping is the mapping between object properties and labels that will be added to the object.
message DatabaseObjectImportRuleMapping {
// ObjectMatches is a set of object matching rules for this mapping.
// For a given database object, each of the matches is attempted.
// If any of them succeed, the labels are applied.
repeated DatabaseObjectSpec ObjectMatches = 2 [
(gogoproto.nullable) = false,
(gogoproto.jsontag) = "object_match"
];

// AddLabels specifies which labels to add if any of the previous matches match.
map<string, string> AddLabels = 3 [(gogoproto.jsontag) = "add_labels"];
}
```

##### Resource: `db_object`

Another new resource is the database object (`db_object`), imported using the
import rules from the database.

The spec for this resource is the `DatabaseObjectSpec` message, which is also
used in the `DatabaseObjectImportRuleMapping`.

The spec is equivalent to a `map<string, string>`, except it predefines a few
optional properties.

In case of an `db_object`, the entirety of the spec is provided by
database-specific schema introspection tool. The custom attributes provide a way
to express additional properties important to the object, which cannot be
rightly placed in other attributes. The `object_kind` property is mandatory for
`db_object` resources.

```protobuf
// DatabaseObjectSpec is the spec for the database object.
message DatabaseObjectSpec {
string Protocol = 1 [(gogoproto.jsontag) = "protocol,omitempty"];
string ServiceName = 2 [(gogoproto.jsontag) = "service_name,omitempty"];
string ObjectKind = 3 [(gogoproto.jsontag) = "object_kind"];
string Database = 4 [(gogoproto.jsontag) = "database,omitempty"];
string Schema = 5 [(gogoproto.jsontag) = "schema,omitempty"];
string Name = 6 [(gogoproto.jsontag) = "name,omitempty"];
Tener marked this conversation as resolved.
Show resolved Hide resolved
// extra attributes for matching
map<string, string> Attributes = 7 [(gogoproto.jsontag) = "attributes,omitempty"];
}
```

#### Role extension: `spec.{allow,deny}.db_permissions` fields

The role is extended with a `db_permissions` field, which consists of a list of
permissions to be applied against particular database objects provided the
permission properties (object kind, match labels) match the given object.

```protobuf

// ...
// DatabasePermission specifies a set of permissions that will be granted
// to the database user when using automatic database user provisioning.
repeated DatabasePermission DatabasePermissions = 38 [
(gogoproto.nullable) = false,
(gogoproto.jsontag) = "db_permissions,omitempty"
];
}

// DatabasePermission specifies the database object permission for the user.
message DatabasePermission {
// ObjectKind is the database object kind: table, schema, etc.
string ObjectKind = 1 [(gogoproto.jsontag) = "object_kind"];
// Permission is the string representation of the permission to be given, e.g. SELECT, INSERT, UPDATE, ...
string Permission = 2 [(gogoproto.jsontag) = "permission"];
// Match is a list of labels (key, value) to match against database object properties.
map<string, string> Match = 3 [(gogoproto.jsontag) = "match,omitempty"];
}

```

### Permission Semantics

The precise meaning of individual permissions is left to the database engine.
The sole exception is the interaction between the `deny` and `allow` parts:
denied permissions are removed, and the comparison is performed in a
case-insensitive way after trimming the whitespace. As a special case, `*` is
allowed as a permission in `deny` part of the role.

For example, if the `allow` permission is `SELECT`, then it can be removed with
`select`, `SELECT` or `*`.

The engines should prohibit invalid permissions. This is a fatal error and
should cause connection error.

### Permission Lifecycle

Permissions are applied after the user is automatically provisioned.
Deprovisioning (deletion/deactivation) of the user should remove ALL permissions
from the user, without the need to reference the particular user permissions.

The permission synchronization is performed at least once, on connection time:

- database schema is read,
- import rules are applied,
- effective permissions are calculated based on user roles,
- database-side permissions are updated.

Optional, additional syncs may happen as needed (e.g. when a schema change is
Tener marked this conversation as resolved.
Show resolved Hide resolved
detected).

### TAG Integration

Periodic application of import rules will populate `db_object` resources in the
backend. These can be imported to TAG, where permission calculation may happen.
The permission algorithm is based on label matching, which should be a primitive
operation that is easy to support in TAG. Performance wise it may be necessary
to filter the set of objects - a single database can viably produce thousands of
objects - but it should remain a viable approach.

### Backward Compatibility

The new `db_permissions` field should be automatically ignored by Teleport
versions that don't support it. The new resources will likewise be ignored.

### Audit Events

The [RFD 113](0113-automatic-database-users.md) introduces events
`db.user.created` and `db.user.disabled`, but these are yet to be implemented.

The `db.user.created` should be extended with an effective list of applied
permissions. Since the list of database objects may be long, it may be necessary
to summarize the list of changes in the audit event.

### Observability

The expectation is that permission changes should occur swiftly. If necessary,
we may consider monitoring the latency of schema queries and the time required
to apply permissions. This becomes particularly relevant when permissions are
managed external to the database instance, for instance, through a call to AWS
Security Token Service (STS).

To enhance observability, it is advisable to introduce appropriate logging.
Debug-level logs can be used to detail individual permissions granted, while
keeping in mind that the list might encompass thousands of entries.

### Product Usage

As of the current point in time, there are no plans for the introduction of
Tener marked this conversation as resolved.
Show resolved Hide resolved
telemetry.

### Test Plan
Tener marked this conversation as resolved.
Show resolved Hide resolved

In the test plan, a new section should be incorporated, positioned alongside the
coverage for the "automated user provisioning" feature. This section should
enumerate and test each supported configuration separately. At the time of
writing this RFD, the coverage for the "automated user provisioning" feature in
the test plan is absent, and it should be added.

### Security

The introduction of the new feature has no direct impact on the security of
Teleport itself. However, it does have implications for the security of
connected databases within the supported configurations. For environments
requiring heightened security, there may be value in explicitly excluding
specific databases from this feature, potentially using a resource label for
this purpose.

It's important to recognize that with sufficiently broad permissions, a user
might have the potential to elevate their database permissions further via
database-specific means, including creating additional users and granting
permissions. Given that this feature does not attempt to model the implications
of individual database permissions, there is no foolproof mechanism to prevent
such excessive permissions. Therefore, it falls to the system administrator to
ensure that the granted permissions are kept to a minimum.
Loading