Skip to content

Commit

Permalink
Puts more focus on U.X. in the RFD (#43447)
Browse files Browse the repository at this point in the history
* Puts more focus on U.X. in the RFD

* Some tweaks

* Update header

* Update rfd/0000-rfds.md

Co-authored-by: Zac Bergquist <[email protected]>

* Update rfd/0000-rfds.md

Co-authored-by: Zac Bergquist <[email protected]>

* Update rfd/0000-rfds.md

Co-authored-by: Zac Bergquist <[email protected]>

* Update rfd/0000-rfds.md

Co-authored-by: Zac Bergquist <[email protected]>

---------

Co-authored-by: Zac Bergquist <[email protected]>
  • Loading branch information
klizhentas and zmb3 authored Jun 26, 2024
1 parent 47654ca commit 1acd9cb
Showing 1 changed file with 142 additions and 13 deletions.
155 changes: 142 additions & 13 deletions rfd/0000-rfds.md
Original file line number Diff line number Diff line change
Expand Up @@ -117,10 +117,151 @@ something like the following.
```
# Required Approvers
* Engineering: @zmb3 && (@codingllama || @nklaassen)
* Security: (@reedloden || @jentfoo)
* Security: (@rjones || @klizhentas)
* Product: (@xinding33 || @klizhentas)
```

### UX

Always start the RFD with a user experience section where you start with user stories. Every other part of your design - security, scale and privacy will flow from the UX, not vice-versa.

#### User stories

Explore UI, CLI and API user experience by going through scenarios that users would go through while solving specific problems.

In each story, explain specific step-by-step UI, CLI and API requests/responses that the user would observe,
as if you are writing a step by step guide for a user who knows as little as possible about Teleport.

If you find too many steps or concepts end users would have to learn, start again to reduce it to a minimum.

In each user story, think about failure modes - what will happen if your integration fails?

**Example: Alice integrates Okta via UI**

Here is an exmaple of a UI-driven user story:

Alice is a system administrator and she would like to integrate Okta with Teleport. She does not know anything about Teleport except the basics, but she has detailed Okta knowledge.

She logs into Teleport, looks for "Integrations", quickly finds an Okta tile and clicks on it.

In the Okta tile, she is asked to add a name for her Okta tenant. She can find the tenant in the Okta's UI and the information
bubble shows her how to do that.

The next step for Alice is to find and locate the SCIM bearer token. Alice needs to go back to Okta again, create Teleport API services
app in the Okta catalog, copy the SCIM token and paste it back to Teleport. Teleport's UI directs her to do just that.

Alice copies the token into Teleport UI. Let's assume she makes a mistake, and the token is broken or misses the permissions.

Alice is directed to Test the integration. The test finds an error and shows her that Okta returns an error:

`Insufficient permissions when synchronizing a user". Teleport shows a detailed response from Okta service, offers to check the token permissions and try the test again.

Finally, Alice figures out the right permission set on Okta's side and Teleport test passes.

Teleport tries a test sync run and offers Alice to tweak the integration parameters. If Alice is happy with the set she clicks save.

#### Make failure modes a first class citizen.

Administrators and system managers spend most of their day debugging integration
issues, failures and errors. Make their day pleasant by building user experiences
for most common failure scenarios:

* What if the integration fails after its setup? Can Alice learn that it's broken, then find out where to go back and troubleshoot it?
* What if Alice needs to tweak the parameters of the integration after setup? Can she go back to the integration and test it?

#### Build Poka-Yoke Devices

In Manufacturing, a Poka-yoke device is anything that prevents an error within the manufacturing process or makes defects visible.

Translated to Teleport, you can build a UX that can prevent people from making a mistake.

For example, if an admin assigned to a role, and changes a mapping that will lock themselves out and leave no other admins, Teleport could prevent the error by blocking the action:

"You can't unassign yourself, because there will be no more admins left."

#### Make UX that reduces information overload and work

Let's take a look at Gmail. When a user clicks on an e-mail, they are offered an option - "Filter messages like this”. Instead of deleting or moving messages one by one, Gmail offers to write, test and set up a rule that also applies to all other messages.

This reduces the amount of manual, tedious work, and works well for one message or a thousand.

When possible, build UX that offers users to reduce the amount of steps and do extra work on their behalf, instead of prompting them to do work that can be automated.

#### Think through the Day One and Day Two user experiences

As a Day 1 user, we don't have any domain knowledge of the product, we are novices.

That's why Day 1 flow should be the first user story we think through. It does not have to be scalable, but it must be easy.

For example, as a Day 1 user, I need step by step guide on how to add one or two servers and databases without learning about RBAC, configs and other Teleport internals. On the UI, Day 1 flow is guiding user each step of the way to enroll a server, test its connection and get to success in the minimum amount of steps.

As a Day two user, I'm concerned about setting up a feature at scale. My Day two user experience is different, and I know a bit more about Teleport.

For example, I would like to spend a bit more time setting up Teleport to automatically discover all my AWS resources and add them to the cluster.

Here are two imaginary examples demonstrating how Day 1 and Day 2 CLI U.X. are different.

**Example: Day 1 CLI certificates**

As a day one user, I would like to issue a certificate to two services to set up mTLS in my cluster.

```bash
tbot join service-a --cluster=teleport.example.com
[1] Joining to cluster teleport.example.com...
[2] Issuing a certificate to ./tbot/certs/service-a/cert.pem and key.pem..

To test using this certificate, try:

curl https://teleport.example.com/ --cert=... --key=... --ca-cert=...
```

On day 1 we keep the amount of new concepts, ideas that users need to think about here to a minimum, and automate most of the steps.

This flow does not have to cover all possible scenarios, just 80% most common ones to
get user to success as fast as possible.

***Example Day 2 CLI certificates***

The UX in the previous example won't scale for Day two, as there are many configuration options to consider, so for a day two user we can offer something more flexible at the expense of adding complexity.


```bash
tbot bootstrap service-a --cluster=teleport.example.com

[1] Generating tbot.yaml for service a in ./tbot/configuraiton/tbot.yaml
[2] Generating service-a role...
[3] Generating systemd unit ./tbot/certs/service-a/cert.pem and key.pem...
[4] Starting a daemon...
```
In this case instead of a simple one liner, we generate detailed step-by step parts and instruct users how to configure those.
#### Make error and info messages actionable.
Make sure errors and info give specific instructions and give enough information.
Explore common failure modes and how users can recover from them.
Here are a couple of examples of messages that need work:
> Please review the access list "My-Awesome-Team", the review is due in 4 days.
This error message misses the actual link or any specific steps users need to take to review the list.
> Failed to set up Okta integration - "Bad request".
This is the most frustrating error messages users can encounter - they don't see any logs, no way to re-test it or trigger the error,
and all they can do is to reach out to support.
#### Consider Cloud UX from the start.
Cloud is a first class citizen. The feature setup can no longer rely on static teleport.yaml configuration, as this automatically
excludes all cloud customers.
#### Upgrade UX
Consider the UX of configuration changes and their impact on Teleport upgrades.
### Security
Describe the security considerations for your design doc.
Expand Down Expand Up @@ -149,18 +290,6 @@ Describe the privacy considerations for your design doc.
and how it will be retained/deleted
* Explore if there are sufficient logs showing any data access or modification
### UX

Describe the UX changes and impact of your design doc.
(Non-exhaustive list below.)

* Explore UI, CLI and API user experience by diving through common scenarios
that users would go through
* Show UI, CLI and API requests/responses that the user would observe
* Make error messages actionable, explore common failure modes and how users can
recover
* Consider the UX of configuration changes and their impact on Teleport upgrades
* Consider the UX scenarios for Cloud users
### Proto Specification
Expand Down

0 comments on commit 1acd9cb

Please sign in to comment.