From d42b9b01f179fc1fc9ed60968bb7f26567755967 Mon Sep 17 00:00:00 2001 From: Alexander Klizhentas Date: Mon, 24 Jun 2024 16:00:11 -0700 Subject: [PATCH 1/7] Puts more focus on U.X. in the RFD --- rfd/0000-rfds.md | 144 +++++++++++++++++++++++++++++++++++++++++++---- 1 file changed, 132 insertions(+), 12 deletions(-) diff --git a/rfd/0000-rfds.md b/rfd/0000-rfds.md index 84fbab0649b8e..6a5fd9f0a5338 100644 --- a/rfd/0000-rfds.md +++ b/rfd/0000-rfds.md @@ -120,6 +120,138 @@ something like the following. * Security: (@reedloden || @jentfoo) * Product: (@xinding33 || @klizhentas) ``` +### UX + +Always start the RFD with a user experience section where you start with user stories. Every other part of your design - security, scale and privacy will flow from the UX, not vice-versa. + +#### User stories + +Explore UI, CLI and API user experience by diving through common scenarios that users would go through while solving specific problems. + +In each story, explain specific step-by-step UI, CLI and API requests/responses that the user would observe, +as if you are writing a step by step guide. + +If you see too many steps or concepts end users would have to learn, start again to reduce it to a minimum. + +In each user story, think about failure modes - what will happen if your integration fails? + +**Example: Alice integrates Okta via UI** + +UI-driven user story: + +Alice is a system administrator and she would like to integrate Okta with Teleport. She does not know anything about Teleport +except the basics, but she has detailed Okta knowledge. + +She logs into Teleport, looks for "Integrations", quickly finds an Okta tile and clicks on it. + +In the Okta tile, she is asked to add a name for her Okta tenant. She can find the tenant in the Okta's UI and the information +bubble shows her how to do that. + +The next step for Alice is to find and locate the SCIM bearer token. Alice needs to go back to Okta again, create Teleport API services +app in the Okta catalog, copy the SCIM token and paste it back to Teleport. Teleport's UI directs her to do just that. + +Alice copies the token into Teleport UI. Let's assume she makes a mistake, and the token is broken or misses the permissions. + +Alice is directed to Test the integration. The test finds an error and shows her that Okta returns an error: + +`Insufficient permissions when synchronizing a user". Teleport shows a detailed response from Okta service, offers to check the token permissions and try the test again. + +Finally, Alice figures out the right permission set on Okta's side and Teleport test passes. + +Teleport tries a test sync run and offers Alice to tweak the integration parameters. If Alice is happy with the set she clicks save. + +#### Make failure modes a first class citizen. + +* What if integration fails after its setup? Can Alice go back and troubleshoot it? +* What if Alice needs to tweak the parameters of the integration after setup? Can she go back to the integration and test it? + +#### Build Poka-Yoke Devices + +In Manufacturing, a Poka-yoke device is anything that prevents an error within the manufacturing process or makes defects easily detectable. + +Translated to Teleport, you can build a UX that can prevent people from making a mistake. + +For example, if an admin assigned to a role, and changes a mapping that will lock themselves out and leave no other admins, Teleport could prevent the error by blocking the action: + +"You can't unassign yourself, because there will be no more admins left." + +#### Make UX that reduces information overload and work + +Let's go back to the Gmail example. When a user clicks on a message, they are offered an option - "Filter messages like this”. Instead of deleting or moving messages one by one, Gmail offers to write, test and set up a rule that also applies to all other messages. + +This reduces the amount of manual, tedious work, and works well for one message or a thousand. + +When possible, build UX that offers users to reduce the amount of steps and do extra work on their behalf, instead of prompting them to do work that can be automated. + +#### Think through the Day One and Day Two user experiences + +As a Day 1 user, we don't have any domain knowledge of the product, we are novices. +That's why Day 1 flow should be the first user story we think through. It does not have to be scalable, but it must be easy. + +For example, as a Day 1 user, I need step by step guide on how to add one or two servers and databases without learning +about RBAC, configs and other Teleport internals. On the UI, Day 1 flow is guiding user each step of the way to enroll a server, test its connection and get to success in the minimum amount of steps. + +As a Day two user, I'm concerned about setting up a feature at scale. My Day two user experience is different, I can spend a bit more time setting up Teleport to automatically discover all my AWS resources and add them to the cluster. + +Here are two imaginary examples of how Day 1 and Day 2 CLI U.X. are different. + +**Example: Day 1 CLI certificates** + +As a day one user, I would like to issue a certificate to two services to set up mTLS in my cluster. + +```bash +tbot join service-a --cluster=teleport.example.com +[1] Joining to cluster teleport.example.com... +[2] Issuing a certificate to ./tbot/certs/service-a/cert.pem and key.pem.. + +To test using this certificate, try: + +curl https://teleport.example.com/ --cakey=... --cacert... +``` + +On day 1 we keep the amount of new concepts, ideas that users need to think about here to a minimum, and automate most of the steps. + +***Example Day 2 CLI certificates*** + +The UX in the previous example won't scale for Day two, as there are many configuration options to consider, so for a day two user we can offer something more flexible at the expense of adding complexity. + + +```bash +tbot bootstrap service-a --cluster=teleport.example.com + +[1] Generating tbot.yaml for service a in ./tbot/configuraiton/tbot.yaml +[2] Generating service-a role... +[3] Generating systemd unit ./tbot/certs/service-a/cert.pem and key.pem... +[4] Starting a daemon... +``` + +In this case instead of a simple one liner, we generate detailed step-by step parts and instruct users how to configure those. + +#### Make error and info messages actionable. + +Make sure errors and info give specific instructions and give enough information. + +Explore common failure modes and how users can recover from them. + +Here are a couple of examples of messages that need work: + +> Please review the access list "My-Awesome-Team", the review is due in 4 days. + +This error message misses the actual link or any specific steps users need to take to review the list. + +> Failed to set up Okta integration - "Bad request". + +This is the most frustrating error messages users can encounter - they don't' see any logs, no way to re-test it or trigger the error, +and all they can do is to reach out to support. + +#### Consider Cloud UX from the start. + +Cloud is a first class citizen. The feature setup can no longer rely on static teleport.yaml configuration, as this automatically +excludes all cloud customers. + +#### Upgrade UX + +Consider the UX of configuration changes and their impact on Teleport upgrades. ### Security @@ -149,18 +281,6 @@ Describe the privacy considerations for your design doc. and how it will be retained/deleted * Explore if there are sufficient logs showing any data access or modification -### UX - -Describe the UX changes and impact of your design doc. -(Non-exhaustive list below.) - -* Explore UI, CLI and API user experience by diving through common scenarios - that users would go through -* Show UI, CLI and API requests/responses that the user would observe -* Make error messages actionable, explore common failure modes and how users can - recover -* Consider the UX of configuration changes and their impact on Teleport upgrades -* Consider the UX scenarios for Cloud users ### Proto Specification From 5ca49f3af1c2db52b6c883f3e5a82eda754bec7e Mon Sep 17 00:00:00 2001 From: Alexander Klizhentas Date: Mon, 24 Jun 2024 16:13:44 -0700 Subject: [PATCH 2/7] Some tweaks --- rfd/0000-rfds.md | 34 +++++++++++++++++++++------------- 1 file changed, 21 insertions(+), 13 deletions(-) diff --git a/rfd/0000-rfds.md b/rfd/0000-rfds.md index 6a5fd9f0a5338..7a8e559ca2974 100644 --- a/rfd/0000-rfds.md +++ b/rfd/0000-rfds.md @@ -126,21 +126,20 @@ Always start the RFD with a user experience section where you start with user st #### User stories -Explore UI, CLI and API user experience by diving through common scenarios that users would go through while solving specific problems. +Explore UI, CLI and API user experience by going through scenarios that users would go through while solving specific problems. In each story, explain specific step-by-step UI, CLI and API requests/responses that the user would observe, -as if you are writing a step by step guide. +as if you are writing a step by step guide for a user who knows as little as possible about Teleport. -If you see too many steps or concepts end users would have to learn, start again to reduce it to a minimum. +If you find too many steps or concepts end users would have to learn, start again to reduce it to a minimum. In each user story, think about failure modes - what will happen if your integration fails? **Example: Alice integrates Okta via UI** -UI-driven user story: +Here is an exmaple of a UI-driven user story: -Alice is a system administrator and she would like to integrate Okta with Teleport. She does not know anything about Teleport -except the basics, but she has detailed Okta knowledge. +Alice is a system administrator and she would like to integrate Okta with Teleport. She does not know anything about Teleport except the basics, but she has detailed Okta knowledge. She logs into Teleport, looks for "Integrations", quickly finds an Okta tile and clicks on it. @@ -162,12 +161,16 @@ Teleport tries a test sync run and offers Alice to tweak the integration paramet #### Make failure modes a first class citizen. -* What if integration fails after its setup? Can Alice go back and troubleshoot it? +Administrators and system managers spend most of their day debugging integration +issues, failures and errors. Make their day pleasant by building user experiences +for most common failure scenarios: + +* What if the integration fails after its setup? Can Alice learn that it's broken, thenfind out where to go back and troubleshoot it? * What if Alice needs to tweak the parameters of the integration after setup? Can she go back to the integration and test it? #### Build Poka-Yoke Devices -In Manufacturing, a Poka-yoke device is anything that prevents an error within the manufacturing process or makes defects easily detectable. +In Manufacturing, a Poka-yoke device is anything that prevents an error within the manufacturing process or makes defects visible. Translated to Teleport, you can build a UX that can prevent people from making a mistake. @@ -177,7 +180,7 @@ For example, if an admin assigned to a role, and changes a mapping that will loc #### Make UX that reduces information overload and work -Let's go back to the Gmail example. When a user clicks on a message, they are offered an option - "Filter messages like this”. Instead of deleting or moving messages one by one, Gmail offers to write, test and set up a rule that also applies to all other messages. +Let's take a look at the Gmail. When a user clicks on an e-mail, they are offered an option - "Filter messages like this”. Instead of deleting or moving messages one by one, Gmail offers to write, test and set up a rule that also applies to all other messages. This reduces the amount of manual, tedious work, and works well for one message or a thousand. @@ -186,14 +189,16 @@ When possible, build UX that offers users to reduce the amount of steps and do e #### Think through the Day One and Day Two user experiences As a Day 1 user, we don't have any domain knowledge of the product, we are novices. + That's why Day 1 flow should be the first user story we think through. It does not have to be scalable, but it must be easy. -For example, as a Day 1 user, I need step by step guide on how to add one or two servers and databases without learning -about RBAC, configs and other Teleport internals. On the UI, Day 1 flow is guiding user each step of the way to enroll a server, test its connection and get to success in the minimum amount of steps. +For example, as a Day 1 user, I need step by step guide on how to add one or two servers and databases without learning about RBAC, configs and other Teleport internals. On the UI, Day 1 flow is guiding user each step of the way to enroll a server, test its connection and get to success in the minimum amount of steps. -As a Day two user, I'm concerned about setting up a feature at scale. My Day two user experience is different, I can spend a bit more time setting up Teleport to automatically discover all my AWS resources and add them to the cluster. +As a Day two user, I'm concerned about setting up a feature at scale. My Day two user experience is different, and I know a bit more about Teleport. -Here are two imaginary examples of how Day 1 and Day 2 CLI U.X. are different. +For example, I would like to spend a bit more time setting up Teleport to automatically discover all my AWS resources and add them to the cluster. + +Here are two imaginary examples demonstrating how Day 1 and Day 2 CLI U.X. are different. **Example: Day 1 CLI certificates** @@ -211,6 +216,9 @@ curl https://teleport.example.com/ --cakey=... --cacert... On day 1 we keep the amount of new concepts, ideas that users need to think about here to a minimum, and automate most of the steps. +This flow does not have to cover all possible scenarios, just 80% most common ones to +get user to success as fast as possible. + ***Example Day 2 CLI certificates*** The UX in the previous example won't scale for Day two, as there are many configuration options to consider, so for a day two user we can offer something more flexible at the expense of adding complexity. From 7bccda3552a671edd027701025ac67500339f2dd Mon Sep 17 00:00:00 2001 From: Alexander Klizhentas Date: Mon, 24 Jun 2024 16:14:47 -0700 Subject: [PATCH 3/7] Update header --- rfd/0000-rfds.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/rfd/0000-rfds.md b/rfd/0000-rfds.md index 7a8e559ca2974..1572d7011f2f8 100644 --- a/rfd/0000-rfds.md +++ b/rfd/0000-rfds.md @@ -117,9 +117,10 @@ something like the following. ``` # Required Approvers * Engineering: @zmb3 && (@codingllama || @nklaassen) -* Security: (@reedloden || @jentfoo) +* Security: (@rjones || @klizhentas) * Product: (@xinding33 || @klizhentas) ``` + ### UX Always start the RFD with a user experience section where you start with user stories. Every other part of your design - security, scale and privacy will flow from the UX, not vice-versa. From cad0f34cbb7caf8447ee8eabb8efbcbc2977e665 Mon Sep 17 00:00:00 2001 From: Alexander Klizhentas Date: Tue, 25 Jun 2024 10:10:24 -0700 Subject: [PATCH 4/7] Update rfd/0000-rfds.md Co-authored-by: Zac Bergquist --- rfd/0000-rfds.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/rfd/0000-rfds.md b/rfd/0000-rfds.md index 1572d7011f2f8..d2d419cbe1bfb 100644 --- a/rfd/0000-rfds.md +++ b/rfd/0000-rfds.md @@ -166,7 +166,7 @@ Administrators and system managers spend most of their day debugging integration issues, failures and errors. Make their day pleasant by building user experiences for most common failure scenarios: -* What if the integration fails after its setup? Can Alice learn that it's broken, thenfind out where to go back and troubleshoot it? +* What if the integration fails after its setup? Can Alice learn that it's broken, then find out where to go back and troubleshoot it? * What if Alice needs to tweak the parameters of the integration after setup? Can she go back to the integration and test it? #### Build Poka-Yoke Devices From 0e0fc7eccc8d10b629d40033e2543e19e52db820 Mon Sep 17 00:00:00 2001 From: Alexander Klizhentas Date: Tue, 25 Jun 2024 10:10:30 -0700 Subject: [PATCH 5/7] Update rfd/0000-rfds.md Co-authored-by: Zac Bergquist --- rfd/0000-rfds.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/rfd/0000-rfds.md b/rfd/0000-rfds.md index d2d419cbe1bfb..2ad8a92f393e0 100644 --- a/rfd/0000-rfds.md +++ b/rfd/0000-rfds.md @@ -212,7 +212,7 @@ tbot join service-a --cluster=teleport.example.com To test using this certificate, try: -curl https://teleport.example.com/ --cakey=... --cacert... +curl https://teleport.example.com/ --cert=... --key=... --ca-cert=... ``` On day 1 we keep the amount of new concepts, ideas that users need to think about here to a minimum, and automate most of the steps. From 1b342a78c4c335062e1ba4f59ba0dc650fe6e584 Mon Sep 17 00:00:00 2001 From: Alexander Klizhentas Date: Tue, 25 Jun 2024 10:10:35 -0700 Subject: [PATCH 6/7] Update rfd/0000-rfds.md Co-authored-by: Zac Bergquist --- rfd/0000-rfds.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/rfd/0000-rfds.md b/rfd/0000-rfds.md index 2ad8a92f393e0..7a1d36722c3c8 100644 --- a/rfd/0000-rfds.md +++ b/rfd/0000-rfds.md @@ -250,7 +250,7 @@ This error message misses the actual link or any specific steps users need to ta > Failed to set up Okta integration - "Bad request". -This is the most frustrating error messages users can encounter - they don't' see any logs, no way to re-test it or trigger the error, +This is the most frustrating error messages users can encounter - they don't see any logs, no way to re-test it or trigger the error, and all they can do is to reach out to support. #### Consider Cloud UX from the start. From 114ad692746c7fdbe845407ab58456ee867442c4 Mon Sep 17 00:00:00 2001 From: Alexander Klizhentas Date: Tue, 25 Jun 2024 10:10:41 -0700 Subject: [PATCH 7/7] Update rfd/0000-rfds.md Co-authored-by: Zac Bergquist --- rfd/0000-rfds.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/rfd/0000-rfds.md b/rfd/0000-rfds.md index 7a1d36722c3c8..990258a880331 100644 --- a/rfd/0000-rfds.md +++ b/rfd/0000-rfds.md @@ -181,7 +181,7 @@ For example, if an admin assigned to a role, and changes a mapping that will loc #### Make UX that reduces information overload and work -Let's take a look at the Gmail. When a user clicks on an e-mail, they are offered an option - "Filter messages like this”. Instead of deleting or moving messages one by one, Gmail offers to write, test and set up a rule that also applies to all other messages. +Let's take a look at Gmail. When a user clicks on an e-mail, they are offered an option - "Filter messages like this”. Instead of deleting or moving messages one by one, Gmail offers to write, test and set up a rule that also applies to all other messages. This reduces the amount of manual, tedious work, and works well for one message or a thousand.