From 4e81ed04b515d31a27d9f087760629f2fffdedfb Mon Sep 17 00:00:00 2001 From: Darren Edge Date: Thu, 26 Sep 2024 16:16:08 +0100 Subject: [PATCH] Update non0case workflow dataset links --- app/workflows/detect_entity_networks/README.md | 8 ++++---- app/workflows/extract_record_data/README.md | 5 ++++- app/workflows/match_entity_records/README.md | 2 +- app/workflows/query_text_data/README.md | 2 +- 4 files changed, 10 insertions(+), 7 deletions(-) diff --git a/app/workflows/detect_entity_networks/README.md b/app/workflows/detect_entity_networks/README.md index c1ef7029..9b4a572d 100644 --- a/app/workflows/detect_entity_networks/README.md +++ b/app/workflows/detect_entity_networks/README.md @@ -26,23 +26,23 @@ Select the `View example outputs` tab (in app) or navigate to [example_outputs/d The task for this tutorial is detect networks of entities and their associated level of relationship-based risk using the `company_grievances` dataset available for download either: - in app, via `View example outputs` tab → `Input data` tab -- on GitHub, at [example_outputs/detect_entity_networks/company_grievances](https://github.com/microsoft/intelligence-toolkit/tree/main/example_outputs/detect_entity_networks/company_grievances) +- on GitHub, at [example_outputs/detect_entity_networks/company_grievances/company_grievances_input.csv](https://github.com/microsoft/intelligence-toolkit/tree/main/example_outputs/detect_entity_networks/company_grievances/company_grievances_input.csv) ### Creating the data model -Navigate to the `Create data model` tab and upload the `company_grievances_data.csv` file. +Navigate to the `Create data model` tab and upload the `company_grievances_input.csv` file. Under `Map columns to model`, we will start with the `Link type` of `Entity-Attribute` to link entities to their attributes. These should be distinctive, i.e., linked only to that entity or closely related entities. Set `name` as the `Entity ID column` and select `address`, `city`, `email`, `phone`, and `owner` as `Attribute column(s) to link on`. -We would not select `sector` or `country` as attribute columns to link on since these are too broad, and would connect too many unrelated entities into the same networks. While `city` could be narrow or broad depending on the dataset (and city), the workflow has a way of showing shared attributes of relevance (like `city`) without using them to detect the entity networks. +We would not select `sector` or `country` as attribute columns to link on since these are too broad, and would connect too many unrelated entities into the same networks. While `city` could be narrow or broad depending on the dataset (and city), the workflow has a way of showing shared attributes of relevance (like `city`) without using them to detect the entity networks. Press `Add links to model` to see a summary of data model so far. Next, set `Link type` to `Entity-Flag`, keep `name` as the `Entity ID column`, and set `safety_grievances`, `pay_grievances`, `conditions_grievances`, `treatment_grievances`, and `workload_grievances` as `Flag value column(s)`. -The format of these columns is as counts of the corresponding grievances, or "flags" more generally, so set `Flag format` to `Count`. +The format of these columns is as counts of the corresponding grievances, or "flags" more generally, so set `Flag format` to `Count`. If flags were formatted as a column of flag labels representing instances of that flag type for the adjacent entity, then you would select `Instance` instead. diff --git a/app/workflows/extract_record_data/README.md b/app/workflows/extract_record_data/README.md index 5b471240..b94781bf 100644 --- a/app/workflows/extract_record_data/README.md +++ b/app/workflows/extract_record_data/README.md @@ -27,7 +27,10 @@ Select the `View example outputs` tab (in app) or navigate to [example_outputs/e ## Tutorial -The task for this tutorial is extracting structured data records from transcripts of customer complaint calls (mock data). +The task for this tutorial is extracting structured data records from transcripts of customer complaint calls (mock data) available for download either: + +- in app, via `View example outputs` tab → `Input data` tab +- on GitHub, at [example_outputs/detect_entity_networks/customer_complaints/customer_complaints_texts.csv](https://github.com/microsoft/intelligence-toolkit/tree/main/example_outputs/extract_record_data/customer_complaints/customer_complaints_texts.csv) From the [`Extract Record Data`](https://github.com/microsoft/intelligence-toolkit/blob/main/app/workflows/extract_record_data/README.md) homepage in a running instance of Intelligence Toolkit, select `Prepare data schema`. diff --git a/app/workflows/match_entity_records/README.md b/app/workflows/match_entity_records/README.md index d64b389a..a5fa8d95 100644 --- a/app/workflows/match_entity_records/README.md +++ b/app/workflows/match_entity_records/README.md @@ -25,7 +25,7 @@ Select the `View example outputs` tab (in app) or navigate to [example_outputs/m The task for this tutorial is detecting matching records across two related `company_grievances` datasets available for download either: - in app, via `View example outputs` tab → `Input dataset 1`, `Input dataset 2` tabs -- on GitHub, at [example_outputs/match_entity_records/company_grievances](https://github.com/microsoft/intelligence-toolkit/tree/main/example_outputs/match_entity_records/company_grievances). +- on GitHub, at [example_outputs/match_entity_records/company_grievances/company_grievances_input_data_1.csv](https://github.com/microsoft/intelligence-toolkit/tree/main/example_outputs/match_entity_records/company_grievances/company_grievances_input_data_1.csv) and [example_outputs/match_entity_records/company_grievances/company_grievances_input_data_2.csv](https://github.com/microsoft/intelligence-toolkit/tree/main/example_outputs/match_entity_records/company_grievances/company_grievances_input_data_2.csv). ### How record embedding works diff --git a/app/workflows/query_text_data/README.md b/app/workflows/query_text_data/README.md index a0ad540b..f6f66738 100644 --- a/app/workflows/query_text_data/README.md +++ b/app/workflows/query_text_data/README.md @@ -27,7 +27,7 @@ Select the `View example outputs` tab (in app) or navigate to [example_outputs/q The task for this tutorial is querying the `news_articles` dataset available for download either: - in app, via `View example outputs` tab → `Input texts` tab -- on GitHub, at [example_outputs/query_text_data/news_articles](https://github.com/microsoft/intelligence-toolkit/tree/main/example_outputs/query_text_data/news_articles) +- on GitHub, at [example_outputs/query_text_data/news_articles/news_articles_texts.csv](https://github.com/microsoft/intelligence-toolkit/tree/main/example_outputs/query_text_data/news_articles/news_articles_texts.csv) This dataset contains mock news articles spanning a range of categories including world events, local events, sports, politics, lifestyle, and culture.