From e956d3a2729bc9a64eb27db28cb886eece27c2c9 Mon Sep 17 00:00:00 2001 From: adamkucharski Date: Sat, 25 May 2024 18:06:29 +0800 Subject: [PATCH 1/9] Add AI review blog post --- posts/ai-learner-review/index.qmd | 99 +++++++++++++++++++++++++++++++ 1 file changed, 99 insertions(+) create mode 100644 posts/ai-learner-review/index.qmd diff --git a/posts/ai-learner-review/index.qmd b/posts/ai-learner-review/index.qmd new file mode 100644 index 00000000..efcc35fc --- /dev/null +++ b/posts/ai-learner-review/index.qmd @@ -0,0 +1,99 @@ +--- +title: "Using LLM agents to review tutorials 'in character' as learners" +author: + - name: "Adam Kucharski" + orcid: "0000-0001-8814-9421" +date: "2024-05-28" +categories: [tutorials, R, R package] +format: + html: + toc: true +--- + +## Turning learner personas into LLM agents + +Part of the Epiverse-TRACE initiative involves development of training materials that span early, middle and late stage outbreak analysis and modelling tasks. To ensure that our tutorials are accessible to target audiences, we have developed a series of [learner personas](https://github.com/epiverse-trace/personas) to inform the design of learning materials. These personas include the following: + +As the volume of training materials increases, we have explored automating the generation of initial reviews using large language models (LLMs) that take the form of 'in character' agents with instructions to provide constructive comments. This reflects a wider focus within the field of outbreak analytics on how LLMs agents can be used to increase the efficiency and scalability of common tasks (e.g. [van Hoek et al, Lancet Microbe, 2024](XX) ). + +To generate the AI tutorial reviews, we use the OpenAI GPT-4 API, via the `openai` R package, as described in [this repository](https://github.com/adamkucharski/llm-api-scripts/). We also use the `gh` package to load the `.Rmd` materials from a given repository (e.g. `epiverse-trace/tutorials-middle`). Full illustrative code is [available here](https://github.com/adamkucharski/llm-api-scripts/scripts/content_review_gpt.R), with the GPT-4 API prompts outlined below. + +```r +# Define first prompt +user_prompt_1 <- "You are the following person, and give all your answers in character:" + +# Load Lucia persona +persona_bio <- read_file("https://raw.githubusercontent.com/epiverse-trace/personas/master/lucia-outbreaks.qmd") + +# Define second prompt +user_prompt_2 <- "Now suppose you want to complete the following tutorial about outbreak analysis in R. The content is in R markdown but would be knitted to HTML in reality, with additional figures where relevant. Provide a critique of the tutorial from your perspective as a learner. What is unclear? What is useful? What is difficult? What could be refined? Support comments with brief quotes. In your feedback be succinct, positive, constructive and specific. State what content to keep and what to improve. Provide clear suggestions for next steps to remove, change or add content. Note that visualisations will be in the tutorial, but are not shown in the Rmd, so do not comment on these. Summarise your review and suggestions for specific improvements in short bullet point paragraphs. If some of the points are similar, amalgamate these into a single bullet point. \n\n" + +# Call OpenAI API (credentials must already be defined - see full illustrative code link above) +output <- create_chat_completion( + model = "gpt-4", + messages = list(list("role"="user","content" = paste0(user_prompt_1,persona_bio, + user_prompt_2,tutorial_contents)) + ) + +# Write review to .md using readr +write_lines(output,"output_review.md")) + +``` + + +## Reviewing transmissibilty training materials as a senior academic + +To illustrate the sort of reviews that can be generated with the above approach, the below is the output produced when we imported the [Vania senior academic persona](XX) alongside an early pre-Alpha version of the training materials for our transmissibility estimation tutorial: + +Overall, the tutorial is well-structured and provides a comprehensive guide on how to use the `EpiNow2` package to estimate transmission metrics from case data. However, there are several areas that could be improved to enhance clarity and understanding for learners like me. + +1. **Prerequisites**: The prerequisites section could be expanded to include specific resources for learners to review the necessary concepts. For instance, links to resources on Bayesian analysis and the effective reproduction number would be helpful. + +2. **Jargon and Terminology**: The tutorial contains several technical terms and jargon that could be confusing for learners. For example, terms like "Bayesian inference", "posterior probability", "likelihood", and "prior probability" are used without clear explanations. Providing definitions or links to resources for these terms would be beneficial. + +3. **Code Explanation**: While the tutorial does a good job of explaining what the code does, it could provide more context on why certain functions or packages are used. For instance, it would be helpful to explain why the `EpiNow2` package is used and how it differs from other packages. + +4. **Visualizations**: Although visualizations are not shown in the Rmd, it would be beneficial to explain what learners should expect to see in the visualizations and how to interpret them. + +5. **Exercises**: The tutorial lacks exercises for learners to practice the concepts and skills they have learned. Including exercises with solutions would greatly enhance the learning experience. + +Suggestions for improvements: + +- Expand the prerequisites section to include specific resources for learners to review necessary concepts. +- Provide definitions or links to resources for technical terms and jargon. +- Provide more context on why certain functions or packages are used. +- Explain what learners should expect to see in the visualizations and how to interpret them. +- Include exercises with solutions for learners to practice the concepts and skills they have learned. + + +## Reviewing forecast materials as a field epidemiologist +To further illustrate review outputs, the below is what was produced when we imported the [Lucia field epidemiologist persona](XX) alongside an early pre-Alpha version of the training materials for our forecasting tutorial: + +**Positive Feedback:** + +1. The tutorial is well-structured and follows a logical progression, which makes it easier to follow. +2. The use of real-world examples and datasets (COVID-19 and Ebola) makes the tutorial more relatable and practical. +3. The inclusion of questions, objectives, prerequisites, and key points at the beginning and end of the tutorial respectively, helps to guide the learning process. +4. The tutorial provides a good balance between theory and practice, with clear explanations and practical coding examples. +5. The use of callouts to highlight important information or provide additional context is very helpful. + +**Constructive Feedback:** + +1. The tutorial assumes a certain level of familiarity with R and statistical concepts, which might be challenging for beginners. It would be helpful to provide more explanations or links to resources for beginners. +2. The code chunks are quite complex and could be broken down into smaller parts with more explanations. +3. The tutorial could benefit from more visualizations to illustrate the concepts and results. +4. The challenge section could provide more guidance or hints to help learners complete the task. + +**Suggestions for Improvement:** + +1. Provide more explanations or links to resources for beginners to learn R and statistical concepts. +2. Break down complex code chunks into smaller parts with more explanations. +3. Include more visualizations to illustrate the concepts and results. +4. Provide more guidance or hints in the challenge section to help learners complete the task. +5. Include a glossary of terms or a list of R packages used in the tutorial for quick reference. + +## Overcoming feedback bottlenecks + +A challenge with LLMs trained for general use is finding domain-specific tasks where they can add sufficient value beyond existing human input. Tasks like providing early sense checking and tailored feedback, particularly from differing perspectives, therefore has potential to overcome common bottlenecks in developing training materials (e.g. providing initial comments and flagging obvious issues while waiting for more detailed human feedback). + +As Epiverse-TRACE training materials continue to develop, we plan to explore further uses beyond simple first-pass reviews. For example, LLMs are well suited to synthesising qualitative feedback, increasing the range of insights that can be collected and summarised from learners as we move into beta testing. We also hope to identify opportunities where LLMs can help supplement the learner experience, as demonstrated by emerging tools like [RTutor](http://rtutor.ai/) for descriptive plotting functionality in R, which combines generation of code in response to user queries with translation into shiny outputs. From 2eb77c7671532e7f4149c01748ff528375ea31cb Mon Sep 17 00:00:00 2001 From: Adam Kucharski Date: Thu, 7 Nov 2024 10:31:15 -1000 Subject: [PATCH 2/9] Update index.qmd with reference links Add links --- posts/ai-learner-review/index.qmd | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/posts/ai-learner-review/index.qmd b/posts/ai-learner-review/index.qmd index efcc35fc..8f4e2b7c 100644 --- a/posts/ai-learner-review/index.qmd +++ b/posts/ai-learner-review/index.qmd @@ -14,7 +14,7 @@ format: Part of the Epiverse-TRACE initiative involves development of training materials that span early, middle and late stage outbreak analysis and modelling tasks. To ensure that our tutorials are accessible to target audiences, we have developed a series of [learner personas](https://github.com/epiverse-trace/personas) to inform the design of learning materials. These personas include the following: -As the volume of training materials increases, we have explored automating the generation of initial reviews using large language models (LLMs) that take the form of 'in character' agents with instructions to provide constructive comments. This reflects a wider focus within the field of outbreak analytics on how LLMs agents can be used to increase the efficiency and scalability of common tasks (e.g. [van Hoek et al, Lancet Microbe, 2024](XX) ). +As the volume of training materials increases, we have explored automating the generation of initial reviews using large language models (LLMs) that take the form of 'in character' agents with instructions to provide constructive comments. This reflects a wider focus within the field of outbreak analytics on how LLMs agents can be used to increase the efficiency and scalability of common tasks (e.g. [van Hoek et al, Lancet Microbe, 2024](https://www.thelancet.com/journals/lanmic/article/PIIS2666-5247(24)00104-6/fulltext) ). To generate the AI tutorial reviews, we use the OpenAI GPT-4 API, via the `openai` R package, as described in [this repository](https://github.com/adamkucharski/llm-api-scripts/). We also use the `gh` package to load the `.Rmd` materials from a given repository (e.g. `epiverse-trace/tutorials-middle`). Full illustrative code is [available here](https://github.com/adamkucharski/llm-api-scripts/scripts/content_review_gpt.R), with the GPT-4 API prompts outlined below. @@ -40,10 +40,9 @@ write_lines(output,"output_review.md")) ``` - ## Reviewing transmissibilty training materials as a senior academic -To illustrate the sort of reviews that can be generated with the above approach, the below is the output produced when we imported the [Vania senior academic persona](XX) alongside an early pre-Alpha version of the training materials for our transmissibility estimation tutorial: +To illustrate the sort of reviews that can be generated with the above approach, the below is the output produced when we imported the [Vania senior academic persona](https://epiverse-trace.github.io/personas/vania-academica.html) alongside an early pre-Alpha version of the training materials for our transmissibility estimation tutorial: Overall, the tutorial is well-structured and provides a comprehensive guide on how to use the `EpiNow2` package to estimate transmission metrics from case data. However, there are several areas that could be improved to enhance clarity and understanding for learners like me. @@ -67,7 +66,7 @@ Suggestions for improvements: ## Reviewing forecast materials as a field epidemiologist -To further illustrate review outputs, the below is what was produced when we imported the [Lucia field epidemiologist persona](XX) alongside an early pre-Alpha version of the training materials for our forecasting tutorial: +To further illustrate review outputs, the below is what was produced when we imported the [Lucia field epidemiologist persona](https://epiverse-trace.github.io/personas/lucia-outbreaks.html) alongside an early pre-Alpha version of the training materials for our forecasting tutorial: **Positive Feedback:** From e915f75d0aba7ded2d27df37d46a4ffd841153b6 Mon Sep 17 00:00:00 2001 From: Adam Kucharski Date: Thu, 7 Nov 2024 10:34:28 -1000 Subject: [PATCH 3/9] Update date on index.qmd --- posts/ai-learner-review/index.qmd | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/posts/ai-learner-review/index.qmd b/posts/ai-learner-review/index.qmd index 8f4e2b7c..e5d2ce81 100644 --- a/posts/ai-learner-review/index.qmd +++ b/posts/ai-learner-review/index.qmd @@ -3,7 +3,7 @@ title: "Using LLM agents to review tutorials 'in character' as learners" author: - name: "Adam Kucharski" orcid: "0000-0001-8814-9421" -date: "2024-05-28" +date: "2024-11-07" categories: [tutorials, R, R package] format: html: From b4d5b9a40dacb42bcb73e570e17b9850e3167162 Mon Sep 17 00:00:00 2001 From: Adam Kucharski Date: Thu, 7 Nov 2024 10:35:42 -1000 Subject: [PATCH 4/9] Fix linting on index.qmd --- posts/ai-learner-review/index.qmd | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/posts/ai-learner-review/index.qmd b/posts/ai-learner-review/index.qmd index e5d2ce81..f6596d80 100644 --- a/posts/ai-learner-review/index.qmd +++ b/posts/ai-learner-review/index.qmd @@ -44,7 +44,7 @@ write_lines(output,"output_review.md")) To illustrate the sort of reviews that can be generated with the above approach, the below is the output produced when we imported the [Vania senior academic persona](https://epiverse-trace.github.io/personas/vania-academica.html) alongside an early pre-Alpha version of the training materials for our transmissibility estimation tutorial: -Overall, the tutorial is well-structured and provides a comprehensive guide on how to use the `EpiNow2` package to estimate transmission metrics from case data. However, there are several areas that could be improved to enhance clarity and understanding for learners like me. +Overall, the tutorial is well-structured and provides a comprehensive guide on how to use the `EpiNow2` package to estimate transmission metrics from case data. However, there are several areas that could be improved to enhance clarity and understanding for learners like me. 1. **Prerequisites**: The prerequisites section could be expanded to include specific resources for learners to review the necessary concepts. For instance, links to resources on Bayesian analysis and the effective reproduction number would be helpful. @@ -64,8 +64,8 @@ Suggestions for improvements: - Explain what learners should expect to see in the visualizations and how to interpret them. - Include exercises with solutions for learners to practice the concepts and skills they have learned. - ## Reviewing forecast materials as a field epidemiologist + To further illustrate review outputs, the below is what was produced when we imported the [Lucia field epidemiologist persona](https://epiverse-trace.github.io/personas/lucia-outbreaks.html) alongside an early pre-Alpha version of the training materials for our forecasting tutorial: **Positive Feedback:** From 90b66e699be00bb850b8164f19ae3456daf4e6dd Mon Sep 17 00:00:00 2001 From: Adam Kucharski Date: Tue, 12 Nov 2024 10:27:50 -1000 Subject: [PATCH 5/9] Update posts/ai-learner-review/index.qmd Co-authored-by: Chris Hartgerink --- posts/ai-learner-review/index.qmd | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/posts/ai-learner-review/index.qmd b/posts/ai-learner-review/index.qmd index f6596d80..bafbc148 100644 --- a/posts/ai-learner-review/index.qmd +++ b/posts/ai-learner-review/index.qmd @@ -3,7 +3,7 @@ title: "Using LLM agents to review tutorials 'in character' as learners" author: - name: "Adam Kucharski" orcid: "0000-0001-8814-9421" -date: "2024-11-07" +date: "2024-11-18" categories: [tutorials, R, R package] format: html: From 8a4e0d768cb652475f2fb3c5e401c7463379037c Mon Sep 17 00:00:00 2001 From: Adam Kucharski Date: Tue, 12 Nov 2024 10:29:21 -1000 Subject: [PATCH 6/9] Update posts/ai-learner-review/index.qmd Co-authored-by: Chris Hartgerink --- posts/ai-learner-review/index.qmd | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/posts/ai-learner-review/index.qmd b/posts/ai-learner-review/index.qmd index bafbc148..865123a1 100644 --- a/posts/ai-learner-review/index.qmd +++ b/posts/ai-learner-review/index.qmd @@ -14,7 +14,7 @@ format: Part of the Epiverse-TRACE initiative involves development of training materials that span early, middle and late stage outbreak analysis and modelling tasks. To ensure that our tutorials are accessible to target audiences, we have developed a series of [learner personas](https://github.com/epiverse-trace/personas) to inform the design of learning materials. These personas include the following: -As the volume of training materials increases, we have explored automating the generation of initial reviews using large language models (LLMs) that take the form of 'in character' agents with instructions to provide constructive comments. This reflects a wider focus within the field of outbreak analytics on how LLMs agents can be used to increase the efficiency and scalability of common tasks (e.g. [van Hoek et al, Lancet Microbe, 2024](https://www.thelancet.com/journals/lanmic/article/PIIS2666-5247(24)00104-6/fulltext) ). +As the volume of training materials increases, I have explored automating the generation of initial reviews using large language models (LLMs) that take the form of 'in character' agents with instructions to provide constructive comments. This reflects a wider focus within the field of outbreak analytics on how LLMs agents can be used to increase the efficiency and scalability of common tasks (e.g. [van Hoek et al, Lancet Microbe, 2024](https://www.thelancet.com/journals/lanmic/article/PIIS2666-5247(24)00104-6/fulltext) ). To generate the AI tutorial reviews, we use the OpenAI GPT-4 API, via the `openai` R package, as described in [this repository](https://github.com/adamkucharski/llm-api-scripts/). We also use the `gh` package to load the `.Rmd` materials from a given repository (e.g. `epiverse-trace/tutorials-middle`). Full illustrative code is [available here](https://github.com/adamkucharski/llm-api-scripts/scripts/content_review_gpt.R), with the GPT-4 API prompts outlined below. From 84237acccd4ff6d5193b1ea7108205140d337b63 Mon Sep 17 00:00:00 2001 From: Adam Kucharski Date: Tue, 12 Nov 2024 10:38:23 -1000 Subject: [PATCH 7/9] Add block quotes to index.qmd Highlight LLM generated text --- posts/ai-learner-review/index.qmd | 92 ++++++++++++++++--------------- 1 file changed, 49 insertions(+), 43 deletions(-) diff --git a/posts/ai-learner-review/index.qmd b/posts/ai-learner-review/index.qmd index 865123a1..1e5f990c 100644 --- a/posts/ai-learner-review/index.qmd +++ b/posts/ai-learner-review/index.qmd @@ -14,6 +14,12 @@ format: Part of the Epiverse-TRACE initiative involves development of training materials that span early, middle and late stage outbreak analysis and modelling tasks. To ensure that our tutorials are accessible to target audiences, we have developed a series of [learner personas](https://github.com/epiverse-trace/personas) to inform the design of learning materials. These personas include the following: +- **Lucia**, a Field Epidemiologist that use R for data cleaning, plotting and report for Outbreak response. +- **Juan**, a Statistician and R user in a National Health Agency with constant deployment to outbreak response. +- **Patricia**, a PhD student learning to use R and analyse Outbreak data for her collaborative project on GitHub. +- **Vania**, a professor who needs ready-to-use training for her research and to pass on to students. +- **Danielle**, a Trainer that wants to remix content to create specific training materials for public health practitioners. + As the volume of training materials increases, I have explored automating the generation of initial reviews using large language models (LLMs) that take the form of 'in character' agents with instructions to provide constructive comments. This reflects a wider focus within the field of outbreak analytics on how LLMs agents can be used to increase the efficiency and scalability of common tasks (e.g. [van Hoek et al, Lancet Microbe, 2024](https://www.thelancet.com/journals/lanmic/article/PIIS2666-5247(24)00104-6/fulltext) ). To generate the AI tutorial reviews, we use the OpenAI GPT-4 API, via the `openai` R package, as described in [this repository](https://github.com/adamkucharski/llm-api-scripts/). We also use the `gh` package to load the `.Rmd` materials from a given repository (e.g. `epiverse-trace/tutorials-middle`). Full illustrative code is [available here](https://github.com/adamkucharski/llm-api-scripts/scripts/content_review_gpt.R), with the GPT-4 API prompts outlined below. @@ -42,54 +48,54 @@ write_lines(output,"output_review.md")) ## Reviewing transmissibilty training materials as a senior academic -To illustrate the sort of reviews that can be generated with the above approach, the below is the output produced when we imported the [Vania senior academic persona](https://epiverse-trace.github.io/personas/vania-academica.html) alongside an early pre-Alpha version of the training materials for our transmissibility estimation tutorial: - -Overall, the tutorial is well-structured and provides a comprehensive guide on how to use the `EpiNow2` package to estimate transmission metrics from case data. However, there are several areas that could be improved to enhance clarity and understanding for learners like me. - -1. **Prerequisites**: The prerequisites section could be expanded to include specific resources for learners to review the necessary concepts. For instance, links to resources on Bayesian analysis and the effective reproduction number would be helpful. - -2. **Jargon and Terminology**: The tutorial contains several technical terms and jargon that could be confusing for learners. For example, terms like "Bayesian inference", "posterior probability", "likelihood", and "prior probability" are used without clear explanations. Providing definitions or links to resources for these terms would be beneficial. - -3. **Code Explanation**: While the tutorial does a good job of explaining what the code does, it could provide more context on why certain functions or packages are used. For instance, it would be helpful to explain why the `EpiNow2` package is used and how it differs from other packages. - -4. **Visualizations**: Although visualizations are not shown in the Rmd, it would be beneficial to explain what learners should expect to see in the visualizations and how to interpret them. - -5. **Exercises**: The tutorial lacks exercises for learners to practice the concepts and skills they have learned. Including exercises with solutions would greatly enhance the learning experience. - -Suggestions for improvements: - -- Expand the prerequisites section to include specific resources for learners to review necessary concepts. -- Provide definitions or links to resources for technical terms and jargon. -- Provide more context on why certain functions or packages are used. -- Explain what learners should expect to see in the visualizations and how to interpret them. -- Include exercises with solutions for learners to practice the concepts and skills they have learned. +> To illustrate the sort of reviews that can be generated with the above approach, the below is the output produced when we imported the [Vania senior academic persona](https://epiverse-trace.github.io/personas/vania-academica.html) alongside an early pre-Alpha version of the training materials for our transmissibility estimation tutorial: +> +> Overall, the tutorial is well-structured and provides a comprehensive guide on how to use the `EpiNow2` package to estimate transmission metrics from case data. However, there are several areas that could be improved to enhance clarity and understanding for learners like me. +> +> 1. **Prerequisites**: The prerequisites section could be expanded to include specific resources for learners to review the necessary concepts. For instance, links to resources on Bayesian analysis and the effective reproduction number would be helpful. +> +> 2. **Jargon and Terminology**: The tutorial contains several technical terms and jargon that could be confusing for learners. For example, terms like "Bayesian inference", "posterior probability", "likelihood", and "prior probability" are used without clear explanations. Providing definitions or links to resources for these terms would be beneficial. +> +> 3. **Code Explanation**: While the tutorial does a good job of explaining what the code does, it could provide more context on why certain functions or packages are used. For instance, it would be helpful to explain why the `EpiNow2` package is used and how it differs from other packages. +> +> 4. **Visualizations**: Although visualizations are not shown in the Rmd, it would be beneficial to explain what learners should expect to see in the visualizations and how to interpret them. +> +> 5. **Exercises**: The tutorial lacks exercises for learners to practice the concepts and skills they have learned. Including exercises with solutions would greatly enhance the learning experience. +> +> Suggestions for improvements: +> +> - Expand the prerequisites section to include specific resources for learners to review necessary concepts. +> - Provide definitions or links to resources for technical terms and jargon. +> - Provide more context on why certain functions or packages are used. +> - Explain what learners should expect to see in the visualizations and how to interpret them. +> - Include exercises with solutions for learners to practice the concepts and skills they have learned. ## Reviewing forecast materials as a field epidemiologist To further illustrate review outputs, the below is what was produced when we imported the [Lucia field epidemiologist persona](https://epiverse-trace.github.io/personas/lucia-outbreaks.html) alongside an early pre-Alpha version of the training materials for our forecasting tutorial: -**Positive Feedback:** - -1. The tutorial is well-structured and follows a logical progression, which makes it easier to follow. -2. The use of real-world examples and datasets (COVID-19 and Ebola) makes the tutorial more relatable and practical. -3. The inclusion of questions, objectives, prerequisites, and key points at the beginning and end of the tutorial respectively, helps to guide the learning process. -4. The tutorial provides a good balance between theory and practice, with clear explanations and practical coding examples. -5. The use of callouts to highlight important information or provide additional context is very helpful. - -**Constructive Feedback:** - -1. The tutorial assumes a certain level of familiarity with R and statistical concepts, which might be challenging for beginners. It would be helpful to provide more explanations or links to resources for beginners. -2. The code chunks are quite complex and could be broken down into smaller parts with more explanations. -3. The tutorial could benefit from more visualizations to illustrate the concepts and results. -4. The challenge section could provide more guidance or hints to help learners complete the task. - -**Suggestions for Improvement:** - -1. Provide more explanations or links to resources for beginners to learn R and statistical concepts. -2. Break down complex code chunks into smaller parts with more explanations. -3. Include more visualizations to illustrate the concepts and results. -4. Provide more guidance or hints in the challenge section to help learners complete the task. -5. Include a glossary of terms or a list of R packages used in the tutorial for quick reference. +> **Positive Feedback:** +> +> 1. The tutorial is well-structured and follows a logical progression, which makes it easier to follow. +> 2. The use of real-world examples and datasets (COVID-19 and Ebola) makes the tutorial more relatable and practical. +> 3. The inclusion of questions, objectives, prerequisites, and key points at the beginning and end of the tutorial respectively, helps to guide the learning process. +> 4. The tutorial provides a good balance between theory and practice, with clear explanations and practical coding examples. +> 5. The use of callouts to highlight important information or provide additional context is very helpful. +> +> **Constructive Feedback:** +> +> 1. The tutorial assumes a certain level of familiarity with R and statistical concepts, which might be challenging for beginners. It would be helpful to provide more explanations or links to resources for beginners. +> 2. The code chunks are quite complex and could be broken down into smaller parts with more explanations. +> 3. The tutorial could benefit from more visualizations to illustrate the concepts and results. +> 4. The challenge section could provide more guidance or hints to help learners complete the task. +> +> **Suggestions for Improvement:** +> +> 1. Provide more explanations or links to resources for beginners to learn R and statistical concepts. +> 2. Break down complex code chunks into smaller parts with more explanations. +> 3. Include more visualizations to illustrate the concepts and results. +> 4. Provide more guidance or hints in the challenge section to help learners complete the task. +> 5. Include a glossary of terms or a list of R packages used in the tutorial for quick reference. ## Overcoming feedback bottlenecks From 402a2aa9d93eb827f856837654d013f6e620afe3 Mon Sep 17 00:00:00 2001 From: Adam Kucharski Date: Tue, 12 Nov 2024 10:39:44 -1000 Subject: [PATCH 8/9] Fix inconsistency in block quote --- posts/ai-learner-review/index.qmd | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/posts/ai-learner-review/index.qmd b/posts/ai-learner-review/index.qmd index 1e5f990c..1a2396ea 100644 --- a/posts/ai-learner-review/index.qmd +++ b/posts/ai-learner-review/index.qmd @@ -48,8 +48,8 @@ write_lines(output,"output_review.md")) ## Reviewing transmissibilty training materials as a senior academic -> To illustrate the sort of reviews that can be generated with the above approach, the below is the output produced when we imported the [Vania senior academic persona](https://epiverse-trace.github.io/personas/vania-academica.html) alongside an early pre-Alpha version of the training materials for our transmissibility estimation tutorial: -> +To illustrate the sort of reviews that can be generated with the above approach, the below is the output produced when we imported the [Vania senior academic persona](https://epiverse-trace.github.io/personas/vania-academica.html) alongside an early pre-Alpha version of the training materials for our transmissibility estimation tutorial: + > Overall, the tutorial is well-structured and provides a comprehensive guide on how to use the `EpiNow2` package to estimate transmission metrics from case data. However, there are several areas that could be improved to enhance clarity and understanding for learners like me. > > 1. **Prerequisites**: The prerequisites section could be expanded to include specific resources for learners to review the necessary concepts. For instance, links to resources on Bayesian analysis and the effective reproduction number would be helpful. From 20e813a9e333de053cca87e285c778018b8437ef Mon Sep 17 00:00:00 2001 From: Adam Kucharski Date: Thu, 14 Nov 2024 15:14:25 -1000 Subject: [PATCH 9/9] Add Andree as author --- posts/ai-learner-review/index.qmd | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/posts/ai-learner-review/index.qmd b/posts/ai-learner-review/index.qmd index 1a2396ea..ade7eab0 100644 --- a/posts/ai-learner-review/index.qmd +++ b/posts/ai-learner-review/index.qmd @@ -3,6 +3,8 @@ title: "Using LLM agents to review tutorials 'in character' as learners" author: - name: "Adam Kucharski" orcid: "0000-0001-8814-9421" + - name: "Andree Valle Campos" + orcid: "0000-0002-7779-481X" date: "2024-11-18" categories: [tutorials, R, R package] format: @@ -20,7 +22,7 @@ Part of the Epiverse-TRACE initiative involves development of training materials - **Vania**, a professor who needs ready-to-use training for her research and to pass on to students. - **Danielle**, a Trainer that wants to remix content to create specific training materials for public health practitioners. -As the volume of training materials increases, I have explored automating the generation of initial reviews using large language models (LLMs) that take the form of 'in character' agents with instructions to provide constructive comments. This reflects a wider focus within the field of outbreak analytics on how LLMs agents can be used to increase the efficiency and scalability of common tasks (e.g. [van Hoek et al, Lancet Microbe, 2024](https://www.thelancet.com/journals/lanmic/article/PIIS2666-5247(24)00104-6/fulltext) ). +As the volume of training materials increases, we have explored automating the generation of initial reviews using large language models (LLMs) that take the form of 'in character' agents with instructions to provide constructive comments. This reflects a wider focus within the field of outbreak analytics on how LLMs agents can be used to increase the efficiency and scalability of common tasks (e.g. [van Hoek et al, Lancet Microbe, 2024](https://www.thelancet.com/journals/lanmic/article/PIIS2666-5247(24)00104-6/fulltext) ). To generate the AI tutorial reviews, we use the OpenAI GPT-4 API, via the `openai` R package, as described in [this repository](https://github.com/adamkucharski/llm-api-scripts/). We also use the `gh` package to load the `.Rmd` materials from a given repository (e.g. `epiverse-trace/tutorials-middle`). Full illustrative code is [available here](https://github.com/adamkucharski/llm-api-scripts/scripts/content_review_gpt.R), with the GPT-4 API prompts outlined below.