Skip to content

Latest commit

 

History

History
123 lines (85 loc) · 7.79 KB

File metadata and controls

123 lines (85 loc) · 7.79 KB

Text Generation

[ Slides | Sketches ]

Objectives

  • Learn about sequential data and Markov Chains.
  • Learn about Recurrent Neural Network (RNN) architectures.
  • Learn about transformer architecture.
  • Learn to work with large language models (LLMs) in JavaScript.

Lecture Notes

Tools

Related Projects

Text

Drawing

Music

Coding

Code Example

Note: Ollama examples below can only be run locally in conjunction with Ollama.

Markov Chain

Ollama

OpenAI

Supplemental Materials

Markov Chains

RNNs and LSTMs

Transformers

Video Tutorials

Note: ml5.js tutorials below were taught using an older version of ml5.js, refer to the ml5.js Resources Wiki page for more information.

Text

Drawing

Assignment 7

  1. Read What Can Machine Learning Teach Us About Ourselves?, interview with Emily Martinez, ml5.js Fellow 2020.
  2. Read The Subtext of a Black Corpus, in conversation with ITP research fellows Nikita Huggins & Ayodamola Okunseinde by Ashley Lewis.
  3. Emily Martinez proposes a set of questions to ask related to working with a corpus of text data. Pick one (or two) of the questions to reflect on as you respond to the above two readings:
    • How can we be more intentional about what we build given the current limitations, problems, and constraints of ML algorithms?
    • How do we prepare datasets and set up guidelines that protect the bodies of knowledge of our communities, that honors lineage, that upholds ethical frameworks rooted in shared, agreed-upon values?
    • How do we work in consensual and respectful ways with texts by marginalized authors that are not as well-represented, and by virtue of that fact alone, much more likely to be misrepresented, misappropriated, or misunderstood if we are not careful?
    • How well can we ensure that the essence of these texts doesn’t dissolve into a word-soup that gets misconstrued?
    • Given that so many of the existing “big data” language models are trained with Western texts and proprietary datasets, what does it even mean to try to decolonize AI?
    • Who do we entrust to do this work?
    • How do we deal with credit and attribution of our new creations?
    • How do we really do ethics with machine learning?
    • How do we get through this whole list of concerns and still build AI that is fun, respectful, tender, pleasurable, kind?
  4. Document your response to the readings in a blog post and add a link to the post on the Assignment 7 Wiki page.
  5. Review the final project proposal guidelines and post your final project proposal and slides on the Final Proposal Wiki page.