Notes on the short course from DeepLearning.AI
- Document loaders transfer raw data into a standardized format
- You can use a speech-to-text model (e.g. OpenAI's whisper model) to load data from audi/video files
- Documents need to be split in order to be processed efficiently. Select a splitter and use
create_documents()
to create documents from a list of text andsplit_documents()
to split -split_documents()
accepts custom regex expressions to split - Store splits converted to embeddings in vector store (i.e. Chroma to get started) for efficient retrieval
- Start with semantic similarity search and try maximum marginal relevance (MMR), SelfQuery (a.k.a. LLM aided retrieval) or compression
- The general retrieval workflow looks as follows:
- User asks a question
- Question is transformed to embedding, send to vector store and used to retrieve relevant splits
- Relevant splits (system prompt) and original question (human prompt) and send to LLM
- Use RetrievalQA chain to ask questions about your documents and switch to ConversationalRetrievalChain and ConversationBufferMemory if you need to introduce memory