Name		Name	Last commit message	Last commit date
parent directory ..
archive		archive
code_huggingface		code_huggingface
code_pytorch		code_pytorch
depth-to-image		depth-to-image
domain-adaptation		domain-adaptation
generated-images		generated-images
img		img
inpainting		inpainting
outpainting		outpainting
style-transfer		style-transfer
upscaling		upscaling
wav		wav
.gitignore		.gitignore
01_generate_text2image_sdxl.ipynb		01_generate_text2image_sdxl.ipynb
02_style_transfer_sdxl.ipynb		02_style_transfer_sdxl.ipynb
03_domain_adaptation_sdxl.ipynb		03_domain_adaptation_sdxl.ipynb
04_upscaling_sd.ipynb		04_upscaling_sd.ipynb
05_inpainting_sd.ipynb		05_inpainting_sd.ipynb
06_depth_to_image_sd.ipynb		06_depth_to_image_sd.ipynb
07_visual_question_answer_idefics.ipynb		07_visual_question_answer_idefics.ipynb
08_visual_question_answer_idefics_sagemaker.ipynb		08_visual_question_answer_idefics_sagemaker.ipynb
09_speech_to_text_whisper_sagemaker_huggingface.ipynb		09_speech_to_text_whisper_sagemaker_huggingface.ipynb
10_speech_to_text_whisper_sagemaker.ipynb		10_speech_to_text_whisper_sagemaker.ipynb
README.md		README.md

README.md

Chapter 10: Multimodal Foundation Models

Questions and Answers

Q: What are the typical use cases for multimodal foundation models?

A: Text summarization, rewriting, information extraction, question answering (QA) and visual question answering (VQA), detecting toxic or harmful content, classification and content moderation, conversational interface, translation, source code generation, reasoning, mask personally identifiable information (PII), personalized marketing and ads.

Q: How does image generation differ from image editing and enhancement?

A: Image generation involves creating images from text prompts, while image editing and enhancement modify existing images based on instructions and prompts, supporting use cases like artistic style transfer, domain adaptation, and upscaling.

Q: What are best practices for multimodal prompt engineering for image-based generative AI?

A: Understand the nuances of the foundation model, define the type of image, describe the subject, specify style and artists, be specific about quality, and be expressive in prompt writing.

Q: Can you explain inpainting, outpainting, and depth-to-image techniques?

A: Inpainting, Outpainting, and Depth-to-Image are specific tasks within generative AI but the document does not provide detailed explanations of these techniques."

Q: How does image captioning contribute to visual question answering?

A: Image captioning, by combining computer vision and natural language processing, enhances tasks like VQA by understanding both visual information in images and textual content of questions to provide accurate and relevant answers.

Chapters

Chapter 1 - Generative AI Use Cases, Fundamentals, Project Lifecycle
Chapter 2 - Prompt Engineering and In-Context Learning
Chapter 3 - Large-Language Foundation Models
Chapter 4 - Quantization and Distributed Computing
Chapter 5 - Fine-Tuning and Evaluation
Chapter 6 - Parameter-efficient Fine Tuning (PEFT)
Chapter 7 - Fine-tuning using Reinforcement Learning with RLHF
Chapter 8 - Optimize and Deploy Generative AI Applications
Chapter 9 - Retrieval Augmented Generation (RAG) and Agents
Chapter 10 - Multimodal Foundation Models
Chapter 11 - Controlled Generation and Fine-Tuning with Stable Diffusion
Chapter 12 - Amazon Bedrock Managed Service for Generative AI

Related Resources

YouTube Channel: https://youtube.generativeaionaws.com
Generative AI on AWS Meetup (Global, Virtual): https://meetup.generativeaionaws.com
Generative AI on AWS O'Reilly Book: https://www.amazon.com/Generative-AI-AWS-Multimodal-Applications/dp/1098159225/
Data Science on AWS O'Reilly Book: https://www.amazon.com/Data-Science-AWS-End-End/dp/1492079391/

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

10_multimodal

10_multimodal

README.md

Chapter 10: Multimodal Foundation Models

Questions and Answers

Chapters

Related Resources

Files

10_multimodal

Directory actions

More options

Directory actions

More options

Latest commit

History

10_multimodal

Folders and files

parent directory

README.md

Chapter 10: Multimodal Foundation Models

Questions and Answers

Chapters

Related Resources