adding content to llm_explore

redhat-ai-services · Oct 14, 2024 · 6b676f6 · 6b676f6
1 parent d7a04b1
commit 6b676f6
Showing 1 changed file with 20 additions and 1 deletion.
diff --git a/content/modules/ROOT/pages/60_llm_explore.adoc b/content/modules/ROOT/pages/60_llm_explore.adoc
@@ -1,15 +1,29 @@
+# What is a Large Language Model?
+
+A Large Language Model (LLM) is an instance of a foundation model. Foundation models are pre-trained on large amounts of unlabeled and self-supervised data. This means that the model learns from patterns in the data in a way that produces generalizable and adaptable output. LLMs are instances of foundation models applied specifically to text and text-like things (code).
+
+Large language models are trained on large datasets of text, such as books, articles and conversations. These datasets can be extremely large. We're talking petabytes of data. Training is the process of teaching the LLM to understand and generate language. It uses algorithms to learn patterns and predict what comes next. ~https://www.ibm.com/topics/large-language-models[1]~ Training an LLM with your data can help ensure that it can answer with the appropriate answer.
+
+The term 'large' in LLM refers to the number of parameters in the model. These parameters are variables that the model uses to make predictions. The higher the number of parameters, the more detailed and nuanced the AI's understanding of language can be. However, training such models requires considerable computational resources and specialized expertise. ~https://www.run.ai/guides/machine-learning-engineering/llm-training[2]~
+
+There are many different types of LLMs for different use cases. Be sure to choose the appropriate one for you specific use case.
+
 # Explore LLMs
 
 In the https://github.com/redhat-ai-services/ai-accelerator[ai-accelerator project], there is an example of an LLM. Let's look at the https://github.com/redhat-ai-services/ai-accelerator/tree/main/tenants/ai-example/single-model-serving-tgis[single-model-serving-tgis] example.
 
 This inference service uses https://huggingface.co/google/flan-t5-small[flan-t5-small] model.
 
+The FLAN-T5 is a Large Language Model open sourced by Google under the Apache license at the end of 2022. We are using the small size which is 80 million parameters. FLAN-T5 models use the following models and techniques: the pretrained model T5 (Text-to-Text Transfer Transformer) and the FLAN (Finetuning Language Models) collection to do fine-tuning multiple tasks.
+
 The model has been uploaded to minio S3 automatically when we ran the bootstrap script. The inference service uses the _TGIS Standalone ServingRuntime for KServe_ and is _**not**_ using a GPU.
 
 Take a look at the InferenceService and the ServingRuntime resource in your _**Demo**_ cluster.
 
 Now let's take a look at the https://github.com/redhat-ai-services/ai-accelerator/tree/main/tenants/ai-example/single-model-serving-vllm[single-model-serving-vllm] example. This inference service uses IBM's https://huggingface.co/ibm-granite/granite-3b-code-base[granite-3b-code-base] model.
 
+The Granite-3B-Code-Base-2K is a decoder-only code model designed for code generative tasks (e.g., code generation, code explanation, code fixing, etc.). It is trained from scratch with a two-phase training strategy. In phase 1, our model is trained on 4 trillion tokens sourced from 116 programming languages, ensuring a comprehensive understanding of programming languages and syntax. In phase 2, our model is trained on 500 billion tokens with a carefully designed mixture of high-quality data from code and natural language domains to improve the models’ ability to reason and follow instructions. Prominent enterprise use cases of LLMs in software engineering productivity include code generation, code explanation, code fixing, generating unit tests, generating documentation, addressing technical debt issues, vulnerability detection, code translation, and more. All Granite Code Base models, including the 3B parameter model, are able to handle these tasks as they were trained on a large amount of code data from 116 programming languages.
+
 The Inference Service uses a vllm ServingRuntime which can be found https://github.com/rh-aiservices-bu/llm-on-openshift/blob/main/serving-runtimes/vllm_runtime/vllm-runtime.yaml[here].
 
 ### Nodes and Taints
@@ -62,4 +76,9 @@ After exploring the GPU Node details, open RHOAI and launch new workbench and ru
 - tenants/ai-example/single-model-serving-tgis/test
 - tenants/ai-example/single-model-serving-vllm/test
 
-These are very simple tests to make sure that the InferenceService is working. View the logs of the inference service pod while you test.
+These are very simple tests to make sure that the InferenceService is working. View the logs of the inference service pod while you test.
+
+
+### References
+1. https://www.ibm.com/topics/large-language-models[]
+2. https://www.run.ai/guides/machine-learning-engineering/llm-training[]