From 9f69b2e9adba1ce052879cc934818b02237bc393 Mon Sep 17 00:00:00 2001 From: Owen Date: Mon, 30 Oct 2023 22:13:53 -0400 Subject: [PATCH] Add setup docs --- docs/source/index.rst | 18 ++-- docs/source/setup/installation.rst | 93 ++++++++++++++++++++ docs/source/setup/quickstart.rst | 136 +++++++++++++++++++++++++++++ 3 files changed, 238 insertions(+), 9 deletions(-) create mode 100644 docs/source/setup/installation.rst create mode 100644 docs/source/setup/quickstart.rst diff --git a/docs/source/index.rst b/docs/source/index.rst index 6c48864..4833308 100644 --- a/docs/source/index.rst +++ b/docs/source/index.rst @@ -21,14 +21,14 @@ Welcome to SciPHi!

-SciPhi [ΨΦ]: AI's Knowledge Engine for Tailored Data Generation 💡 +SciPhi [ΨΦ]: AI's Knowledge Engine 💡 ----------------------------------------------------------------- SciPhi is a powerful knowledge engine tailored for LLM-based data generation and management. With SciPhi, you can: -* Generate datasets using various LLMs, supporting **Anthropic**, **OpenAI**, **vLLM**, and **SciPhi API**. +* Generate datasets using various LLMs, supporting **Anthropic**, **OpenAI**, **vLLM**, and **SciPhi**. * Tap into the **Retriever-Augmented Generation (RAG)** for data anchoring to real-world sources. - Features like end-to-end cloud and local RAG knowledge engine APIs are underway! * Custom tailor your data creation for applications such as LLM training, RAG, and beyond. @@ -41,19 +41,19 @@ Quick and easy setup: Diverse Features: -* Engage with the community on platforms like `Discord `_. -* Seamlessly integrate multiple LLM and RAG providers like OpenAI, Anthropic, HuggingFace, and vLLM. +* Seamlessly integrate multiple LLM and RAG providers like SciPhi, OpenAI, Anthropic, HuggingFace, and vLLM. * Generate custom datasets and even full textbooks using SciPhi's unique capabilities. * Evaluate your RAG systems effectively with the SciPhi evaluation harness. +* Engage with the community on platforms like `Discord `_. Developers can also instantiate their own LLM and RAG providers using the SciPhi framework. The supported LLM providers include popular choices like OpenAI, Anthropic, HuggingFace, and vLLM. For specialized RAG capabilities, SciPhi offers the **World Databasef API** for comprehensive database access. For a detailed setup guide, deeper feature exploration, and developer insights, refer to: * `SciPhi GitHub Repository `_ -* `Example Textbook Generated with SciPhi `_ -* `Default Settings for Textbook Generation `_ -* `Library of SciPhi Books `_ +* `Example Textbook Generated with SciPhi `_ +* `Default Settings for Textbook Generation `_ +* `Library of SciPhi Books `_ Do consider citing our work if SciPhi aids your research. Check the citation section for details. @@ -66,8 +66,8 @@ Documentation :maxdepth: 1 :caption: Getting Started - getting_started/installation - getting_started/quickstart + setup/installation + setup/quickstart .. toctree:: :maxdepth: 1 diff --git a/docs/source/setup/installation.rst b/docs/source/setup/installation.rst new file mode 100644 index 0000000..c04499a --- /dev/null +++ b/docs/source/setup/installation.rst @@ -0,0 +1,93 @@ +.. _sciphi_installation: + +Installation for SciPhi [ΨΦ]: AI's Knowledge Engine 💡 +===================================================== + +

+SciPhi Logo +

+ +SciPhi is a powerful knowledge engine that integrates with multiple LLM providers and RAG providers, allowing for customizable data creation, retriever-augmented generation, and even textbook generation. + +Requirements +------------ + +- **Python**: `>=3.9,<3.12` +- **Libraries**: (Please refer to the README for a detailed list) + +Fast Installation with pip +-------------------------- + +Installing SciPhi is as simple as using pip: + +.. code-block:: console + + $ pip install sciphi + +Optional Extra Dependencies +--------------------------- + +For complete advanced features and provider support: + +.. code-block:: console + + $ pip install 'sciphi[all_with_extras]' + +Setting Up Your Environment +--------------------------- + +After installation, set up your environment to link with supported LLM providers: + +.. code-block:: console + + $ cd your_working_directory + $ nano .env # Adjust the .env file with your specific configurations. + +Here is an example of the configuration in the `.env` file: + +.. code-block:: bash + + OPENAI_API_KEY=your_openai_api_key + ANTHROPIC_API_KEY=your_anthropic_api_key + HF_TOKEN=your_huggingface_token + VLLM_API_KEY=your_vllm_api_key + SCIPHI_API_KEY=your_sciphi_api_key + RAG_API_KEY=your_rag_server_api_key + RAG_API_BASE=your_rag_api_base_url + +.. note:: + Make sure to save and exit the file after making changes. + +Development Setup +----------------- + +To set up SciPhi for development: + +.. code-block:: console + + $ git clone https://github.com/emrgnt-cmplxty/sciphi.git + $ cd sciphi + $ pip3 install poetry # If you do not have Poetry installed. + $ poetry install + $ poetry install -E all_with_extras + +Licensing and Acknowledgment +--------------------------- + +SciPhi is licensed under the [Apache-2.0 License](./LICENSE). + +Citing Our Work +--------------- + +If you're using SciPhi in your research or project, please cite our work: + +.. code-block:: plaintext + + @software{SciPhi, + author = {Colegrove, Owen}, + doi = {Pending}, + month = {09}, + title = {{SciPhi: A Framework for LLM Powered Data}}, + url = {https://github.com/sciphi-ai/sciphi}, + year = {2023} + } diff --git a/docs/source/setup/quickstart.rst b/docs/source/setup/quickstart.rst new file mode 100644 index 0000000..21217ca --- /dev/null +++ b/docs/source/setup/quickstart.rst @@ -0,0 +1,136 @@ +.. _sciphi_quickstart: + +SciPhi Quickstart +================= + +Welcome to the SciPhi quickstart guide! SciPhi, or ΨΦ, is your portal to using large language models (LLMs) like OpenAI's models, Anthropic, HuggingFace, and vLLM, combined with the power of Retriever-Augmented Generation (RAG). + +This guide will introduce you to: +- Generating data tailored to your needs. +- Using the RAG provider interface. +- Creating RAG-enhanced textbooks. +- Evaluating your RAG pipeline. + +Let's get started! + +Setting Up Your Environment +--------------------------- + +Before you start, ensure you've installed SciPhi: + +.. code-block:: bash + + pip install sciphi + +For additional details, refer to the `installation guide `_. + +Instantiate Your LLM and RAG Provider +------------------------------------- + +Here's a simple example of how you can utilize SciPhi to work with your own LLM and RAG provider: + +.. code-block:: python + + from sciphi.core import LLMProviderName, RAGProviderName + from sciphi.interface import LLMInterfaceManager, RAGInterfaceManager + from sciphi.llm import GenerationConfig + + # Define your parameters here... + + # RAG Provider Settings + rag_interface = ( + RAGInterfaceManager.get_interface_from_args( + RAGProviderName(rag_provider_name), + api_base=rag_api_base or llm_api_base, + api_key=rag_api_key or llm_api_key, + top_k=rag_top_k, + ) + if rag_enabled + else None + ) + + # LLM Provider Settings + llm_interface = LLMInterfaceManager.get_interface_from_args( + LLMProviderName(llm_provider_name), + api_key=llm_api_key, + api_base=llm_api_base, + rag_interface=rag_interface, + model_name=llm_model_name, + ) + + # Set up typical LLM generation settings + completion_config = GenerationConfig( + temperature=llm_temperature, + top_k=llm_top_k, + max_tokens_to_sample=llm_max_tokens_to_sample, + model_name=llm_model_name, + skip_special_tokens=llm_skip_special_tokens, + stop_token=SciPhiFormatter.INIT_PARAGRAPH_TOKEN, + ) + + # Get the completion for a prompt + completion = llm_interface.get_completion(prompt, generation_config) + + # Continue with your process... + +This example showcases the flexibility and power of SciPhi, allowing you to seamlessly integrate various LLM and RAG providers into your applications. + + +Generating Data with SciPhi +--------------------------- + +To generate data tailored to your specifications, you can use the provided scripts. For instance, to generate a dataset with a desired number of samples: + +.. code-block:: bash + + python -m sciphi.scripts.data_augmenter --config-path=$PWD/sciphi/config/prompts/question_and_answer.yaml --config_name=None --n_samples=1 + + +Inspecting the output: + +.. code-block:: bash + + {"question": "What is the reaction called when alcohol and carboxylic acids react?", "answer": "Fischer esterification"} + ... + {"question": "Are tertiary alcohols resistant to oxidation?", "answer": "Yes"} + + +This command can be readily expanded to other configurations. + +RAG-Enhanced Textbooks +---------------------- + +With SciPhi, you can generate textbooks with the assistance of RAG. To perform a dry-run: + +.. code-block:: bash + + python -m sciphi.scripts.textbook_generator dry_run --toc_dir=sciphi/data/sample/table_of_contents --rag-enabled=False + +To generate a textbook: + +.. code-block:: bash + + python -m sciphi.scripts.textbook_generator run --toc_dir=sciphi/data/sample/table_of_contents --rag-enabled=False --filter_existing_books=False + +You can also use a custom table of contents: + +.. code-block:: bash + + python -m sciphi.scripts.textbook_generator run --toc_dir=toc --output_dir=books --data_dir=$PWD + +RAG Evaluation +-------------- + +Measure the efficacy of your RAG pipeline using SciPhi's evaluation harness: + +.. code-block:: bash + + python -m sciphi.scripts.rag_harness --n-samples=100 --rag-enabled=True --evals_to_run="science_multiple_choice" + +This will evaluate your RAG over a set of questions and report the final accuracy. + + +Wrapping Up +----------- + +Congratulations! You've now been introduced to the core functionalities of SciPhi. This is just the beginning; delve deeper into the documentation, explore the community on Discord, or reach out for tailored inquiries. Happy modeling! \ No newline at end of file