Skip to content
This repository has been archived by the owner on Feb 12, 2024. It is now read-only.

Add setup docs #115

Merged
merged 1 commit into from
Oct 31, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 9 additions & 9 deletions docs/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -21,14 +21,14 @@ Welcome to SciPHi!
</p>


SciPhi [ΨΦ]: AI's Knowledge Engine for Tailored Data Generation 💡
SciPhi [ΨΦ]: AI's Knowledge Engine 💡
-----------------------------------------------------------------

SciPhi is a powerful knowledge engine tailored for LLM-based data generation and management.

With SciPhi, you can:

* Generate datasets using various LLMs, supporting **Anthropic**, **OpenAI**, **vLLM**, and **SciPhi API**.
* Generate datasets using various LLMs, supporting **Anthropic**, **OpenAI**, **vLLM**, and **SciPhi**.
* Tap into the **Retriever-Augmented Generation (RAG)** for data anchoring to real-world sources.
- Features like end-to-end cloud and local RAG knowledge engine APIs are underway!
* Custom tailor your data creation for applications such as LLM training, RAG, and beyond.
Expand All @@ -41,19 +41,19 @@ Quick and easy setup:

Diverse Features:

* Engage with the community on platforms like `Discord <https://discord.gg/j9GxfbxqAe>`_.
* Seamlessly integrate multiple LLM and RAG providers like OpenAI, Anthropic, HuggingFace, and vLLM.
* Seamlessly integrate multiple LLM and RAG providers like SciPhi, OpenAI, Anthropic, HuggingFace, and vLLM.
* Generate custom datasets and even full textbooks using SciPhi's unique capabilities.
* Evaluate your RAG systems effectively with the SciPhi evaluation harness.
* Engage with the community on platforms like `Discord <https://discord.gg/j9GxfbxqAe>`_.

Developers can also instantiate their own LLM and RAG providers using the SciPhi framework. The supported LLM providers include popular choices like OpenAI, Anthropic, HuggingFace, and vLLM. For specialized RAG capabilities, SciPhi offers the **World Databasef API** for comprehensive database access.

For a detailed setup guide, deeper feature exploration, and developer insights, refer to:

* `SciPhi GitHub Repository <https://github.com/emrgnt-cmplxty/sciphi>`_
* `Example Textbook Generated with SciPhi <sciphi/data/sample/textbooks/Aerodynamics_of_Viscous_Fluids.md>`_
* `Default Settings for Textbook Generation <sciphi/config/generation_settings/textbook_generation_settings.yaml>`_
* `Library of SciPhi Books <https://github.com/SciPhi-AI/library-of-phi>`_
* `Example Textbook Generated with SciPhi <https://github.com/SciPhi-AI/sciphi/data/sample/textbooks/Aerodynamics_of_Viscous_Fluids.md>`_
* `Default Settings for Textbook Generation <https://github.com/SciPhi-AI/sciphi/config/generation_settings/textbook_generation_settings.yaml>`_
* `Library of SciPhi Books <https://github.com/SciPhi-AI/github.com/SciPhi-AI/library-of-phi>`_

Do consider citing our work if SciPhi aids your research. Check the citation section for details.

Expand All @@ -66,8 +66,8 @@ Documentation
:maxdepth: 1
:caption: Getting Started

getting_started/installation
getting_started/quickstart
setup/installation
setup/quickstart

.. toctree::
:maxdepth: 1
Expand Down
93 changes: 93 additions & 0 deletions docs/source/setup/installation.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,93 @@
.. _sciphi_installation:

Installation for SciPhi [ΨΦ]: AI's Knowledge Engine 💡
=====================================================

<p align="center">
<img width="716" alt="SciPhi Logo" src="https://github.com/emrgnt-cmplxty/sciphi/assets/68796651/195367d8-54fd-4281-ace0-87ea8523f982">
</p>

SciPhi is a powerful knowledge engine that integrates with multiple LLM providers and RAG providers, allowing for customizable data creation, retriever-augmented generation, and even textbook generation.

Requirements
------------

- **Python**: `>=3.9,<3.12`
- **Libraries**: (Please refer to the README for a detailed list)

Fast Installation with pip
--------------------------

Installing SciPhi is as simple as using pip:

.. code-block:: console

$ pip install sciphi

Optional Extra Dependencies
---------------------------

For complete advanced features and provider support:

.. code-block:: console

$ pip install 'sciphi[all_with_extras]'

Setting Up Your Environment
---------------------------

After installation, set up your environment to link with supported LLM providers:

.. code-block:: console

$ cd your_working_directory
$ nano .env # Adjust the .env file with your specific configurations.

Here is an example of the configuration in the `.env` file:

.. code-block:: bash

OPENAI_API_KEY=your_openai_api_key
ANTHROPIC_API_KEY=your_anthropic_api_key
HF_TOKEN=your_huggingface_token
VLLM_API_KEY=your_vllm_api_key
SCIPHI_API_KEY=your_sciphi_api_key
RAG_API_KEY=your_rag_server_api_key
RAG_API_BASE=your_rag_api_base_url

.. note::
Make sure to save and exit the file after making changes.

Development Setup
-----------------

To set up SciPhi for development:

.. code-block:: console

$ git clone https://github.com/emrgnt-cmplxty/sciphi.git
$ cd sciphi
$ pip3 install poetry # If you do not have Poetry installed.
$ poetry install
$ poetry install -E all_with_extras

Licensing and Acknowledgment
---------------------------

SciPhi is licensed under the [Apache-2.0 License](./LICENSE).

Citing Our Work
---------------

If you're using SciPhi in your research or project, please cite our work:

.. code-block:: plaintext

@software{SciPhi,
author = {Colegrove, Owen},
doi = {Pending},
month = {09},
title = {{SciPhi: A Framework for LLM Powered Data}},
url = {https://github.com/sciphi-ai/sciphi},
year = {2023}
}
136 changes: 136 additions & 0 deletions docs/source/setup/quickstart.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,136 @@
.. _sciphi_quickstart:

SciPhi Quickstart
=================

Welcome to the SciPhi quickstart guide! SciPhi, or ΨΦ, is your portal to using large language models (LLMs) like OpenAI's models, Anthropic, HuggingFace, and vLLM, combined with the power of Retriever-Augmented Generation (RAG).

This guide will introduce you to:
- Generating data tailored to your needs.
- Using the RAG provider interface.
- Creating RAG-enhanced textbooks.
- Evaluating your RAG pipeline.

Let's get started!

Setting Up Your Environment
---------------------------

Before you start, ensure you've installed SciPhi:

.. code-block:: bash

pip install sciphi

For additional details, refer to the `installation guide <https://sciphi.readthedocs.io/en/latest/installation.html>`_.

Instantiate Your LLM and RAG Provider
-------------------------------------

Here's a simple example of how you can utilize SciPhi to work with your own LLM and RAG provider:

.. code-block:: python

from sciphi.core import LLMProviderName, RAGProviderName
from sciphi.interface import LLMInterfaceManager, RAGInterfaceManager
from sciphi.llm import GenerationConfig

# Define your parameters here...

# RAG Provider Settings
rag_interface = (
RAGInterfaceManager.get_interface_from_args(
RAGProviderName(rag_provider_name),
api_base=rag_api_base or llm_api_base,
api_key=rag_api_key or llm_api_key,
top_k=rag_top_k,
)
if rag_enabled
else None
)

# LLM Provider Settings
llm_interface = LLMInterfaceManager.get_interface_from_args(
LLMProviderName(llm_provider_name),
api_key=llm_api_key,
api_base=llm_api_base,
rag_interface=rag_interface,
model_name=llm_model_name,
)

# Set up typical LLM generation settings
completion_config = GenerationConfig(
temperature=llm_temperature,
top_k=llm_top_k,
max_tokens_to_sample=llm_max_tokens_to_sample,
model_name=llm_model_name,
skip_special_tokens=llm_skip_special_tokens,
stop_token=SciPhiFormatter.INIT_PARAGRAPH_TOKEN,
)

# Get the completion for a prompt
completion = llm_interface.get_completion(prompt, generation_config)

# Continue with your process...

This example showcases the flexibility and power of SciPhi, allowing you to seamlessly integrate various LLM and RAG providers into your applications.


Generating Data with SciPhi
---------------------------

To generate data tailored to your specifications, you can use the provided scripts. For instance, to generate a dataset with a desired number of samples:

.. code-block:: bash

python -m sciphi.scripts.data_augmenter --config-path=$PWD/sciphi/config/prompts/question_and_answer.yaml --config_name=None --n_samples=1


Inspecting the output:

.. code-block:: bash

{"question": "What is the reaction called when alcohol and carboxylic acids react?", "answer": "Fischer esterification"}
...
{"question": "Are tertiary alcohols resistant to oxidation?", "answer": "Yes"}


This command can be readily expanded to other configurations.

RAG-Enhanced Textbooks
----------------------

With SciPhi, you can generate textbooks with the assistance of RAG. To perform a dry-run:

.. code-block:: bash

python -m sciphi.scripts.textbook_generator dry_run --toc_dir=sciphi/data/sample/table_of_contents --rag-enabled=False

To generate a textbook:

.. code-block:: bash

python -m sciphi.scripts.textbook_generator run --toc_dir=sciphi/data/sample/table_of_contents --rag-enabled=False --filter_existing_books=False

You can also use a custom table of contents:

.. code-block:: bash

python -m sciphi.scripts.textbook_generator run --toc_dir=toc --output_dir=books --data_dir=$PWD

RAG Evaluation
--------------

Measure the efficacy of your RAG pipeline using SciPhi's evaluation harness:

.. code-block:: bash

python -m sciphi.scripts.rag_harness --n-samples=100 --rag-enabled=True --evals_to_run="science_multiple_choice"

This will evaluate your RAG over a set of questions and report the final accuracy.


Wrapping Up
-----------

Congratulations! You've now been introduced to the core functionalities of SciPhi. This is just the beginning; delve deeper into the documentation, explore the community on Discord, or reach out for tailored inquiries. Happy modeling!
Loading