diff --git a/README.md b/README.md index d474342..6536bc9 100644 --- a/README.md +++ b/README.md @@ -14,61 +14,73 @@ With Synthesizer, users can: --- -## Documentation - -For more detailed information, tutorials, and API references, please visit the official [Synthesizer Documentation](https://sciphi.readthedocs.io/en/latest/). - ## Fast Setup ```bash pip install sciphi-synthesizer ``` -## Features + +### Using Synthesizer + +1. **Generate synthetic question-answer pairs** + + ```bash + export SCIPHI_API_KEY=MY_SCIPHI_API_KEY + python -m synthesizer.scripts.data_augmenter run --dataset="wiki_qa" + ``` + + ```bash + tail augmented_output/config_name_eq_answer_question__dataset_name_eq_wiki_qa.jsonl + ``` + +2. **Evaluate RAG pipeline performance** + + ```bash + export SCIPHI_API_KEY=MY_SCIPHI_API_KEY + python -m synthesizer.scripts.rag_harness --rag_provider="agent-search" --llm_provider_name="sciphi" --n_samples=25 + ``` + +### Documentation + +For more detailed information, tutorials, and API references, please visit the official [Synthesizer Documentation](https://sciphi.readthedocs.io/en/latest/). ### Community & Support - Engage with our vibrant community on [Discord](https://discord.gg/j9GxfbxqAe). - For tailored inquiries or feedback, please [email us](mailto:owen@sciphi.ai). +### Developing with Synthesizer -### Example - -The following example demonstrates how to construct a connection to the AgentSearch API with the synthesizer RAG interface. Then, the example goes on to use the RAG interface to generate a response with an OpenAI hosted LLM. +Quickly set up RAG augmented generation with your choice of provider, from OpenAI, Anhtropic, vLLM, and SciPhi: ```python - - from synthesizer.core import LLMProviderName, RAGProviderName - from synthesizer.interface import ( - LLMInterfaceManager, - RAGInterfaceManager, - ) - from synthesizer.llm import GenerationConfig - - # RAG Provider Settings - rag_interface = RAGInterfaceManager.get_interface_from_args( - RAGProviderName(rag_provider_name), - api_base=rag_api_base, - limit_hierarchical_url_results=rag_limit_hierarchical_url_results, - limit_final_pagerank_results=rag_limit_final_pagerank_results, - ) - rag_context = rag_interface.get_rag_context(query) - - # LLM Provider Settings - llm_interface = LLMInterfaceManager.get_interface_from_args( - LLMProviderName(llm_provider_name), - ) - - generation_config = GenerationConfig( - model_name=llm_model_name, - max_tokens_to_sample=llm_max_tokens_to_sample, - temperature=llm_temperature, - top_p=llm_top_p, - # other generation params here ... - ) - - formatted_prompt = rag_prompt.format(rag_context=rag_context) - completion = llm_interface.get_completion( - formatted_prompt, generation_config - ) - print(completion) -``` +# Requires SCIPHI_API_KEY in env + +from synthesizer.core import LLMProviderName, RAGProviderName +from synthesizer.interface import LLMInterfaceManager, RAGInterfaceManager +from synthesizer.llm import GenerationConfig + +# RAG Provider Settings +rag_interface = RAGInterfaceManager.get_interface_from_args( + RAGProviderName("agent-search"), + limit_hierarchical_url_results=rag_limit_hierarchical_url_results, + limit_final_pagerank_results=rag_limit_final_pagerank_results, +) +rag_context = rag_interface.get_rag_context(query) + +# LLM Provider Settings +llm_interface = LLMInterfaceManager.get_interface_from_args( + LLMProviderName("openai"), +) + +generation_config = GenerationConfig( + model_name=llm_model_name, + max_tokens_to_sample=llm_max_tokens_to_sample, + temperature=llm_temperature, + top_p=llm_top_p, + # other generation params here ... +) + +formatted_prompt = raw_prompt.format(rag_context=rag_context) +completion = llm_interface.get_completion(formatted_prompt, generation_config) +``` \ No newline at end of file diff --git a/docs/source/setup/quickstart.rst b/docs/source/setup/quickstart.rst index 9a48cb6..ef6c548 100644 --- a/docs/source/setup/quickstart.rst +++ b/docs/source/setup/quickstart.rst @@ -48,6 +48,7 @@ Using Synthesizer python -m synthesizer.scripts.rag_harness --rag_provider="agent-search" --llm_provider_name="sciphi" --n_samples=25 .. code-block:: bash + ... INFO:__main__:Now generating completions... 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 100/100 [00:29<00:00, 3.40it/s]