Ollama Parallel using LangChain #21972

ArslanKAS · 2024-05-21T14:09:34Z

ArslanKAS
May 21, 2024

Checked other resources

I added a very descriptive title to this question.
I searched the LangChain documentation with the integrated search.
I used the GitHub search to find a similar question and didn't find it.

Commit to Help

I commit to help with one of those options 👆

Example Code

import concurrent.futures
from langchain_community.llms import Ollama

SYSTEM_PROMPT = """You're a call center analyst, tasked to analyze the call transcripts, fix grammatical errors and typos
                and provide an in-depth summary consisting of 6 lines with fixed grammar."""

llm = Ollama(
    base_url=BASE_URL,
    model=MODEL,
    system=SYSTEM_PROMPT,
    temperature=0.0,
    )

# Define a function to process each row
def process_row(row):
    transcript = row["TranscriptContent"]
    response = llm(f""" You're a call analyst. Summarize the following call transcript into 6 concise sentences paragraph: Call Transcript: {transcript}""")
    return response

# List to store the summaries
summaries = []

# Concurrently process each row
with concurrent.futures.ThreadPoolExecutor(max_workers=4) as executor:
    # Submit each row to the executor
    futures = [executor.submit(process_row, row) for _, row in df.iterrows()]
    # Retrieve the results as they become available
    for future in concurrent.futures.as_completed(futures):
        result = future.result()
        summaries.append(result)

Description

Hi everyone,

After the release of OLLAMA_NUM_PARALLEL and OLLAMA_MAX_LOADED_MODELS I tried it out purely using Ollama in the terminal and I was able to get concurrent results. But when I used LangChain then I was unable to get any concurrency at all.

I searched through both Ollama and LangChain repos and disucssions but couldn't find any starter code for it. Please if someone has already implemented it, guide me.

By the way I've installed Ollama on a AWS EC2 Instance and I use it's BASE URL in my SageMaker Jupyter Notebook using LangChain. That's how I'm able to use Ollama. All that's missing is now the Concurrency (Parallelism).

Thank you

System Info

System Information

OS: Linux
OS Version: #1 SMP Tue Apr 23 20:32:19 UTC 2024
Python Version: 3.10.14 | packaged by conda-forge | (main, Mar 20 2024, 12:45:18) [GCC 12.3.0]

Package Information

langchain_core: 0.2.0
langchain: 0.2.0
langsmith: 0.1.60
langchain_openai: 0.1.7
langchain_text_splitters: 0.2.0

@dosu · 2024-05-21T14:10:06Z

dosubot[bot]
bot May 21, 2024

To achieve concurrency with LangChain and Ollama, you should leverage the asynchronous capabilities of the Ollama class. The concurrent.futures.ThreadPoolExecutor is designed for synchronous functions, but since the Ollama class supports asynchronous operations, using asyncio would be more appropriate.

Here is an updated version of your code that uses asyncio and the asynchronous methods provided by the Ollama class:

Updated Example Code

import asyncio
from langchain_community.llms import Ollama
from langchain_core.messages import HumanMessage

SYSTEM_PROMPT = """You're a call center analyst, tasked to analyze the call transcripts, fix grammatical errors and typos
                and provide an in-depth summary consisting of 6 lines with fixed grammar."""

llm = Ollama(
    base_url=BASE_URL,
    model=MODEL,
    system=SYSTEM_PROMPT,
    temperature=0.0,
)

# Define an asynchronous function to process each row
async def process_row(row):
    transcript = row["TranscriptContent"]
    response = await llm._agenerate([HumanMessage(content=f""" You're a call analyst. Summarize the following call transcript into 6 concise sentences paragraph: Call Transcript: {transcript}""")])
    return response.generations[0][0].text

# Define the main function to process all rows concurrently
async def main(df):
    tasks = [process_row(row) for _, row in df.iterrows()]
    summaries = await asyncio.gather(*tasks)
    return summaries

# Run the main function
summaries = asyncio.run(main(df))

# Print the summaries
for summary in summaries:
    print(summary)

This approach uses asyncio to handle concurrency, which should improve the performance of processing call transcripts. Make sure you have the necessary dependencies installed for asynchronous operations, such as aiohttp and async-timeout.

Details

Sources

About Dosu
This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

To continue the conversation, mention @dosu.

2 replies

ArslanKAS Jun 5, 2024
Author

How to make the following concurrent?

llm = Ollama(
    base_url=BASE_URL,
    model=MODEL,
    temperature=0.2,
    )

# Define a function to process each row
def process_row(row, llm):
    transcript = row["TranscriptContent"]
    result = LLM_Function(transcript, llm)
    return result

merged_df["Results"] = merged_df.apply(lambda row: process_row(row, llm), axis=1)

dosubot[bot] bot Jun 5, 2024

To make the provided code concurrent using asyncio, you need to modify the process_row function to be asynchronous and use asyncio.gather to run the tasks concurrently. Here is the updated code:

import asyncio
import pandas as pd

llm = Ollama(
    base_url=BASE_URL,
    model=MODEL,
    temperature=0.2,
)

# Define an async function to process each row
async def process_row(row, llm):
    transcript = row["TranscriptContent"]
    result = await LLM_Function(transcript, llm)
    return result

# Define a function to process the DataFrame concurrently
async def process_dataframe(df, llm):
    tasks = [process_row(row, llm) for _, row in df.iterrows()]
    results = await asyncio.gather(*tasks)
    return results

# Run the async function and update the DataFrame
async def main():
    merged_df["Results"] = await process_dataframe(merged_df, llm)

# Execute the main function
asyncio.run(main())

In this code:

process_row is defined as an async function.
process_dataframe creates a list of tasks for each row and uses asyncio.gather to run them concurrently.
main is the entry point to run the async processing and update the DataFrame.
asyncio.run(main()) is used to execute the main function ^[1].

To continue talking to Dosu, mention @dosu.

Zuo-Peng · 2024-07-11T02:35:56Z

Zuo-Peng
Jul 11, 2024

@ArslanKAS Have you finish this problem?
I have the same problem.When I use Langchian on the local Mac system, I can achieve concurrency effects, but I cannot achieve concurrency effects on the computing service cluster.

from langchain_community.llms import Ollama
from concurrent.futures import ProcessPoolExecutor, as_completed
import logging
import pandas as pd


def progress(model, prompt):
    llm = Ollama(model=model)
    logging.info(f"Starting invoke")
    response = llm.invoke(prompt)
    logging.info(f"Response: {response}")
    return response


def main():
    # 创建一个log文件格式
    logging.basicConfig(filename='00testpara.log', level=logging.INFO,
                        format='%(asctime)s - %(processName)s - %(levelname)s - %(message)s - [%(filename)s:%(lineno)d]')
    # 创建一个prompt的dataframe
    prompt_df = pd.DataFrame(columns=['model', 'prompt'])
    # df有4行，每行的prompt都是一样的
    prompt_df.loc[0] = ['model1', 'Who are you?']
    prompt_df.loc[1] = ['model2', 'Who are you?']
    prompt_df.loc[2] = ['model3', 'Who are you?']
    prompt_df.loc[3] = ['model4', 'Who are you?']
    prompt_df.loc[4] = ['model1', 'Who are you?']
    prompt_df.loc[5] = ['model2', 'Who are you?']
    prompt_df.loc[6] = ['model3', 'Who are you?']
    prompt_df.loc[7] = ['model4', 'Who are you?']
    # 创建一个进程池
    with ProcessPoolExecutor(max_workers=4) as executor:
        futures = [executor.submit(progress, "llama3:latest", row['prompt']) for index, row in prompt_df.iterrows()]
        for future in as_completed(futures):
            response = future.result()
            print("\n-----Answer:-----\n")
            print(response)
            logging.info(f"Give a response")
    return None


if __name__ == "__main__":
    main()

3 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ollama Parallel using LangChain #21972

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 2 comments 5 replies

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Details

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Ollama Parallel using LangChain #21972

ArslanKAS May 21, 2024

Checked other resources

Commit to Help

Example Code

Description

System Info

System Information

Package Information

Replies: 2 comments · 5 replies

dosubot[bot] bot May 21, 2024

Updated Example Code

Details

ArslanKAS Jun 5, 2024 Author

dosubot[bot] bot Jun 5, 2024

Zuo-Peng Jul 11, 2024

Zuo-Peng Jul 11, 2024

krish-dx Nov 14, 2024

Zuo-Peng Nov 16, 2024

ArslanKAS
May 21, 2024

Replies: 2 comments 5 replies

dosubot[bot]
bot May 21, 2024

ArslanKAS Jun 5, 2024
Author

Zuo-Peng
Jul 11, 2024