Feature request: Use multithreading for simple operations on larger dfs #19924

Chuck321123 · 2024-11-22T10:46:26Z

Description

So polars seems to be equally as fast as pandas on simple operations on larger dfs. Is there nothing that can be done to make the operations faster?

import numpy as np
import polars as pl

# Define the number of rows
n_rows = 200_000_000

df = pl.DataFrame({
    "col1": np.random.rand(n_rows)
})

%timeit -r 1 -n 7 df.select(pl.col("col1")+1)

df=df.to_pandas()
%timeit -r 1 -n 7 df["col1"] + 1

504 ms ± 0 ns per loop (mean ± std. dev. of 1 run, 7 loops each)
484 ms ± 0 ns per loop (mean ± std. dev. of 1 run, 7 loops each)

The text was updated successfully, but these errors were encountered:

Chuck321123 added the enhancement New feature or an improvement of an existing feature label Nov 22, 2024

Chuck321123 changed the title ~~Using multithreading for simple operations on larger dfs~~ Feature request: Use multithreading for simple operations on larger dfs Nov 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature request: Use multithreading for simple operations on larger dfs #19924

Feature request: Use multithreading for simple operations on larger dfs #19924

Chuck321123 commented Nov 22, 2024

Feature request: Use multithreading for simple operations on larger dfs #19924

Feature request: Use multithreading for simple operations on larger dfs #19924

Comments

Chuck321123 commented Nov 22, 2024

Description