Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

how to optimize reading long tuples from julia to python #575

Open
dpinol opened this issue Nov 18, 2024 · 4 comments
Open

how to optimize reading long tuples from julia to python #575

dpinol opened this issue Nov 18, 2024 · 4 comments
Labels
enhancement New feature or request

Comments

@dpinol
Copy link
Contributor

dpinol commented Nov 18, 2024

Is your feature request related to a problem? Please describe.
This code takes around 7s on a modern i9 the first time that it's executed. It's immediate the next times it's executed. It's also immediate to execute (zeros(20_000)...,) in julia.
I have python 3.12 and julia 1.10.5

import juliacall
v=juliacall.Main.seval("(zeros(20_000)...,)");

Describe the solution you'd like
Is there any way to speed up the initial compilation?

Describe alternatives you've considered
I tried to use this code to to speed up reading data from julia to python by going through a tuple. Looping a long julia vector from python is much slower than looping a tuple ( around 4 times slower to read a 100_000 floats vector vs reading a tuple)

@dpinol dpinol added the enhancement New feature or request label Nov 18, 2024
@cjdoris
Copy link
Collaborator

cjdoris commented Nov 20, 2024

I can't explain the speed difference (seval is directly calling Julia to execute that code, there is essentially zero overhead from JuliaCall) but you shouldn't be creating tuples of thousands of elements in Julia anyway. What are you actually trying to do?

@dpinol
Copy link
Contributor Author

dpinol commented Nov 20, 2024

What are you actually trying to do?

I tried to use this code to to speed up reading long vectors from julia to python by going. Looping a long julia vector from python is much slower than looping a tuple ( around 4 times slower to read a 100_000 floats vector vs reading a tuple). So I tried converting the vector to a tuple in julia before reading it. Indeed it's much faster, but the compilation overhead is huge.
An alternative I found is, instead of the reading the long julia vector from python, I pass a python vector to julia and julia populates it. This results in being around x3 faster also.

@cjdoris
Copy link
Collaborator

cjdoris commented Nov 21, 2024

Can you give some code to demonstrate what you're doing, i.e. some code that works but is slow that you'd like to be faster?

@dpinol
Copy link
Contributor Author

dpinol commented Nov 27, 2024

get_v what I'd like to optimize (reading a long julia vector).
set_v is the workaround that, by passing a python array to julia, the reading of the vector becomes 3 times faster.

from  juliacall import Main as jl
import timeit
import logging

jl.seval("get_v(v,n)=rand(n)")

jl.seval("""function setPyVector(py_v, jl_v)
             @inbounds for (i, v) in enumerate(jl_v)
                 py_v[i]=v;
             end;
         end;""")

jl.seval("function set_v(v,n) setPyVector(v, get_v(v,n)); return v end")


def benchmark(f, vector_len, retries) -> float:
	def workload():
	 sum = 0
	 v = vector_len * [None]
	 fn = f(v, vector_len)
	 for i in fn:
	     sum += i
	workload() # warmup
	print("running ", f)
	print("elapsed ", timeit.timeit(workload, number=retries))


benchmark(jl.get_v, vector_len=100000, retries=100)
benchmark(jl.set_v, vector_len=100000, retries=100)

This is what I get on a modern i9:

running  get_v
elapsed  6.038048821996199
running  set_v
elapsed  1.921932346012909

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants