Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make PyO3 bindings an optional feature #14

Merged
merged 3 commits into from
Aug 29, 2024
Merged

Conversation

kc611
Copy link
Contributor

@kc611 kc611 commented Aug 22, 2024

As titled,

This PR makes use of cargo --features flag to optionally compile the python bindings when required. Without the usage of --features python-bindings, cargo will only compile the pure rust functions.

Note:: The last function create_fsm_index_end_to_end cannot be internalized right away as a pure Rust function because there is usage of PyDict within the function logic, but that should be easy enough to resolve. The main issue is that if we use the logic that we are using for the rest of the functions (wrapping the Python bindings around a *_internal Rust version) we'd have to convert the huge states_to_token_subsets dictionary from Rust HashMaps to PyDict wherein it could've been constructed as PyDict in the first place, as it is being done right now. This would be a regression in terms of performance.

  • [ ] Make a Rust version of create_fsm_index_end_to_end so that it's available in the pure Rust package without python bindings. Create Rust-only index construction function #9
  • Check: Confirm that the pure Rust package (without Python bindings feature) doesn't use PyO3.

@kc611 kc611 linked an issue Aug 22, 2024 that may be closed by this pull request
5 tasks
@kc611 kc611 marked this pull request as draft August 22, 2024 08:41
@ErikKaum ErikKaum mentioned this pull request Aug 22, 2024
3 tasks
Cargo.toml Outdated Show resolved Hide resolved
@brandonwillard brandonwillard changed the title Made PyO3 bindings optional through the use if Cargo features flag Make PyO3 bindings an optional feature Aug 22, 2024
@brandonwillard brandonwillard force-pushed the rust_features branch 2 times, most recently from 8a8fa49 to 5b6e765 Compare August 22, 2024 22:43
brandonwillard
brandonwillard previously approved these changes Aug 26, 2024
Copy link
Member

@brandonwillard brandonwillard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note:: The last function create_fsm_index_end_to_end cannot be internalized right away as a pure Rust function because there is usage of PyDict within the function logic, but that should be easy enough to resolve. The main issue is that if we use the logic that we are using for the rest of the functions (wrapping the Python bindings around a *_internal Rust version) we'd have to convert the huge states_to_token_subsets dictionary from Rust HashMaps to PyDict wherein it could've been constructed as PyDict in the first place, as it is being done right now. This would be a regression in terms of performance.

  • Make a Rust version of create_fsm_index_end_to_end so that it's available in the pure Rust package without python bindings.

I'm fine moving forward with this and addressing the Rust-only version of states_to_token_subsets in a follow-up.

We might want to wait until #15 is merged so that we can rebase this and then merge it.

@brandonwillard brandonwillard marked this pull request as ready for review August 26, 2024 20:07
@brandonwillard brandonwillard dismissed their stale review August 29, 2024 15:35

The merge-base changed after approval.

@brandonwillard brandonwillard force-pushed the main branch 2 times, most recently from 347e191 to c448997 Compare August 29, 2024 15:36
ErikKaum
ErikKaum previously approved these changes Aug 29, 2024
Copy link
Collaborator

@ErikKaum ErikKaum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good 👍

One small (but important thing) is to have the rlib in there as well, otherwise I think it can't be imported by rust projects:
crate-type = ["cdylib", "rlib"]

I also when running cargo build --features python-bindings, I get a compile error:

...
            "__Py_NoneStruct", referenced from:
                pyo3_ffi::object::Py_None::hb5c6297611a670d5 in outlines_core_rs.ax4ei8tw6kqx8u3n4jsaoie49.rcgu.o
                pyo3_ffi::object::Py_None::h0a3a434a6d719957 in libpyo3-f9af37ea37f46715.rlib[15](pyo3-f9af37ea37f46715.pyo3.a12b04b38ef4efd4-cgu.12.rcgu.o)
            "__Py_TrueStruct", referenced from:
                pyo3_ffi::boolobject::Py_True::h325d151b7e5600ac in libpyo3-f9af37ea37f46715.rlib[6](pyo3-f9af37ea37f46715.pyo3.a12b04b38ef4efd4-cgu.03.rcgu.o)
          ld: symbol(s) not found for architecture arm64
          clang: error: linker command failed with exit code 1 (use -v to see invocation)
          

error: could not compile `outlines-core-rs` (lib) due to 1 previous error

But I'm pretty sure this is something off on my system. So probably nothing that should block merging this PR.

@brandonwillard
Copy link
Member

One small (but important thing) is to have the rlib in there as well, otherwise I think it can't be imported by rust projects:
crate-type = ["cdylib", "rlib"]

Added.

@ErikKaum ErikKaum self-requested a review August 29, 2024 19:50
Copy link
Collaborator

@ErikKaum ErikKaum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome 🚀

@brandonwillard brandonwillard merged commit 2717ffd into main Aug 29, 2024
5 checks passed
@brandonwillard brandonwillard deleted the rust_features branch August 29, 2024 19:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Separate Python bindings code from Rust-only code
3 participants