Skip to content

Commit

Permalink
Dev (#2)
Browse files Browse the repository at this point in the history
* regex now compiled using regex-automata crate not python interegular pkg.

* regex now compiled using regex-automata crate not python interegular pkg.

* SPEED. we now compute the FSM index lazily, among other optimizations.

* troubleshooting

* maybe fix?

* more fixes

* that didnt work..

* fix

* fix

* fix.. idek anymore

* may be it will work

* may be it will work

* i think it works now. this shit should not be this difficult..

* it works now.

* save point before cargo fix cmd

* rollback point. remove commit if not needed

* final + working
  • Loading branch information
unaidedelf8777 authored May 5, 2024
1 parent 2046b58 commit 564b60d
Show file tree
Hide file tree
Showing 18 changed files with 2,878 additions and 401 deletions.
92 changes: 92 additions & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

19 changes: 18 additions & 1 deletion Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -10,9 +10,26 @@ path = "function_sampler/fsm/fsm_utils/src/lib.rs"

# Dependencies for this specific package
[dependencies]
regex-automata = { git = "https://github.com/unaidedelf8777/regex.git", rev = "855b023e2d7c88722dfb0df8c669cda597604260" }
pyo3 = { version = "0.21.1", features = ["extension-module"] }
rayon = "1.10.0"
rustc-hash = "1.1.0"
lazy_static = "1.4.0"
mimalloc = "0.1.41"
dashmap = { version = "5.5.3", features = ["inline"] }



[profile.release]
opt-level = 3
opt-level = 3
lto = true
codegen-units = 1
# make sure to strip any debug info from binary
# this way it loads to python faster!
strip = true
panic = 'abort'


[features]
default = []
e2e_experimental = []
8 changes: 6 additions & 2 deletions function_sampler/fsm/__init__.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,8 @@
from .fsm import RegexFSM, FSMState, FSM
from .tokenizer_fsm_patch import TransformerTokenizer as FsmTokenizer
from .fsm_utils import create_fsm_index_end_to_end #, LazyFSMIndex

__all__ = ["RegexFSM", "FSMState", "FSM", "FsmTokenizer"]
from .regex import create_fsm_index_tokenizer, FSMState



__all__ = [ "FsmTokenizer", "create_fsm_index_tokenizer", "create_fsm_index_end_to_end", "FSMState"]
166 changes: 0 additions & 166 deletions function_sampler/fsm/fsm.py

This file was deleted.

Loading

0 comments on commit 564b60d

Please sign in to comment.