Releases: columbia-applied-data-science/rosetta
Releases · columbia-applied-data-science/rosetta
Streamers split and LDA results speedups.
Parallelized VW methods for general Text Streaming
new parallel_easy utils for memory friendly iterator functionality
new threading_easy utls for easy multi_threading
VW methods are parallelized for generic text streamers
protected import statements for non-standard libs
.to_scipysparse method added for streamers and bug fixes
0.2.4 minor nb changes
New Streamers, enhanced nlp
- new TextStreamer class to handle general text stream processes
- explicit doc path passing option in TextFileStreamer
- updated version of nlp.word_tokenize
- minor bug fixes
Improved EDA
Major improvements to the modeling.eda
module.
Improved LDA prediction
- improved LDA predict function
- improved documentation
- removed old dependencies