Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

filter_pos bug #13

Open
aflyax opened this issue Aug 9, 2020 · 3 comments
Open

filter_pos bug #13

aflyax opened this issue Aug 9, 2020 · 3 comments

Comments

@aflyax
Copy link

aflyax commented Aug 9, 2020

Describe the bug
When trying to run ops.text.clean.filter_pos("NOUN", keep_matching_tokens=True), getting: module 'jange.ops.text.clean' has no attribute 'filter_pos'.

When changed to ops.text.clean.pos_filter("NOUN", keep_matching_tokens=True), getting: OSError: [E050] Can't find model 'en_core_web_sm'. It doesn't seem to be a shortcut link, a Python package or a valid path to a data directory.

To Reproduce

From examples:

clusters_ds = ds.apply(
    ops.text.clean.pos_filter("NOUN", keep_matching_tokens=True),
    ops.text.encode.tfidf(max_features=5000, name="tfidf"),
    ops.cluster.minibatch_kmeans(n_clusters=5),
    result_collector=result_collector,
)
@jangedoo
Copy link
Owner

jangedoo commented Aug 9, 2020

you'll need to download spacy's model using python -m spacy download en_core_web_sm

@aflyax
Copy link
Author

aflyax commented Aug 9, 2020

Thanks, that worked! Although now I am getting an error with features_ds = result_collector[clusters_ds.applied_ops.find_by_name("tfidf")] from the example (if I print features_ds, I get a StopIteration error). I can open a separate issue for that.

@TheWoodKidJr
Copy link

Thanks, that worked! Although now I am getting an error with features_ds = result_collector[clusters_ds.applied_ops.find_by_name("tfidf")] from the example (if I print features_ds, I get a StopIteration error). I can open a separate issue for that.

Hey did you find a solution ? I have the same error...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants