Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash with numpy 2 #17

Open
jlqibm opened this issue Nov 10, 2024 · 1 comment
Open

Crash with numpy 2 #17

jlqibm opened this issue Nov 10, 2024 · 1 comment

Comments

@jlqibm
Copy link

jlqibm commented Nov 10, 2024

With numpy 1.26.4 and python 3.11, things work fine.
With numpy 2.1.3 and python 3.11, I get a crash:
Successfully installed numpy-2.1.3
(hf) [jlquinn@cccxc520 fms-dgt-internal]$ python
Python 3.11.0 (main, Mar 1 2023, 18:26:19) [GCC 11.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.

from ftlangdetect import detect
detect('hi there')
Traceback (most recent call last):
File "", line 1, in
File "/dccstor/jlquinn01/miniforge3/envs/hf/lib/python3.11/site-packages/ftlangdetect/detect.py", line 45, in detect
labels, scores = model.predict(text)
^^^^^^^^^^^^^^^^^^^
File "/dccstor/jlquinn01/miniforge3/envs/hf/lib/python3.11/site-packages/fasttext/FastText.py", line 239, in predict
return labels, np.array(probs, copy=False)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
ValueError: Unable to avoid copy while creating an array as requested.
If using np.array(obj, copy=False) replace it with np.asarray(obj) to allow a copy when needed (no behavior change in NumPy 1.x).
For more details, see https://numpy.org/devdocs/numpy_2_0_migration_guide.html#adapting-to-changes-in-the-copy-keyword.

@Roman-9182
Copy link

Hi, @jlqibm.

Temporary bypass solution:

    import numpy as np
    from ftlangdetect.detect import get_or_load_model
    
    def custom_predict(self, text, k=1, threshold=0.0, on_unicode_error="strict"):
        """
        Given a string, get a list of labels and a list of
        corresponding probabilities. k controls the number
        of returned labels. A choice of 5, will return the 5
        most probable labels. By default this returns only
        the most likely label and probability. threshold filters
        the returned labels by a threshold on probability. A
        choice of 0.5 will return labels with at least 0.5
        probability. k and threshold will be applied together to
        determine the returned labels.

        This function assumes to be given
        a single line of text. We split words on whitespace (space,
        newline, tab, vertical tab) and the control characters carriage
        return, formfeed and the null character.

        If the model is not supervised, this function will throw a ValueError.

        If given a list of strings, it will return a list of results as usually
        received for a single line of text.
        """

        def check(entry):
            if entry.find("\n") != -1:
                raise ValueError("predict processes one line at a time (remove '\\n')")
            entry += "\n"
            return entry

        if type(text) == list:
            text = [check(entry) for entry in text]
            all_labels, all_probs = self.f.multilinePredict(
                text, k, threshold, on_unicode_error
            )

            return all_labels, all_probs
        else:
            text = check(text)
            predictions = self.f.predict(text, k, threshold, on_unicode_error)
            if predictions:
                probs, labels = zip(*predictions)
            else:
                probs, labels = ([], ())

            return labels, np.asarray(probs)

    def custom_detect(text: str, low_memory=False) -> dict[str, str | float]:
        model = get_or_load_model(low_memory)
        model.__class__.predict = custom_predict
        labels, scores = model.predict(text)
        label = labels[0].replace("__label__", '')
        score = min(float(scores[0]), 1.0)
        return {
            "lang": label,
            "score": score,
        }

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants