-
-
Notifications
You must be signed in to change notification settings - Fork 214
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Indexing can not be completed (on Windows) #14
Comments
Hi @vanetreg, sorry about that! I believe this is related to the issue making it not work on Google Colab -- everything is currently multiprocessing even when it doesn't need to be, and it hangs in certain environments (outside I don't have a windows machine to try this, but it might be the Windows + Cursor combo. We'll be looking at fixing this shortly (cc @okhat) |
I think indexing and search should definitely work on colab? https://colab.research.google.com/github/stanford-futuredata/ColBERT/blob/main/docs/intro2new.ipynb |
I did notice that it works on the main repo, but doesn't with RAGatouille, must be how we handle the Run... I need to track down exactly why, but it actually hangs: |
Same issue here, same environment (even Cursor!) |
I was using Windows 11, Cursor, Python 10 through WSL... Worked for me. So, may be a windows not in WSL thing. I gotta say it would be hard for me to imagine not working in WSL on a Windows machine myself at this point. |
@bclavie @okhat I tested again and while executing first cell:
I got:
so after installing ipywidgets (requirements?!) and restarting Cursor, now without the above written warning,
After trying to run next cell:
I got error:
I've always checked every each cells execution timestamp, so all previous cells (especially where RAG is defined) run without errors this time. |
Hey, thanks for confirming @MikeRenwick-ICG and shining some light on it being a Windows (non-WSL) issue @jponline77. @vanetreg While you're not using Google Colab, this is definitely the same multiprocessing issue that's causing it to hang in colab. I believe the issue you're seeing is still the same problem -- RAG isn't defined because the previous cell never actually ran and just timed out. The likely issue is identified (#13), I'll ping you when we get a fix out for it!
This is a bit of an annoying warning, but it doesn't negatively impact running anything. To avoid overloading the lib with dependencies one wouldn't use outside a notebook, we don't generally add ipython/notebook related dependencies to requirements, but definitely do install it if you're going to be running notebooks a lot! |
@bclavie Today I tested again this and it really must be Windows related:
Note |
@jponline77 |
anyone tried this outside of windows Jupyter? I'm keen to drop this in as a direct replacement for single vector RAG |
Hi @bclavie, I'm not sure if the problem is related to Colab, I also have an error using Jupyter locally on my Ubuntu server. Here's the code and stacktrace if that helps: from ragatouille import RAGPretrainedModel
RAG = RAGPretrainedModel.from_pretrained("colbert-ir/colbertv2.0")
my_documents = [
"This is a great excerpt from my wealth of documents",
"Once upon a time, there was a great document"
]
index_path = RAG.index(index_name="my_index", collection=my_documents) output the following:
|
Hey @timothepearce, thanks for flagging! I believe this is a very separate problem (the multiprocessing in your case runs fine, but there seems to be another problem). Could you create a new issue so I can look into it a bit more? And could you try out the notebooks in examples/ ? I think there might be something wrong with the README, which is (probably) that there aren't enough documents in the example (which I could fix by adopting a separate logic for n_docs that are far too small). |
@runonthespot Feel free to try https://github.com/bclavie/RAGatouille/blob/main/examples/01-basic_indexing_and_search.ipynb, it's fully plug-and-play! |
@bclavie You're right, the code doesn't work either in the Python CLI, and seems related to the ColBERT library. I'll open a new issue and dig a little bit more. |
Hey @vanetreg, for your other issue, the partial init -- no idea what's going on there, it seems like something weird happened when initialising ntlk? I've tested some things on my end and I can confirm this is due to how ColBERT does multiprocessing, which causes the issue in some environments (seemingly Colab and Windows 10). This will eventually be fixed once the multiprocessing handling is changed upstream but sadly there doesn't seem to be a good in-notebook workaround on those two platforms at the moment. If you use RAGatouille in a python script (making sure to have it inside |
Hey @bclavie , |
Yeah, maybe it's a Windows 10 issue. Just be sure, if you are using WSL, that it's actually running in WSL. If you are setup to run in WSL, then you should be able to try to run it command line from WSL directly without using VSC or Cursor. My experience with WSL is that it runs everything that runs in Ubuntu in a very similar way as if it was a standalone Linux system. So, it would surprise me a little if it matters if you are Windows 10 or 11. That said, any reason you aren't interested in upgrading to 11? I've now got RAGatouille running on two different systems with Windows 11 and WSL. One was a Laptop with a low end integrated GPU and 16GB of memory. It did take 10 minutes to index a small file but it worked. |
@jponline77 |
Hey, thanks for this @jponline77 -- indexing is slow sadly, taking a while to create the index is the tradeoff to querying very large corpuses at near-constant time. It can maybe be optimised though (that'd require work on the upstream ColBERT repo), but that's something for the future! @vanetreg I think (not sure) you could try it out in a standalone script like I mentioned earlier? Wrap it in |
I was actually a little surprised it worked at all on the laptop. Indexing speed was much faster on my RTX4080 system with 128GB of ram :) |
I'm getting it hanging on wsl2 ubuntu (win 11) as well. In a notebook and as a standalone python script (as well as wrapped in main). been using cuda + pytorch in wsl2 for a long time, first time i've seen this nccl issue pop up, and trying to trace around to where it might be coming from. Pretty sure it's something to do with nccl, and likely colbert (edit: although the colbert notebook posted by @okhat above works fine). my best guess so far is https://github.com/stanford-futuredata/ColBERT/blob/03fb1becb30c1d01e83d210ba0c4a25108543809/colbert/utils/distributed.py#L27 edit: torch.distributed.DistBackendError: NCCL error in: ../torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp:1333, unhandled system error (run with NCCL_DEBUG=INFO for details), NCCL version 2.18.1 |
Multiprocessing is no longer enforced for indexing when using no GPU or a single GPU thanks to @Anmol6's excellent upstream work on stanford-futuredata/ColBERT#290 & propagated by #51. This is likely to fix the indexing problems on Windows (or at least, one of the problems). Please let me know if the latest version of RAGatouille fixes it for you! |
@bclavie
having error messages:
|
I think this is an issue with Windows 10 and loading cpp extensions in PyTorch? Saw a few similar issues on other projects floating around... I think the current stance will be that the lib doesn't support Win10 unless someone can figure out a solid fix to this 😞 |
In case others are trying to get it working on Windows 10, I did get past the cl error with non-zero exit status above (by installing the C++ parts of VS 2022 Build Tools) but I then ran into issues with pthread.h not being found. I tried vcpkg to install it (which was possible) but I still couldn't get it to work with the compiler and when I saw that cpp_extensions now seems archived, that, along with the time/effort taken to get to that point made me give up on Windows directly (for now at least!) However I didn't have any problems with ragatouille using WSL on Windows (Ubuntu 20.04) via pip install ragatouille within a conda env with Python 3.11.7. |
I'm having similar issues. I'm using WSL2 Windows 10 with faiss-gpu installed and faiss-cpu uninstalled. The basic script below has been running for 30 minutes...
After about 30 minutes, I got the error:
Alternatively, the T4 GPU Colab ran very quickly around 5 minutes. |
(Copy/pasting this message in a few related issues) Hey guys! Thanks a lot for bearing with me as I juggle everything and trying to diagnose this. It’s complicated to fix with relatively little time to dedicate to it, as it seems like the dependencies causing issues aren’t the same for everyone, with no clear platform pattern as of yet. Overall, the issues center around the usual suspects of While because of this I can’t fix the issue with PLAID optimised indices just yet, I’m also noticing that most of the bug reports here are about relatively small collections (100s-to-low-1000s). To lower the barrier to entry as much as possible, #137 is introducing a second index format, which doesn’t actually build an index, but performs an exact search over all documents (as a stepping stone towards #110, which would use an HNSW index to be an in-between compromise between PLAID optimisation and exact search). The PR above (#137) is still a work in progress, as it needs CRUD support, tests, documentation, better precision routing (fp32/bfloat16) etc… (and potentially searching only subset of document ids). index(…
index_type=“FULL_VECTORS”,
) Any feedback is appreciated, as always, and thanks again! |
The CUBLAS errors turned out to be |
I'm testing
01-basic_indexing_and_search.ipynb
on a Windows 10 PC, in Cursor IDE, using Python 3.11.6
Cell:
RAG.index(collection=[full_document], index_name="Miyazaki", max_document_length=180, split_documents=True)
can not be completed after almost an hour!
is shown, I restarted kernel after an hour.
The previous cell, which prints the length of full_document, worked properly.
The text was updated successfully, but these errors were encountered: