You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Wed Dec 1819:46:222024+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.120 Driver Version: 550.120 CUDA Version: 12.4 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA RTX 5000 Ada Gene... Off | 00000000:18:00.0 Off | Off |
| 30% 30C P8 12W / 250W | 5720MiB / 32760MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
| 1 NVIDIA RTX 5000 Ada Gene... Off | 00000000:3B:00.0 Off | 1 |
| 30% 35C P8 17W / 250W | 14MiB / 30712MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
| 2 NVIDIA RTX 5000 Ada Gene... Off | 00000000:86:00.0 Off | 0 |
| 30% 38C P8 13W / 250W | 5720MiB / 30712MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
| 3 NVIDIA RTX 5000 Ada Gene... Off | 00000000:AF:00.0 Off | Off |
| 54% 78C P2 206W / 250W | 14887MiB / 32760MiB | 87% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------++-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| 0 N/A N/A 4070 G /usr/lib/xorg/Xorg 4MiB |
| 0 N/A N/A 341143 C /mnt/HDD1/wmz19/miniconda3/bin/python3 5702MiB |
| 1 N/A N/A 4070 G /usr/lib/xorg/Xorg 4MiB |
| 2 N/A N/A 4070 G /usr/lib/xorg/Xorg 4MiB |
| 2 N/A N/A 341237 C /mnt/HDD1/wmz19/miniconda3/bin/python3 5702MiB |
| 3 N/A N/A 4070 G /usr/lib/xorg/Xorg 4MiB |
| 3 N/A N/A 341286 C /mnt/HDD1/wmz19/miniconda3/bin/python3 5702MiB |
| 3 N/A N/A 392717 C python 9164MiB |
+-----------------------------------------------------------------------------------------+
我从昨天晚上11点一直执行到现在,进展如下:
pre tokenize: 100%|██████████████████████████████████████████████████████████████████| 50000/50000 [06:17<00:00, 132.47it/s]
pre tokenize: 71%|██████████████████████████████████████████████▊ | 35438/50000 [06:21<01:54, 126.81it/s]You're using a XLMRobertaTokenizerFast tokenizer. Please note that with a fast tokenizer, using the `__call__` method is faster than using a method to encode the text followed by a call to the `pad` method to get a padded encoding.pre tokenize: 100%|██████████████████████████████████████████████████████████████████| 50000/50000 [06:20<00:00, 131.41it/s]You're using a XLMRobertaTokenizerFast tokenizer. Please note that with a fast tokenizer, using the `__call__` method is faster than using a method to encode the text followed by a call to the `pad` method to get a padded encoding.
pre tokenize: 100%|██████████████████████████████████████████████████████████████████| 50000/50000 [08:10<00:00, 101.90it/s]
You're using a XLMRobertaTokenizerFast tokenizer. Please note that with a fast tokenizer, using the `__call__` method is faster than using a method to encode the text followed by a call to the `pad` method to get a padded encoding.Inference Embeddings: 100%|█████████████████████████████████████████████████████████| 50000/50000 [2:22:23<00:00, 5.85it/s]Inference Embeddings: 100%|█████████████████████████████████████████████████████████| 50000/50000 [2:46:34<00:00, 5.00it/s]Inference Embeddings: 100%|█████████████████████████████████████████████████████████| 50000/50000 [3:31:52<00:00, 3.93it/s]Chunks: 75%|████████████████████████████████████████████████████████▎ | 3/4 [3:40:11<1:02:32, 3752.99s/it]
请问有没有什么建议,感觉挺慢的,还有这里这个chunks这是在干啥呢?
The text was updated successfully, but these errors were encountered:
JackTan25
changed the title
使用BGE-M3模型来对MLDR的英文数据集进行embedding遇到问题
使用BGE-M3模型来对MLDR的英文数据集进行embedding很慢
Dec 18, 2024
我目前正在使用BGE-M3模型对MLDR数据集进行编码获取稠密向量,下面是我的代码:
我执行如下脚本:
这些我的机器的gpu配置:
我从昨天晚上11点一直执行到现在,进展如下:
请问有没有什么建议,感觉挺慢的,还有这里这个chunks这是在干啥呢?
The text was updated successfully, but these errors were encountered: