We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
I got the following error: [2022-01-13 14:47:32,154] [INFO] [launch.py:131:sigkill_handler] Killing subprocess 2273 Traceback (most recent call last): File "/home/ubuntu/anaconda3/envs/gpt2_lm/lib/python3.8/runpy.py", line 194, in _run_module_as_main return _run_code(code, main_globals, None, File "/home/ubuntu/anaconda3/envs/gpt2_lm/lib/python3.8/runpy.py", line 87, in _run_code exec(code, run_globals) File "/home/ubuntu/anaconda3/envs/gpt2_lm/lib/python3.8/site-packages/deepspeed/launcher/launch.py", line 167, in <module> main() File "/home/ubuntu/anaconda3/envs/gpt2_lm/lib/python3.8/site-packages/deepspeed/launcher/launch.py", line 156, in main sigkill_handler(signal.SIGTERM, None) # not coming back File "/home/ubuntu/anaconda3/envs/gpt2_lm/lib/python3.8/site-packages/deepspeed/launcher/launch.py", line 137, in sigkill_handler raise subprocess.CalledProcessError(returncode=last_return_code, cmd=cmd) subprocess.CalledProcessError: Command '['/home/ubuntu/anaconda3/envs/gpt2_lm/bin/python', '-u', 'run_clm.py', '--local_rank=0', '--deepspeed', 'ds_config.json', '--model_name_or_path', 'gpt2-xl', '--train_file', '../../dataset/train.txt', '--validation_file', '../../dataset/test.txt', '--do_train', '--do_eval', '--fp16', '--overwrite_cache', '--evaluation_strategy=steps', '--output_dir', 'finetuned', '--eval_steps', '500', '--num_train_epochs', '1', '--gradient_accumulation_steps', '2', '--per_device_train_batch_size', '1', '--per_device_eval_batch_size', '1']' died with <Signals.SIGKILL: 9>.
[2022-01-13 14:47:32,154] [INFO] [launch.py:131:sigkill_handler] Killing subprocess 2273 Traceback (most recent call last): File "/home/ubuntu/anaconda3/envs/gpt2_lm/lib/python3.8/runpy.py", line 194, in _run_module_as_main return _run_code(code, main_globals, None, File "/home/ubuntu/anaconda3/envs/gpt2_lm/lib/python3.8/runpy.py", line 87, in _run_code exec(code, run_globals) File "/home/ubuntu/anaconda3/envs/gpt2_lm/lib/python3.8/site-packages/deepspeed/launcher/launch.py", line 167, in <module> main() File "/home/ubuntu/anaconda3/envs/gpt2_lm/lib/python3.8/site-packages/deepspeed/launcher/launch.py", line 156, in main sigkill_handler(signal.SIGTERM, None) # not coming back File "/home/ubuntu/anaconda3/envs/gpt2_lm/lib/python3.8/site-packages/deepspeed/launcher/launch.py", line 137, in sigkill_handler raise subprocess.CalledProcessError(returncode=last_return_code, cmd=cmd) subprocess.CalledProcessError: Command '['/home/ubuntu/anaconda3/envs/gpt2_lm/bin/python', '-u', 'run_clm.py', '--local_rank=0', '--deepspeed', 'ds_config.json', '--model_name_or_path', 'gpt2-xl', '--train_file', '../../dataset/train.txt', '--validation_file', '../../dataset/test.txt', '--do_train', '--do_eval', '--fp16', '--overwrite_cache', '--evaluation_strategy=steps', '--output_dir', 'finetuned', '--eval_steps', '500', '--num_train_epochs', '1', '--gradient_accumulation_steps', '2', '--per_device_train_batch_size', '1', '--per_device_eval_batch_size', '1']' died with <Signals.SIGKILL: 9>.
The text was updated successfully, but these errors were encountered:
Having the same issue here.
Sorry, something went wrong.
No branches or pull requests
I got the following error:
[2022-01-13 14:47:32,154] [INFO] [launch.py:131:sigkill_handler] Killing subprocess 2273 Traceback (most recent call last): File "/home/ubuntu/anaconda3/envs/gpt2_lm/lib/python3.8/runpy.py", line 194, in _run_module_as_main return _run_code(code, main_globals, None, File "/home/ubuntu/anaconda3/envs/gpt2_lm/lib/python3.8/runpy.py", line 87, in _run_code exec(code, run_globals) File "/home/ubuntu/anaconda3/envs/gpt2_lm/lib/python3.8/site-packages/deepspeed/launcher/launch.py", line 167, in <module> main() File "/home/ubuntu/anaconda3/envs/gpt2_lm/lib/python3.8/site-packages/deepspeed/launcher/launch.py", line 156, in main sigkill_handler(signal.SIGTERM, None) # not coming back File "/home/ubuntu/anaconda3/envs/gpt2_lm/lib/python3.8/site-packages/deepspeed/launcher/launch.py", line 137, in sigkill_handler raise subprocess.CalledProcessError(returncode=last_return_code, cmd=cmd) subprocess.CalledProcessError: Command '['/home/ubuntu/anaconda3/envs/gpt2_lm/bin/python', '-u', 'run_clm.py', '--local_rank=0', '--deepspeed', 'ds_config.json', '--model_name_or_path', 'gpt2-xl', '--train_file', '../../dataset/train.txt', '--validation_file', '../../dataset/test.txt', '--do_train', '--do_eval', '--fp16', '--overwrite_cache', '--evaluation_strategy=steps', '--output_dir', 'finetuned', '--eval_steps', '500', '--num_train_epochs', '1', '--gradient_accumulation_steps', '2', '--per_device_train_batch_size', '1', '--per_device_eval_batch_size', '1']' died with <Signals.SIGKILL: 9>.
The text was updated successfully, but these errors were encountered: