Unable to Run SFT #66

Rui-Yuan91 · 2024-02-13T14:15:03Z

When I run the SFT script in the example by choosing BasicTrainer instead of FSDPTrainer and by disabling wandb logging to avoid other issues:

python -u train.py model=pythia28 datasets=[hh] loss=sft exp_name=anthropic_dpo_pythia28 gradient_accumulation_steps=2 batch_size=64 eval_batch_size=32 trainer=BasicTrainer sample_during_eval=false wandb.enabled=False

I encountered the following error:

building policy
starting single-process worker
Creating trainer on process 0 with world size 1
Loading tokenizer EleutherAI/pythia-2.8b
Loaded train data iterator
Loading HH dataset (test split) from Huggingface...
Error executing job with overrides: ['model=pythia28', 'datasets=[hh]', 'loss=sft', 'exp_name=anthropic_dpo_pythia28', 'gradient_accumulation_steps=2', 'batch_size=64', 'eval_batch_size=32', 'trainer=BasicTrainer', 'sample_during_eval=false', 'wandb.enabled=False']
Traceback (most recent call last):
File "/shared/home/se79359/direct-preference-optimization/train.py", line 114, in main
worker_main(0, 1, config, policy, reference_model)
File "/shared/home/se79359/direct-preference-optimization/train.py", line 42, in worker_main
trainer = TrainerClass(policy, config, config.seed, config.local_run_dir, reference_model=reference_model, rank=rank, world_size=world_size)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/shared/home/se79359/direct-preference-optimization/trainers.py", line 179, in init
self.eval_batches = list(self.eval_iterator)
^^^^^^^^^^^^^^^^^^^^^^^^
File "/shared/home/se79359/direct-preference-optimization/preference_datasets.py", line 320, in get_batch_iterator
for prompt, data in get_dataset(name, split, silent=silent, cache_dir=cache_dir).items():
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/shared/home/se79359/direct-preference-optimization/preference_datasets.py", line 168, in get_dataset
data = get_hh(split, silent=silent, cache_dir=cache_dir)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/shared/home/se79359/direct-preference-optimization/preference_datasets.py", line 142, in get_hh
dataset = datasets.load_dataset('Anthropic/hh-rlhf', split=split, cache_dir=cache_dir)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/datasets/load.py", line 1773, in load_dataset
builder_instance = load_dataset_builder(
^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/datasets/load.py", line 1502, in load_dataset_builder
dataset_module = dataset_module_factory(
^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/datasets/load.py", line 1219, in dataset_module_factory
raise e1 from None
File "/usr/local/lib/python3.11/site-packages/datasets/load.py", line 1203, in dataset_module_factory
).get_module()
^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/datasets/load.py", line 769, in get_module
else get_data_patterns_in_dataset_repository(hfh_dataset_info, self.data_dir)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/datasets/data_files.py", line 662, in get_data_patterns_in_dataset_repository
return _get_data_files_patterns(resolver)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/datasets/data_files.py", line 223, in _get_data_files_patterns
data_files = pattern_resolver(pattern)
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/datasets/data_files.py", line 473, in _resolve_single_pattern_in_dataset_repository
glob_iter = [PurePath(filepath) for filepath in fs.glob(PurePath(pattern).as_posix()) if fs.isfile(filepath)]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/fsspec/spec.py", line 606, in glob
pattern = glob_translate(path + ("/" if ends_with_sep else ""))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/fsspec/utils.py", line 734, in glob_translate
raise ValueError(
ValueError: Invalid pattern: '**' can only be an entire path component

Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.

The text was updated successfully, but these errors were encountered:

BigBinnie · 2024-03-05T06:14:19Z

Have you solved the problem? I also met the same issue.

BigBinnie · 2024-03-05T15:46:38Z

I think it's because of the version of datasets package. I just upgraded datasets and it works for me.

Yanfors · 2024-05-13T07:17:01Z

@BigBinnie What version of the datasets package are you using? I still have the same problem after updating.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unable to Run SFT #66

Unable to Run SFT #66

Rui-Yuan91 commented Feb 13, 2024

BigBinnie commented Mar 5, 2024

BigBinnie commented Mar 5, 2024

Yanfors commented May 13, 2024

Unable to Run SFT #66

Unable to Run SFT #66

Comments

Rui-Yuan91 commented Feb 13, 2024

BigBinnie commented Mar 5, 2024

BigBinnie commented Mar 5, 2024

Yanfors commented May 13, 2024