Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Infinite Loop Error (keeps starting train.py for some reason) #14

Open
civiliangame opened this issue Jul 4, 2022 · 2 comments
Open

Comments

@civiliangame
Copy link

civiliangame commented Jul 4, 2022

Hello,
I haven't made any modifications to the code. I cloned it, installed the requirements, and ran the script for training. No other options, no custom data.

As you can see, it ran train.py twice for some reason and then got out with a broken pipe error. I've included the error output below. (Output 1)

I then tried debugging this on another machine with the if name == main modification to prevent train.py from calling itself.

It seemed like line 74 from train.py was causing this issue:
image

This too caused an error, albeit a different one. I've included that one as well (Output 2)

Thank you.

Here's the error Output 1:
(deepsim2) D:\DeepSIM>python ./train.py --dataroot ./datasets/car --primitive seg --no_instance --tps_aug 1 --name DeepSIMCar
C:\Users\Public\Anaconda\envs\deepsim2\lib\site-packages\torchvision\io\image.py:13: UserWarning: Failed to load image Python extension:
warn(f"Failed to load image Python extension: {e}")
name DeepSIMCar
[0]
------------ Options -------------
affine_aug: none
batchSize: 1
beta1: 0.5
canny_aug: 0
canny_color: 0
canny_sigma_l_bound: 1.2
canny_sigma_step: 0.3
canny_sigma_u_bound: 3
checkpoints_dir: ./checkpoints
continue_train: False
cutmix_aug: 0
cutmix_max_size: 96
cutmix_min_size: 32
data_type: 32
dataroot: ./datasets/car
debug: False
display_freq: 100
display_winsize: 512
feat_num: 3
fineSize: 256
fp16: False
gpu_ids: [0]
input_nc: 3
instance_feat: False
isTrain: True
label_feat: False
label_nc: 0
lambda_feat: 10.0
loadSize: 256
load_features: False
load_pretrain:
local_rank: 0
lr: 0.0002
max_dataset_size: inf
model: pix2pixHD
nThreads: 2
n_blocks_global: 9
n_blocks_local: 3
n_clusters: 10
n_downsample_E: 4
n_downsample_global: 4
n_layers_D: 3
n_local_enhancers: 1
name: DeepSIMCar
ndf: 64
nef: 16
netG: global
ngf: 64
niter: 8000
niter_decay: 8000
niter_fix_global: 0
no_flip: False
no_ganFeat_loss: False
no_html: False
no_instance: True
no_lsgan: False
no_vgg_loss: False
norm: instance
num_D: 2
output_nc: 3
phase: train
pool_size: 0
primitive: seg
print_freq: 100
resize_or_crop: none
save_epoch_freq: 20000
save_latest_freq: 20000
serial_batches: False
test_canny_sigma: 2
tf_log: False
tps_aug: 1
tps_percent: 0.99
tps_points_per_dim: 3
use_dropout: False
verbose: False
which_epoch: latest
-------------- End ----------------
./train.py:11: DeprecationWarning: fractions.gcd() is deprecated. Use math.gcd() instead.
def lcm(a, b): return abs(a * b) / fractions.gcd(a, b) if a and b else 0
CustomDatasetDataLoader
dataset [AlignedDataset] was created
#training images = 1
C:\Users\Public\Anaconda\envs\deepsim2\lib\site-packages\torchvision\models_utils.py:209: UserWarning: The parameter 'pretrained' is deprecated since 0.13 and will be removed in 0.15, please use 'weights' instead.
f"The parameter '{pretrained_param}' is deprecated since 0.13 and will be removed in 0.15, "
C:\Users\Public\Anaconda\envs\deepsim2\lib\site-packages\torchvision\models_utils.py:223: UserWarning: Arguments other than a weight enum or None for 'weights' are deprecated since 0.13 and will be removed in 0.15. The current behavior is equivalent to passing weights=VGG19_Weights.IMAGENET1K_V1. You can also use weights=VGG19_Weights.DEFAULT to get the most up-to-date weights.
warnings.warn(msg)
create web directory ./checkpoints\DeepSIMCar\web...
display_delta 0
print_delta 0.0
save_delta 0
C:\Users\Public\Anaconda\envs\deepsim2\lib\site-packages\torchvision\io\image.py:13: UserWarning: Failed to load image Python extension:
warn(f"Failed to load image Python extension: {e}")
name DeepSIMCar
[0]
------------ Options -------------
affine_aug: none
batchSize: 1
beta1: 0.5
canny_aug: 0
canny_color: 0
canny_sigma_l_bound: 1.2
canny_sigma_step: 0.3
canny_sigma_u_bound: 3
checkpoints_dir: ./checkpoints
continue_train: False
cutmix_aug: 0
cutmix_max_size: 96
cutmix_min_size: 32
data_type: 32
dataroot: ./datasets/car
debug: False
display_freq: 100
display_winsize: 512
feat_num: 3
fineSize: 256
fp16: False
gpu_ids: [0]
input_nc: 3
instance_feat: False
isTrain: True
label_feat: False
label_nc: 0
lambda_feat: 10.0
loadSize: 256
load_features: False
load_pretrain:
local_rank: 0
lr: 0.0002
max_dataset_size: inf
model: pix2pixHD
nThreads: 2
n_blocks_global: 9
n_blocks_local: 3
n_clusters: 10
n_downsample_E: 4
n_downsample_global: 4
n_layers_D: 3
n_local_enhancers: 1
name: DeepSIMCar
ndf: 64
nef: 16
netG: global
ngf: 64
niter: 8000
niter_decay: 8000
niter_fix_global: 0
no_flip: False
no_ganFeat_loss: False
no_html: False
no_instance: True
no_lsgan: False
no_vgg_loss: False
norm: instance
num_D: 2
output_nc: 3
phase: train
pool_size: 0
primitive: seg
print_freq: 100
resize_or_crop: none
save_epoch_freq: 20000
save_latest_freq: 20000
serial_batches: False
test_canny_sigma: 2
tf_log: False
tps_aug: 1
tps_percent: 0.99
tps_points_per_dim: 3
use_dropout: False
verbose: False
which_epoch: latest
-------------- End ----------------
CustomDatasetDataLoader
dataset [AlignedDataset] was created
#training images = 1
C:\Users\Public\Anaconda\envs\deepsim2\lib\site-packages\torchvision\models_utils.py:209: UserWarning: The parameter 'pretrained' is deprecated since 0.13 and will be removed in 0.15, please use 'weights' instead.
f"The parameter '{pretrained_param}' is deprecated since 0.13 and will be removed in 0.15, "
C:\Users\Public\Anaconda\envs\deepsim2\lib\site-packages\torchvision\models_utils.py:223: UserWarning: Arguments other than a weight enum or None for 'weights' are deprecated since 0.13 and will be removed in 0.15. The current behavior is equivalent to passing weights=VGG19_Weights.IMAGENET1K_V1. You can also use weights=VGG19_Weights.DEFAULT to get the most up-to-date weights.
warnings.warn(msg)
create web directory ./checkpoints\DeepSIMCar\web...
display_delta 0
print_delta 0.0
save_delta 0
Traceback (most recent call last):
File "", line 1, in
Traceback (most recent call last):
File "./train.py", line 74, in
File "C:\Users\Public\Anaconda\envs\deepsim2\lib\multiprocessing\spawn.py", line 105, in spawn_main
for i, data in enumerate(dataset, start=epoch_iter):
exitcode = _main(fd)
File "C:\Users\Public\Anaconda\envs\deepsim2\lib\site-packages\torch\utils\data\dataloader.py", line 438, in iter
File "C:\Users\Public\Anaconda\envs\deepsim2\lib\multiprocessing\spawn.py", line 114, in _main
prepare(preparation_data)
File "C:\Users\Public\Anaconda\envs\deepsim2\lib\multiprocessing\spawn.py", line 225, in prepare
_fixup_main_from_path(data['init_main_from_path'])
File "C:\Users\Public\Anaconda\envs\deepsim2\lib\multiprocessing\spawn.py", line 277, in _fixup_main_from_path
run_name="mp_main")
File "C:\Users\Public\Anaconda\envs\deepsim2\lib\runpy.py", line 263, in run_path
pkg_name=pkg_name, script_name=fname)
File "C:\Users\Public\Anaconda\envs\deepsim2\lib\runpy.py", line 96, in _run_module_code
mod_name, mod_spec, pkg_name, script_name)
File "C:\Users\Public\Anaconda\envs\deepsim2\lib\runpy.py", line 85, in _run_code
exec(code, run_globals)
File "D:\DeepSIM\train.py", line 74, in
for i, data in enumerate(dataset, start=epoch_iter):
File "C:\Users\Public\Anaconda\envs\deepsim2\lib\site-packages\torch\utils\data\dataloader.py", line 438, in iter
return self._get_iterator()
return self._get_iterator()
File "C:\Users\Public\Anaconda\envs\deepsim2\lib\site-packages\torch\utils\data\dataloader.py", line 384, in _get_iterator
File "C:\Users\Public\Anaconda\envs\deepsim2\lib\site-packages\torch\utils\data\dataloader.py", line 384, in _get_iterator
return _MultiProcessingDataLoaderIter(self)
File "C:\Users\Public\Anaconda\envs\deepsim2\lib\site-packages\torch\utils\data\dataloader.py", line 1048, in init
return _MultiProcessingDataLoaderIter(self)
File "C:\Users\Public\Anaconda\envs\deepsim2\lib\site-packages\torch\utils\data\dataloader.py", line 1048, in init
w.start()
File "C:\Users\Public\Anaconda\envs\deepsim2\lib\multiprocessing\process.py", line 112, in start
w.start()
File "C:\Users\Public\Anaconda\envs\deepsim2\lib\multiprocessing\process.py", line 112, in start
self._popen = self._Popen(self)
self._popen = self._Popen(self)
File "C:\Users\Public\Anaconda\envs\deepsim2\lib\multiprocessing\context.py", line 223, in _Popen
File "C:\Users\Public\Anaconda\envs\deepsim2\lib\multiprocessing\context.py", line 223, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
return _default_context.get_context().Process._Popen(process_obj)
File "C:\Users\Public\Anaconda\envs\deepsim2\lib\multiprocessing\context.py", line 322, in _Popen
File "C:\Users\Public\Anaconda\envs\deepsim2\lib\multiprocessing\context.py", line 322, in _Popen
return Popen(process_obj)
return Popen(process_obj)
File "C:\Users\Public\Anaconda\envs\deepsim2\lib\multiprocessing\popen_spawn_win32.py", line 89, in init
File "C:\Users\Public\Anaconda\envs\deepsim2\lib\multiprocessing\popen_spawn_win32.py", line 46, in init
reduction.dump(process_obj, to_child)
prep_data = spawn.get_preparation_data(process_obj._name)
File "C:\Users\Public\Anaconda\envs\deepsim2\lib\multiprocessing\reduction.py", line 60, in dump
File "C:\Users\Public\Anaconda\envs\deepsim2\lib\multiprocessing\spawn.py", line 143, in get_preparation_data
_check_not_importing_main()
File "C:\Users\Public\Anaconda\envs\deepsim2\lib\multiprocessing\spawn.py", line 136, in _check_not_importing_main
ForkingPickler(file, protocol).dump(obj)
is not going to be frozen to produce an executable.''')
BrokenPipeError: [Errno 32] Broken pipe
RuntimeError:
An attempt has been made to start a new process before the
current process has finished its bootstrapping phase.

    This probably means that you are not using fork to start your
    child processes and you have forgotten to use the proper idiom
    in the main module:

        if __name__ == '__main__':
            freeze_support()
            ...

    The "freeze_support()" line can be omitted if the program
    is not going to be frozen to produce an executable.
























Here is (Error Output 2)
(deepsim) PS E:\JM\GAN\deepsim> python ./train.py --dataroot ./datasets/car --primitive seg --no_instance --tps_aug 1 --name DeepSIMCar
C:\Users\TWiM.conda\envs\deepsim\lib\site-packages\torchvision\io\image.py:13: UserWarning: Failed to load image Python extension:
warn(f"Failed to load image Python extension: {e}")
name DeepSIMCar
[0]
------------ Options -------------
affine_aug: none
batchSize: 1
beta1: 0.5
canny_aug: 0
canny_color: 0
canny_sigma_l_bound: 1.2
canny_sigma_step: 0.3
canny_sigma_u_bound: 3
checkpoints_dir: ./checkpoints
continue_train: False
cutmix_aug: 0
cutmix_max_size: 96
cutmix_min_size: 32
data_type: 32
dataroot: ./datasets/car
debug: False
display_freq: 100
display_winsize: 512
feat_num: 3
fineSize: 256
fp16: False
gpu_ids: [0]
input_nc: 3
instance_feat: False
isTrain: True
label_feat: False
label_nc: 0
lambda_feat: 10.0
loadSize: 256
load_features: False
load_pretrain:
local_rank: 0
lr: 0.0002
max_dataset_size: inf
model: pix2pixHD
nThreads: 2
n_blocks_global: 9
n_blocks_local: 3
n_clusters: 10
n_downsample_E: 4
n_downsample_global: 4
n_layers_D: 3
n_local_enhancers: 1
name: DeepSIMCar
ndf: 64
nef: 16
netG: global
ngf: 64
niter: 8000
niter_decay: 8000
niter_fix_global: 0
no_flip: False
no_ganFeat_loss: False
no_html: False
no_instance: True
no_lsgan: False
no_vgg_loss: False
norm: instance
num_D: 2
output_nc: 3
phase: train
pool_size: 0
primitive: seg
print_freq: 100
resize_or_crop: none
save_epoch_freq: 20000
save_latest_freq: 20000
serial_batches: False
test_canny_sigma: 2
tf_log: False
tps_aug: 1
tps_percent: 0.99
tps_points_per_dim: 3
use_dropout: False
verbose: False
which_epoch: latest
-------------- End ----------------
./train.py:16: DeprecationWarning: fractions.gcd() is deprecated. Use math.gcd() instead.
def lcm(a, b): return abs(a * b) / fractions.gcd(a, b) if a and b else 0
CustomDatasetDataLoader
dataset [AlignedDataset] was created
#training images = 1
C:\Users\TWiM.conda\envs\deepsim\lib\site-packages\torchvision\models_utils.py:209: UserWarning: The parameter 'pretrained' is deprecated since 0.13 and will be removed in 0.15, please use 'weights' instead.
f"The parameter '{pretrained_param}' is deprecated since 0.13 and will be removed in 0.15, "
C:\Users\TWiM.conda\envs\deepsim\lib\site-packages\torchvision\models_utils.py:223: UserWarning: Arguments other than a weight enum or None for 'weights' are deprecated since 0.13 and will be removed in 0.15. The current behavior is equivalent to passing weights=VGG19_Weights.IMAGENET1K_V1. You can also use weights=VGG19_Weights.DEFAULT to get the most up-to-date weights.
warnings.warn(msg)
create web directory ./checkpoints\DeepSIMCar\web...
display_delta 0
print_delta 0.0
save_delta 0
C:\Users\TWiM.conda\envs\deepsim\lib\site-packages\torchvision\io\image.py:13: UserWarning: Failed to load image Python extension:
warn(f"Failed to load image Python extension: {e}")
Traceback (most recent call last):
File "", line 1, in
Traceback (most recent call last):
File "C:\Users\TWiM.conda\envs\deepsim\lib\multiprocessing\spawn.py", line 105, in spawn_main
File "./train.py", line 197, in
exitcode = _main(fd)
main()
File "./train.py", line 81, in main
File "C:\Users\TWiM.conda\envs\deepsim\lib\multiprocessing\spawn.py", line 115, in _main
for i, data in enumerate(dataset, start=epoch_iter):
self = reduction.pickle.load(from_parent)
File "C:\Users\TWiM.conda\envs\deepsim\lib\site-packages\torch\utils\data\dataloader.py", line 438, in iter
EOFError: Ran out of input
return self._get_iterator()
File "C:\Users\TWiM.conda\envs\deepsim\lib\site-packages\torch\utils\data\dataloader.py", line 384, in _get_iterator
return _MultiProcessingDataLoaderIter(self)
File "C:\Users\TWiM.conda\envs\deepsim\lib\site-packages\torch\utils\data\dataloader.py", line 1048, in init
w.start()
File "C:\Users\TWiM.conda\envs\deepsim\lib\multiprocessing\process.py", line 112, in start
self._popen = self._Popen(self)
File "C:\Users\TWiM.conda\envs\deepsim\lib\multiprocessing\context.py", line 223, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "C:\Users\TWiM.conda\envs\deepsim\lib\multiprocessing\context.py", line 322, in _Popen
return Popen(process_obj)
File "C:\Users\TWiM.conda\envs\deepsim\lib\multiprocessing\popen_spawn_win32.py", line 89, in init
reduction.dump(process_obj, to_child)
File "C:\Users\TWiM.conda\envs\deepsim\lib\multiprocessing\reduction.py", line 60, in dump
ForkingPickler(file, protocol).dump(obj)
AttributeError: Can't pickle local object 'CustomDatasetDataLoader.initialize..'

@civiliangame
Copy link
Author

civiliangame commented Jul 4, 2022

I found the problem and it seems like it's fixed for now. Documenting for the future adventurers.

Go to DeepSIM/data/custom_dataset_data_loader.py in your repository.

https://github.com/eliahuhorwitz/DeepSIM/blob/master/data/custom_dataset_data_loader.py

class CustomDatasetDataLoader(BaseDataLoader):

def name(self):
    return 'CustomDatasetDataLoader'

def initialize(self, opt):
    BaseDataLoader.initialize(self, opt)
    if opt.isTrain:
        self.dataset = CreateDataset(opt)
    else:
        self.dataset = CreateDataset_test(opt)
    self.dataloader = torch.utils.data.DataLoader(
        self.dataset,
        batch_size=opt.batchSize,
        shuffle=not opt.serial_batches,
        num_workers=int(opt.nThreads),
    worker_init_fn=lambda _: np.random.seed())

def load_data(self):
    return self.

Replace the entirety of line 38 with one parentheses. It seems like the multithreaded thing is messing with everything.

Then, go to train.py and put everything in an

if name == "main":
main()

def main:
#literally all of train.py here

This worked for me and at least the code is running.

@civiliangame
Copy link
Author

For test.py, you should also put everything in an if name == "main" loop

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant