-
Notifications
You must be signed in to change notification settings - Fork 216
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
training error: OutOfRangeError: End of sequence #60
Comments
I just found the reason. I have only 7 classes or persons in my dataset but I set batch_P as 8.
this step just take(0) as a result and the iteration of data will end at the first iteration then which raise the error mentioned. It's a silly mistake but I suggest to add a if-else statement to notice this condition |
Thanks for updating with the reason. Indeed we could add code catching this mistake, I'd happily accept a PR doing so! |
you do a good job |
@muxizju Just came across this as I also have few classes. My question is what happens with the rest of the classes if I say I have 7 classes and Batch_P is 4. What happens with the other 3 remainder classes. Do they get reiterated into the future batches or just ignored?
|
I use the codes to train my own dataset, but raised this error at sees.run(). The detail printed log is as below in which I changed some args such as net_input_height size and batch_p. my tensorflow version is 1.7. I don't know what's wrong here
Instructions for updating:
Use the retry module or similar alternatives.
2018-09-27 11:12:06,474 [INFO] train: Training using the following parameters:
2018-09-27 11:12:06,474 [INFO] train: batch_k: 4
2018-09-27 11:12:06,474 [INFO] train: batch_p: 8
2018-09-27 11:12:06,474 [INFO] train: checkpoint_frequency: 1000
2018-09-27 11:12:06,474 [INFO] train: crop_augment: False
2018-09-27 11:12:06,474 [INFO] train: decay_start_iteration: 100000
2018-09-27 11:12:06,474 [INFO] train: detailed_logs: False
2018-09-27 11:12:06,474 [INFO] train: embedding_dim: 128
2018-09-27 11:12:06,475 [INFO] train: experiment_root: F:/projector/GestureClassification/TripletBasedGestureRecognition/experiment_root/20180926/
2018-09-27 11:12:06,475 [INFO] train: flip_augment: False
2018-09-27 11:12:06,475 [INFO] train: head_name: fc1024
2018-09-27 11:12:06,475 [INFO] train: image_root: F:/projector/GestureClassification/data/img/20180919/triplet_data/img/
2018-09-27 11:12:06,475 [INFO] train: initial_checkpoint: None
2018-09-27 11:12:06,475 [INFO] train: learning_rate: 0.0003
2018-09-27 11:12:06,475 [INFO] train: loading_threads: 4
2018-09-27 11:12:06,475 [INFO] train: loss: batch_hard
2018-09-27 11:12:06,476 [INFO] train: margin: soft
2018-09-27 11:12:06,476 [INFO] train: metric: euclidean
2018-09-27 11:12:06,476 [INFO] train: model_name: resnet_v1_50
2018-09-27 11:12:06,476 [INFO] train: net_input_height: 64
2018-09-27 11:12:06,476 [INFO] train: net_input_width: 64
2018-09-27 11:12:06,476 [INFO] train: pre_crop_height: 64
2018-09-27 11:12:06,476 [INFO] train: pre_crop_width: 64
2018-09-27 11:12:06,476 [INFO] train: resume: False
2018-09-27 11:12:06,476 [INFO] train: train_iterations: 250000
2018-09-27 11:12:06,476 [INFO] train: train_set: F:/projector/GestureClassification/data/img/20180919/triplet_data/gesture_train.csv
2018-09-27 11:12:07,403 [INFO] tensorflow: Scale of 0 disables regularizer.
2018-09-27 11:12:07,403 [INFO] tensorflow: Scale of 0 disables regularizer.
2018-09-27 11:12:08,569 [WARNING] tensorflow: From F:\projector\GestureClassification\TripletBasedGestureRecognition\triplet-reid\nets\resnet_v1.py:219: calling reduce_mean (from tensorflow.python.ops.math_ops) with keep_dims is deprecated and will be removed in a future version.
Instructions for updating:
keep_dims is deprecated, use keepdims instead
2018-09-27 11:12:08,569 [WARNING] tensorflow: From F:\projector\GestureClassification\TripletBasedGestureRecognition\triplet-reid\nets\resnet_v1.py:219: calling reduce_mean (from tensorflow.python.ops.math_ops) with keep_dims is deprecated and will be removed in a future version.
Instructions for updating:
keep_dims is deprecated, use keepdims instead
D:\Program Files\Python3.5\lib\site-packages\tensorflow\python\ops\gradients_impl.py:100: UserWarning: Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
"Converting sparse IndexedSlices to a dense Tensor of unknown shape. "
2018-09-27 11:12:11.533610: I T:\src\github\tensorflow\tensorflow\core\platform\cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
2018-09-27 11:12:11.936193: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1344] Found device 0 with properties:
name: GeForce GTX 1060 5GB major: 6 minor: 1 memoryClockRate(GHz): 1.7085
pciBusID: 0000:01:00.0
totalMemory: 5.00GiB freeMemory: 4.12GiB
2018-09-27 11:12:11.936710: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1423] Adding visible gpu devices: 0
2018-09-27 11:12:14.388590: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:911] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-09-27 11:12:14.388811: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:917] 0
2018-09-27 11:12:14.388948: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:930] 0: N
2018-09-27 11:12:14.415769: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1041] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 3871 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1060 5GB, pci bus id: 0000:01:00.0, compute capability: 6.1)
2018-09-27 11:12:16.275624: I T:\src\github\tensorflow\tensorflow\core\kernels\cuda_solvers.cc:159] Creating CudaSolver handles for stream 000001A50E54E080
2018-09-27 11:12:20,572 [INFO] tensorflow: F:/projector/GestureClassification/TripletBasedGestureRecognition/experiment_root/20180926/checkpoint-0 is not in all_model_checkpoint_paths. Manually adding it.
2018-09-27 11:12:20,572 [INFO] tensorflow: F:/projector/GestureClassification/TripletBasedGestureRecognition/experiment_root/20180926/checkpoint-0 is not in all_model_checkpoint_paths. Manually adding it.
2018-09-27 11:12:23,207 [INFO] train: Starting training from iteration 0.
Traceback (most recent call last):
File "D:\Program Files\Python3.5\lib\site-packages\tensorflow\python\client\session.py", line 1327, in _do_call
return fn(*args)
File "D:\Program Files\Python3.5\lib\site-packages\tensorflow\python\client\session.py", line 1312, in _run_fn
options, feed_dict, fetch_list, target_list, run_metadata)
File "D:\Program Files\Python3.5\lib\site-packages\tensorflow\python\client\session.py", line 1420, in _call_tf_sessionrun
status, run_metadata)
File "D:\Program Files\Python3.5\lib\site-packages\tensorflow\python\framework\errors_impl.py", line 516, in exit
c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.OutOfRangeError: End of sequence
[[Node: IteratorGetNext = IteratorGetNextoutput_shapes=[[?,64,64,3], [?], [?]], output_types=[DT_FLOAT, DT_STRING, DT_STRING], _device="/job:localhost/replica:0/task:0/device:CPU:0"]]
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "F:/projector/GestureClassification/TripletBasedGestureRecognition/triplet-reid/train.py", line 439, in
main()
File "F:/projector/GestureClassification/TripletBasedGestureRecognition/triplet-reid/train.py", line 393, in main
prec_at_k, endpoints['emb'], losses, fids])
File "D:\Program Files\Python3.5\lib\site-packages\tensorflow\python\client\session.py", line 905, in run
run_metadata_ptr)
File "D:\Program Files\Python3.5\lib\site-packages\tensorflow\python\client\session.py", line 1140, in _run
feed_dict_tensor, options, run_metadata)
File "D:\Program Files\Python3.5\lib\site-packages\tensorflow\python\client\session.py", line 1321, in _do_run
run_metadata)
File "D:\Program Files\Python3.5\lib\site-packages\tensorflow\python\client\session.py", line 1340, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.OutOfRangeError: End of sequence
[[Node: IteratorGetNext = IteratorGetNextoutput_shapes=[[?,64,64,3], [?], [?]], output_types=[DT_FLOAT, DT_STRING, DT_STRING], _device="/job:localhost/replica:0/task:0/device:CPU:0"]]
Caused by op 'IteratorGetNext', defined at:
File "F:/projector/GestureClassification/TripletBasedGestureRecognition/triplet-reid/train.py", line 439, in
main()
File "F:/projector/GestureClassification/TripletBasedGestureRecognition/triplet-reid/train.py", line 280, in main
images, fids, pids = dataset.make_one_shot_iterator().get_next()
File "D:\Program Files\Python3.5\lib\site-packages\tensorflow\python\data\ops\iterator_ops.py", line 366, in get_next
name=name)), self._output_types,
File "D:\Program Files\Python3.5\lib\site-packages\tensorflow\python\ops\gen_dataset_ops.py", line 1484, in iterator_get_next
output_shapes=output_shapes, name=name)
File "D:\Program Files\Python3.5\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "D:\Program Files\Python3.5\lib\site-packages\tensorflow\python\framework\ops.py", line 3290, in create_op
op_def=op_def)
File "D:\Program Files\Python3.5\lib\site-packages\tensorflow\python\framework\ops.py", line 1654, in init
self._traceback = self._graph._extract_stack() # pylint: disable=protected-access
OutOfRangeError (see above for traceback): End of sequence
[[Node: IteratorGetNext = IteratorGetNextoutput_shapes=[[?,64,64,3], [?], [?]], output_types=[DT_FLOAT, DT_STRING, DT_STRING], _device="/job:localhost/replica:0/task:0/device:CPU:0"]]
Process finished with exit code 1
The text was updated successfully, but these errors were encountered: