Skip to content

Commit

Permalink
fix GPU mapping error for Horovod + finetune
Browse files Browse the repository at this point in the history
When doing finetune with Horovod, the same error as #2712 throws at what I modified in this PR.

Signed-off-by: Jinzhe Zeng <[email protected]>
  • Loading branch information
njzjz authored Dec 10, 2023
1 parent 0547940 commit 778c5d0
Showing 1 changed file with 5 additions and 1 deletion.
6 changes: 5 additions & 1 deletion deepmd/utils/batch_size.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,8 +7,12 @@
)

import numpy as np
from packaging.version import (
Version,
)

from deepmd.env import (
TF_VERSION,
tf,
)
from deepmd.utils.errors import (
Expand Down Expand Up @@ -59,7 +63,7 @@ def __init__(self, initial_batch_size: int = 1024, factor: float = 2.0) -> None:
self.minimal_not_working_batch_size = self.maximum_working_batch_size + 1
else:
self.maximum_working_batch_size = initial_batch_size
if tf.test.is_gpu_available():
if (Version(TF_VERSION) >= Version("1.14") and tf.config.experimental.get_visible_devices('GPU')) or tf.test.is_gpu_available():

Check warning on line 66 in deepmd/utils/batch_size.py

View check run for this annotation

Codecov / codecov/patch

deepmd/utils/batch_size.py#L66

Added line #L66 was not covered by tests
self.minimal_not_working_batch_size = 2**31
else:
self.minimal_not_working_batch_size = (
Expand Down

0 comments on commit 778c5d0

Please sign in to comment.