[chatllama]Do I need to split the llama model manully? #322

balcklive · 2023-03-31T19:02:21Z

I downloaded a llama 7B model. It only get one model file which ends with .pth. But as the model loading code in llama_model .py showed as below says. If I want train the model with multi gpus, I need to divide the model into the same number as the graphics card. May I ask how should do that? or is there anything I did not understand?

def load_checkpoints(
ckpt_dir: str, local_rank: int, world_size: int
) -> Tuple[dict, dict]:
checkpoints = sorted(Path(ckpt_dir).glob("*.pth"))
assert world_size == len(checkpoints), (
f"Loading a checkpoint for MP={len(checkpoints)} but world " # world size means numbers of gpus used right?
f"size is {world_size}"
)
ckpt_path = checkpoints[local_rank]
print("Loading")
checkpoint = torch.load(ckpt_path, map_location="cpu")
with open(Path(ckpt_dir) / "params.json", "r") as f:
params = json.loads(f.read())
return checkpoint, params

sharlec · 2023-04-04T23:19:36Z

I wonder about this question as well. I want to serve the 7B model on two servers, I am not sure what process should be done on model architecture

PierpaoloSorbellini · 2023-04-14T13:50:42Z

Hi @balcklive, you may have to enable fair scale and set the MP as stated in the llama documentations.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[chatllama]Do I need to split the llama model manully? #322

[chatllama]Do I need to split the llama model manully? #322

balcklive commented Mar 31, 2023

sharlec commented Apr 4, 2023

PierpaoloSorbellini commented Apr 14, 2023

[chatllama]Do I need to split the llama model manully? #322

[chatllama]Do I need to split the llama model manully? #322

Comments

balcklive commented Mar 31, 2023

sharlec commented Apr 4, 2023

PierpaoloSorbellini commented Apr 14, 2023