parallel-tutorial

Using simple cases in pytorch to understanding parallel in AI training/inference.

Unless otherwise specified, all code is run in a linux+DGX A100-40GB+nvcr.io/nvidia/pytorch:23.04-py3(pytorch 2.0) environment.

Please refer to the corresponding installation tutorial for the above environment configuration.

Unless otherwise specified, all code is written by shh2000@github, no code copy from other repos.

Some simple cases in train_basic_model has xx_forward.py, contains only forward(no training) for better understanding.

Cases:

catagory	task	case	parallel type	api	manual with readme
train	simple	matmul	None	/	see code
			data	torch.DDP()	see code
			1D Tensor	/	see code
			Pipeline	torch Pipe()	/
		C=A*B	2D-Tensor	/	see code

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
foundation_model		foundation_model
inference_basic_model		inference_basic_model
training_basic_model		training_basic_model
LICENSE		LICENSE
README.md		README.md

Provide feedback