Skip to content

Latest commit

 

History

History
88 lines (81 loc) · 5.53 KB

README.md

File metadata and controls

88 lines (81 loc) · 5.53 KB

Deep Learning with Pytorch and Hugging Face

  • You can access the slide deck that covers Pytorch Here
  • You can access the slide deck that covers various concepts related to Transformers Here
  • It is recommended to read the slide decks before using the following colab notebooks
  • Once you get a good grip on the first four modules, you can easily walk through the documentation or other code to build an application. I will keep updating this repository.
  • Recorded videos

Colab Notebooks

  1. The Fuel: Tensors

    • Difficulty Level: Easy if you have prior experience using Numpy or TensorFlow
    • Understand the Pytorch architecture
    • Create Tensors of 0d,1d,2d,3d,... (a multidimensional array in numpy)
    • Understand the attributes: storage, stride, offset, device
    • Manipulate tensor dimensions
    • Operations on tensors
  2. The Engine: Autograd

    • Difficulty Level: Hard, requires a good understanding of backprop algorithm. However, you can skip this and still follow the subsequent notebooks easily.
    • A few more attributes of tensor : requires_grad, grad, grad_fn, _saved_tensors, backward, retain_grad, zero_grad
    • Computation graph: Leaf node (parameters) vs non-leaf node (intermediate computation)
    • Accumulate gradient and update with context manager (torch.no_grad)
    • Implementing a neural network from scratch
  3. The factory: nn.Module, Data Utils

    • Difficulty Level: Medium
    • Brief tour into the source code of nn.Module
    • Everything is a module (layer in other frameworks)
    • Stack modules by subclassing nn.Module and build any neural network
    • Managing data with dataset class and DataLoader class
  4. Convolutional Neural Network Image Classification

    • Difficulty Level: Medium
    • Using torchvision for datasets
    • build CNN and move it to GPU
    • Train and test
    • Transfer learning
    • Image segmentation
  5. Recurrent Neural Network Sequence classification

    • Difficulty Level: Hard for pre-processing part, Medium for model building part
    • torchdata
    • torchtext
    • Embedding for words
    • Build RNN
    • Train,test, infer

Please take a look at the official tutorial series if you want to perform distributed training using a multi-GPU or multi-node setup in PyTorch (requires minimal modifications to the existing code). It covers various approaches, including:

  • Distributed Data-Parallel (DDP)
  • Fully Sharded Data Parallel (FSDP)
  • Model, Tenosr and PipeLine parallelism
    Now, let's move on to the Hugging Face library, which further simplifies these training strategies

  1. Using pre-trained models Notebook
    • Difficulty Level: Easy
    • AutoTokenizer
    • AutoModel
  2. Fine-Tuning Pre-Trained Models Notebook
    • Difficulty Level: Medium
    • datasets
    • tokenizer
    • data collator with padding
    • Trainer
  3. Loading Datasets Notebook
    • Difficulty Level: Easy
    • Dataset from local data files
    • Dataset from Hub
    • Preprocessing the dataset: Slice, Select, map, filter, flatten, interleave, concatenate
    • Loading from external links
  4. Build a Custom Tokenizer for translation task Notebook
    • Difficulty Level: Medium
    • Translation dataset as running example
    • Building the tokenizer by encapsulating the Normalizer, pre-tokenizer and tokenization algorithm (BPE)
    • Locally Save and Load the tokenizer
    • Using it in the Transformer module
    • Exercise: Build a Tokenizer with shared vocabulary.
  5. Training Custom Seq2Seq model using Vanilla Transformer Architecture Notebook
    • Difficulty Level: Medium, if you know how to build models in PyTorch.
    • Build Vanilla Transformer architecture in Pytorch
    • Create a configuration file for a model using PretrainedConfig class
    • Wrap it by HF PreTrainedModel class
    • Use the custom tokenizer built in the previous notebook
    • Use Trainer API to train the model
  6. Gradient Accumulation - Continual Pre-training Notebook
    • Difficulty Level: Easy
    • Understand the memory requirement for training and inference
    • Understand how gradient accumulation overcomes the limited memory