This is an optional bonus homework assignment for the course 11-485/11-685/11-785 Introduction to Deep Learning at Carnegie Mellon University.
Most modern machine learning and deep learning frameworks rely on a technique called "Automatic Differentiation" (or Autodiff for short) to compute gradients. In this homework assignment, we introduce a new Autodiff based framework for computing these gradients (called new_grad) with a focus on the backbone of the autodiff framework - the Autograd Engine - without the complexity of dealing with a special "Tensor" class, or the need to perform DFS during the backward pass over the computational graph.
In this (optional) bonus homework assignment, you will biuld your own version of an automatic differentiation library, in the context of the Deep Learning concepts that you learn in the course. What you will learn:
- What is a computational graph? How are mathematical operations recorded on a computational graph?
- The benefits of thinking at the granularity of operations, instead of layers,
- The simplicity of the chain rule of differentiation, and backpropogation,
- Building a PyTorch-like API without the use of a 'Tensor' class, and instead working directly with numpy arrays,
Though we provide some starter code, you will get to complete key components of the library including the main Autograd engine. For specific instructions, please refer to the writeup included in this repository. Students enrolled in the course: submit your solutions through Autolab.
Once you have completed all the key components of the assignment, you will be able to build and train simple neural networks:
- Import mytorch and the nn module:
>>> from mytorch import autograd_engine
>>> import mytorch.nn as nn
- Declare the autograd object, layers, activations, and loss:
>>> autograd_engine = autograd_engine.Autograd()
>>> linear1 = nn.Linear(input_shape, output_shape, autograd_engine)
>>> activation1 = nn.ReLU(autograd_engine)
>>> loss = nn.SoftmaxCrossEntropy(autograd_engine)
- Calculate the loss, and kick-off backprop
>>> y_hat = activation1(linear1(x))
>>> loss_val = loss(y, y_hat)
>>> loss.backward()
Developed by: Kinori Rosnow, Anurag Katakkar, David Park, Chaoran Zhang, Shriti Priya, Shayeree Sarkar, and Bhiksha Raj.