A light-weight head-only c++ library for differentiable programming. Unlike the popular TensorFlow and Torch, DiffNum is implemented simply with forward inference with chain rule, instead of computation graph, source code transformation or other high level autograd algorithms. Thus it takes few efforts to implement and apply.
Advantages
- Extremely easy to use and flexible. Just replace
float
double
withdfloat
ddouble
, and specify the independent variables viasetVar()
. Then it does autograd for you. The gradients can be accessed at any stage of computation. And it has flexible indexing, because the differentiating is applied to single variables instead of vectors or tensors. - Saving memory in long iterations. It does not record the computation graph (like tape-based approach of Torch). So it is efficient when the computation involves a large number of iteration, especially self-accumulation iterations.
- Secondary derivatives and higher order derivatives supported. Higher order derivatives are derived via recursive template definition
DiffVar
. It might be written likeDiffVar<DiffVar<double, 0>, 0>
. It means autograding the gradients, that is the secondary derivatives or Hessian matrices. - CUDA supported. DiffNum can be even used in CUDA kernel functions. Just replace
float
,double
withdfloat_cuda
, andddouble_cuda
. This data structure can be seamlessly applied in CUDA functions as local variables, parameters or other purposes. (Please note that the CUDA code auto-grad loops of a single variable are still sequential, not parallel. In large scale computations, the parallelism should be given to higher level of computations like matrix operations.) - (Extra and in progress) . Independent from DiffNum, we offer template classes
Vec
,Matrix
andTensor
. They might be kind of useful.
Disadvantages
- The time complexity is greatly many times of reversed differentiating algorithms (back propagation) when there is large number of independent variables. Thus it can be extremely inefficient when there are many variables!
-
Using the differentiable variable
We offer the differentiable variable template classes:
DiffVar<d_type, size>
, whered_type
can befloat
,double
, or even differentiable variable type. The template parametersize
is the number of independent variables, that is the length of gradient vector. Ifsize
is zero, the independent variables will be uncertain and can be dynamically changed.To assign value to
DiffVar
variables is just like usingfloat
ordouble
variables. The can be assigned values simply usingoperator=
. Real values can be directly assigned toDiffVar
variables.For short, you can use
dfloat<size>
,ddouble<size>
. They are the same toDiffVar<float, size>
andDiffVar<double, size>
Here is an example
#include <DiffNum.h> #include <iostream> using namespace DiffNum; // Let's define a function: f(x, y, z) = 2*x^2 - 3*y*z + 1. And apply DiffVar to autograd. dfloat<3> f(float _x, float _y, float _z) { // We have 3 independent variables to study: x, y, z dfloat<3> x = _x, y = _y, z = _z; // The independent variables must be specified, otherwise they will be treated as constants. Here, let x be the 1st, y the 2nd, z the 3rd. Their indices are 0, 1, and 2 respectively. x.setVar(0); y.setVar(1); z.setVar(2); // Then use them like using floats. return 2 * x * x - 3 * y * z + 1.0; } int main(void) { dfloat<3> u = f(3.7, 4.0, 4.3); std::cout << u << std::endl; // DiffNum can be directly outputted to ostream. // Output: -23.22(14.8, -12.9, -12) // The first real number is the value of u. The following vector is the gradient to (x, y, z) return 0; }
-
Access the value and gradients
To access the value:
.getValue()
To access the gradient, use operator
[]
. With higher-order derivatives, use multiple[]
to get the derivatives. For examplea[1], b[1][2]
. -
Higher-order derivatives
Use
DiffVar
recursively. For exampleDiffVar<DiffVar<double, 3>, 3>
. Currently, the size of gradients of each recursion must be the same. -
Primary mathematical functions
For
DiffVar
, we provide mathematical functions that performs autograding. They are in template classMath<T>
, whereT
is the type of theDiffVar
you are using. -
CUDA supported DiffNum
To make DiffNum available in CUDA programs which are needed for many scientific computation tasks, we offer a CUDA version. Use
DiffVar_cuda
andMath_cuda
in both host codes and device codes.DiffVar_cuda
can be parameters of__global__
functions. -
Directly memcpy DiffVar or DiffVar_cuda arrays ?
Except for
DiffVar<d_type, 0>
, which is dynamic, all otherDiffVar
arrays can be directly copied.
Example 1. a, b are independent variables. c = a+b; d = log(max(sin(a/c), b))
using dmath = Math<ddouble<0>>;
// Example 1. a, b are variables. c = a+b; d = log(max(sin(a/c), b))
ddouble<0> a = 2., b = 3.;
// 2 total variables, a is the first, b is the second
a.setVar(2, 0); b.setVar(2, 1);
auto c = a + b;
auto d = dmathd::Log(dmath::Max(dmathd::Sin(a / c), b));
std::cout << d << std::endl;
Example 2. Vec v1 v2. v1[2] is the variable. q = v1 dot v2. We also offer dense Vec
and Mat
. Since DiffVar
is so similar to float
and double
, they can be easily adopted into any advanced numerical structure.
// Example 2. Vec v1 v2. v1[2] is the variable. q = v1 dot v2.
Vec<ddouble<0>, 3> v1, v2;
v1[0] = 8.7;
v1[1] = 4.3;
v1[2] = 7.;
v2[0] = -6.7;
v2[1] = 4.1;
v2[2] = 2.3;
// Set v1[2] as the only variable.
v1[2].setVar(1, 0);
auto q = Vec<ddouble<0>, 3>::dot(v1, v2);
std::cout << q << std::endl;
std::cout << std::endl;
Example 3. Evaluating secondary derivative.
// Example 3. Evaluating secondary derivative.
using ddmath = Math<dddouble<2>>;
dddouble<2> x = 2., y = 3.;
x.setVar(0); y.setVar(1),
std::cout << "x := 2, y := 3" << std::endl;
std::cout << "x^3 + 2*y^2 = ";
std::cout << ddmath::Pow(x, unsigned int(3)) + 2. * ddmath::Pow(y, unsigned int(2)) << std::endl;
std::cout << "x + x^3*y + x*y + 2*y = ";
std::cout << x + ddmath::Pow(x, unsigned int(3)) * y + x * y + 2. * y << std::endl
This is a head-only library. Just clone this repository and include the headers in your codes.
#include <DiffNum.h>
And for CUDA applications
#include <DiffNum_cuda.h>
Thanks to this project, I learned CUDA...