Basic definitions for autodifferentiation.

## Overview¶

This module defines a simple autodiff system for Klamp’t objects, including 3D math, kinematics, dynamics, and geometry functions. This is useful for all sorts of optimization functions, e.g., machine learning, calibration, and trajectory optimization.

This can be used standalone, but is more powerful in conjunction with PyTorch or CadADi. Use the functions in pytorch and casadi to convert these Klamp’t functions to PyTorch or CasADi functions.

## Calling and evaluating functions¶

When you call an ADFunctionInterface object on some arguments, the function isn’t immediately called. Instead, it creates an expression DAG that can either be evaluated using eval(), or a derivative can be taken with derivative(), or the expression itself can be passed as an argument to another ADFunctionInterface.

Many built-in Python numeric functions and some Numpy functions are made available in autodiff variants. Standard numeric operators like +, -, , /, *, abs, and sum will automatically create autodiff-able expressions.

Array indexing works properly on expressions. To create larger arrays from scalars or smaller arrays, you must use the ad.stack function, you can’t just write [expr1,expr2].

Variables are named by strings. If you pass a string to a function, it creates an expression that is dependent on a variable. It can later be evaluated by passing the named variables as arguments to eval(). For example:

from klampt.math.autodiff.ad import *
import numpy as np
print(mul('x','y').eval(x=2,y=np.array([3.0,4.0]))) #gives the vector [6,8]


Note that if want to use variables in Python operators, you can’t simply do something like 'x'+'y' since Python sees this as string addition. Instead, you should explicitly declare a variable using the var keyword: var('x') + var('y').

## Everything evaluates to a scalar or 1D array¶

An important note is that all arguments to functions, return values, and variable assignments must either be scalars or 1D numpy arrays.

Derivatives of a function are always of shape (size(f(x)),size(x)).

If you are creating your own function that can accept either scalars or vectors, the ad._size() and ad._scalar() functions are your friends. The size of a scalar is 1.

Note

Arrays should be float type! Weird results can result if you, for example initialize your arrays with ints, e.g., np.array([0,0,0])

## Expression DAG and performance¶

The expression DAG can be nested arbitrarily deeply. For improved performance, you can reuse the same sub-expression multiple times in a DAG. For example, in the following setup:

diff = var('x')**2 - var('y')**2
poly = diff**3 - 2.0*diff**2 + 4.0*diff - 6
poly.eval(x=2.5,y=1.5)


the term diff will only be evaluated once and then reused through the evaluation of poly. Similarly, derivatives will be smart in reusing common subexpressions constructed in this fashion. For complex subexpressions, this can be much faster than evaluating the full expression.

Note, however, it is the user’s job to construct expressions such that common subexpressions are the same Python object. So the following code:

x = var('x')
y = var('y')
poly = (x**2-y**2)**3 - 2.0*(x**2-y**2)**2 + 4.0*(x**2-y**2) - 6
poly.eval(x=2.5,y=1.5)


doesn’t know that the term (x**2-y**2) is repeated, and instead it will be recalculated three times.

Caching is only done within a single eval call. If you would like to have multiple expressions evaluated with a single eval call, you can either put them in a stack expression, or use the eval_multiple() function.

If you print an expression DAG that contains repeated subexpressions, you will see terms like #X and @X. The term #X after a subexpression indicates that this subexpression may be reused, and @X indicates an instance of reuse.

## Derivatives¶

To get the derivative of an expression, call ADFunctionCall.derivative(). This will use the chain rule to handle arbitrarily nested expressions. A derivative is always a 2D array, even if the expression or the argument is scalar.

Finite differences is performed automatically if some internal function doesn’t have derivative information defined. To modify the finite difference step size, modify FINITE_DIFFERENCE_RES.

## Creating your own functions¶

The easiest way to create your own function is the function() function. Just pass a Python function to it, indicate the input and output sizes (for better error handling), and any available derivatives as keywords. This will then automatically generate an ADFunctionInterface instance that can be called to produce an auto-differentable function.

You can either provide a derivative function, J(arg,x1,...,xn), which returns the Jacobian $$\frac{df}{dx_{arg}}(x1,..,xn)$$ , or a jvp function Jdx(arg,dx,x1,...,xn), which calculates the jacobian-vector product $$\frac{df}{dx_{arg}}(x1,...,xn)\cdot dx$$ . More conveniently, you can just provide a list of functions Ji(x1,...,xn), with i=1,..,n, with Ji giving $$\frac{df}{dx_i}(x1,...,xn)$$ , or in the JVP form, functions Jdxi(dx,x1,...,xn) each giving $$\frac{df}{dx_{i}}(x1,...,xn)\cdot dx$$. See the function() documentation for more details.

Example:

from klampt.math.autodiff import ad

def my_func(x,y):
return x**2 - y**2

my_func_ad = ad.function(my_func)   #does no error checking, no derivatives
my_func_ad2 = ad.function(my_func,'auto',[1,1],1)  #indicates a scalar function of two scalars
derivative=[lambda x,y:np.array(2*x).resize((1,1)),
lambda x,y:np.array(-2*y).resize((1,1))])  #gives Jacobians
jvp=[lambda dx,x,y:2*x*dx, lambda dy,x,y:-2*y*dy])  #gives Jacobian-vector products

#now you can do a bunch of things...
print(my_func_ad4(var('x'),var('x')).derivative('x',x=2))  #evaluates the derivative, which is always []
print(my_func_ad4(var('x'),2.0).derivative('x',x=2))  #evaluates the derivative, which is 4
print(my_func_ad(var('x'),2.0).derivative('x',x=2))   #uses finite differencing to approximate the derivative


For more sophisticated functions, such as those that need to be constructed with some custom user data, you may subclass the ADFunctionInterface interface class. At the very least, you should overload the eval method, although you will likely also want to overload n_args, n_in, n_out, and one of the derivative methods.

The same example as above:

class MyFunc(ADFunctionInterface):
def __str__(self):
return 'my_func'
def n_args(self):
return 2
def n_in(self,arg):
return 1
def n_out(self):
return 1
def eval(self,x,y):
return x**2 - y**2
def jvp(self,arg,darg,x,y):
if arg == 0:  #x
return 2*x*darg
else: #y
return -2*y*darg

print(my_func_ad4(var('x'),var('x')).derivative('x',x=2))  #evaluates the derivative, which is always []
print(my_func_ad4(var('x'),2.0).derivative('x',x=2))  #evaluates the derivative, which is 4


If you are going to be taking a lot of derivatives w.r.t. vector variables, then the most efficient way to implement your functions is to provide the derivative method. Otherwise, the most efficient way is to provide the jvp method. The reason is that taking the derivative w.r.t. a vector will create intermediate matrices, and to perform the chain rule, derivative will perform a single call and a matrix multiply. On the other hand, jvp will be called once for each column in the matrix. If you are taking derivatives w.r.t. scalars, jvp won’t create the intermediate Jacobian matrix, so it will be faster. Implementing both will let the system figure out the best alternative.

When you create a new function with its derivatives, you ought to run it through check_derivatives to see whether they match with finite differences. To modify the finite difference step size, modify FINITE_DIFFERENCE_RES.

This module defines var(), function(), and the helper functions check_derivatives(), eval_multiple(), derivative_multiple() and finite_differences().

The operators stack, sum_, minimum, and maximum operate somewhat like their Numpy counterparts:

• stack acts like np.stack, since all items are 1D arrays.

• sum_ acts like Numpy’s sum if only one item is provided. Otherwise, it acts like the builtin sum.

• minimum/maximum act like ndarray.min/max if only one item is provided, and otherwise they act like Numpy’s minimum/maximum. If more than 2 items are provided, they are applied elementwise and sequentially.

The operators add, sub, mul, div, neg, pow_, abs_, and getitem are also provided here, but you will probably just want to use the standard Python operators +, -, , /, -, *, abs, and []. On arrays, they act just like the standard numpy functions.

Another convenience function is setitem(), which has the signature setitem(x,indices,v) and is equivalent to:

xcopy = np.copy(x)
xcopy[indices] = v
return xcopy


which is sort of a proper functional version of the normal x[indices]=v.

The function cond acts like an if statement, where cond(pred,posval,negval) returns posval if pred > 0 and negval otherwise. There’s also cond3(pred,posval,zeroval,negval) that returns posval if pred > 0, zeroval if pred==0, and negval otherwise. Conditions can also be applied when pred is an array, but it must be of equal size to posval and negval. The result in this case is an array with the i’th value taking on posval[i] when pred[i] > 0 and negval[i] otherwise.

## Other Klampt ad Modules¶

autodiff versions of many Klamp’t functions are found in:

• math_ad : basic vector math, including functions from the vectorops module.

• so3_ad : operations on SO(3), mostly compatible with the so3 module.

• se3_ad : operations on SE(3), mostly compatible with the se3 module.

• kinematics_ad kinematics functions, e.g., robot Jacobians.

• geometry_ad geometry derivative functions.

• dynamics_ad dynamics derivative functions.

• trajectory_ad trajectory derivative functions.

Data:

 FINITE_DIFFERENCE_RES The resolution used for finite differences. add Autodiff’ed function comparable to the addition operator +. sub Autodiff’ed function comparable to the subtraction operator -. mul Autodiff’ed function comparable to the multiplication operator *. div Autodiff’ed function comparable to the division operator /. neg Autodiff’ed function comparable to the negation operator -. sum_ Autodiff’ed function comparable to sum. pow_ Autodiff’ed function comparable to np.power. abs_ Autodiff’ed function comparable to abs. stack Autodiff’ed function comparable to np.stack. minimum Autodiff’ed function comparable to np.minimum. maximum Autodiff’ed function comparable to np.maximum. cond Autodiff’ed function that performs a conditional (pred,trueval,falseval). cond3 Autodiff’ed function that performs a triple conditional (pred,trueval,zeroval,falseval).

Classes:

 The base class for a new auto-differentiable function. ADTerminal(name) A symbolic value that will be bound later. ADFunctionCall(func, args) A function call in an expression tree.

Functions:

 Creates a new variable symbol with name x. function(func[, name, argspec, outspec, …]) Declares and instantiates a ADFunctionInterface object that calls the Python function func. finite_differences(f, x[, h]) Performs forward differences to approximate the derivative of f at x. finite_differences_hessian(f, x, h) Performs forward differences to approximate the Hessian of f(x) w.r.t. finite_differences_hessian2(f, x, y, hx[, hy]) Performs forward differences to approximate the Hessian of f(x,y) w.r.t. check_derivatives(f, x[, h, rtol, atol]) Checks the derivatives of a ADFunctionCall or an ADFunctionInterface. eval_multiple(exprs, **kwargs) Given a list of expressions, and keyword arguments that set variable values, returns a list of the evaluations of the expressions. derivative_multiple(exprs, arg, **kwargs) Given a list of expressions, the argument to which you’d like to take the derivatives, and keyword arguments that set variable values, returns a list of the derivatives of the expressions. getitem(a, index) Creates an Autodiff’ed function comparable to the [] operator. setitem(x, index, v) Creates an Autodiff’ed function similar to x[index] = v, return x but without modifying x.
klampt.math.autodiff.ad.FINITE_DIFFERENCE_RES = 1e-06

The resolution used for finite differences.

class klampt.math.autodiff.ad.ADFunctionInterface[source]

Bases: object

The base class for a new auto-differentiable function.

The function has the form $$f(x_0,...,x_k)$$ with k+1 == self.n_args(). All inputs and outputs are either scalars or numpy ndarrays.

To define a new function, you need to implement n_args(), eval(*args), and optionally n_in(arg), n_out(), derivative(arg,*args), and jvp(arg,darg,*args).

A function class can be instantiated with non-numeric data. For example, to perform forward kinematics the call:

wp = WorldPosition(link,localPos)


(see kinematics_ad) creates a function wp(q) that will accept a robot configuration q and return the world position of the point localPos on link link.

This makes these function classes more expressive, but also leads to a slight annoyance in that to define a function that can be called, you need to instantiate a function. For exaple, the sin function is defined in math_ad as sin = ADSinFunction() where ADSinFunction is the actual class that implements the sin function. This construction lets you call math_ad.sin(var('x')) as though it were a normal function.

You can also call the convenience routine fn_name = function(func,'fn_name',argspec,outspec) to do just this. See the documentation of function() for more inf.

The general convention is that CapitalizedFunctions are function classes that must be instantiated, while lowercase_functions are already instantiated.

Methods:

 argname(arg) Returns a descriptive string for argument #arg Returns the number of arguments. n_in(arg) Returns the number of entries in argument #arg. Returns the number of entries in the output of the function. eval(*args) Evaluates the application of the function to the given (instantiated) arguments. derivative(arg, *args) Returns the Jacobian of the function w.r.t. gen_derivative(arg, *args) Generalized derivative that allows higher-order derivatives to be taken. jvp(arg, darg, *args) Performs a Jacobian-vector product for argument #arg.
argname(arg)[source]

Returns a descriptive string for argument #arg

n_args()[source]

Returns the number of arguments.

n_in(arg)[source]

Returns the number of entries in argument #arg. If 1, this can either be a 1-D vector or a scalar. If -1, the function can accept a variable sized argument.

n_out()[source]

Returns the number of entries in the output of the function. If -1, this can output a variable sized argument.

eval(*args)[source]

Evaluates the application of the function to the given (instantiated) arguments.

Parameters

args (list) – a list of arguments, which are either ndarrays or scalars.

derivative(arg, *args)[source]

Returns the Jacobian of the function w.r.t. argument #arg.

Parameters
• arg (int) – A value from 0,…,self.n_args()-1 indicating that we wish to take $$df/dx_{arg}$$.

• args (list of ndarrays) – arguments for the function.

Returns

A numpy array of shape (self.n_out(),self.n_in(arg)). Keep in mind that even if the argument or result is a scalar, this needs to be a 2D array.

If the derivative is not implemented, raise a NotImplementedError.

If the derivative is zero, can just return 0 (the integer) regardless of the size of the result.

Return type

ndarray

gen_derivative(arg, *args)[source]

Generalized derivative that allows higher-order derivatives to be taken.

Parameters
• arg (list) – Indicates the order of derivatives to be taken. For example, to take the 2nd derivative w.r.t. x0,x1, arg = [0,1] should be specified.

• args (list of ndarrays) – arguments for the function.

Returns

A tensor of shape (n_out,n_in(arg),..., n_in(arg[-1]).

If the generalized derivative is not implemented, raise a NotImplementedError.

If the generalized derivative is zero, can just return 0 (the integer) regardless of the size of the result.

Return type

ndarray

jvp(arg, darg, *args)[source]

Performs a Jacobian-vector product for argument #arg.

Parameters
• arg (int) – A value from 0,…,self.n_args()-1 indicating that we wish to calculate df/dx_arg * darg.

• darg (ndarray) – the derivative of x_arg w.r.t some other parameter. Must have darg.shape = (self.n_in(arg),).

• args (list of ndarrays) – arguments for the function.

Returns

A numpy array of length self.n_out()

If the derivative is not implemented, raise a NotImplementedError.

Return type

ndarray

class klampt.math.autodiff.ad.ADTerminal(name)[source]

Bases: object

A symbolic value that will be bound later.

Consider using var() rather than calling this explicitly.

class klampt.math.autodiff.ad.ADFunctionCall(func, args)[source]

Bases: object

A function call in an expression tree.

Methods:

 eval(**kwargs) Evaluate a function call with bound values given by the keyword arguments. Example::. derivative(arg, **kwargs) Evaluates the derivative of this computation graph with respect to the named variable at the given setings in kwargs. gen_derivative(args, **kwargs) Evaluates the generalized derivative of this computation graph with respect to the named variables in args, at the given settings in kwargs. replace(**kwargs) Replaces any symbols whose names are keys in kwargs with the matching value as an expression. terminals([sort]) Returns the list of terminals in this expression.
eval(**kwargs)[source]

Evaluate a function call with bound values given by the keyword arguments. Example:

print((3*var('x')**2 - 2*var('x')).eval(x=1.5))

derivative(arg, **kwargs)[source]

Evaluates the derivative of this computation graph with respect to the named variable at the given setings in kwargs.

If any derivatives are not defined, finite differences will be used with step size FINITE_DIFFERENCE_RES.

If the prior eval() call wasn’t with the same arguments, then this will call eval() again with kwargs. So, the sequence eval(**kwargs), derivative(**kwargs) will save some time compared to changing the args.

gen_derivative(args, **kwargs)[source]

Evaluates the generalized derivative of this computation graph with respect to the named variables in args, at the given settings in kwargs.

Examples:

• (var('x')**3).gen_derivative(['x','x']) returns the

second derivative of $$x^3$$ with respect to x.

• var('x')**2*var('y').gen_derivative(['x','y']) returns

$$\frac{d^2}{dx dy} x^2 y$$.

If args is [], this is equivalent to self.eval(**kwargs).

If args has only one element, this is equivalent to self.derivative(args,**kwargs).

Parameters
• args (list of str) – the denominators of the derivative to be evaluated.

• kwargs (dict) – the values at which the derivative is evaluated.

Result:

ndarray: A tensor of shape (_size(this),_size(args),...,_size(args[-1])) containing the derivatives.

replace(**kwargs)[source]

Replaces any symbols whose names are keys in kwargs with the matching value as an expression. The result is an ADFunctionCall.

For example, (var('x')**2 + var('y')).replace(x=var('z'),y=3) would return the expression var('z')**2 + 3.

terminals(sort=False)[source]

Returns the list of terminals in this expression. If sort=True, they are sorted by alphabetical order.

Returns

list of str

klampt.math.autodiff.ad.var(x)[source]

Creates a new variable symbol with name x.

klampt.math.autodiff.ad.function(func, name='auto', argspec='auto', outspec=None, argnames=None, derivative=None, jvp=None, gen_derivative=None, order=None)[source]

Declares and instantiates a ADFunctionInterface object that calls the Python function func.

Parameters
• func (callable) – a Python function.

• name (str, optional) – the string that the ADFunctionInterface prints.

• argspec (list, optional) – a list of argument sizes, or ‘auto’ to auto- set the argument list from the function signature.

• outspec (int, optional) – a size of the output.

• argnames

• derivative (callable or list of callables, optional) – one or more derivative functions. If a callable, takes the index-arguments tuple (arg,*args). If a list, each element just takes the same arguments as func.

• jvp (callable or list of callables, optional) – one or more Jacobian- vector product functions. If a single callable, has signature (arg,darg,*args). If a list, each element has the signature (darg,*args).

• gen_derivative (callable, list of callables, or dict, optional) –

a higher-order derivative function.

If a callable is given, it has standard signature (arg,*args).

If a list of callables is given, they are a list of higher-order derivatives, starting at the second derivative. In this case, the function may only have 1 argument, and each callable has the signature (*args).

If a dict is given, it must map tuples to callables, each of which has the signature (*args).

• order (int, optional) – if provided, we know that all derivatives higher than this order will be zero.

klampt.math.autodiff.ad.finite_differences(f, x, h=1e-06)[source]

Performs forward differences to approximate the derivative of f at x. f can be a vector or scalar function, and x can be a scalar or vector.

The result is a Numpy array of shape (_size(f(x)),_size(x)).

h is the step size.

klampt.math.autodiff.ad.finite_differences_hessian(f, x, h)[source]

Performs forward differences to approximate the Hessian of f(x) w.r.t. x.

klampt.math.autodiff.ad.finite_differences_hessian2(f, x, y, hx, hy=None)[source]

Performs forward differences to approximate the Hessian of f(x,y) w.r.t. x and y.

klampt.math.autodiff.ad.check_derivatives(f, x, h=1e-06, rtol=0.01, atol=0.001)[source]

Checks the derivatives of a ADFunctionCall or an ADFunctionInterface.

If f is an ADFunctionCall, the derivatives of all terminal values are checked, and x must be a dict matching terminal values to their settings. Example:

f = var('x')+var('y')**2+10
check_derivatives(f,{'x':3,'y':2})


If f is an ADFunctionInterface, x is list of the function’s arguments. All derivative and jvp functions are tested.

The derivatives must match the finite differences result by relative tolerance rtol and absolute tolerance atol (see np.allclose for details). Otherwise, an AssertionError is raised.

klampt.math.autodiff.ad.eval_multiple(exprs, **kwargs)[source]

Given a list of expressions, and keyword arguments that set variable values, returns a list of the evaluations of the expressions.

This can leverage common subexpressions in exprs to speed up running times compared to multiple eval() calls.

klampt.math.autodiff.ad.derivative_multiple(exprs, arg, **kwargs)[source]

Given a list of expressions, the argument to which you’d like to take the derivatives, and keyword arguments that set variable values, returns a list of the derivatives of the expressions.

This can leverage common subexpressions in exprs to speed up running times compared to multiple derivative() calls.

klampt.math.autodiff.ad.add(*args) = <klampt.math.autodiff.ad._ADAdd object>

Autodiff’ed function comparable to the addition operator +. Applied automatically when the + operator is called on an ADTerminal or an ADFunctionCall.

klampt.math.autodiff.ad.sub(*args) = <klampt.math.autodiff.ad._ADSub object>

Autodiff’ed function comparable to the subtraction operator -. Applied automatically when the - operator is called on an ADTerminal or an ADFunctionCall.

klampt.math.autodiff.ad.mul(*args) = <klampt.math.autodiff.ad._ADMul object>

Autodiff’ed function comparable to the multiplication operator *. Applied automatically when the * operator is called on an ADTerminal or an ADFunctionCall. Multiplication of vectors is performed element-wise.

klampt.math.autodiff.ad.div(*args) = <klampt.math.autodiff.ad._ADDiv object>

Autodiff’ed function comparable to the division operator /. Applied automatically when the / operator is called on an ADTerminal or an ADFunctionCall. Divison of vectors is performed element-wise.

klampt.math.autodiff.ad.neg(*args) = <klampt.math.autodiff.ad._ADNeg object>

Autodiff’ed function comparable to the negation operator -. Applied automatically when the - operator is called on an ADTerminal or an ADFunctionCall.

klampt.math.autodiff.ad.sum_(*args) = <klampt.math.autodiff.ad._ADSum object>

Autodiff’ed function comparable to sum. It acts like np.sum if only one item is provided. Otherwise, it acts like the builtin sum.

klampt.math.autodiff.ad.pow_(*args) = <klampt.math.autodiff.ad._ADPow object>

Autodiff’ed function comparable to np.power. Applied automatically when the ** operator is called on an ADTerminal or an ADFunctionCall.

klampt.math.autodiff.ad.abs_(*args) = <klampt.math.autodiff.ad._ADAbs object>

Autodiff’ed function comparable to abs. Applied automatically when abs is called on an ADTerminal or an ADFunctionCall.

klampt.math.autodiff.ad.stack(*args) = <klampt.math.autodiff.ad._ADStack object>

Autodiff’ed function comparable to np.stack.

klampt.math.autodiff.ad.getitem(a, index)[source]

Creates an Autodiff’ed function comparable to the [] operator. Note that the index argument cannot be a variable.

klampt.math.autodiff.ad.setitem(x, index, v)[source]

Creates an Autodiff’ed function similar to x[index] = v, return x but without modifying x. The index argument cannot be a variable.

klampt.math.autodiff.ad.minimum(*args) = <klampt.math.autodiff.ad._ADMinimum object>

Autodiff’ed function comparable to np.minimum. It acts like ndarray.min if only one item is provided, and otherwise it acts like np.minimum. If more than 2 items are provided, they are applied elementwise and sequentially.

klampt.math.autodiff.ad.maximum(*args) = <klampt.math.autodiff.ad._ADMaximum object>

Autodiff’ed function comparable to np.maximum. It acts like ndarray.max if only one item is provided, and otherwise it acts like np.maximum. If more than 2 items are provided, they are applied elementwise and sequentially.

klampt.math.autodiff.ad.cond(*args) = <klampt.math.autodiff.ad._ADCond object>

Autodiff’ed function that performs a conditional (pred,trueval,falseval). It returns trueval if pred > 0 and falseval otherwise. If pred is an array, then it must have the same size as trueval and falseval, and performs the conditioning element-wise.

klampt.math.autodiff.ad.cond3`(*args) = <klampt.math.autodiff.ad._ADCond3 object>

Autodiff’ed function that performs a triple conditional (pred,trueval,zeroval,falseval). It returns trueval if pred > 0, zeroval if pred=0, and falseval otherwise. If pred is an array, then it must have the same size as trueval, zeroval, and falseval, and performs the conditioning element-wise.