Estimate derivatives by simply passing in a complex number to your function!
Recall, the centered-difference approximation is a fairly accurate method for
approximating derivatives of a univariate function \(f\), which only requires two
function evaluations. A similar derivation, based on the Taylor series expansion
with a complex perturbation, gives us a similarly-accurate approximation with a
single (complex) function evaluation instead of two (real-valued) function
evaluations. Note: \(f\) must support complex inputs (in frameworks, such as numpy
or matlab, this often requires no modification to source code).
This post is based on Martins+'03.
Derivation: Start with the Taylor series approximation:
Take the imaginary part of both sides and solve for \(f'(x)\). Note: the \(f\) and
\(f''\) term disappear because \(i^0\) and \(i^2\) are real-valued.
As usual, using a small \(\varepsilon\) let's us throw out higher-order
terms. And, we arrive at the following approximation:
If instead, we take the real part and solve for \(f(x)\), we get an approximation
to the function's value at \(x\):
In other words, a single (complex) function evaluations computes both the
function's value and the derivative.
Code:
def complex_step(f, eps=1e-10):
"""
Higher-order function takes univariate function which computes a value and
returns a function which returns value-derivative pair approximation.
"""
def f1(x):
y = f(complex(x, eps)) # convert input to complex number
return (y.real, y.imag / eps) # return function value and gradient
return f1
A simple test:
f = lambda x: exp(x)+cos(x)+10 # function
g = lambda x: exp(x)-sin(x) # gradient
x = 1.0
print (f(x), g(x))
print complex_step(f)(x)
Other comments
-
Using the complex-step method to estimate the gradients of multivariate functions requires independent approximations for each dimension of the input.
-
Although the complex-step approximation only requires a single function evaluation, it's unlikely faster than performing two function evaluations because operations on complex numbers are generally much slower than on floats or doubles.
Code: Check out the gist for this post.