Ethan Lipson

Exp Will Solve All Your Problems

October 24, 2024

Towards the end of my previous article, I made the wild claim that $e^\frac{\mathrm d}{\mathrm dt}f(t) = f(t + 1)$ , or in prose, that the exponential of the derivative operator is the shift operator. If you're interested in knowing why that's the case, I recommend giving it a read. But in short:

The differential equation $y' = Ky$ , which says "at position $y$ , move in the direction $Ky$ ", has the solution $y(t) = e^{Kt}y(0)$ . So $e^\frac{\mathrm d}{\mathrm dt}f(t)$ tells us "starting at $f(t)$ , move in the direction $\frac{\mathrm df}{\mathrm dt}$ for one second", thus landing us at $f(t + 1)$ .

More generally, we have that $e^{a\frac{\mathrm d}{\mathrm dt}} = T^a$ , where $T^a$ shifts a function over by $a$ units. All of this can be verified using Taylor series, as demonstrated here, although I personally don't find such an algebraic proof convincing. Instead, I think both you and I would prefer to see the applications of this identity. And surprisingly, they're quite plentiful. In this article, we'll explore applications in

Dynamical systems
Fourier analysis
Partial differential equations

which, hopefully, you'll find convincing enough!

1. Dynamical Systems

Any dynamical systems course will cover linear dynamical systems, which come in two flavors: continuous and discrete. Continuous linear systems have the form

\frac{\mathrm d}{\mathrm dt}y = Ky

for $y \in \mathbb R^n$ and a square matrix $K \in \mathbb R^{n \times n}$ , whereas discrete linear systems have the form

y(t + 1) = My(t)

for a similar vector $y$ and matrix $M$ . It would seem that these two types of systems have something in common, but it's hard to say exactly how they're related. But if we turn to the exponential, all becomes clear. In the continuous case, our system is telling us that $\frac{\mathrm d}{\mathrm dt}$ and $K$ do the same thing to $y$ , so it stands to reason that $e^\frac{\mathrm d}{\mathrm dt}$ and $e^K$ would also do the same thing to $y$ , that is,

e^\frac{\mathrm d}{\mathrm dt}y = e^Ky

But we already know what $e^\frac{\mathrm d}{\mathrm dt}$ does: it's the shift operator! So we can rewrite the above equation as

y(t + 1) = e^Ky(t)

and if we let $M = e^K$ , we get the exact form of a discrete system. Like magic, the exponential has turned a continuous object into a discrete one.

But there's more! After introducing linear dynamical systems, stability conditions are usually discussed. For continuous linear systems, a system is stable only if the eigenvalues of $K$ all have a negative real part. For discrete linear systems, they're only stable if the eigenvalues have a magnitude less than $1$ .

It's common for students to be frustrated by these seemingly unrelated conditions, but our exponential analysis reveals the connection: if a complex number $z$ has a negative real part, then the magnitude of $e^z$ is less than one. In other words, the exponential of the left half-plane is the unit disc. When we used the exponential to go from continuous to discrete, we brought the eigenvalues along with us!

More generally, $\exp$ brings us from the continuous world to the discrete world. $\frac{\mathrm d}{\mathrm dt}f$ tells us how to move "a little bit", while $Tf$ tells us how to move a discrete amount of time. It's a lot like integration in that respect.

2. Fourier Analysis

The Fourier Transform

The Fourier transform $\hat f$ of a function $f$ , defined as

\hat f(\omega) = \int_{-\infty}^\infty e^{-i\omega t}f(t)dt

says how to represent a function as a sum/integral of complex exponentials. More specifically, $\hat f(\omega)$ indicates how strongly the frequency $\omega$ contributes to $f$ . The inverse relationship is

f(t) = \int_{-\infty}^\infty e^{i\omega t}\hat f(\omega)d\omega

which says that $f$ is an infinite linear combination of exponentials, weighted by $\hat f$ .

These exponentials are "eigenfunctions" of differentiation, that is, they simply get scaled when you differentiate them.

\frac{\mathrm d}{\mathrm dt}e^{i\omega t} = i\omega e^{i\omega t}

so it's no surprise that the Fourier transform, as a linear combination of exponentials, acts like an eigenfunction as well.

\widehat{f'}(\omega) = i\omega\hat f(\omega)

This property has many names, but we'll call it the differentiation property of the Fourier transform. The differentiation property is so easy to prove that it's rarely called a theorem, instead just a basic fact about the transform. But there's another slightly more subtle result called the shift theorem:

\widehat{T^af}(\omega) = e^{i\omega a}\hat f(\omega)

which says that the Fourier transform a function shifted by $a$ units is the same as multiplying by $e^{i\omega a}$ . If you're comfortable with Fourier analysis, this should be intuitive, since $e^{i\omega a}$ represents a phase shift by $a$ at frequency $\omega$ . But of course, there's another way to view this; let's take another look at the differentiation property. Essentially, it tells us that, at a given frequency $\omega$ , differentiating is the same as multiplying by $i\omega$ . So, like last time, it stands to reason that $e^{a\frac{\mathrm d}{\mathrm dt}} = e^{i\omega a}$ , i.e. $T^a = e^{i\omega a}$ . With respect to the Fourier transform, the shift $T^a$ and the exponential $e^{i\omega a}$ are exactly the same operation.

3. Partial Differential Equations

But maybe even the previous material isn't applied enough for you. You want to see some numerical, scientific computing done with $e^\frac{\mathrm d}{\mathrm dt}$ . In that case, you're probably familiar with the discrete derivative matrix, but let's review it anyway!

Suppose you have a function $u$ sampled at $u_0, u_1, u_2$ (we write $u_x = u(x)$ to ease notation), and we want to approximate the derivative at $u_1$ . Is there a "best" way to do this? As it turns out, yes! $\frac{u_2 - u_0}2$ gives us the best answer for the derivative at $u_1$ . Try it on a line and see what you get. If we instead have a vector of samples

u = \begin{pmatrix}u_0 \\ u_1 \\ u_2 \\ \vdots \\ u_N\end{pmatrix}

we can compress our so-called "discrete derivative" into a matrix $D$

D = \begin{pmatrix} \frac12 & 0 & 0 & 0 & \\ 0 & \frac12 & 0 & 0 & \\ -\frac12 & 0 & \frac12 & 0 & \cdots \\ 0 & -\frac12 & 0 & \frac12 & \\ & & \vdots & & \ddots \end{pmatrix}

so that, as we hope, $Du$ gives us our estimated values for $u'(x)$

Du = \begin{pmatrix}u'_0 \\ u'_1 \\ u'_2 \\ \vdots \\ u'_N\end{pmatrix}

We should expect that the derivative can be encapsulated by a matrix, since it's a linear operator and all finite-dimensional linear operators can be represented by matrices.

Now, onto the PDEs! The first PDE you study in depth is typically the 1D heat equation

\frac{\mathrm d}{\mathrm dt}u = \frac{\mathrm d^2}{\mathrm dx^2}u

where $u(x, t)$ equals the temperature at position $x$ at time $t$ . When we're solving this computationally, we usually have a starting vector $u(x, 0)$ of temperatures, and we use the heat equation to advance it in time. At this point in the article, you're probably already thinking what I'm thinking: if we view the equation as

\frac{\mathrm d}{\mathrm dt}u = D^2u

then we transform it into a first order linear ODE, which has the exact solution

u(x, t) = e^{D^2t}u(x, 0)

And we've done it again, we've exponentiated the derivative operator! Except now that we've discretized it, we can get numerical results. Here's what the evolution looks like, starting with randomly initialized data.^[1]

The code for the above simulation is available here. Heat simulations exist already, but what's fascinating here is that this exponential method gives a formula for seeing any point in the future. Most simulations require you to slowly advance time to the point you're interested in. With this method, if we want to know how the simulation looks in exactly 85 seconds, we can see it with one computation.

We can easily do the same thing with the wave equation, which has the form

\frac{\mathrm d^2}{\mathrm dt^2}u = \frac{\mathrm d^2}{\mathrm dx^2}u

Since the time derivative is second-order, we'll need to transform this into a first-order equation, which is pretty straightforward. Simply let $v = \frac{\mathrm du}{\mathrm dt}$ , so we have

\frac{\mathrm d}{\mathrm dt}\begin{pmatrix}u \\ v\end{pmatrix} = \begin{pmatrix}0 & I \\ D^2 & 0\end{pmatrix}\begin{pmatrix}u \\ v\end{pmatrix}

Note that we've replaced $\frac{\mathrm d^2}{\mathrm dx^2}$ with $D^2$ like last time. This gives us the explicit solution

\begin{pmatrix}u(x, t) \\ v(x, t)\end{pmatrix} = \exp\begin{pmatrix}0 & It \\ D^2t & 0\end{pmatrix}\begin{pmatrix}u(x, 0) \\ v(x, 0)\end{pmatrix}

Here's a visualization:

The code for the above simulation is available here. This method also has the advantage of being completely timestep invariant. Most simulations lose accuracy if you advance time too quickly, but this one remains perfectly accurate. It's exciting what exponentials can do for us!

This simulation has periodic boundary conditions, which is captured by the fact that the entries of $D$ wrap around at the edges. You can use the forward- and backward-difference formulas if you want different boundary conditions. ↩