Exp Will Solve All Your Problems

October 24, 2024

Towards the end of my previous article, I made the wild claim that eddtf(t)=f(t+1)e^\frac{\mathrm d}{\mathrm dt}f(t) = f(t + 1), or in prose, that the exponential of the derivative operator is the shift operator. If you're interested in knowing why that's the case, I recommend giving it a read. But in short:

The differential equation y=Kyy' = Ky, which says "at position yy, move in the direction KyKy", has the solution y(t)=eKty(0)y(t) = e^{Kt}y(0). So eddtf(t)e^\frac{\mathrm d}{\mathrm dt}f(t) tells us "starting at f(t)f(t), move in the direction dfdt\frac{\mathrm df}{\mathrm dt} for one second", thus landing us at f(t+1)f(t + 1).

More generally, we have that eaddt=Tae^{a\frac{\mathrm d}{\mathrm dt}} = T^a, where TaT^a shifts a function over by aa units. All of this can be verified using Taylor series, as demonstrated here, although I personally don't find such an algebraic proof convincing. Instead, I think both you and I would prefer to see the applications of this identity. And surprisingly, they're quite plentiful. In this article, we'll explore applications in

  1. Dynamical systems
  2. Fourier analysis
  3. Partial differential equations

which, hopefully, you'll find convincing enough!

1. Dynamical Systems

Any dynamical systems course will cover linear dynamical systems, which come in two flavors: continuous and discrete. Continuous linear systems have the form

ddty=Ky\frac{\mathrm d}{\mathrm dt}y = Ky

for yRny \in \mathbb R^n and a square matrix KRn×nK \in \mathbb R^{n \times n}, whereas discrete linear systems have the form

y(t+1)=My(t)y(t + 1) = My(t)

for a similar vector yy and matrix MM. It would seem that these two types of systems have something in common, but it's hard to say exactly how they're related. But if we turn to the exponential, all becomes clear. In the continuous case, our system is telling us that ddt\frac{\mathrm d}{\mathrm dt} and KK do the same thing to yy, so it stands to reason that eddte^\frac{\mathrm d}{\mathrm dt} and eKe^K would also do the same thing to yy, that is,

eddty=eKye^\frac{\mathrm d}{\mathrm dt}y = e^Ky

But we already know what eddte^\frac{\mathrm d}{\mathrm dt} does: it's the shift operator! So we can rewrite the above equation as

y(t+1)=eKy(t)y(t + 1) = e^Ky(t)

and if we let M=eKM = e^K, we get the exact form of a discrete system. Like magic, the exponential has turned a continuous object into a discrete one.

But there's more! After introducing linear dynamical systems, stability conditions are usually discussed. For continuous linear systems, a system is stable only if the eigenvalues of KK all have a negative real part. For discrete linear systems, they're only stable if the eigenvalues have a magnitude less than 11.

Left Half PlaneUnit Disc

It's common for students to be frustrated by these seemingly unrelated conditions, but our exponential analysis reveals the connection: if a complex number zz has a negative real part, then the magnitude of eze^z is less than one. In other words, the exponential of the left half-plane is the unit disc. When we used the exponential to go from continuous to discrete, we brought the eigenvalues along with us!

More generally, exp\exp brings us from the continuous world to the discrete world. ddtf\frac{\mathrm d}{\mathrm dt}f tells us how to move "a little bit", while TfTf tells us how to move a discrete amount of time. It's a lot like integration in that respect.

2. Fourier Analysis

The Fourier Transform

The Fourier transform f^\hat f of a function ff, defined as

f^(ω)=eiωtf(t)dt\hat f(\omega) = \int_{-\infty}^\infty e^{-i\omega t}f(t)dt

says how to represent a function as a sum/integral of complex exponentials. More specifically, f^(ω)\hat f(\omega) indicates how strongly the frequency ω\omega contributes to ff. The inverse relationship is

f(t)=eiωtf^(ω)dωf(t) = \int_{-\infty}^\infty e^{i\omega t}\hat f(\omega)d\omega

which says that ff is an infinite linear combination of exponentials, weighted by f^\hat f.

These exponentials are "eigenfunctions" of differentiation, that is, they simply get scaled when you differentiate them.

ddteiωt=iωeiωt\frac{\mathrm d}{\mathrm dt}e^{i\omega t} = i\omega e^{i\omega t}

so it's no surprise that the Fourier transform, as a linear combination of exponentials, acts like an eigenfunction as well.

fundefined(ω)=iωf^(ω)\widehat{f'}(\omega) = i\omega\hat f(\omega)

This property has many names, but we'll call it the differentiation property of the Fourier transform. The differentiation property is so easy to prove that it's rarely called a theorem, instead just a basic fact about the transform. But there's another slightly more subtle result called the shift theorem:

Tafundefined(ω)=eiωaf^(ω)\widehat{T^af}(\omega) = e^{i\omega a}\hat f(\omega)

which says that the Fourier transform a function shifted by aa units is the same as multiplying by eiωae^{i\omega a}. If you're comfortable with Fourier analysis, this should be intuitive, since eiωae^{i\omega a} represents a phase shift by aa at frequency ω\omega. But of course, there's another way to view this; let's take another look at the differentiation property. Essentially, it tells us that, at a given frequency ω\omega, differentiating is the same as multiplying by iωi\omega. So, like last time, it stands to reason that eaddt=eiωae^{a\frac{\mathrm d}{\mathrm dt}} = e^{i\omega a}, i.e. Ta=eiωaT^a = e^{i\omega a}. With respect to the Fourier transform, the shift TaT^a and the exponential eiωae^{i\omega a} are exactly the same operation.

3. Partial Differential Equations

But maybe even the previous material isn't applied enough for you. You want to see some numerical, scientific computing done with eddte^\frac{\mathrm d}{\mathrm dt}. In that case, you're probably familiar with the discrete derivative matrix, but let's review it anyway!

Suppose you have a function uu sampled at u0,u1,u2u_0, u_1, u_2 (we write ux=u(x)u_x = u(x) to ease notation), and we want to approximate the derivative at u1u_1. Is there a "best" way to do this? As it turns out, yes! u2u02\frac{u_2 - u_0}2 gives us the best answer for the derivative at u1u_1. Try it on a line and see what you get. If we instead have a vector of samples

u=(u0u1u2uN) u = \begin{pmatrix}u_0 \\ u_1 \\ u_2 \\ \vdots \\ u_N\end{pmatrix}

we can compress our so-called "discrete derivative" into a matrix DD

D=(1200001200120120012012) D = \begin{pmatrix} \frac12 & 0 & 0 & 0 & \\ 0 & \frac12 & 0 & 0 & \\ -\frac12 & 0 & \frac12 & 0 & \cdots \\ 0 & -\frac12 & 0 & \frac12 & \\ & & \vdots & & \ddots \end{pmatrix}

so that, as we hope, DuDu gives us our estimated values for u(x)u'(x)

Du=(u0u1u2uN) Du = \begin{pmatrix}u'_0 \\ u'_1 \\ u'_2 \\ \vdots \\ u'_N\end{pmatrix}

We should expect that the derivative can be encapsulated by a matrix, since it's a linear operator and all finite-dimensional linear operators can be represented by matrices.

Now, onto the PDEs! The first PDE you study in depth is typically the 1D heat equation

ddtu=d2dx2u\frac{\mathrm d}{\mathrm dt}u = \frac{\mathrm d^2}{\mathrm dx^2}u

where u(x,t)u(x, t) equals the temperature at position xx at time tt. When we're solving this computationally, we usually have a starting vector u(x,0)u(x, 0) of temperatures, and we use the heat equation to advance it in time. At this point in the article, you're probably already thinking what I'm thinking: if we view the equation as

ddtu=D2u\frac{\mathrm d}{\mathrm dt}u = D^2u

then we transform it into a first order linear ODE, which has the exact solution

u(x,t)=eD2tu(x,0)u(x, t) = e^{D^2t}u(x, 0)

And we've done it again, we've exponentiated the derivative operator! Except now that we've discretized it, we can get numerical results. Here's what the evolution looks like, starting with randomly initialized data.[1]

The code for the above simulation is available here. Heat simulations exist already, but what's fascinating here is that this exponential method gives a formula for seeing any point in the future. Most simulations require you to slowly advance time to the point you're interested in. With this method, if we want to know how the simulation looks in exactly 85 seconds, we can see it with one computation.

We can easily do the same thing with the wave equation, which has the form

d2dt2u=d2dx2u\frac{\mathrm d^2}{\mathrm dt^2}u = \frac{\mathrm d^2}{\mathrm dx^2}u

Since the time derivative is second-order, we'll need to transform this into a first-order equation, which is pretty straightforward. Simply let v=dudtv = \frac{\mathrm du}{\mathrm dt}, so we have

ddt(uv)=(0ID20)(uv)\frac{\mathrm d}{\mathrm dt}\begin{pmatrix}u \\ v\end{pmatrix} = \begin{pmatrix}0 & I \\ D^2 & 0\end{pmatrix}\begin{pmatrix}u \\ v\end{pmatrix}

Note that we've replaced d2dx2\frac{\mathrm d^2}{\mathrm dx^2} with D2D^2 like last time. This gives us the explicit solution

(u(x,t)v(x,t))=exp(0ItD2t0)(u(x,0)v(x,0))\begin{pmatrix}u(x, t) \\ v(x, t)\end{pmatrix} = \exp\begin{pmatrix}0 & It \\ D^2t & 0\end{pmatrix}\begin{pmatrix}u(x, 0) \\ v(x, 0)\end{pmatrix}

Here's a visualization:

The code for the above simulation is available here. This method also has the advantage of being completely timestep invariant. Most simulations lose accuracy if you advance time too quickly, but this one remains perfectly accurate. It's exciting what exponentials can do for us!

  1. This simulation has periodic boundary conditions, which is captured by the fact that the entries of DD wrap around at the edges. You can use the forward- and backward-difference formulas if you want different boundary conditions.