In addition to further articles on image processing, I plan to write a number of articles on *signal processing*. Of course, image processing *is* signal processing, but I’ll put one-dimensional signals such as audio and radio in a separate category, since they can be quite different.

To kick off the series, I’ll start with a very practical problem: How to find out how a lion sounds in a concert hall. I’ll deal with more theoretical subjects in later articles. Smuggling an actual lion unseen into a concert hall seems unlikely to succeed, and might be dangerous. The question is, can you capture the “sound” of a concert hall in some way, and combine it with the sound of the lion? It turns out that you can. Below is the sound of the lion.

**Lion Growl**

Now, I know that you can just use any standard audio software and tell it to add a *large concert hall* effect or something, but that doesn’t tell you how it’s done, which is where I come in.

## Impulse Response

The necessary concept to compute how this lion would sound in a given concert hall, is the *impulse response* of that concert hall. Impulse response is a concept from signal processing (including image processing, where it is often called *point spread function*). A *linear, time-invariant* (LTI) system, is characterized completely by its impulse response. This is quite technical, but assuming that your system is LTI makes your “signal processing life” a lot easier. Practical systems are often assumed to be LTI, even though this is never exactly true. I’ll explain all this in more detail in a future article.

The sound below is the impulse response of the Promenadikeskus concert hall in Pori, Finland. It is maybe a bit unexpected that you can actually hear the impulse response as an actual sound, since the original impulse is a theoretical construct that you cannot hear (to begin with, it has a duration of zero). Also, this impulse response is *not* recorded by producing an *approximate impulse* and recording the result, it’s more complicated than that. I’ll also explain that in more detail in a future article.

**Impulse Response of Concert Hall**

## Convolution

Combining a sound with the impulse response of something (a concert hall, an amplifier, etc.) is done by means of *convolution*, if the system is LTI. Convolution is so common in image and signal processing that it almost seems the third basic arithmetic operation there, next to addition and multiplication. This article focuses on *digital* signal processing, so I’ll define convolution using sums instead of integrals. The convolution of two signals is defined (in the one-dimensional case) by an infinite sum,

\[(x*y)[n]\equiv\sum_{m=-\infty}^\infty\!x[m]\,y[n-m].\]

Often, \(x\) is the original signal (an audio stream, a radio signal, etc.), and \(y\) represents some manipulation of that signal. In this definition, \(x[n]\) and \(y[n]\) are sequences of numbers that are a *sampled* version of the real signal. Computing a convolution by following this definition closely is a very expensive operation (in terms of computation). However, the ubiquitous use of convolution is enabled in practice by the *Fast Fourier Transform* (FFT) algorithm. In practice, the length of \(x\) and \(y\) is of course not infinite, and the sum simply means that each point of each signal is combined with each point of the other signal. When the lion growl is *convolved* with the impulse response of the concert hall, the result is the following.

**Lion Growl in Concert Hall**

With this convolution, we have determined exactly how the lion will sound in the Promenadikeskus concert hall, and no one got hurt. As you would expect from a large concert hall, the sound takes a long time to “die out”. This also follows from the definition of convolution, since the combination of a 1 s (1 second) signal with a 4 s signal must result in a 5 s signal (each sample of \(x\) is spread out over 4 s by the sum over all \(m\)). In the impulse response itself, only the initial “click” is audible, but the remainder does influence the rest of the lion growl, which is audible for almost the full 5 s.

*Sources: The impulse response of the concert hall is courtesy of the Laboratory of Acoustics and Audio Signal Processing, Helsinki University of Technology (see Concert Hall Impulse Responses on their website). The lion growl is from freeSFX.co.uk.*

thanks for getting me started on convolution. My teacher is building on the math rigor of LTI systems for the past 3 weeks and by the time, we started studying convolution, i was lost. back on track again :)

That is an excellent article. Thank you

## Add new comment