Basic Python Scripts to Save and Load Audio Files

Each time I need to do something with audio, one dimensional signal processing in general, or images, I need to go through my collection of Python scripts to help me remember once again which library to use to save and load them.

To make this easier, I’ve created a collection of scripts that contain the basic save and load operations. In this article, I list basic audio scripts. As always, I’ve assumed that you want to use NumPy and SciPy for processing your data. I also assume that you use floating point numbers for all processing, and convert to integers just before saving to file. Note that .wav files do support different bit depths and even floating point numbers. However, for maximum compatibility with other software, you’ll probably want to save the final output using 16 bit integers.

Create an Audio File

The following script produces a 1 kHz sine at a sampling rate of 44.1 kHz, and saves it to a .wav file.

from __future__ import division
 
import numpy as np
from scipy.io.wavfile import write
 
# Create waveform.
rate = 44100  # Sampling rate [samples/s].
n = 44100     # Length [samples].
f = 1000      # Frequency of the sine [Hz].
t = np.linspace(0, n / rate, n, endpoint=False)
s = np.sin(2 * np.pi * f * t)
 
# Save audio file (range of s is [-1, 1]).
write('sine.wav', rate, np.round(s * 32767).astype(np.int16))

sine.wav

Edit an Audio File

The following script loads a .wav file (it loads the file that was produced in the previous script), edits it, and writes it to a new .wav file. For fun, I amplitude modulate the sine exactly as the example that is given in How Does Amplitude Modulation Work?.

from __future__ import division
 
import numpy as np
from scipy.io.wavfile import read, write
 
# Load audio file.
rate, s = read('sine.wav')
s = s.astype(np.double) / 32767  # Or 32768.
 
# Add amplitude modulation.
m1 = 3  # Message sine 1 [Hz].
m2 = 7  # Message sine 2 [Hz].
n = len(s)
t = np.linspace(0, n / rate, n, endpoint=False)
m = np.sin(2 * np.pi * m1 * t)  # Message.
m += np.sin(2 * np.pi * m2 * t)
m /= 2
s *= (1 + m)
s /= 2
 
# Save audio file (range of s is [-1, 1]).
write('am-sine.wav', rate, np.round(s * 32767).astype(np.int16))

am-sine.wav

As noted in a comment in the code, you might need to use 32768 instead of 32767 to convert the input to the range [-1, 1], if you are not sure that no samples have the value -32768.

Also, as already mentioned in the introduction, if you need to do several operations on a signal, do all of them in floating point, and only convert to integers just before writing the final output file.

Create a Stereo Audio File

The last script creates a stereo .wav file. The number of channels is not restricted to two, so you can also create multi-track .wav files in this way.

from __future__ import division
 
import numpy as np
from scipy.io.wavfile import write
 
# Create waveform.
rate = 44100  # Sampling rate [samples/s].
n = 44100     # Length [samples].
f = 1000      # Frequency of the sine [Hz].
t = np.linspace(0, n / rate, n, endpoint=False)
left = np.sin(2 * np.pi * f * t)
left *= np.linspace(1, 0, n)
right = np.sin(2 * np.pi * f * t)
right *= np.linspace(0, 1, n)
s = np.vstack((left, right)).transpose()
 
# Save audio file (range of s is [-1, 1]).
write('stereo-sine.wav', rate, np.round(s * 32767).astype(np.int16))

stereo-sine.wav

I'll also write an article with scripts like this for images. Stay tuned!