The concept of normalization is to raise the level of audio without changing the dynamic range or adding distortion. Various approaches exist that address that process, and not all are the same.
Any digital signal processing process will usually introduce some changes in the overall audio. That is a general consequence of the digital domain. This includes EQ, normalization, compression, limiting, and most other similar processing options.
Samples are little more than a volume level at a point in time. The question is how to process given what you have to start with and what you want to end up with.
Using a simple example can sometimes help. Suppose the audio is on a scale from 1 to 10, and there are four samples. Suppose the sample values are 2, 0, 6, and 5. The max value is 6 on a scale of 10. So you decide to set the third sample to a value of 10 (ignoring for the moment at most normalization approaches actually shoot for a little less that actual max). That is an increase by a factor of 10/6 or about 66%. That means that the 2 value now needs to be 66% higher (we’re using linear scale here again for simplicity). But that is a value of 3 1/3 which is not on the 0 to 10 scale. So the nearest point is used, which is 3. That’s where a point to adjustment occurs where you have approximated the overall result a little. So when the dust settled, you might end up with values of 3, 0, 10, 8.
The good news is that modern audio is sampled at usually 44KHz and 16-bit resolution. This means that the scale is much larger than 0 to 10 and ranges from about –32000 to +32000. When digital processing is done, the scale is many times smaller and the adjustments, like in the simple example, are tiny in comparison to the overall level. If it weren’t this way the mix down process it self would destroy the result as it has to adjust the volume of every sample as they are combined based on the level setting of the mixer.
One of the most common uses of normalization is for audio that represents telephone voice mail messages. Many people speak softly, and many connections are not good. The result is quite often an audio file that is otherwise very hard to hear, even with the speakers on max. Once the data is normalized, it becomes much easier to hear, as the character of the original is effectively not changed, but the volume is much louder. The key in developing normalization algorithms for telephone audio is knowing what to ignore. For example virtually all such audio contains a large blip at the front (basically the off hook sound) and usually a similar event near the end usually consisting of a combination of touch tone input to stop the recording and the hang up event. Good normalization algorithms will ignore these peaks and work with the data included inside that represents the true caller audio.
Normalization is a highly useful process. Like most other forms of digital signal processing, it still boils down to using the tools best suited to the need at hand. There is no “One size fits all” when it comes to digital signal processing.
When I first get an audio WAV in the mix phase, I first look at it using a WAV edit tool. That way you can quickly get a visual image of what you have coming in. Things like peaks and silence periods are immediately obvious. Also things like noise at the start/end that needs to be immediately removed/erased.
If there are a lot of peaks, then normalization may buy you little or nothing. If the WAV is just recorded low, then it works wonders. Also have normalization algorithms that offer the audio engineer some parameter control is also very useful. Things like auto gain adjustment and peak limiting can sometimes be very handy to work a difficult WAV file.
My profession is as a software engineer, which is what I have done for 35 years. One of the areas I have spent some time over the last few years is audio processing of WAV data. This is real time stuff and separating the wheat from the chaff is the name of the game. While somewhat different from Hi-Fi CD audio, the digital concepts are quite similar. You have this stream of digital data and you goal is to modify it according to some goals and constraints.
It was well said by “dave in Toledo”. There is a lot of room for opinion, judgment and guess work in the audio mixing world. While there’s a lot of science behind it, there’s a lot of art in the process and room for a lot of views.
Ed