Interface Latency vs. Output Monitoring Latency

LevelSongD

New member
Hi, I’m going to be putting together a PC for digital audio production, and since a big chunk of my work includes playing softsynths with a MIDI controller (through FL Studio), I’ve been reading about latency for the past few days to try to get my head around the concept. I think I understand the basics, and while the following 2 statements are bound to be oversimplifications, I want to make sure I didn’t make any mistakes in my reasoning. So, please correct me if I’m wrong -

1 - The sound card/audio interface (and the cables going into it) are largely what determine the latency between the instrument and the computer.

2 - Dropouts are on the computer’s end. As long as the drivers are good and the operation isn’t CPU-intensive (aka, multi-track audio input, realtime effects, playing layered samplers/VSTs, etc.), you should be able to set a low buffer size, get a low latency, and hear what you’re playing in near-real-time.

Is that correct? If it is, I’ll be focusing mainly on getting a good audio interface (and drivers), and won’t be worrying as much about getting a blazing-fast CPU with tons of RAM, as I’m content to record the MIDI data via lo-fi virtual instruments, then switch over to higher-quality instruments and add effects in post. The first 15 years of my music career were spent playing live, so I just need to be sure I can lay down that MIDI without lag!
 
1 - The sound card/audio interface (and the cables going into it) are largely what determine the latency between the instrument and the computer.

In large part, yes.

The interface (in both directions) effectively looks like a loop of rope wrapped around two flag poles. Along the rope are little buckets. Two people are standing along the rope. One person is writing notes on a piece of paper and stuffing them into the buckets, then the next person is reading the notes. Each person does this precisely once per second (or some other time interval, so long as it is the same), then they slide the rope to the next bucket.

The packet size is the number of letters the person writes on the piece of paper. If the writer writes one letter, then slides it down to the reader, the reader is only behind by one letter from the writer. If the writer writes two letters, then by the time the reader gets them, he or she is two letters behind what the writer is writing. And so on. That discrepancy is the packet latency.

Also, the person doing the reading occasionally has to pause to scold his or her kids. That's the processor doing other tasks. While it is busy, the writer could get ahead. When this happens, the reader must take a step forward and start reading from a bucket farther away. This means that the total latency may be greater than the theoretical packet latency.

Unfortunately, the reader is also copying those notes and putting them into another set of buckets (for the audio output), and every time the input reader gets behind, the output reader doesn't get a note that cycle. This is an output glitch caused by an input glitch. The input doesn't actually lose anything because the reader can always step forward (until they wrap all the way around and bump into the writer, at which point they do lose data), but the output glitches because the data wasn't ready when it was needed.

In reality, however, there are typically several additional rings between the input reader and the output writer. These rings don't run once per second, but rather run as quickly as they possibly can, but only if there is data available. These are equivalent to the various bits of plumbing in the operating system and your audio interface. Each of these rings introduces a small amount of latency, and can be highly variable.

To prevent glitches, the OS intentionally introduces a small bit of latency (a fraction of a packet long) going into the output so that variations in the speed of this plumbing (and even variations in the input rate) don't affect the output (within reason). In effect, this means that the writer in the output loop moves back and forth along the rope as needed, moving closer to the reader when there isn't enough data and moving farther away when there is too much. It all averages out.

Because the hardware packet size has the largest latency and because the packet latency is the only latency that isn't dependent on the speed of the computer and its ability to execute code, this is generally considered to be the largest factor in round trip latency.


2 - Dropouts are on the computer’s end. As long as the drivers are good and the operation isn’t CPU-intensive (aka, multi-track audio input, realtime effects, playing layered samplers/VSTs, etc.), you should be able to set a low buffer size, get a low latency, and hear what you’re playing in near-real-time.

Generally, yeah. And by buffer size, again, you really mean packet size.
 
Back
Top