Jitter Clock speeds on audio interfaces?

dickiefunk · Mar 12, 2007

Hi. I'm currently try to learn some more info about sound cards and have been told that the Jitter Clock speeds make a difference to the audio quality aswell as the converters!
What difference does the jitter clock make to the sound / performance of the interface and what do the measurements mean? How can I tell how well the jitter clock will perform on reading the specs ; what should I be looking for?

bennychico11 · Mar 12, 2007

well, you're combining two different words...looks like you're confusing 'jitter' and the word 'clock'.
Jitter is the irregular variation in the digital signal causing samples to not be at their correct time relative to others.
And then of course your 'clock' is the internal sync of the components in your gear.

So there can be clock jitter...but there's not a jitter clock. Unless you were just wondering about how a clock affects your signal?

Ethan Winer · Mar 12, 2007

dickiefunk said:
have been told that the Jitter Clock speeds make a difference to the audio quality aswell as the converters!

Neither of those have nearly as much affect on the sound as is claimed. Far more important is the accuracy of your loudspeakers and room. IMO jitter is a non-issue invented by marketing types to justify higher prices for their "premium" sound cards and A/D/A converters.

--Ethan

JoeNovice · Mar 12, 2007

Ethan Winer said:
Neither of those have nearly as much affect on the sound as is claimed. Far more important is the accuracy of your loudspeakers and room. IMO jitter is a non-issue invented by marketing types to justify higher prices for their "premium" sound cards and A/D/A converters.

--Ethan

E-

Which A/D/C have you used and what did you use them with? That's a very bold statement without much fact for backing it up. Have you ever used the Apogee BigBen as a master house clock?

dickiefunk · Mar 12, 2007

Hi. Thanks for the replies. I have got my terminology confused. I now realise that the jitter and clock are different things. What difference do both of these have on the performance and audio quality of an audio interface? Also, what figures should I look for on spec sheets? How can I tell how good the clock is going to be?

Robert D · Mar 12, 2007

I agree with Nathan. There was a time when it was a bigger deal than it is today, but with todays converters, even the prosumer ones, it's really nth degree nitpicking to be saved for when you have run out of real problems to solve.

dgatwood · Mar 13, 2007

JoeNovice said:
Which A/D/C have you used and what did you use them with? That's a very bold statement without much fact for backing it up. Have you ever used the Apogee BigBen as a master house clock?

A bold statement, yes, but since you can back it up with simple math, the question of real-world proof is moot.

Proof I: Nyquist proof

For jitter to be audible, the error in timing for a sample must be off by at least enough to result in an error of some significant period of a wave within the range of human hearing. An error of about a tenth of a wavelength is provably the minimum drift that should result in audible artifacts within the human hearing range unless you want to try to argue that the Nyquist theorem is wrong. That said, if you don't believe me, I'll back that statement up with mathematical proof.

At a 44.1 kHz sample rate, this means an error of somewhere on the order of 20% of a sample, which would require an insanely bad clock. Even things like S/PDIF or AES/EBU only drift half that much (at most) At 88.2/96 kHz, this would require an error of, at minimum, almost half a sample. I don't think that there's an audio interface clock on the planet with that much drift---even in really early hardware from decades back.

Proof II: Decibel proof

For a real-world example, we'll use a 16 kHz audio tone---right at the top limit for average people's hearing. Sample at 96 kHz. (Note: I picked 16 because it divides evenly into 96, which makes the math easier. You can do the math for 22 kHz if you really want to do so, but it really won't make any difference.)

Let's use a standard with relatively high clock jitter allowed in the spec. The AES/EBU standard allows 20 ns of jitter. That's about as bad as jitter gets, since your internal clock is likely to jitter in the low picosecond range, so 20 nanoseconds is huge by modern standards. So keep in mind throughout the math that follows that you will almost certainly never see jitter this bad in any real-world hardware.

Before I do the math, let me also add that 20 nanoseconds is 50 MHz. That's Megahertz, as in 20,000 times the top of the human hearing range. This should already give you a pretty good idea of why discussions of jitter are silly with modern electronics.

The maximum error will occur when the sine wave is moving at its maximum speed, which would be at the zero crossing. An error of 20 nanoseconds on a 16 kHz represents 1/3125th of a cycle (50 million / 16 thousand). The error would be about +/-0.002, or 0.0174 dB. The minimum difference needed to be audible is an error of about 3 dB. That means that it is impossible for jitter to be audible within the human hearing range even with a with a huge jitter by modern standards. And note that this is the absolute maximum possible error at any point on the curve at 16 kHz sampled at 96 kHz.

It gets better. The reality of the matter is that you're only sampling at six points on the cycle. Depending on where that falls within the sample, the roll-off at 16 kHz can be as much as 1/12th of a wavelength, or about 3 dB. So the maximum error from jitter is less than the maximum error due to the sampling itself---250 times less, to be precise.

It gets still better. Do that with a more typical error of 2 ns (for clock recovered from ADAT), and the jitter results in an error of 1/31250th of a cycle, or about 0.0002, or about 0.00174 dB, or 1/2500th the error caused by the sampling itself! Not only can you not hear an error that small, odds are, you can't even measure it on an oscilloscope. The difference is that small. At +4 dB line level audio (1.228 vPP or so), that comes out to an error of 0.2 mV, or a fifth the distance between divisions on a good oscilloscope at its maximum precision.

Now do this for a 2 picosecond internal clock jitter. What is the maximum error in decibels? Hint: a nanosecond is 1000 picoseconds. Did you answer 0.000000174 dB? If you did, you're right!

Proof III: Interface design proof

Mathematical theory aside, the way audio interfaces work removes jitter from the picture by its very nature. Audio interfaces use a phase-locked loop to create a stable internal clock based on an external clock. The way this works is that each consecutive clock moves towards the incoming clock until it represents an average of the incoming clock.

Even with a modestly jittery clock signal coming from an outside source, the PLL should still be relatively solid, but will exhibit tracking jitter even if the clock is absolutely perfect. The catch is that even a theoretically perfect clock will not be perfect by the time it reaches the audio interface. The act of passing through the cable, amplifiers on output, amplifiers on input, etc. will cause the pulse edges to degrade.

The kicker is this: even with good clock recovery, an external clock source derived from a recovered (external) clock will always exhibit greater jitter than a physically closer crystal-driven clock because of this degradation. Thus, an external clock automatically makes jitter worse, regardless of whose clock, regardless of whose interface, regardless of whether it is sent as word clock, S/PDIF, AES/EBU, or ADAT, and regardless of all other factors.

Now the argument the audiophiles give is "phase distortion, and technically, they are correct. It does produce phase distortion, but as you can see plainly from the math, the error at any given point in time is inaudible by many, many orders of magnitude, and thus, the error is inaudible, period.

The only argument that could reasonably be plausible is one in which you are recording multiple audio streams with different interfaces in which at least one is externally clocked. In that case, you could have a very slight phase offset that might be improved by slaving both interfaces off of a single external clock rather than to each other. However, the phase offset from clock recovery will likely pale compared to the multi-interface timing discrepancies inherent in the data path after the information leaves your interface due to time stamping precision (or lack thereof), and thus, this error can be discounted just as easily.

In short, external clocks serve a purpose. They produce a single stable clock signal in multiple formats to drive different devices. If your purpose for using an external clock is to make your recordings somehow sound better, though, repeat after me: it's all in your head.

dgatwood · Mar 13, 2007

While I was in the shower, I had a eureka moment and realized I'd left out the best part. Even if the Big Ben magically generated a perfect clock and the superconductive cable magically passed the signal flawlessly and the PLL in the interface locked to the clock flawlessly, you would only be reducing the jitter from a couple of picoseconds to zero.

2 picoseconds jitter = 0.000 000 02 volts, or just under 1/50,000,000th of full line level. The precision of a single bit in 24-bit recording is only 1/8,388,608th of a volt, give or take.

If I'm doing the math correctly, that means you only have a 1/12 chance of the total error from jitter resulting in even a single bit difference at 24-bit resolution (+/- 1/6th, times a 50% chance of the jitter being in the right direction to push it across the midpoint threshold between two values), thus, the best an external clock could do is result in an increase in effective ADC resolution of about 1/12th of one bit at 24-bit resolution, even in a perfect world.

At 16-bit resolution, the improvement is reduced further, to a precision increase of 1/3052nd of a bit, give or take.

In short, not only is the jitter from the crystal clock in a typical modern audio interface not audible, unless you're doing 24-bit audio, it likely won't even be detectable....

At the 2 ns (e.g. ADAT external clock) level, the math comes out to just under 10 bits on average in 24, or just under 2 bits in 16. That's at least detectable, though still inaudible for the reasons described in my previous post.

(Bear in mind that a 10-bit error---even a 2-bit error---sounds large until you recognize that the calculations assumed a signal at 100% saturation, right up to the point of clipping. Because the error is proportional to the signal level, if your signal is a tenth that loud, the error is also a tenth as much, and thus, no matter how low you ride in the mud, it will always be thoroughly buried proportional to the input signal.)

Ethan Winer · Mar 13, 2007

JoeNovice said:
Which A/D/C have you used and what did you use them with?

I use an M-Audio Delta 66 card with Sonar 5 and it all works great.

That's a very bold statement without much fact for backing it up. Have you ever used the Apogee BigBen as a master house clock?

dgatwood already explained in excruciating detail why jitter is irrelevant. Jitter is typically 120 dB below the signal, so it's impossible for anyone to hear. Understand this is 20+ dB below the noise floor of a 16 bit recording.

As I'm sure you're aware, this kind of stuff comes up all the time, where people perceive an improvement after adding an expensive outboard clock, or expensive speaker cables, or other items that are marketed as improving the sound but don't really. There are many reasons people think they hear an improvement (or just a difference) even when a difference is unlikely. Here's my take on that:

www.ethanwiner.com/believe.html

Of all the things people should focus on to improve the quality of their productions, external clocks and high-end (read: expensive) A/D/A converters are at the bottom of the list compared to basics such as decent microphones and accurate monitoring.

--Ethan

JoeNovice · Mar 13, 2007

Obviously I don't have the math background to participate on dgatwoods level. I'm just a musician looking to use these tools of science to make sound in a pleasing way.

Do you contend that the Delta 66 performes at the same level as say an Apogee Rosetta or RME 800?

I used a Delta 1010 for years and when I switched to a Tascam FW1082 there was a (major) difference in the quality of both sound (previously recorded) and recordings (new). If not A/D converters, what would account for this?

As to dgatwoods comments (while barely understood) - Does the math change when using complex wave forms? Does the math account for dozens of seperate audio recordings played at the same time?

Can everything be explained in such conclusive terms? It seems like quantifications are always based on theory and the perception of perfect theory. Could there be more than the math at work?

Vast numbers of people claim to hear a difference. Is it all due to the weak minded perceptions of lesser humans?

MessianicDreams · Mar 13, 2007

dgatwood said:
If your purpose for using an external clock is to make your recordings somehow sound better, though, repeat after me: it's all in your head.

IMHO, YMMV and all that jazz...

I think we need to be careful in our discussion of jitter: the case of using external (stand alone) clocks; the case of jittery clocks.

Regarding the first part, my opinion is that external clocks - by and large - are an expensive waste of time. The intrisic nature of PLL design basically negates the need for external *stand alone* clocks such as the Big Ben. Of course, there are times when it is impossible to get away from PLLs - digital to digital interconections for example - and so we must suffer the problems of PLLs. Of course there is then the discussion of whether one must daisy chain using imbedded clock signal, external (BNC) clocking inputs, daisy chaining BNCs, using a WC distribution amp (star shaped distribution), using BNC t-connectors in a false daisy chain (really star-shaped distrubtion).

Regarding the second part, jittery clocks DO make a difference! and they CAN be heard! Maths aside - I've done the tests. I've used a DM2000 on internal, and compared to clocking to a 300L, 960L, M2000, a FireWorx, System 6000 and the very first 01V that came out. I also used a Neutrik A2D to generate various amounts of clock jitter. And you could hear the difference!!

Now of course, PLLs come into play - but the bottom line is, in these cases, clocks made a difference!

dgatwood, the maths is very impressive, and all seem to make sense to me. Have you ever done a listening test for jitter? what equipment did you use?

Of course, it is questionable whether it is of importance in the home recording environment, when the emphasis could be put on acoustics & other equipment FIRST.

EDIT: if you want to spend the next 1000 hours reading up on this, check out this thread over at PSW forums:

http://recforums.prosoundweb.com/index.php/t/14324/0/

pipelineaudio · Mar 13, 2007

Awesome post dgatwood!!!

danny.guitar · Mar 13, 2007

pipelineaudio said:
Awesome post dgatwood!!!

Agreed.

The only time I've heard a difference in converters was when something was run through A/D/A conversion several times over (4+), and even then, the differences can be somewhat subtle.

zekthedeadcow · Mar 13, 2007

... to answer the question....

yes they matter ... just not much...

converters make a significant difference... but practically everyone usues the same one... all made by AKM... probably in taiwan too

jitter is basically the instability of sync speed... the card has a crystal that for example has a frequency of around 22khz... emphasis on 'around'... it could be 22.00001 21.99999 etc... the PROBLEM occures when the crysal is inconsistant and these inconsitancies add up over time.

I don't know how consistent these things are...

plus I'm tired and bad at math anyway...

lets say theres an error of .0001% so out of 96,000 samples thats 9.6 samples off per second... average human threshold is 11ms for tempo errors... so after roughly 17.6 minutes the average person will notice a tempo error.

thats just tempo ... I'm sure intonation issues will crop up before hand.

dgatwood · Mar 13, 2007

JoeNovice said:
Do you contend that the Delta 66 performes at the same level as say an Apogee Rosetta or RME 800?

Not even close. There's a lot more to building a good interface than the clock. The Apogee almost certainly has a better analog section than the Delta. That can make a huge difference in sound quality very easily.

Also, there's no guarantee that a clock can't suck so much that it would not be audible. It just isn't too likely on any modern interface with a high frequency crystal clock like you'd have on anything that can sample at 96kHz or 192kHz.

JoeNovice said:
I used a Delta 1010 for years and when I switched to a Tascam FW1082 there was a (major) difference in the quality of both sound (previously recorded) and recordings (new). If not A/D converters, what would account for this?

If it improved previously recorded sounds, odds are the FW1082 has better DACs and/or reconstruction/anti-aliasing filters.

JoeNovice said:
As to dgatwoods comments (while barely understood) - Does the math change when using complex wave forms? Does the math account for dozens of seperate audio recordings played at the same time?

Sure. If you have something where the rate of change is faster, the error will be greater. Of course, if the slope of the wave is faster, that means that this is a higher frequency component of the signal. Above a certain point (which is not too far above the value I gave), you're outside the range of human hearing, so any error in that range is moot.

JoeNovice said:
Vast numbers of people claim to hear a difference. Is it all due to the weak minded perceptions of lesser humans?

It might actually be the same reason that dithering "improves" the sound. Technically, it is making it worse, but it can be perceived as an improvement to add random noise to a weak audio signal. It may very well be that the less accurate timing of a PLL slaved to an external clock is perceived as being better because it is less accurate.... I still don't see how it could exceed the threshold of perception, though unless the PLL is really poor at jitter rejection.

dgatwood · Mar 13, 2007

zekthedeadcow said:
converters make a significant difference... but practically everyone usues the same one... all made by AKM... probably in taiwan too

There are actually a handful of manufacturers. There aren't a lot of manufacturers, though, and AKM is one of the more popular ones.

zekthedeadcow said:
jitter is basically the instability of sync speed... the card has a crystal that for example has a frequency of around 22khz... emphasis on 'around'... it could be 22.00001 21.99999 etc... the PROBLEM occures when the crysal is inconsistant and these inconsitancies add up over time.

MHz. The card probably has a crystal in MHz. You normally start with a frequency that can be evenly divided into all the frequencies you want, then divide it in different ways, e.g. taking a sample on every 50th clock vs. every 93rd clock. (Yes, I pulled those two numbers out of my you-know-where.)

zekthedeadcow said:
I don't know how consistent these things are...

I am not an electrical engineer, but from what I've read, for a crystal oscillator, frequency stability (averaged over a period of time) is probably single digit pulses per million. There are some down in the 0.5ppm range. Either way, we're talking about an error on the order of 0.00005%-0.0005% here.

The short-term jitter numbers I've read for the resulting crystal clocks for audio are more on the order of 0.012% if I'm doing the math right. Again, insanely small.

zekthedeadcow said:
plus I'm tired and bad at math anyway...

lets say theres an error of .0001% so out of 96,000 samples thats 9.6 samples off per second... average human threshold is 11ms for tempo errors... so after roughly 17.6 minutes the average person will notice a tempo error.

There's a difference between jitter and drift. You're talking about drift. Jitter is a random gaussian distribution around the correct frequency. That said, for drift, you're in the ballpark with that number. For jitter, it's a couple orders of magnitude higher, but randomly off in one direction or the other.

More discussions at http://www.hydrogenaudio.org/forums/index.php?showtopic=51322

Best part was a link to this paper: http://www.jstage.jst.go.jp/article/ast/26/1/50/_pdf in which they conclude that the threshold for audibility of jitter is 250ns. That's 10 times the worst-case allowable jitter for the non-PLL-corrected AES/EBU signal, which, in turn, is a thousand times worse than the worst clocks you're likely to find in a digital audio interface. Oh, yeah, and the audiophiles want us to believe that anything over 5ps is objectionable.

They were also doing it with 44.1kHz audio, I think, so to use the math from yesterday with a different sample rate and 250ns jitter, this gives us a maximum per cycle error of 0.011025 cycles. That's a 1% jitter. That's freaking huge. This gives us an error of 0.06, or 6% difference in volume on that 16 kHz sine wave, assuming I did the quick and dirty path right, or half a decibel.

Well, that's not the 3 decibels that I was expecting based on perception of differences in volume, but it's in the same ballpark....

That said, the discussion on that other board did mention one interesting point: correlated jitter (jitter that depends on the input signal) is likely to be much more noticeable than random jitter. If the audio interface's power supply is not build correctly and you're getting leakage of audio into the power source for the crystal, then maybe I could see the potential for an audible improvement from using an external clock source. On the other hand, I could also see the potential for beating the interface designer senseless with a rotten halibut.

pipelineaudio · Mar 14, 2007

dgatwood, I have to connect many devices. FOr my home mix setup say, its A benchmark DAC-1, 3 RME ADI DS's 2 RME HDSP 9652's, a mess of digital hardware fx units, and a DBX 160SL with the ADC out. If I dont connect some of these to a master clock, I can get clicks, while others (especially the spdif ones) get along fine just referencing their digital audio input connectors

Now, allegedly, these other devices, connected to word, still use their own clock, but claim to reference the word clock.

How can this be?

If the clicks were not there, would just referencing input be better or is this setup correct?

dgatwood · Mar 14, 2007

pipelineaudio said:
dgatwood, I have to connect many devices. FOr my home mix setup say, its A benchmark DAC-1, 3 RME ADI DS's 2 RME HDSP 9652's, a mess of digital hardware fx units, and a DBX 160SL with the ADC out. If I dont connect some of these to a master clock, I can get clicks, while others (especially the spdif ones) get along fine just referencing their digital audio input connectors

With that many devices, that's the sort of situation where an external clock is generally a good idea.

pipelineaudio said:
Now, allegedly, these other devices, connected to word, still use their own clock, but claim to reference the word clock.

How can this be?

If the clicks were not there, would just referencing input be better or is this setup correct?

I'm not quite sure I understand the question. If you have two devices---call them A and B---then if the output of A is connected to the input of B, either B must be set to get its clock from the S/PDIF connection or device A and B most both get their clock from some third source (which may be, but does not necessarily have to be, a clock generator).

Are you saying that you're getting clicks when device B slaves off of device A directly? It might be that the clock on the device A was designed incorrectly and is far enough from the correct frequency that device B can't lock to it. Behringer gear, perhaps?

If so, clocking the offending device externally would fix the problem by forcibly bringing it back into spec.

Barring out-of-spec gear, it shouldn't matter much one way or the other whether the second device derives its clock from the S/PDIF input or from the word clock, at least as long as you are only recording the S/PDIF signal. Digital data is digital data. If you're mixing analog and digital, it depends on whether I'm right about the audibility of jitter.

Ethan Winer · Mar 15, 2007

JoeNovice said:
Do you contend that the Delta 66 performes at the same level as say an Apogee Rosetta or RME 800?

Technically, probably not. But in practice, for sure. This is so incredibly easy for everyone to test for themselves I'm surprised people even still argue about this stuff. Here's all you need to do:

Pick the absolutely best sounding CD you've ever heard, and record it via analog connections from your CD player into your sound card. Now play it back through your sound card. Do you hear a difference? If so, get a better sound card. Otherwise, stop worrying about it. I did this test years ago, and stopped worrying about it years ago.

It kills me that people obsess over stuff like this, all the while ignoring that their room skews the frequency response by 30 dB or more, and their loudspeakers have 1 to 5 percent distortion.

I used a Delta 1010 for years and when I switched to a Tascam FW1082 there was a (major) difference in the quality of both sound (previously recorded) and recordings (new). If not A/D converters, what would account for this?

Beats me, but subjective tests are all but useless unless something performs really really poorly. Much more direct is to simply measure the frequency response, noise, and distortion.

Does the math change when using complex wave forms? Does the math account for dozens of seperate audio recordings played at the same time?

All sound is comprised of individual sine waves. This was proven many years ago by Fourier.

Can everything be explained in such conclusive terms?

Yes!

Vast numbers of people claim to hear a difference. Is it all due to the weak minded perceptions of lesser humans?

Yes! And comb filtering as explained in my article linked above.

--Ethan

dgatwood · Mar 16, 2007

Ethan Winer said:
It kills me that people obsess over stuff like this, all the while ignoring that their room skews the frequency response by 30 dB or more, and their loudspeakers have 1 to 5 percent distortion.

True. That said, once you have your room sounding relatively good, with decent mics, your pres make some non-negligible difference, and a lot of interfaces contain pres. That's where you'll hear the biggest difference between two interfaces.

Also, better interfaces also tend to have better drivers and generally cause less problems, though that's a bit of an overgeneralization, I suppose.... You might also find improvements in the quality of the analog filter banks, which can have a major impact on sound quality in terms of high frequency roll-off.

At line levels, the difference between one interface and another is likely to be minimal (filter roll-off notwithstanding) once you get into actual recording gear (and not SoundBlaster crap), but I wouldn't completely rule out somebody being able to hear the difference. It is certainly possible.

Oh, and don't forget that noise levels (both electrical and acoustical) can vary between cheap and quality interfaces even when all other things are equal.

Jitter Clock speeds on audio interfaces?

dickiefunk

New member

bennychico11

...

Ethan Winer

Acoustics Expert

JoeNovice

Junior Cheeseburger

dickiefunk

New member

Robert D

New member

dgatwood

is out. Leave a message.

dgatwood

is out. Leave a message.

Ethan Winer

Acoustics Expert

JoeNovice

Junior Cheeseburger

MessianicDreams

New member

pipelineaudio

Well-known member

danny.guitar

Guest

zekthedeadcow

New member

dgatwood

is out. Leave a message.

dgatwood

is out. Leave a message.

pipelineaudio

Well-known member

dgatwood

is out. Leave a message.

Ethan Winer

Acoustics Expert

dgatwood

is out. Leave a message.