Soundfield Mic Discussion

  • Thread starter Thread starter Treeline
  • Start date Start date
Treeline

Treeline

New member
What's a Soundfield Type mic?

From YuriK:

What is a "Soundfield Type" mic? I'm only aware of 3 versions of the soundfield system. There are phseudo-surround system from Shoeps, but I do not know of too many others. Other then the SPL Area5.1.
http://www.spl-usa.com/Area_51/in_short.html
http://www.soundperformancelab.com/Atmos/in_detail.html (much better)

Would be very interested to find out about other systems. Please let me know. Thanks in advance



From Arcanemethods:

A microphone producing four A-format outputs from capsules on the faces of an imaginary tetrahedron. Usually accompanied by a box that can encode these signals to B-format Ambisonics or or derive logical cardioid type mic signals pointing in arbitrary directions.

(Designing mics)
From Arcanemethods:


...
I figure I'm a few weeks from the prototype (if I can stay buckled down) but from there to a production unit is uncharted territory for me. Based on the performance of the prototype I have a tentative partnership with someone who knows those ropes.

Low cost doesn't imply lower quality in this case. The encoding technology I've developed compensates naturally for large imperfections in capsule response and to somewhat lesser degree in the pattern. After encoding the mic will be very flat and it's directional sensitivity surface very smooth. It also gives near perfect coincidence compensation, a problem that has been solved to date with waves of the hand and which drastically affects HF performance. I will be using lower cost OEM capsules (can you guess from where) and the only concession to quality might be in self noise. There are conflicting requiremets on compactness of the array and low self noise.

The problem with this kind of mic is that it is particularly sensitive to self noise because it typically operates in fairly low level sound fields.

The technology is applicable to any gradient capsules that contain an omni component so within reason I can ignore frequency response and pay for low noise which should make even the highest quality achievable at lower cost than we've seen to date.

My first efforts are aimed at the lowest cost product that can demonstrate the technology and still provide a usable SNR.

...



From YuriK:

Very intriguing. It probably should have its own thread. Keep us posted. I'm sure I am not the only one who is interested in achieving 5.1 without spending $$$$$$$$$. The results with sounfield I have heard to date were very impressive. SPL is more of the good thing. KNowing that there is a more economical alternative is certainly good news.


From the TransAmerica / SoundFieldUSA website:

sfarray.gif




This oughta get a little discussion going. Fascinating stuff! :cool:
 
I toured a studio last year and the owner wanted to show off some things to me. One of them was the Soundfield mic. It was a jaw dropping experience hearing piano tracked through one.
 
Thanks, Treeline. Ambisonic mics are indeed a fascinating study.

FWIW, and to make sure this post contains something more than inane comments, Ambisonics is an example of what is called a first order microphone. One way of interpreting that is that it is sensitive to the zeroth order pressure, which is a scalar, and the first derivative of pressure with respect to spatial distance, which is a vector. An Ambisonic setup gives four signals, the W which is the scalar pressure and X,Y,Z which contain the components of the first derivative vector. The surround community is already bored with what they can do with first order and are clamoring for a second or higher order microphone. That is really difficult to build for all kinds of reasons and no one has accomplished it yet. Describing its outputs is difficult without an understanding of the mathematics of spherical harmonics which are used to encode the information.

A company called ImmersiveSound is developing an array that they claim can measure higher order components but it hasn't seen the commercial light of day yet.

http://www.immersivesound.com/index.php


Bob
 
arcanemethods said:
One way of interpreting that is that it is sensitive to the zeroth order pressure, which is a scalar, and the first derivative of pressure with respect to spatial distance, which is a vector.
Bob

Sorry, but that is incorrect. What is output on X,Y,Z is not the derivative of pressure with respect to spatial distance as a function of time but rather, in the ideal case, the integral of _that_ with respect to time. This turns out to be proportional to, and in the same direction as, the bulk velocity of the air as a function of time. So the ideal Ambisonic output is the pressure on the W channel and the three components of bulk air velocity on X, Y, Z, at the center of the array (if properly compensated to be logically coincident), all as a function of time.

Sorry, too, to be so damn technical but that's just the way it is with these things. Without some serious math it is really hard to even talk about what higher order Ambisonics is beyond the broad brush I gave above. I have an entire book on the theory of spherical harmonics! In essence they are a way analogous to Fourier analysis that allow one to represent arbitrarily complex closed surfaces that have a center from which an outward pointing arrow will only cross the surface once. How this relates to 3D microphone theory is left as an exercise. :-)

The main motivation for higher order is not only higher precision in reproduction directivity but in enlarging the sweet spot which for a first order Ambisonic mic is just a single point in space. In practice that's not as bad as it might seem but you can't move far from there and expect any realism. There are methods for decoding W,X,Y and Z to speaker feeds that enlarge the sweet spot at the expense of overall accuracy, including transforms that map it to 5.1 with a loss of the height information.


Bob
 
Hey, who started this thread without me!!!

Hi Bob

Did I understand correctly that Soundfield is a first order Ambisonic mic? Where the sweet spot is just a single point in space? If you look at the mic as a camera taking a snapshot its location would be a reference point, ie a single point in space. This shold provide a realistic snapshot, right? But the higher order allows multiple references? I'm having a hard time visualising this. This then will be a number of microphones at different locations going back to a coding matrix. Is this too simplistic?

Yuri
 
YuriK said:
Hey, who started this thread without me!!!

Hi Bob

Did I understand correctly that Soundfield is a first order Ambisonic mic? Where the sweet spot is just a single point in space? If you look at the mic as a camera taking a snapshot its location would be a reference point, ie a single point in space. This shold provide a realistic snapshot, right? But the higher order allows multiple references? I'm having a hard time visualising this. This then will be a number of microphones at different locations going back to a coding matrix. Is this too simplistic?

Yuri

I'm really not the one to carry this deeper into a discussion of higer order Ambisonics. I just mentioned it to prime the thread with food for thought, which may not have been a good idea since it is sort of a tangent. I only know enough to be dangerous and probably mislead people. The reason I said the application of spherical harmonics to 3D audio was left as an exercise is so that someone might explain it better to me. :-)

When I say that it offers the potential of a larger sweet spot while being more accurate, I am just parroting what I hear the people who really do understand it say.

There is more going on at a point in space than just the pressure and velocity of the air (time integral of the rate of change in pressure as you move an infinitessimal distance.) There is also information present in the higher derivatives of pressure with respect to distance, for example the rate of chage of the rate of change in pressure as you move an infinitessimal distance, not to mention the rate of change of the rate of change of the rate of change, etc. As to how that information, or rather its encoding into spherical harmonics, can be used to give a better image, I'm still in the dark but the real experts say it can be and even have speaker feed tranformations based on it ready to go when the data arrives.

My understanding of the first order recording system (as opposed to the really nasty problem of the playback system) is pretty complete but that's only because it is relatively simple and I chose to stop with what is practical today. If, as I believe, I have solved the problem of an optimal first order Ambisonic mic I hope to turn my attention next to the real life playback systems and seek optimal solutions using a similar methodology.

I can't claim priority on the methodology, turns out it was effectively published in '97 as a solution to a different problem but I arrived at it independantly (working from a concise statement of the Ambisonic problem given by Angelo Farina) and with some theoretical improvements. Needless to say, since I am working on a commercial product I won't be spelling it out in any detail. All I will say is that it is a mathematical procedure to calculate the optimal A-format to B-format transformation based on real measurement of the array. It matches the reality to the theory in an optimal way.


Bob
 
Bob

Over in
https://homerecording.com/bbs/showthread.php?t=127461

Harvey and DJL attempting at a similar discussion. Do you mind jumping in there and clarify a few things? Your explanations here are very technical, Harvey is trying to be simplistic, perhaps you can find a middle ground through the other thread.

Thanks

Yuri
 
Thanks for posting a link here to the other thread... I did the same over there.

Now the two threads are linked together... which will make it easier finding both using the search fuction. ;)
 
For the soundfield mic and I guess all first-order ambisonic's, is the inside distance between the capsules (in otherwords the diameter of the "mic ball" as Harvey put it in the other thread) intergral to the sound? Could someone just take four regular condensers, put them in a sphere configuration (all pointed outwards) and run them through a decoder? I assume the decoder would probably have to be adjusted for the distances between the capsules and such but could it be done? And if so, could a new decca-tree-of-sorts come on to the scene to help create our own first-order ambisonic mics with four matched regular mics? Just a thought...
 
OneRoomStudios said:
Could someone just take four regular condensers, put them in a sphere configuration (all pointed outwards) and run them through a decoder? I assume the decoder would probably have to be adjusted for the distances between the capsules and such but could it be done? And if so, could a new decca-tree-of-sorts come on to the scene to help create our own first-order ambisonic mics with four matched regular mics? Just a thought...

Here is the problem that I see with this. Height and spacing is very important when dealing with this kind of application. Trying to adjust for differences between the capsules would effect frequency. Time and distance and its effect of frequency are known as Doppler Effect. Trying to control this from a decoder without it being controlled at the capsules as well is probably too much trouble.
 
alanhyatt said:
Here is the problem that I see with this. Height and spacing is very important when dealing with this kind of application. Trying to adjust for differences between the capsules would effect frequency. Time and distance and its effect of frequency are known as Doppler Effect. Trying to control this from a decoder without it being controlled at the capsules as well is probably too much trouble.
OneRoomStudios said:
Could someone just take four regular condensers, put them in a sphere configuration (all pointed outwards) and run them through a decoder? I assume the decoder would probably have to be adjusted for the distances between the capsules and such but could it be done? And if so, could a new decca-tree-of-sorts come on to the scene to help create our own first-order ambisonic mics with four matched regular mics? Just a thought...
The secret is in the decoder, but the math for the algorithms depends on the precise spacing of the diaphragms. Without knowing more about how these beasts really work, I suspect Alan is absolutely correct - it would be a real pain in the ass to set it up with four mics and get absolute phase coherence.

In simpler terms, for this thing to work, the algorithms have to know exactly where everything is.
 
YuriK said:
Bob

Over in
https://homerecording.com/bbs/showthread.php?t=127461

Harvey and DJL attempting at a similar discussion. Do you mind jumping in there and clarify a few things? Your explanations here are very technical, Harvey is trying to be simplistic, perhaps you can find a middle ground through the other thread.

Thanks

Yuri
Now that we are cross linked I think I'll attempt a brief and less technical explanation of the first order Ambisonic mic here, rather than there, in accordance with the principle of the conservation of threads. :-)

For anyone that is math oriented, my bible for understanding this theory is at:

http://www.personal.rdg.ac.uk/~shr97psc/Thesis.html

It covers both first and second order concepts in excellent detail. I haven't yet spent the necessasary time with the second order part of it.

The ideal first order Ambisonic mic is comprised of four cardiod type capsules. These ideal capsules can be decomposed into two logical capsules summed internally. One of the logical capsules is an ideal omni (0th order) and the other is an ideal figure 8 (1st order.) The relative gain of these two logical components determines whether the real capsule is cardiod, sub-cardiod, hyper-cardiod or anything inbetween. For the Ambisonic array, it doesn't really matter, as it turns out, which of these is used. The end result only differs by a gain factor between the W component and the XYZ components that can be easily compensated. There is no theoretical advantage in which cardiod type is used if they are ideal. In practice, there may be advantages in that whatever of these can be found that has a pattern closest to its ideal form across the spectrum will work the best.

Now to the encoding of the outputs of four ideal capsules. The W channel is just the sum of all four. It turns out that because of the angles in which they arrange themselves on the faces of a tetrahedron that all of the figure 8 components of the capsules cancel out and one is left with four times the omni component of each capsule.

Note that this would only really be true if the capsules occupied the same point in space instead of being on the faces of an imaginary tetrahedron. This is the coincidence compensation problem and Michael Gerzon, who discovered these properties of a tetrahedral arrangement, offered in his patent on it an equation of two linear filters to approximate this and he did so without justifying it. It is still not clear how he derived this or from my reading, how close it is to ideal.

The XY and Z "velocity" components are formed by summing the four capsules in the three combinations that exist where two of them appear negatively in each sum. Since two are positive and two are negative in each sum, the omni components, being non-directional, simply cancel. The summing of the figure eight components has to take into account the direction from which sound arrives at the mic and it turns out that, due to the angles involved, the result of each sum is another figure 8 pattern such that each lobe has two postive and two negative capsules on each side of it and the axis of the lobe goes straight between them. This gives three figure 8 pattern each of which are orthogonal (at right angles to) the other two and can be aligned to up/down, left/right and front/back.

That's it in a nutshell and without all that annoying math. Problems arrise, of course, because in reality the patterns of the capsules aren't as simple as an ideal omni summed with an ideal figure 8. In fact, the directivity pattern of the mic varies considerably over the full spectrum as does its overall frequency response. Also the coincidence compensation is an approiximation that affects all frequencies with quarter wavelengths larger than the diameter of the array. The result of any inaccuracies here are a kind of comb filtering in the B-format encoding. Nonetheless, it proves to give a very satisfying result for spatial recording.


Bob
 
alanhyatt said:
Here is the problem that I see with this. Height and spacing is very important when dealing with this kind of application. Trying to adjust for differences between the capsules would effect frequency. Time and distance and its effect of frequency are known as Doppler Effect. Trying to control this from a decoder without it being controlled at the capsules as well is probably too much trouble.

Nah, with the right DSP it's a piece of cake. Using larger capsules to get lower noise increases the distance which must be compensated but that's not actually a problem as it turns out. The problem with larger capsules is that their high frequency response patterns get wierd and may lose the conditions required for extracting the directional information. One of the things I will be investigating eventually is just how serious a problem that really is when you can calculate and apply optimal linear compensations which are arbitrary. My hunch module tells me that it will yield nicely to compensation and allow for much quieter Ambisonic mics than we have seen.

BTW, the Doppler effect pertains to moving sound sources like, ostensibly, loudspeaker cones and, if it exists, is a non-linear effect. With microphones, if uncompensated, time/distance factors result in comb filtering, a linear effect.


Bob
 
OK, I guess I'm gonna hafta do this myself.

arcanemethods said:
Now that we are cross linked I think I'll attempt a brief and less technical explanation of the first order Ambisonic mic here, rather than there, in accordance with the principle of the conservation of threads. :-)

For anyone that is math oriented, my bible for understanding this theory is at:

http://www.personal.rdg.ac.uk/~shr97psc/Thesis.html

It covers both first and second order concepts in excellent detail. I haven't yet spent the necessasary time with the second order part of it.

The ideal first order Ambisonic mic is comprised of four cardiod type capsules. These ideal capsules can be decomposed into two logical capsules summed internally. One of the logical capsules is an ideal omni (0th order) and the other is an ideal figure 8 (1st order.) The relative gain of these two logical components determines whether the real capsule is cardiod, sub-cardiod, hyper-cardiod or anything inbetween. For the Ambisonic array, it doesn't really matter, as it turns out, which of these is used. The end result only differs by a gain factor between the W component and the XYZ components that can be easily compensated. There is no theoretical advantage in which cardiod type is used if they are ideal. In practice, there may be advantages in that whatever of these can be found that has a pattern closest to its ideal form across the spectrum will work the best.
Translation:
You need to use four really good cardioid capsules. (Cardiods are just an equal combination of figure 8 and omni.) What you use is unimportant as long as they're good cardioids.

Now to the encoding of the outputs of four ideal capsules. The W channel is just the sum of all four. It turns out that because of the angles in which they arrange themselves on the faces of a tetrahedron that all of the figure 8 components of the capsules cancel out and one is left with four times the omni component of each capsule.
Translation: Okay, combine the output of all four mics and that's called the "W Channel". Because of the way they're pointing, it's equal to four omnis.

Note that this would only really be true if the capsules occupied the same point in space instead of being on the faces of an imaginary tetrahedron. This is the coincidence compensation problem and Michael Gerzon, who discovered these properties of a tetrahedral arrangement, offered in his patent on it an equation of two linear filters to approximate this and he did so without justifying it. It is still not clear how he derived this or from my reading, how close it is to ideal.
Translation: It ain't exactly perfect. And even the guy that came up with the idea never explained it very well.

The X,Y, and Z "velocity" components are formed by summing the four capsules in the three combinations that exist where two of them appear negatively in each sum. Since two are positive and two are negative in each sum, the omni components, being non-directional, simply cancel. The summing of the figure eight components has to take into account the direction from which sound arrives at the mic and it turns out that, due to the angles involved, the result of each sum is another figure 8 pattern such that each lobe has two postive and two negative capsules on each side of it and the axis of the lobe goes straight between them. This gives three figure 8 pattern each of which are orthogonal (at right angles to) the other two and can be aligned to up/down, left/right and front/back.
Translation: Other combinations of capsules put back the directional stuff, but it's tricky shit.

That's it in a nutshell and without all that annoying math. Problems arrise, of course, because in reality the patterns of the capsules aren't as simple as an ideal omni summed with an ideal figure 8. In fact, the directivity pattern of the mic varies considerably over the full spectrum as does its overall frequency response. Also the coincidence compensation is an approiximation that affects all frequencies with quarter wavelengths larger than the diameter of the array. The result of any inaccuracies here are a kind of comb filtering in the B-format encoding. Nonetheless, it proves to give a very satisfying result for spatial recording.
Translation: It's tricky shit that works pretty good, but it has a few problems.
 
Last edited:
Harvey Gerst said:
OK, I guess I'm gonna hafta do this myself.

Whatever. I was trying to address people that want to know how it works not just that it does.

Translation:
You need to use four really good cardioid capsules. (Cardiods are just an equal combination of figure 8 and omni.) What you use is unimportant as long as they're good cardioids.

That's the problem with oversimplification. It's often wrong. Read it again.


Bob
 
arcanemethods...

Here's an idea... do it every way. Oversimplification, technical, translations, and corrections... that way everyone can learn.

Oh, and if anyone post something incorrect ... please don't make us or them guess, just corrected it for everyone... and lets learn more.

:)

Thanks for taking the time to share/teach.
Don
 
Last edited:
OneRoomStudios said:
For the soundfield mic and I guess all first-order ambisonic's, is the inside distance between the capsules (in otherwords the diameter of the "mic ball" as Harvey put it in the other thread) intergral to the sound? Could someone just take four regular condensers, put them in a sphere configuration (all pointed outwards) and run them through a decoder? I assume the decoder would probably have to be adjusted for the distances between the capsules and such but could it be done? And if so, could a new decca-tree-of-sorts come on to the scene to help create our own first-order ambisonic mics with four matched regular mics? Just a thought...
I tihink that would be cool... and with enterchangeable capsules for different tones and etc... oh, and a cool pre-adjusted soundfield quad mic holder. :D
 
Back
Top