Dr. Hans W. Gierlich On The Future of Audio Quality Testing: More Than Just Frequency Response
May 2, 2023 at 9:51 AM Post #46 of 71
Neutrality can be measured so theoretically we can have a perfectly correct system with regards to the DAC and amplification, headphones though, like speakers and the room are the Achilles heel in the chain......and will probably never will be perfect.


....over the years I've seen many gear their system or least try to..to perform the same with all recordings so that everything, regardless of its origin, adheres, sounds homogenized to their preferences. Voices have to be intimate, everything needs to be smooth (lifeless in my books), no harshness, always deep bass etc, etc.....EQing (Audio Photoshoping) the hell out of the original recording....no wonder they're constantly chasing systems.
(EQ definitly has its place but that is for another discussion nothing to do with the MDAQS algorithm )


That will be a godsend but FR is probably still the starting block and I was thinking about this the other day with regards to the HD820. I've quite recently got to really like this headphone, which was originally dismissed right at the starting block, including myself, by many just from the published FR graphs.
Seriously though, the answer is right in front of us, just look at the headphone. How do we measure with a standard FR measuring system something that is all rather physically different and has been purposely designed, inside and out to create a non standard headphone experience by manipulating, redirecting the soundwaves before they reach our inner ear. If one relies on the The Harman Curve "the optimal sound signature that most people prefer in their headphones" to make you happy you are out of luck!....personally I prefer the Diffuse field equalization.....but maybe, possibly, incorporating the MDAQS algorithm (if it's feasible) to show timbre, distortion and immersiveness, all which IMHO the HD820 do rather well, will.
Neutrality is an actual agreed upon thing? Really? Even if it is a supposed actual thing that many agree upon (which they don’t) the process of amplified or digitally decoded material is all over the road due to the missing standardized recording studio replay monitoring, so it would not matter in the first place? Reason being due to the actual material it’s decoding or amplifying maybe not neutral. Maybe it is close to neutral maybe not, there is no way to tell as the original event is lost. This takes place at all amplifier/DAC price points.

Perfectly neutral amplification or DAC, hmmm they are all different, some closer to the same others not, regardless of price.

The speakers in the room? Right the recording studio monitors, they are different, getting a different response all across the world, that is one reason every album sounds slightly different. So why would you even consider getting a standardized amplifier or DAC to playback a spectrum of recordings, some cool, some warm....whatever?

This part about the HD820 I agree with you. I have never heard the HD820 but that is an example of the flock style of consumerism that at times takes place. When anyone brave enough or stupid enough goes against the grain, then at times rewards are obtained. Why? Well maybe some perceived the headphone as one way...but..........

There are a plethora of reasons why they would sound different to different listeners. The sound is from the entire signal chain. The style of listener may have a preference to a certain style of tune. The cup fitting, the way the headphones are made, the way they resonate. That’s a cool part is there is never quite the same headphone made each time. Similar but not the same. What this new MDAQS process does is maybe gets closer to understanding why we have an affinity to certain headphones.
 
Last edited:
May 2, 2023 at 12:47 PM Post #47 of 71
Neutrality is an actual agreed upon thing? Really?
Lol...I know what you mean.
Old school...we used to use the proverbial term "straight wire with gain"...at least for the pre-amp and amplification .....and my personal sound systems are still basically that, or at least I try.....but you are right...what is neutral?
So why would you even consider getting a standardized amplifier or DAC to playback a spectrum of recordings, some cool, some warm....whatever?
IMHO, theoretically a neutral system should in fact be very "chameleon" in nature so that the sound reproduced would in fact have those exact traits of the recording recreated.
Which is quite different from those who want all their recordings to sound warm or bright and lively...or "whatever".. their systems in fact veer from neutrality and never the twain shall meet.
The headphone hobby in recent years with the introduction of computers in the chain, allowing anyone to change the 1's and 0's of the original source material (the music), has exponentially created a smorgasbord of subjectivity......we may in fact be beyond the point of no return to ever finding neutrality. :triportsad:

Again I regress...
What this new MDAQS process does is maybe gets closer to understanding why we have an affinity to certain headphones
:thumbsup:
 
Last edited:
May 2, 2023 at 12:48 PM Post #48 of 71
I agree with the guide part in use of a FR graph understandIng, except I have to disagree with different (music) listeners responding to timbre or scale replay? Everyone simply has preferences, which comes out different choices and opinions. Maybe certain music utilizes different aspects of playback, but?

Somehow some think there is a perfect DAC/Amplifier/headphone tone which is correct, when it is irrelevant due to recordings not being standardized.
DACs and amplifiers are very different from HPs, as the laws governing these electronic devices and the constraints of the components inside them are well understood. We have high precision modelling of these devices and their purposes are well defined: Reproduction and amplification of the original signal with least amount of artefacts. How far we deviate from the perfect reproduction and amplification can also be assessed with measurement methods. The nature of the next chained components is also well modeled, and there is a better standardisation of the interfaces. This is unlike the HPs, where there is a moving part, that has to interact with the human auditory system and the head which varies for every person.

So for the DAC and amplifier part we are much more in control and as long as the engineers know what they are doing, very affordable solutions exist. I don't want to dive too much to much to this discussion, but you are welcome to join in some other thread I started some time ago here, and what I think about DACs costing multiple thousands to achieve a well understood task but still fail to do it is here (but let's discuss it in another thread).

So, there is a definition for a perfect DAC and amplifier, which also drops the "three unknowns" in the chain (HP, DAC, amplifier) to one (HP) which makes it easier to cope with the imperfections in the recordings. As you said there is no perfect recording and there is no perfect HP to cope with them all, but at least no sane and experienced mix engineer will apply wiggling EQ curve of 10 dB peak-to-peak in the mids that we have to cope with. For my listening habits and music (and other physical features), I have a close to perfect HP (DCA Stealth). If there is still an issue, I use the PEQ on my ADI-2.
But the best part of this technology is they are making FR almost secondary to the individuals comprehension of timbre. And even with timbre being joined at the hip of FR, it may be the key to why we like different FR playbacks?
I am not sure if they are making anything other than just suggesting a way to assess a feature like timbre? I have to admit, I did not read their papers or watched their all videos (except the posted one), but I don't see how FR can be of secondary importance. The ASA definition:

1683043292327.png
 
Last edited:
May 2, 2023 at 3:37 PM Post #50 of 71
DACs and amplifiers are very different from HPs, as the laws governing these electronic devices and the constraints of the components inside them are well understood. We have high precision modelling of these devices and their purposes are well defined: Reproduction and amplification of the original signal with least amount of artefacts. How far we deviate from the perfect reproduction and amplification can also be assessed with measurement methods. The nature of the next chained components is also well modeled, and there is a better standardisation of the interfaces. This is unlike the HPs, where there is a moving part, that has to interact with the human auditory system and the head which varies for every person.

So for the DAC and amplifier part we are much more in control and as long as the engineers know what they are doing, very affordable solutions exist. I don't want to dive too much to much to this discussion, but you are welcome to join in some other thread I started some time ago here, and what I think about DACs costing multiple thousands to achieve a well understood task but still fail to do it is here (but let's discuss it in another thread).

So, there is a definition for a perfect DAC and amplifier, which also drops the "three unknowns" in the chain (HP, DAC, amplifier) to one (HP) which makes it easier to cope with the imperfections in the recordings. As you said there is no perfect recording and there is no perfect HP to cope with them all, but at least no sane and experienced mix engineer will apply wiggling EQ curve of 10 dB peak-to-peak in the mids that we have to cope with. For my listening habits and music (and other physical features), I have a close to perfect HP (DCA Stealth). If there is still an issue, I use the PEQ on my ADI-2.

I am not sure if they are making anything other than just suggesting a way to assess a feature like timbre? I have to admit, I did not read their papers or watched their all videos (except the posted one), but I don't see how FR can be of secondary importance. The ASA definition:

Once again if there was a formula for this reproduction success in linearity, which you believe exists (which is not agreed upon) then all DACs and amplifiers would have made to sound exactly the same in the early 1980s. But there is no definition of such thing, in the end we win, if you understand that “color” can be a great thing. Still because your feeding the DAC/amplifier various tones, that have been recorded (free from a standardized method) we are still not arriving at neutrality? It simply depends what you believe artifacts are, I myself find a subtle introduction of warmth to be acceptable, yet other amplifiers are cooler and that is acceptable too, but at times goes with different headphones. You see we have choices of these tonal artifacts, we actually can win in the end, if you know how to access such departures from linearity. So are you saying all the manufacturers that build an amp that sounds one way are wrong? I’m curious here? Or are you saying all amplifiers/DACs sound identical, or at least should sound identical to each other?
 
Last edited:
May 2, 2023 at 3:58 PM Post #51 of 71
DACs and amplifiers are very different from HPs, as the laws governing these electronic devices and the constraints of the components inside them are well understood. We have high precision modelling of these devices and their purposes are well defined: Reproduction and amplification of the original signal with least amount of artefacts. How far we deviate from the perfect reproduction and amplification can also be assessed with measurement methods. The nature of the next chained components is also well modeled, and there is a better standardisation of the interfaces. This is unlike the HPs, where there is a moving part, that has to interact with the human auditory system and the head which varies for every person.

So for the DAC and amplifier part we are much more in control and as long as the engineers know what they are doing, very affordable solutions exist. I don't want to dive too much to much to this discussion, but you are welcome to join in some other thread I started some time ago here, and what I think about DACs costing multiple thousands to achieve a well understood task but still fail to do it is here (but let's discuss it in another thread).

So, there is a definition for a perfect DAC and amplifier, which also drops the "three unknowns" in the chain (HP, DAC, amplifier) to one (HP) which makes it easier to cope with the imperfections in the recordings. As you said there is no perfect recording and there is no perfect HP to cope with them all, but at least no sane and experienced mix engineer will apply wiggling EQ curve of 10 dB peak-to-peak in the mids that we have to cope with. For my listening habits and music (and other physical features), I have a close to perfect HP (DCA Stealth). If there is still an issue, I use the PEQ on my ADI-2.

I am not sure if they are making anything other than just suggesting a way to assess a feature like timbre? I have to admit, I did not read their papers or watched their all videos (except the posted one), but I don't see how FR can be of secondary importance. The ASA definition:

Right I never said it is secondary importance, I simply said maybe, if you read my exact word I used the word almost, that does not state secondary importance.

Please don’t misrepresent what I am writing.

It is just they are maybe finding a way to test why people like headphones, around simply testings FR. Of course FR and timbre are locked together to a point. The FR starts to somehow find its way to being even, complete and correct in replay (to a point) then add the ability to reproduce realistic timbre and we are getting somewhere. But what their tests show that some headphones that produce a skewed FR possibility still produce a desired tone though timbre response. We know the two features of timbre and FR are related but not totally and not always. That’s what this whole thIng is about. This is all about trying to understand how and why people can actually love different headphone response character, different FR yet have timbre. I don’t even know if it’s always totally accurate timbre, as at times successful timbre is off-color and still gorgeous, with-in certain perimeters.
 
Last edited:
May 2, 2023 at 4:02 PM Post #52 of 71
Once again if there was a formula for this reproduction success in linearity, which you believe exists (which is not agreed upon) then all DACs and amplifiers would have made to sound exactly the same in the early 1980s. But there is no definition of such thing, in the end we win, if you understand that “color” can be a great thing. Still because your feeding the DAC/amplifier various tones, that have been recorded (free from a standardized method) we are still not arriving at neutrality? It simply depends what you believe artifacts are, I myself find a subtle introduction of warmth to be acceptable, yet other amplifiers are cooler and that is acceptable too, but at times goes with different headphones. You see we have choices of these tonal artifacts, we actually can win in the end, if you know how to access such departures from linearity. So are you saying all the manufacturers that build an amp that sounds one way are wrong? I’m curious here? Or are you saying all amplifiers/DACs sound identical, or at least should sound identical to each other?
I really don't want this thread to turn into another subjective vs objective or sterile vs non-sterile thread. If you like, we can take this discussion to somewhere else. :wink:
 
May 2, 2023 at 4:10 PM Post #53 of 71
I really don't want this thread to turn into another subjective vs objective or sterile vs non-sterile thread. If you like, we can take this discussion to somewhere else. :wink:
Why, we have come so far? And I never said objective subjective, sterile non-sterile, if you can’t answer my questions as to amp/DAC character that’s fine. :)
 
Last edited:
May 2, 2023 at 4:29 PM Post #54 of 71
Why, we have come so far? And I never said objective subjective, sterile non-sterile, if you can’t answer my questions as to amp character that’s fine. :)
OK, shortly. :) I spend my money on the objectively good devices. The RME ADI-2 Pro I have is my endgame device (also due to its rich feature set) and there is no DAC and/or amplifier in the market that could replace it, at any price. It is clean with no coloration, which is what I prefer, as I do my tweaking elsewhere. If I want color, I use its great PEQ.

I don't say I am right, but I am in content with what I have, and I came to the conclusion that it is for me the right way to go. Also I am not just happy with what I have, but also because the chase is over for me and I can concentrate on listening to music. :wink:
 
May 2, 2023 at 4:31 PM Post #55 of 71
OK, shortly. :) I spend my money on the objectively good devices. The RME ADI-2 Pro I have is my endgame device (also due to its rich feature set) and there is no DAC and/or amplifier in the market that could replace it, at any price. It is clean with no coloration, which is what I prefer, as I do my tweaking elsewhere. If I want color, I use its great PEQ.

I don't say I am right, but I am in content with what I have, and I came to the conclusion that it is for me the right way to go. Also I am not just happy with what I have, but also because the chase is over for me and I can concentrate on listening to music. :wink:
Congratulations!
 
May 2, 2023 at 5:15 PM Post #56 of 71
Once again if there was a formula for this reproduction success in linearity, which you believe exists (which is not agreed upon) then all DACs and amplifiers would have made to sound exactly the same in the early 1980s. But there is no definition of such thing, in the end we win, if you understand that “color” can be a great thing. Still because your feeding the DAC/amplifier various tones, that have been recorded (free from a standardized method) we are still not arriving at neutrality? It simply depends what you believe artifacts are, I myself find a subtle introduction of warmth to be acceptable, yet other amplifiers are cooler and that is acceptable too, but at times goes with different headphones. You see we have choices of these tonal artifacts, we actually can win in the end, if you know how to access such departures from linearity. So are you saying all the manufacturers that build an amp that sounds one way are wrong? I’m curious here? Or are you saying all amplifiers/DACs sound identical, or at least should sound identical to each other?
The idea that there isn't one sure target for an entire system is a poor argument in support of letting every part of the system be chaos. The best for fidelity is not less fidelity with more unpredictable variables added for fun. DACs do not need to sound different because it solves none of the issues we have with the audio circle of confusion you brought up.

Also, the product presented in this thread claims good prediction results because it is tasked to correlate often massive differences in headphone measurements, with the just as massive consequences on the subjective experience from those headphones(I'd like to know more about how those listening tests were conducted BTW). With DACs, once you go with scientific listening tests, I don't expect much will remain to build a distinct subjective sound profile. DACs in general are just that much more accurate and stable compared to transducers.
As a curious nerd, I like learning about how they work. But as a listener, I've completely lost interest in DACs.
 
May 5, 2023 at 2:27 PM Post #57 of 71
For the HEAD acoustic guys.
In the second or third video, we see results for, I think 3 products with... was it speakers, a pair of headphones, and maybe something in a car? Maybe I'm mixing things up in my head, I watched quite a few sped up videos in one sitting and somehow ended up needing Till Papenfus to tell me all about hammers and acoustic analysis. It was just one example anyway, but I'm curious. Were there actual trials with listeners having to A/B and rate headphones against speakers(or IEM vs car stereo, or...), and if so, what were the trends for global ratings? Or did the trials remain between same playback systems types, and it's your own SkynetGPT that decided it could use the rating basis for all sound systems?

Hey @castleofargh, sorry for the late response. I'll try to elaborate a bit on the auditory tests.

The philosophy behind the MDAQS algorithm is that we all listen to music binaurally, regardless of the playback device (car stereo, IEMs, BT speakers, etc.). And the goal was to create an algorithm that could better correlate the measured binaural response of an audio device output to the subjective impressions.

As a participant in the auditory tests we conducted, you could have been exposed to audio files originating from many different audio devices. Headphones were of course included, but so were car stereos, BT speakers, loudspeaker pairs. However, to be clear, some auditory tests did only include products from one category. That was part of the randomization of the testing.

The participants in the auditory tests would be asked to compare the same short music sample (~4-6sec in length) played back from system A vs. System B, and record their preference scores for Timbre, Distortion, Immersiveness and Overall Quality, before moving on to the next comparison. Do that a total of 90 times (6 different music tracks, multiple different device comparisons) and the test is over.

(Bonus, this is a great way to weed out what Magnus calls "Human Random Number Generators", since we can easily catch people who rate A>B>C>A, and then discard all their scores as useless)

Anyway, if you do that with enough people, you end up with enough data for building the model and supplying it with training AND validation data.
Hope that helps.

As for your questions about trends, (unsurprisingly) high-end car stereos and headphones were rated the highest. And this might not go over well in this forum, but I believe a car stereo was the highest overall (think 20+ channels of juicy audio). :grimacing:

In terms of skynetGPT, it's always a balance between data size and niche size.
If we only focused on the auditory data from headphone-to-headphone comparisons, our sample size is heavily reduced. MAYBE it becomes more tailored to headphones, but we also lose something in the process. For proper Deep Learning ML implementations we would ideally want 10,000+ of samples. And since the initial goal was to have an algorithm that just evaluates binaural audio signals (like humans do), we wanted to create something that fit all general audio applications.


PS. Glad you enjoyed the Till Papenfus video 🙃 He's got some great videos out there.
 
May 5, 2023 at 5:32 PM Post #58 of 71
Hey @castleofargh, sorry for the late response. I'll try to elaborate a bit on the auditory tests.

The philosophy behind the MDAQS algorithm is that we all listen to music binaurally, regardless of the playback device (car stereo, IEMs, BT speakers, etc.). And the goal was to create an algorithm that could better correlate the measured binaural response of an audio device output to the subjective impressions.

As a participant in the auditory tests we conducted, you could have been exposed to audio files originating from many different audio devices. Headphones were of course included, but so were car stereos, BT speakers, loudspeaker pairs. However, to be clear, some auditory tests did only include products from one category. That was part of the randomization of the testing.

The participants in the auditory tests would be asked to compare the same short music sample (~4-6sec in length) played back from system A vs. System B, and record their preference scores for Timbre, Distortion, Immersiveness and Overall Quality, before moving on to the next comparison. Do that a total of 90 times (6 different music tracks, multiple different device comparisons) and the test is over.

(Bonus, this is a great way to weed out what Magnus calls "Human Random Number Generators", since we can easily catch people who rate A>B>C>A, and then discard all their scores as useless)

Anyway, if you do that with enough people, you end up with enough data for building the model and supplying it with training AND validation data.
Hope that helps.

As for your questions about trends, (unsurprisingly) high-end car stereos and headphones were rated the highest. And this might not go over well in this forum, but I believe a car stereo was the highest overall (think 20+ channels of juicy audio). :grimacing:

In terms of skynetGPT, it's always a balance between data size and niche size.
If we only focused on the auditory data from headphone-to-headphone comparisons, our sample size is heavily reduced. MAYBE it becomes more tailored to headphones, but we also lose something in the process. For proper Deep Learning ML implementations we would ideally want 10,000+ of samples. And since the initial goal was to have an algorithm that just evaluates binaural audio signals (like humans do), we wanted to create something that fit all general audio applications.


PS. Glad you enjoyed the Till Papenfus video 🙃 He's got some great videos out there.
Thank you for the clear reply. Exactly what I was curious about. It is a reassuring and pretty smart(IMO) way to approach this.

CpIlMEdVUAQlhIW.jpg
 
May 6, 2023 at 8:43 AM Post #59 of 71
Thank you @Mr.Jacob for the interesting research and your effort to bring that experience also to our headphones community, and to @jude for bringing this to our attention. :) After lots of bla bla in my previous posts, at last I found the time to watch at least your linked presentations. I have several questions.

1. I maybe can guess the reason behind smaller distortion coefficient in the simplified model here. My questions is, how can inexperienced listeners that don't fully understand a subjectively vague concept of distortion and make a rating about it, especially when the volume is low or the distortion can manifest as an oddness in FR (assuming it is audible)? Does it even make sense for headphones?

2. Have you tried it also with experienced critical listeners and did you have similar subjective results, especially those that can relate the subjective experience to objective facts? Or did they just become the part of the same data to train your DL network? Was there a study to evaluate them separately? I personally, am not the same person I was from only a few years ago and what I would rate positively back than might be a hard fail today.

3. Is the intention to make an industry standardization? Can we, in the future, expect different manufacturers making "from the factory" evaluation and rating of their, for example, headphones? For the automotive industry this is of course a different story, as standardization in that industry is a must.

4. There is a lot of resistance against quantified data (even if it has a foot in the subjective feedback) in this community. You can see traces of it also in this thread, full of hope that your work will prove that frequency response is meaningless. Unfortunately, many vendors, actively or passively, go along with those claims not to spread negative or unwanted (I use the term snake oil for it) information on their products. Do you think the community can be convinced without the honest efforts from the vendors? My guess is HeadFi is a small community in a much much larger causal HP listeners around the world, so we don't really matter much. :p

Lastly my opinion, I value the effort to bring this quantitative approach to our headphone community and hope that it will also honestly be supported by the vendors. I personally am not too convinced about a single number evaluation rating as I value, for example, timbre over everything else, and would prefer to look at individual ratings, maybe in addition to the frequency response, but still a valuable contribution that supports us on the way to become more informed consumers.

Thanks!
 
Last edited:
May 7, 2023 at 2:44 PM Post #60 of 71
While we're at it, I wonder if on the measurement side, you look at distortions beyond the classic and not often useful THD (not often useful regarding subjective preference).
 

Users who are viewing this thread

Back
Top