To me, "study" implies scientific, objective methodology to ascertain something measurable/quantifiable. Your post title says "how many people can hear the difference" so we have assume there's a physical, measurable thing being studied. I think what is maddening for all of us is how this kind of topic always turns into a tree circling waste of time, as the goal post keeps getting moved.
I posted one study that was related to lossy audio, but the posts by some participants exposed some of the tests faults, but many are the kind you cannot rectify in a test of this kind of thing, where the simple matter is, some things don't matter in the same way to all people, i.e., even if their perceptive ability might make it possible to discern differences, it could be that this is not in an area they have trained themselves to recognize, or they just don't care. That's only "subjective" in the sense that some of the subjects think about things differently, not that there's a subjective (vs. objective) part of the hypothesis or test method. I can think of all kinds of things you might do to mitigate a lot of the variables in testing, pre-test hearing exams, controlled listening environments, etc., but then what do you have? Results to a test that have no basis in application in the real world. Nobody wants to pay for that. (A subjective, unfounded IMHO.)