I'm always very aware of what i's going to happen when people put on headphones, or worse sometimes, in-ears. T I'm uning suffers in about 50% of cases. people who sin g perfectly in tune can be sharp or flat and not notice! I have terrible trouble with this on stage playing bass. I absolutely cannot play my fretless at all, which I'm fine on using floor wedge. Stick it in my ears and I can be a semi-tone float or sharp and not notice - and worse still, it's a consistent flatness - so I can actually play a few bars perfectly out of tune until I play an octave up, when something seems to kick in and I panic. Oddly, for me I can sing in tune up top, but play out of tune on my bass at the same time. For people I'm recording, if they sing out of tune, I'll suggest they try one ear - which usually cures it.
Other people seem to need a very precise mix in their headphones or they lose it - on speakers they can seem to hear what they need better?
As for voice quality and tone - any adjustment decisions made on headphones will be wrong, and I think it's because we don't just hear with our ears, but bone conduction too. I've noted that when you hand untrained/inexperienced people a microphone, the different voice they hear makes them sometimes stop speaking or singing when they suddenly hear themselves. I suspect that we are just not used to what comes out of the monitor speakers being a representation of what they really sound like - the sound everyone else hears, but you never do yourself?
I don't know if you've ever tried the delay trick. We used to do it with off tape monitoring in the analogue reel to reel days, but it's easy to simulate with an electronic delay. Feed somebody through headphones NO direct sound, and 100% delayed sound. It only has to be a short delay - not even half a second sometimes and they cannot speak. When they try, they sound drunk, and slur their words terribly. We used to use this to demonstrate how hearing helps you form words properly. A doctor friend suggested that this is also why people who have had strokes speak strangely - they hear their voice via different neural pathways, and after years of hearing out one day, the tiny delays wreck speech. Deaf people don't have this feedback and their speech is sometime remarkably similar to what happens with delays.
Some people can still sing properly with both ears covered, some people can manage one-eared, but a few cannot cope with near field monitoring at all if it blots out the room.
I've always had a rule. Mix on headphones if the audience listen on headphones, if they will use speakers - even bad ones, mix on speakers. On speakers, pan helps spacial location by linking to time - so delay and reverb effects have a big impact that are different on phones.