It’s not a DC offset. That would make the resting point the center of the waveform rather than the center of the display. It’s pretty clearly returning to rest at actual zero.
Voice - especially male voices - are asymmetrical. One of the big tricks that radio announcers use to get that larger than life sound is phase rotation. It’s basically an allpass filter or four centered somewhere in the meat of the voice that essentially randomizes the phase relationship of the various harmonics in the voice, redistributing the energy in such a way that it becomes much more symmetrical. On its own, this is a very subtle effect. You might not even hear the difference, but it does help compressors, limiters, and other non-linear processing to work more consistently and effectively.