It’s an idea more likely to be horrible than good. The reason we all use editors is to at the simplest, trim the video start and finish, so while you do that, add the audio because you can adjust in up and down in level. One of the biggest issues in the comments on you tube is that the sound is poor, or the music drowns out the speaker or the speaker is too loud. Picture quality comments are rare. everyone is an audio expert.
analogue video was low quality, 240 line resolution at best, linear editing resulted in a one generation loss which was just about tolerable. Then we invented digital and there is no quality loss. Copies are clones. Going into an editor is not the issue. Quality changes as said, come from conversions of standards. Going in as .mov the outputting as .mp4, or converting interlaced to progressive - these things change quality, but the question is can you even see it? Sometimes is the answer, but that usually means you were too radical. Squashing too much to get a small file size, maybe?
with your idea here, quality of the recorder is the key. Voice recorders rarely are designed for maximum quality.