This is a very interesting example of how things aren't always as they seem.
Everybody will agree that the right channel sounds "louder" or "more present" than the left channel. Yet look at the numbers (as measured in Sound Forge):
PEAK LEVEL (dBFS)
L: -9.0
R: -12.8
RMS LEVEL (dBFS)
L: -21.0
R: -22.5
If you went by the numbers, the left channel would definitely be the louder of the two as it rules in both peak and RMS values. But yet clearly this isn't the case.
What's happening here is that the signal coming through the vamp on the right is much
denser than the miked signal on the left. See the attached blow up of a section of the waveforms (L on top, R on bottom) to see the difference visually; the right channel just looks denser. And it sounds that way too, because there is actually more sonic information, even if it isn't "louder".
Why is this happening? It sounds (and looks) to me as though the vamp is compressing the signal, and in doing so is also giving a bit of a high-end boost to it. This boost is what synkotron erroneously heard as an octave modulation. They are playing the same thing, but the right channel is brighter and more present.
How do you fix this? The simplest way is to just increase the gain on the left channel. So what if they don't look or read out at the same level, nobody cares how the song looks or reads. If both sides sound the same volume, that's all that matters. As is always said here, mix with your ears

.
Or you can try some compression of the left side to try and bring the densities in line, and then EQ a little to taste to go with the right side how you wish.
Try and remember this the next time the discussion of RMS and volume comes up and remember that they don't necessarily mean the same thing.
G.