Just to clarify, is everything going through the M-Audio interface, including playback? Do sample rates match for everything, from interface to samples to session?
Going out of sync over a few measures sounds more like sample rate mismatch than a clocking error. Or it could be dropping massive...