I don't actually use Cubase.
The "Disable audio app use of Monitor mixer and Patchbay/router" thing should be checked, especially for Cubase as it apparently changes faders and pans for no good reason if it isn't checked.
Direct Monitoring is putting extra load on the system, and it makes no sense to use it with a high latency. Direct or Input Monitoring is in most DAW programs and echoes the incoming audio of any tracks armed for record so you can hear the effect of any FX plugins and eq in the tracks. There must be very low latency from the soundcard buffer for this to work, otherwise what you hear is hoplessly delayed by at least the twice the buffer latency time (the sound had to go in, thru the track fx and back out). If you don't need to monitor track fx while recording, turn Direct input monitoring off.
With the m-audio cards there is a "monitor mix" output option in the control panel patchbay which can allow you to hear the inputs and outputs over the soundcards main stereo output, according to the control panel mixers faders. You won't hear any Cubase track FX on inputs this way though.
It's important not to use the cards monitor mix as well as your softwares direct input monitoring, else you'll hear the audio with 2 different delays and it'll sound like all the upper mids and highs are missing from the sound!
Another fix that occurs to me is to turn down the graphics cards system time. The thing here is called IRQ (or less accurately) PCI Latency. It matters even if the Graphics is in an AGP slot but cannot be tweaked if it's a Pci-Express card. This affects the time the device hogs the computers internal paths (busses). N-videa and ATi graphics are very greedy in this respect, taking 248 or 255 clocks per access while the soundcard only asks for 32.
If you do a Google search for "pci latency tool download" you should find it, but for ATi cards, I found one just for those...
http://downloads.guru3d.com/download.php?det=733
...though I'd prefer the generic tool as it'll use less resources itself.
A latency setting of 128 or 64 might cure the performance problem for you.
I use a Matrox G550 AGP card in my DAW, these have a pci latency of just 64, much more reasonable and less prone to cause glitches in soundcards.
Absolute crap for 3d gaming though