Basically sampling rate and bit depth refer to the degree of resolution that your digitized audio signal is stored at. I like to think of them as something like the number of dots in a picture, or the number of pixels on a computer screen. The more dots, the better the quality of the picture. Same with sampling rate and bit depth. Higher rates result in better sound quality because they represent the analog signal more accurately. Samplling rate basically deals with the frequency of the sound, while bit depth refers to the amplitude of the sound. CD quality has a sampling rate of 44,100 samples per second at 16 bits.
Dither is essentially noise that is added to an audio single when you are reducing the bit depth (say from 24 bits to 16 bits). It is designed to keep the change in the signal more "musical sounding" than simply truncating the signal would (i.e., just chopping off the extra bits).
Here are some more precise definitions taken from the Sonar manual:
sampling ratethe rate at which the computer saves measurements of the signal strength. It is a known fact of physics that you must measure, or sample, the signal at a rate at least twice that of the highest frequency you wish to capture. For example, suppose you want to record a moderately high note on a violin--say the A whose fundamental frequency is 440 Hz and all overtones up to five times the fundamental. The highest frequency you want to capture is 2,200 Hz, so you need to measure the electrical signal from the microphone at least 4,400 times per second.
Since humans can hear frequencies well above 10 kHz, most sound cards and digital recording systems are capable of sampling at much higher rates than that. Typical sampling rates used by modern musicians and audio engineers are 22 kHz, 44.1 kHz, and 48 kHz. The 44.1 kHz rate is called CD-quality, since it is the rate used by audio compact discs.
Sampling resolution determines how accurately the amplitude of each sample is measured. At present, the music industry has settled on a system that provides 65,536 different values to assign to the amplitude of a waveform at any given instant. Thus, each sample saved by your computer requires 2 bytes (16 bits) to store, since it takes 2 bytes to store a number from -32,768 to 32,767. The scaling of the electrical input signal level to amplitude value is determined by your audio hardware and by the position of your input level control.