Most software will list the buffer in sample size. I don't know why Audacity using milliseconds, but that's what was listed.
The way USB works, it is like a bucket brigade. As data is being sent, it collects in the bucket. The computer will check to see if there is data in the bucket, if it is, it will pull the data, process it and move on. In the settings I mentioned above, each bucket holds 128 samples of data. If you set the buffer too low, and the computer doesn't get there in time, then data will overflow, and you lose samples. This will exhibit as clicks, pops and dropouts.
Audacity appears to be starting out with a very big buffer, so there's no chance of data being lost. The problem that a bigger buffer causes a larger delay. On Youtube, there is an excellent video that explains the process, but I can't remember where it is right now.
I really don't know how Ffine sets up the audio transfer. Perhaps someone who has used the Ffine knows more about it's setting.
Try changing the buffer to something more workable, like 20ms and see if it works better. You can't break anything. I'm guessing that the extra 30ms tacked on for the compensation is an estimation on how long Audacity is taking to process the audio itself (converting from 16 bit to a 32 bit floating number, storing the number, then plotting it on the waveform). Your method of trying to record a speaker playback is ok. Make sure that you have the mic right at the front of the speaker. Changing the distance will change the delay! Remember sound moves at about .3meters/ms. Record a click track and use that.