Start of audio beeps are 10ms different from each other instead of being completely synchronized

Hi everyone, I’ve got a very weird problem and would appreciate it a lot if someone could help me with this.
I have a simple audio experiment in which numbers 1,2,3,4 are going to be multiplied by 250ms. I’ve also put a little gap of 50ms between sounds so to make them separate.
Today I recorded the sounds and looked at them in audacity software. when I sync the start of the first sounds in a sequence, the second sounds are not starting at the same time, for example, the start times for the second beeps are 0.581, 0.571, 0.591, 0.561, and as you see they are different by 10ms. The same holds true for the third and fourth beeps. Also the gap between them is not 50ms and is different! I want to know what’s the cause of this problem and how I can solve it. I’ve put the experiment here if you wanna look at it.
audioseqq.psyexp (24.7 KB)
this is a picture of the audacity software where I compare sounds. You can see here that the onset times don’t match and the gap is more than 50ms.

Thank you so much in advance for your help.

Have you tried different sound backends? For some, some latency and variability seem to be expected: Sound - for audio playback — PsychoPy v2022.2.4.

Yes, I tried different sound libraries including PTB which is supposed to have the smallest latencies and also different latency priorities. There’s still this problem of starting in 10ms intervals back or forth. What I can’t understand is why is that exactly multiples of 10ms?!
And I have used kind of the same code for producing rhythmic visual tasks and now I’m wondering if visual stimuli are off-phase too!!
Is there a way that I can prove the visual stimuli are or are not out of phase?

Tomorrow, I am back at my PC and I will try to replicate this to see if this 10ms thing is robust. If the timing precision is very important to you (and you always have the same number of sounds in a sequence) you could also try if it helps if you put the different sounds into different components of the same routine. This way, fewer other things (looping, code) would have to happen between the sounds. Maybe that makes a difference…

I think the question of whether this is even a (solvable) problem for visual stimuli depends on the framerate of the screen you are using. You could test this the same way as with audacity, by recording your screen and counting frames.


Seems to be replicable…


If you really need accurate / reproducible timings, wouldn’t it be possible to pre-record (or pre-generate in code) the audio sequences, including the gaps? You could then play them as one instead of several distinct sounds, and the gaps’ duration would be guaranteed.

(I have no means to open the experiment file in PP GUI, so I may have misunderstood your code - I’m sorry if that’s the case.)

Thank you very much for your answer. In fact, I did that, and as you said the timing was very accurate. But the problem is that I have the visual version of the same code too. So instead of sounds, there will be a square on the screen lasting for a certain amount of time. As this is a timing project, they need to be very exact and accurate but it seems they are not!
I recorded the screen while my visual task was running with a resolution of 120 frames per second, then I counted the number of frames it takes for a square to be replaced by another one. And unfortunately, the timing is not very precise.
Do you think my method (recording screens and counting frames) might have had a problem or they are actually imprecise? If the latter is true, how can I solve it?
It’s noteworthy to mention that I even tried the experiment with no costume code and only with routines but the result is still what I said.

@Zhaleh, I am not familiar with your experiment setup in detail, so again, I’m only talking about things in general. If you need exact timing, both your experiment and your verification methods need to be accurate. What I mean is you probably need dedicated hardware (with low latency and jitter) to be able to say with confidence that your measurement was correct. Something like what Black Box Toolkit offers (this is no attempt at marketing, we just happen to have that).

Did you time your visual stimulus by frames or by (milli)seconds? In my experience, PP is quite precise when it comes to showing something for a certain number of screen frames; it gives you precise control of what appears on each frame. As long as you think of time in frames (and not in seconds), and you don’t perform any processing task that might cause you to miss a flip, it should be accurate.