Voice Key RT issues

OS: Win11
PsychoPy version: v2022.2.5
Standard Standalone: y

What are you trying to achieve?:

I would like a reliable reaction time from the microphone - effectively onset of voice.

What did you try to make it work?:
I’ve built an experiment and used the microphone response to take sound in. I can record sound and I get sound onset and offset times, but onset times are way too early - effectively I think just picking up the microphone initializing.

What specifically went wrong when you tried that?:
There’s no error message - the experiment runs smoothly but reaction times aren’t what I need.

I’ve found a number of posts on here that seem to have the same problem but they are old and when I try to implement solutions they don’t work - sadly my coding skills aren’t enough to figure out why.

(this post looked promising but I can’t get the code to run: Recording vocal RT using the microphone routine in Builder)

I have also seen a number of posts link to and discuss a “word_naming” demo experiment, but the link is always dead and I can find no such experiment on PsychoPy git hub.

Any help would be very much appreciated - thanks!


1 Like

Hi @jonathan_silas,

This is an issue I am also trying to figure out. I don’t have a solution yet, but here are links to a few of the demos I have found that use the microphone:

In PsychoPy, you can save the demos to an easy-to-access folder by clicking Demos > unpack demos and then choosing a folder to save the demos.

In v2022.2.5 the demos have the following behaviour:

  • offline voice transcription works if you select built-in
  • microphone component onset/offset are recorded (the time when the microphone component starts and ends)
  • speaking start and stop times are not recorded in the csv even if the box is checked

I hope that helps a bit, I’ll keep digging around for a real solution.
Please update if you figure out anything as well!

Edit: also found this coder demo demos/coder/input/latencyFromTone.py (accessed June 22, 2023), which might record voice timing but is throwing errors in v2022.2.5

Thanks for the demos and advice.

If you find any more on using voice onset as an RT please do let me know.

Hi Jon,

The recent PsychoPy update 2023.2.1 has added a plugin called psychopy-whisper for voice transcription using Open AI’s Whisper tool (you can find it under Tools > plug-ins/package manager). This plug-in is promising, but isn’t working quite right for me yet.

I haven’t found much documentation for the plug-in because it is quite new, so I am not sure what the recommended set-up and expected behaviours are.

My Current Test Set-up

  • PsychoPy Version: 2023.2.1
  • Install psychopy-whisper through plug-ins/package manager
  • Update typing-extensions with pip install typing-extensions --upgrade (there is a terminal in the plugin/package manager window that you can use for this step)
  • in the microphone component, the following options are set:
    • transcribe audio
      • select Whisper as transcription backend
    • save speaking start/stop times
  • Testing Details:
    • Test environment:
      • open area with some background noise (not your ideal testing environment)
      • USB microphone
      • audio-jack microphone
    • Test words:
      • short words (e.g., dog, cat, snake)
      • long words (e.g., Supercalifragilisticexpialidocious)
    • Test speech style:
      • speaking as quickly as possible
      • delaying speech until near the end of the trial

Observed Test Behaviour

  • Transcription: Transcription with psychopy-whisper appears to be working, the software does a reasonable job of identifying the spoken words
  • Save Speaking Start Time: The column mic.speechStart is created, but the value is almost always 0.0, which does not seem correct. This could be because my microphone was picking up background noise.
  • Save Speaking Stop Time: The column mic.speechEnd is created and a time is recorded. Not sure if these times are relative to routine, mic component, or mic.speechStart, which will be important to figure out.

I am not confident that speechStart and speechEnd are recording times accurately. This could be an issue with the plug-in, but could also be due to the microphones I was using, or the background noise of my environment.

Example trials

mic.speechStart times mic.speechEnd times
0.0 0.38
0.0 1.5
0.0 2.12
0.0 4.5
1.5800000000000005 3.96
2.2400000000000007 4.86

Hope this helps you, let me know if you end up making further progress.