| Reference | Downloads | Github

Trouble formatting data for SPSS


I’m in builder mode. My experiment is within-subjects. The design is that each participant hears five stimuli and makes a rating at four different time-points within each stimulus, for a total of 20 ratings recorded from each participant.

Screenshot 1 shows my Conditions file given as input to the loop. The songs are the stimuli; tp1-tp4 represents the four different time-points at which the four rating scales pop up inside each stim, where the participant has 15 seconds to make a rating of that part of the song as the song plays. (tp1Type-tp4Type represents the categories of each of the time-points, which will be important later in the analysis in SPSS, but doesn’t affect PsychoPy).

Unless I’m mistaken, SPSS prefers the data in the format of one participant per row.
However, currently PsychoPy is outputting the data with one condition (song) per row. For example, when Joe Participant takes the experiment, a single .csv is generated that has one row for each song; the four ratings made for each song are presented within the row for that song.

What I want instead is all Joe’s 20 ratings in a single row, with the column headings being Song1Rating1, Song1Rating2, Song1Rating3, Song1Rating4, Song2Rating1, Song2Rating2…Song4Rating4. Does this make sense?

Am I going about this the wrong way?

Also, is there any way to get PsychoPy to output a cumulative .csv rather than exporting one for each Participant, so it can be readily copy-pasted into SPSS once data collection is complete?



Welcome to data analysis. PsychoPy gives you the data in the most granular form (usually one row per trial). It is up to you to get in into the form required for a given analysis, but PsychoPy is providing it in a useful ‘lowest common denominator’ form. Different software, and different analyses within the same software, expect different arrangements of the same data. PsychoPy is outputting a format that can be wrangled into whatever shape you need, but that wrangling is really up to you.

Personally I recommend the dplyr and tidyr libraries within the R statistical environment for doing this, simply, easily and powerfully. But if you aren’t keen on a coding approach, pivot tables in Excel are actually also remarkably effective for reshaping data in a graphical way.

I don’t know if SPSS itself provides any mechanisms for data reshaping, will leave that to others to comment. But you are not in a unique position in needing to either re-shape or summarise data to suit it.

Hello, thanks for your advice. I haven’t yet learned R. Is there a way to pre-emptively get PsychoPy to output data with one participant per line, without using an external program like R or Excel?

Certainly. PsychoPy is built on Python, a full programming language that can be used to re-shape data in any way needed (although perhaps not as simply as dplyr/tidyr). But that may not interest you if you are reluctant to even use Excel…

Thanks. If I want to have the PsychoPy software itself output the experimental data with one participant per line, are there specific suggestions for how to do this? Is there a feature inside the builder? The coder? Any tutorials recommended? I’ve looked online but just wondering if there’s a particular built-in function in PsychoPy that can make it output one line per participant. Given that this is a common situation researchers encounter in SPSS (from what I understand), might there be an easy way to do this from within Psychopy, rather than applying it to the data files after the fact in R, Excel, or an external Python script?


I think we’re talking at cross-purposes. Yes, it is common to need to re-shape data from long to wide format, but the process of doing that depends on the individual design structure. That would be difficult to automate in a function you haven’t written yourself, as knowledge of your particular data structure is required (in that you have four dependent variables per trial, of five trial types (songs)).

If the code was implemented in Python, then yes, it could run at the end of the experiment from within PsychoPy. But that function would need to be specifically crafted for your data. Whatever way you approach it, it will take work from you, as data analysis usually does, and can’t be automagical.

Maybe you should look at it from the SPSS end if you are more comfortable with that: