psychopy.org | Reference | Downloads | Github

Order of columns in ExperimentHandler

Hey there! We rely on the ExperimentHandler and TrialHandler classes for result data management in most of our experiments. Recently, we have been experiencing problems with the output files.

Depending on the computer on which some of our scripts are run, the order of columns in the output files changes. Column names and the number of columns are exactly identical, the order of columns, however, changes. That makes it impossible to, e.g., export data to an Excel file which relies on a specific order of columns.

My question: is it possible to have the columns in an output file ordered alphabetically or in any other predetermined manner?

Kind regards, Malte

Issue 1: By “depending on the computer”, do you actually mean “depending on the version of PsychoPy”?

Issue2:

Solution1: Coding your analysis rather than doing manual imports into an Excel file scales much better. e.g. this R script would concatenate multiple .csv files into a single data frame, matching columns by name rather than position:

library(tidyverse) # for the %>% operator & all functions below except list.files():

dat = list.files(path = 'data/', pattern = '*.csv', full.names = TRUE) %>%
  map(read_csv) %>%
  reduce(bind_rows)

Solution 2: You can monkey-patch any Python function. i.e. your script can overwrite the relevant ExperimentHandler or TrialHandler export function with a version that you’ve modified to fit your requirements.

Issue 1: By “depending on the computer”, do you actually mean “depending on the version of PsychoPy”?

It might be dependent on the PsychoPy version. However, since the ExperimentHandler class uses Python dictionaries to store information, the PsychoPy version should not be an issue.

Solution1: Coding your analysis rather than doing manual imports into an Excel file scales much better.

You are completely correct that an integrated path of analysis would be preferrable. However, I find this is not the issue here. There are myriads of use cases where data is written to a csv/tsv file and then analyzed by third party software which might rely on a specific order of columns, regardless of there being better options.

IMHO, it’s problematic for any software to produce different results with the same input data if no random process is deliberately involved.

Solution 2: You can monkey-patch any Python function.

Yes, although I’d always prefer a more thorough solution which does not involve local forks of an ever-updating environment. Which is why…

… there is a quick update. A new parameter “sortColumns” has been added to the master branch of PsychoPy. It’s a simple boolean flag which allows for alphabetic sorting of the output columns by header name.

Great, good to know.

But for future reference for your own code development:

Prior to Python v 3.6, standard dictionaries didn’t have a defined key ordering, so the order could never be relied upon from one dictionary to the next. In Python 3.6, that has changed so that the keys are ordered by order of insertion. But even then, if the underlying PsychoPy code is altered so that the order of insertion changes, then necessarily so do the order of keys within a dictionary. And PsychoPy is still available in Python 2.7 where all bets are off, so versions (of PsychoPy and of Python itself) do really matter when it comes to dictionary key ordering.

Prior to Python v 3.6, standard dictionaries didn’t have a defined key ordering

True, I was fully aware of that fact. However, no “defined” key ordering does not necessarily mean “random” key ordering. Prior to 3.6, when adding the same keys in the same order to a Python dictionary, the key ordering would be deterministic. This is why we were very surprised to find key ordering change between experimental runs.

Not wanting to belabour the point, but the issue as reported isn’t between experimental runs, but between different installations. Presumably, the order is constant between runs within each installation. If those installations have different versions of PsychoPy, then yes, the underlying code might differ (due to bug fixes or new features, etc), with the effect that variables might be being written in a different order, or even that entirely new variables are added.

So our strong recommendation is to never change PsychoPy versions during data collection for a given study, precisely because ongoing development can introduce unexpected changes, in either performance or output. Even if those changes improve performance, the inconsistency is undesirable. In your case, it sounds like you might have two different installations running in parallel, which is equivalent to changing version in series.

Not sure if you’re aware of this (I’ve never done it myself, and perhaps this is the approach you’re already taking), but the .psydat files are basically freeze-dried representations of your experiment handler objects. So if you want consistent data output despite the differing versions used, (I think) you could iterate over all of the .psydat files, import each into an ExperimentHandler, and then call .saveAsWideText() on it. This should give you the same ordering for the columns regardless of the version the task was run under, because the exporting is being run consistently from just one version of the PsychoPy library. This should work whether or not you force the output to be alphabetical (which is probably not a desirable arrangement to work with practically).

Again, maybe you’re already pursuing this approach, but if not, there is an example of the technique in the csvFromPsyday.py file available under there demos menu in the Coder view of the PsychoPy app. In essence, wrap this in a loop:

from psychopy.tools.filetools import fromFile

# read in the experiment session from the psydat file:
exp = fromFile('some_file.psydat')

# save it as csv:
exp.saveAsWideText('some_file.csv')

Hopefully that works for you?

EDIT: Wait a moment, maybe that is bollocks, as the exported key order might still depend on the pickled representations of each individual dictionary, which might vary across the different .psydat files. So I guess you might still need to force the alphabetical ordering.