Reading in Excel files

Hi everyone,

I’m supposed to create an experiment for a group of students. I need to set it up and hand them the *.psyepx file so they can use it.
I need to read in (and later write to) a spreadsheet which contains several lists ( of strings and numerics). At first I thought *.csv would do the trick, but it turned out that MS Excel uses a different character encoding for csv than Open Office (non english country). Furthermore Excel uses a different delimiter when writing csv.
And since I don’t know which program for editing the spreadsheet will be used in the future, this behavior introduces problems I’m not able to solve. So I decided to use *.xlsx instead. Since I’m not fluent in Python (and programming in general) myself, I do need to ask, though: Is there a “gold standard” for reading and writing xlsx files?

I tried pandas - which works great. I do wonder, though: is using pandas “overkill” for just reading in an xlsx file?

import pandas as pd

# ...

setupFilename = "setup.xlsx"

try:
    pdSetupFile = pd.read_excel(setupFilename)

    listStimulus_1 = pdSetupFile['stimulus_1'].tolist()
    listStimulus_2 = pdSetupFile['stimulus_2'].tolist()
    listAffectPosition = pdSetupFile['affekt_position'].tolist()
    listDotPosition = pdSetupFile['dot_position'].tolist()
    listTimes = pdSetupFile['zeit'].tolist()
    keyPos_1 = pdSetupFile['taste_position_1'].tolist()[0]
    keyPos_2 = pdSetupFile['taste_position_2'].tolist()[0]
except IOError:
    msg_error = "Fehler beim Öffnen der Datei '{}': Datei möglicherweise falsch benannt oder nicht im gleichen Verzeichnis?".format(setupFilename)
    logging.log(level=logging.ERROR, msg=msg_error)
    quit()
except KeyError as e:
    error = str(e)
    msg_error = "Fehler beim lesen der Datei '{}': Es konnte keine Spalte mit dem Titel '{}' gefunden werden.".format(file_name_in, error)
    logging.log(level=logging.ERROR, msg=msg_error)
    quit()

Furthermore: Can I assume that this code will work on anyone’s machine that has PsychoPy installed or might there be a problem with missing imports (I dont know how imports work in python)?

PsychoPy bundles the third-party pandas library, so yes, anyone running the PsychoPy standalone distribution will be able to import pandas.

You might be better off using PsychoPy’s built-in functions and classes to do this. e.g. It seems like you are using the Builder interface, as you are passing on a .psyexp file. So you could just connect a loop in the flow panel to your Excel file, which will automatically read in all of the columns and make the values in each row available on each trial. At the end of the experiment, Builder will save a .csv data file which includes all of the columns in that conditions file, as well as all of the variables (e.g. responses, reaction times, etc) that are measured during the experiment.

If needed, I guess you could also just include a function at the end to save that file in .xlsx format if required.

So I think at this stage you might be doing a bit more work than needed… let PsychoPy do the heavy lifting for you. This applies whether you are coding the experiment from scratch (in which case, you should use the TrialHandler or ExperimentHandler classes to take care of reading in conditions files, saving data, and iterating over trials: psychopy.data - functions for storing/saving/analysing data — PsychoPy v2023.2.3

But even for experienced programmers, we tend to advise still using the Builder interface where possible. It will produce “best practice” code that will avoid a lot of timing issues that might not be obvious otherwise. The advantage of going this route is that Builder will take care of the drudgery, while you still have the freedom to insert “code components” ( a GUI dialog containing txt fields for code snippets that will be injected into the generated Python script to run at various points within the experiment). These code component s give you the best of both worlds: most of the experiment will be written for you, but you still get to inject just tiny snippets of code that add extra functionality (e.g. in your case, saving the final data format in .xlsx as well as .csv).

Thanks for your great reply!
Yes, you are right: I’m using PsychoPy’s Builder with code components. It is pretty convenient.
Regarding the xlsx and loop: The guys wanting me to build this experiment need(?) it to be as easy to use as possible while still being highly “flexible”. This includes a) minimizing “contact” with PsychoPy’s Builder interface and b) being able to change key bindings, etc “on the fly”.
So the xlsx is an frankenstein-ish bridge between actual conditions and some kind of *.ini file, if you will. I am not a entirely sure, but I think it is not possible to use the built-in classes for this kind of initial setup. I might be wrong, though. I will look into it.

Furthermore I’m supposed to create a custom output file, which is limited to just a few variables. Using builder, I could not find a way to have such level of control over the standard output.
That is why I’m reading and writing files by hand. If there is a more streamlined way of doing it I’d be glad, though.

OK, sounds like you are on the right track.

Do make sure the full Builder-generated data files are kept though - your users might kick themselves if/when they realise at the analysis/peer review stage that they need more variables than they thought they did in the customised output…

Good Point!

There’s a follow up question, though:
As mentioned, I read in the user xlsx, remove any of the setup variables (key bindings, etc), save it as a temporary file “conditions.xlsx”, which I then feed to a loop in PsychoPy’s Builder.

However, when entering “Null” as a string into the xlsx, PsychoPy does not recognize it as a string. It thinks “NULL” is an empty cell in the xlsx. Is there a way around that?
The only thing I can think of is looping through all the entries, checking if it equals “Null” and append a space or something. But that sounds like a silly idea.