Random missing columns in participants' data files, ocassionally scrambled rows

OS (e.g. Win10): Win10
PsychoPy version (e.g. 1.84.x): 2022.2.5
Standard Standalone? (y/n) If not then what?: y

I have an issue that I in my 10 years of using PsychoPy have never encountered (and, in fact, seems to violate everything I know about it). I ran a simple rating experiment with ~100 participants on 5 university computers. I tested it on every single computer about 10 times in a row and every time datafiles came out identical (350 columns, 350 rows, everything in its own place). Now, I started analyzing datafiles from participants and realized that NONE of them have 350 columns, even though the exact same experiment file was used every time. All of them are missing 1-7 columns (most of them in a particular exit survey, where it would randomly not record the start/end time of components, reaction times to components, or correct responses even though the checkbox is checked in Builder). Even more weirdly, some of them (maybe 3-5%) have scrambled ROWS (I am attaching two screemshots, one with correct rows 106-109 and the other one that has scrambled junk in rows 107-108). I rushed to campus, ran the experiment myself again on every computer and got all the correct columns and rows. I am absolutely losing my mind about what could possible be happening here. I am willing to share some datafiles and the experiment file if anyone is willing to take a look. Other than that, I have not been able to find a similar problem reported anyone. Any ideas?


How are you getting the data files? Might some of them be from online experiments and some of them from local? Could some of them been opened and resaved? The messed up lines sound like they could be due to a particular character in the data and/or because they have been opened a different way (e.g. using a version of Excel set to European definitions of commas and points in numbers.

No, it was all local and nobody opened the datafiles before I got them (to the best of my knowledge anyway, very few people have access to lab computers and nobody would want to sabotage me that way).

In that case, please could you upload an original csv that comes out corrupted and I’ll take a look tomorrow.

For sure! test.csv is what the data is supposed to look like (when I myself do it on any computer I have access to). p4 and p20 are just two examples of scrambled files (but they all miss columns, except for, again, when I test it)

And thank you!
test_ME_2023-10-25_16h59.08.610.csv (272.9 KB)
p4_ME_2023-10-04_11h54.04.135.csv (271.5 KB)
p20_ME_2023-10-05_13h52.31.373.csv (270.0 KB)

I think there is a character in ans1.text causing a new line (possibly the return character being recorded and not removed)

p4 had Type here and press ENTER to submit: The world belched" seemed to be used as an alternative word for “kick out” or “expel”.

p20 had Type here and press ENTER to submit: This person used the word slid" because he tried to explain how fast the conversation ended.

I think p20 also put the same character into ans2.text

I think you need to “sanitise” the text responses (using text.replace) so they don’t start new lines in the data file.

You are definitely right and now I feel dumb! So weird it happened though because pressing “Enter” just submits the response, and also this scramble happened to only 4 out of ~100 participants. But yes, thanks so much for your keen eye!

My best guess is that the 4 participants had International keyboards and the return key got coded slightly differently in the data file. When I open the three files in my text editor I can’t see a difference between the returns than break the lines in Excel and the ones that don’t, though I did notice that the test one seemed to have a return before your typed response as well as after.

I did run the entire experiment on five university computers so the keyboard was the same for every participant. I guess it’s going to stay a mistery.