Alternative to pandas for reading csv/excel files online

aisa2 · May 29, 2020, 5:00pm

Hello PsychoPy community,

I have an experiment which uses the pandas package to import an excel file with lots of stimuli and associated distracter words. During the experiment I’m also using pandas-specific methods like .loc() and .sample() to shuffle and check my trials/words/conditions/etc. Since we can’t import extra packages for our online experiments, what alternatives would you recommend for these functions?

wakecarter · May 30, 2020, 6:08am

What do those functions do? I’ve not used pandas.

I often read variables into arrays from Excel in one (or more) loops which precede the trials loop so I can easily do things like interleaving (prospective memory) and repetition (n-back).

Best wishes,

Wakefield

aisa2 · May 31, 2020, 4:35am

@wakecarter pandas imports an excel file and makes a dataframe that can include text as well as numbers, turning the columns into variables based on their name, that you can access by name similar to R.
For example, by running:

wordList = pd.read_excel(<filename>, sheet_name=<sheetname>)

you could now access the second column of that sheet by its name (wordList.column2label for example), and the rows could be accessed by their index using loc.

Since I have a free recall task, I need to look up words that participants typed, match them with a word in my original dataframe, and then present some other trials based on which words they recalled. Thus, I can’t predict ahead of time what the trials will be, and there are some additional complexities like accounting for homophones. Plus, these are all strings and not numbers, so simple arrays won’t work well… pandas has been useful while coding offline, but I’m looking for something with similar functionalities that I can use in Pavlovia.

wakecarter · May 31, 2020, 6:28am

Read the information into a multi-dimensional array.

code_JS

Array.prototype.append = [].push;

Begin Experiment

wordList=[]

Begin Routine (within a loop pointing at the Excel file)

wordList.append([Item,Valence,Answer])

You could then loop through this list to add the items they type to a second one.

aisa2 · May 31, 2020, 2:24pm

Thanks, I will look more into multidimensional arrays, this seems to be what I need.

Cheers!

wakecarter · May 31, 2020, 2:52pm

Refer to elements as wordList[0][0]

Shuffle works based on the first index

aisa2 · June 1, 2020, 4:28pm

@wakecarter does it work to simply use open() on a csv file?
I keep running into an error trying to import that file:

readWords = open('distracters_EN_NH.csv', encoding = 'utf-8')
headers = readWords.readline()
print(headers)

UnicodeEncodeError: 'ascii' codec can't encode character '\ufeff' in position 0: ordinal not in range(128)

wakecarter · June 1, 2020, 4:44pm

Sorry, that’s not a technique I’ve tried online.

My Python programming skill has been learnt through PsychoPy Builder code components and my other languages are MATLAB (rusty), PHP/MySQL and originally Commodore BASIC… I therefore let Builder handle files and loops since my only experiments that open or save custom files won’t work online.

aisa2 · June 1, 2020, 5:02pm

Okay, thanks anyway!

aisa2 · June 1, 2020, 8:40pm

After a lot of sleuthing, I found something nifty that doesn’t use any extra packages. I haven’t yet ported everything online to test it, but I thought I would put this here in case anyone else is curious.

The method involves using data.importCondtions, which imports the csv into a bunch of dictionaries (one for each row in the csv file), with keys that correspond to the column headers at the beginning of the csv. Then, all we have to do is a little loop to meld all of these dictionaries together, and voilà, the dictionary is ready to use in the script.

# initialize the big dictionary...
expVars = {}

# import csv file
csvData = data.importConditions('my_file.csv')

# meld all dictionaries together and store in the big one
for k in csvData[0]:
    expVars[k] = [d[k] for d in csvData]

You can now access the dictionary by name and list index; e.g.

expVars['var1'][33]

aisa2 · June 5, 2020, 10:50pm

@wakecarter I just got online with this… unfortunately, it ends up throwing a
ReferenceError: "data is not defined"
Would I need to use psychopy.data instead? Or will it just not work?

wakecarter · June 6, 2020, 6:13am

Try just using importConditions first.

Does that work locally?

aisa2 · June 6, 2020, 2:46pm

@wakecarter thanks for taking the time; importConditions does not work locally unfortunately.

aisa2 · June 8, 2020, 2:24am

Update: I’ve used a short python-only code component at the beginning of the experiment to make import Conditions work:

importConditions = data.importConditions

Now, I can use importConditions in my code. This line no longer produces an error online, but it does throw an error immediately afterwards when I try to access the dictionary:

TypeError: _pj_a is undefined

Here is the portion of the js file this error is coming from:

for (var k, _pj_c = 0, _pj_a = distListData[0], _pj_b = _pj_a.length; (_pj_c < _pj_b); _pj_c += 1) {
      k = _pj_a[_pj_c];
      distList[k] = function () {
      var _pj_d = [], _pj_e = distListData;
      for (var _pj_f = 0, _pj_g = _pj_e.length; (_pj_f < _pj_g); _pj_f += 1) {
          var d = _pj_e[_pj_f];
          _pj_d.push(d[k]);
      }
      return _pj_d;
  }
  .call(this);
  }

I believe that’s just the javascript version of:

# meld all dictionaries together and store in the big one
for k in csvData[0]:
    expVars[k] = [d[k] for d in csvData]

LukasPsy · June 11, 2020, 9:55am

Is the post marked as the solution for this post really the solution? Two answers down from the accepted solution @aisa2 wrote it does not work in the way that was proposed initially?

@aisa2, how did you eventually solve the issue?

aisa2 · June 11, 2020, 2:13pm

Hi LukasPsy,

You’re right to ask, I’ve unmarked it for now. The “sequel” to the story is here.
I wanted it in a different thread because I’m still not sure if it’s just me who is having problems online, or if it’s a more general problem. But it’s probably good not to have anything marked here.

In that new post, I think the problem is iterating over dictionaries and not the importConditions function itself (for what it’s worth).

aisa2 · June 12, 2020, 4:39pm

Okay, the answer has been found on this thread:

Topic		Replies	Views
Replacing "import xlrd" to JS code - creating arrays from a csv Online experiments csv , importing , pavlovia	15	1409	February 3, 2021
Replacing importing functions in Python Online experiments	12	1781	June 6, 2020
Variables empty in online script (looping over dictionaries) Blog pavlovia	22	2601	February 22, 2021
Pandas functionality from PsychoPy for PsychoJS (pavlovia) Online experiments	17	2734	June 3, 2020
Issue converting my Psychopy code component (that runs well locally) so that it is compatible online Online experiments	2	215	November 29, 2023

Alternative to pandas for reading csv/excel files online

Related topics