UTF-8 encoded .csv being read as ISO-8859-1

URL of experiment: https://pavlovia.org/naomilee/beam-number-abb

Description of the problem: My experiment has some text components that rely on a UTF-8 encoded .csv file. When I pilot the experiment on Pavlovia, the text is being read as ISO-8859-1. Most notably, the single right apostrophe ’ is showing up as †(the ™ symbol that usually shows up with ’ with this kind of encoding issue isn’t appearing).

Should I change my .csv file encoding, or is there a setting I can change on Pavlovia to fix this issue?

Ah, yes, this did crop up https://github.com/psychopy/psychopy/issues/2299 and I think the issue is that the datafile reader we’re using (the JavaScript XLSX library) uses whatever default encodig it finds (probably set on the browser which gets it from the opersating system).

I believe we can change that to use UTF-8 by default rather than the browser’s default. Two workaround options until we make that change are:

  • encode as utf-8-bom (adding a Byte Order Mark at the start of the file to indicate encoding) instead of plain utf-8 (which does nothing to indicate its format)
  • save as an xlsx file instead (which certainly always labels its encoding)
1 Like

Perfect! Thank you so much for the prompt and helpful reply.

I reencoded my file as utf-8 with BOM and the issue went away.
(For those using Mac Numbers and frustrated with its inability to export with a byte order mark, I ended up just reopening the exported .csv file in Sublime Text and using Save with Encoding > UTF-8 with BOM.)

1 Like