Precise Chinese Characters Arrangement

andrewsilva19 · May 15, 2020, 4:10pm

I need to present 20 chinese characters in a precise grid. This works pretty well with texstim if characters come from the english alphabet, but I can’t make anything work for chinese characters. Please see below for my code:

from psychopy import visual, core
import pandas, random, numpy, itertools


characterList = pandas.read_csv('letters.csv', header = None)
characterList = characterList.values.flatten()

win = visual.Window( units = 'pix')


pixelsWidthRadius = 50

numHorizontalChoices = 5
numVerticalChoices = 4
xLocations = numpy.linspace(-pixelsWidthRadius,pixelsWidthRadius,numHorizontalChoices)
yLocations = numpy.linspace(-pixelsWidthRadius,pixelsWidthRadius, numVerticalChoices)
choiceLocations = list(itertools.product(xLocations, yLocations))


for i in range(10):
    
    random.shuffle(characterList)
    allStimuli = characterList[:20]

    # Create list of 20 individual random characters
    choiceList = [None]*len(allStimuli)
    for index in range(len(allStimuli)):

        choiceList[index] = visual.TextStim(win, text=characterList[index],
                              height = 20,
                              alignText = 'center',
                              anchorHoriz = 'center',
                              anchorVert = 'center',
                              pos = choiceLocations[index],
                              font = 'SimHei')
        
    # display all 20 characters    
    for character in choiceList:
        character.draw()
    win.flip()
    core.wait(1)

Screenshot:

But, the problem occurs when instead of the CSV with english characters, I use a CSV with Chinese characters. If I change nothing else in the code, the locations of the chinese characters are very imprecise. This is a screenshot when I run the exact same code above, except I load chineseWords.csv instead of letters.csv:

The spaces in between characters are more variable, and some even overlap.

Additional details:

I’ve tried changing the font, and it doesn’t appear to do anything with Chinese characters - I’ve tried inputting Arial, SimHei, Songti SC, Consolas.
When I open the chineseWords.csv file directly in notepad or wordpad, all of the characters line up beautifully in a column.
I’ve also tried replacing the textstim objects with textbox objects - and again it works for english text, but using my chineseWords.csv does not work at all, as the glyphs are simply invisible. The textbox code is:

        choiceList[index] = visual.TextBox(win,
                      text=allStimuli[index],
                      font_size=21,
                      font_color= [-1,-1,-1], 
                      textgrid_shape=[1, 1],
                      pos=choiceLocations[index], 
                      grid_horz_justification='center',
                      units='pix',
                      grid_color=[1,-1,-1,0.5], 
                      grid_stroke_width=1)

And when I try to input a font_name in the above code regardless of language, I get the following error

Traceback (most recent call last):
  File "C:\Users\andre\Desktop\vision span\VisionSpanTextbox.py", line 39, in <module>
    font_name = 'Arial')
  File "C:\Program Files\PsychoPy3\lib\site-packages\psychopy\contrib\lazy_import.py", line 120, in __call__
    return obj(*args, **kwargs)
  File "C:\Program Files\PsychoPy3\lib\site-packages\psychopy\visual\textbox\__init__.py", line 384, in __init__
    self._font_name, self._font_size, self._bold, self._italic, self._dpi)
  File "C:\Program Files\PsychoPy3\lib\site-packages\psychopy\visual\textbox\fontmanager.py", line 212, in getGLFont
    if len(font_infos) == 0:
TypeError: object of type 'NoneType' has no len()

Ultimately, I need the chinese characters in a precise grid - where each position can be precisely known and controlled. I’ve been working at this for a while, and though everything is straightforward in english, I haven’t found any solution for the chinese characters. Thank you!

chineseWords.csv (13.1 KB)

letters.csv (81 Bytes)

Michael · May 16, 2020, 4:11am

When we see these issues with Chinese characters, changing the font is usually the solution. It’s not clear to me whether you’ve actually done this, as lower down, you mention that errors occur when you try to specify a font?

Regardless, have you tried using a mono spaced font in particular (I guess Consolas is)? That could mean that you could even do this in a single text stimulus, depending on whether or not you actually need to control the precise coordinates of each letter as opposed to their indexed location in a grid.)

andrewsilva19 · May 16, 2020, 4:50am

Unfortunately, I do need precise control over the actual characters.

Yes, I have tried different monospace fonts with textstim as well. I have noticed that the trends seems to be that if I use the correct font, then it should work correctly. However, I haven’t had any luck changing even to monospace fonts yet.

For example, this exact code:

from psychopy import visual, core
import pandas, random, numpy, itertools


characterList = pandas.read_csv('chineseWords.csv', header = None)
characterList = characterList.values.flatten()

win = visual.Window( units = 'pix')


pixelsWidthRadius = 50

numHorizontalChoices = 5
numVerticalChoices = 4
xLocations = numpy.linspace(-pixelsWidthRadius,pixelsWidthRadius,numHorizontalChoices)
yLocations = numpy.linspace(-pixelsWidthRadius,pixelsWidthRadius, numVerticalChoices)
choiceLocations = list(itertools.product(xLocations, yLocations))


for i in range(10):
    
    random.shuffle(characterList)
    allStimuli = characterList[:20]
    
    ## Create list of 20 individual characters
    choiceList = [None]*len(allStimuli)
    for index in range(len(allStimuli)):

        choiceList[index] = visual.TextStim(win, text=characterList[index],
                              height = 20,
                              alignText = 'center',
                              anchorHoriz = 'center',
                              anchorVert = 'center',
                              pos = choiceLocations[index],
                              font = 'Consolas')
        
            
    for character in choiceList:
        character.draw()
    win.flip()
    core.wait(1)

Yields this for me:

I’ve also tried replacing “Consolas” with various purported monospace fonts listed on this page, that should support chinese characters and have gotten the same issues.

The other text option, using textbox, isn’t working for me either, as specifying a font_name crashes the program. The following code (using either english or chinese text):

from psychopy import visual, core
import pandas, random, numpy, itertools

characterList = pandas.read_csv('letters.csv', header = None)
characterList = characterList.values.flatten()
character = characterList[10]

win = visual.Window( units = 'pix')

text = visual.TextBox(win, text=character, font_size=21, font_color= [-1,-1,-1],  textgrid_shape=[1, 1], 
              grid_horz_justification='center', units='pix', grid_color=[1,-1,-1,0.5], grid_stroke_width=1,
              font_name = 'Consolas')

text.draw()
win.flip()
core.wait(1)

Results in the following error:

Traceback (most recent call last):
  File "C:\Users\andre\Desktop\vision span\VisionSpanTextboxSimple.py", line 19, in <module>
    font_name = 'Consolas')
  File "C:\Program Files\PsychoPy3\lib\site-packages\psychopy\contrib\lazy_import.py", line 120, in __call__
    return obj(*args, **kwargs)
  File "C:\Program Files\PsychoPy3\lib\site-packages\psychopy\visual\textbox\__init__.py", line 384, in __init__
    self._font_name, self._font_size, self._bold, self._italic, self._dpi)
  File "C:\Program Files\PsychoPy3\lib\site-packages\psychopy\visual\textbox\fontmanager.py", line 212, in getGLFont
    if len(font_infos) == 0:
TypeError: object of type 'NoneType' has no len()

Though, it works fine with english text if font_name is not specified (with Chinese text, the glyphs are invisible but the bounding box still shows up and no errors or warnings appear).

letters.csv (81 Bytes) chineseWords.csv (13.1 KB)

wakecarter · May 16, 2020, 7:00am

You could use a separate text component for each character.

Michael · May 18, 2020, 7:29am

At the moment, we rely on the third-party library pyglet for drawing text, and so PsychoPy doesn’t have too much control over what happens with exact placement of glyphs.

However, @jon just happens to be working on a much improved TextBox2, where PsychoPy code itself is responsible for much more of the task, such as precise placement of individual characters (and even including allowing for editable text). We were recently discussing ways to check on issues with non-Latin scripts. It is useful to have a test case like this to see how it goes.

So I modified your code as follows:

from psychopy.visual.textbox2 import TextBox2, allFonts

and:

choiceList[index] = TextBox2(win, 
                             text=characterList[index],
                             pos = choiceLocations[index],
                             font = 'Arial Unicode MS', 
                             color=(1.0, 1.0, 1.0, 1.0))

(Not sure why it needs a four-element colour specified, but it is a work-in-progress).

I ran it under the current developer version from Github, and it generates output like this:

Screen Shot 2020-05-18 at 19.24.47

which seems to be what you are after?

I’m not sure when @jon plans on including TextBox2 in a public release, but if you can’t wait, you could clone the Github repo and start testing with the code while it is still in development.

PS From a performance point of view, you should avoid creating objects from scratch unless it is necessary. i.e. in this case, you create 20 text stimuli on each iteration of your loop. What you should ideally do is create that list of 20 stimuli just once, at the start of your code. Then in the loop, just cycle through those existing stimuli and update each of their .text attributes. i.e. it is faster to update the attributes of an existing object than to go through all of the overhead of creating it from scratch.

e.g.

# enumerate() is a bit more pythonic than range(len()):
for index, stimulus in enumerate(allStimuli):
    stimulus.text = characterList[index]
    stimulus.pos = choiceLocations[index]

Actually the positions are constant I guess, so that step also only needs to happen once.

andrewsilva19 · May 18, 2020, 7:48am

This looks fantastic - exactly what I was after. I guess it’s reassuring to know that my off-center grid wasn’t the result of some mistake on my part. Thank you!

I’ll use the dev version for now, then, and will switch when textbox2 becomes official!

Thank you for this - yes, I absolutely will take this advice and stop creating and destroying 20 new objects every loop iteration.

Michael · May 18, 2020, 7:59am

That’s OK - it’s a pleasure to work with an issue when someone provides minimal, working, reproducible example code, and illustrations of the desired and erroneous output. There is an art to asking a question that allows it to be answered…

Topic		Replies	Views
Character presentation error Builder screen	4	1039	November 18, 2019
Text kerning causing Chinese character overlap Builder	2	2474	September 26, 2017
Problem with text display - Chinese characters cropped Builder	3	4170	January 18, 2018
Button component，Chinese characters appear as white squares Builder	6	1186	October 19, 2021
TypeError:TypeError: decoding Unicode is not supported Builder	24	6874	January 20, 2017

Precise Chinese Characters Arrangement

Related topics