Selecting random rows and repeating without replacement

Hi,
PsychoPy Builder uses the random functions provided by the numpy library (beware: these are not the random functions provided by the Python standard library itself). This is the relevant line you’ll see that the start of all Builder-generated scripts:

from numpy.random import random, randint, normal, shuffle

It’s important that you understand what these functions do before using them. e.g. look online for the numpy docs:
https://docs.scipy.org/doc/numpy/reference/routines.random.html

You’ll see that numpy.random() returns “random floats in the half-open interval [0.0, 1.0)”. If you specify a size (as you do), then you get an array of random floating point numbers. Each of those numbers is independent of the others. And because you are sampling from a continuous distribution, the notion of “sampling without replacement” doesn’t even make sense, as there are effectively an infinite number of floating point numbers to choose from.
But note that even if you were directly sampling from integers, you would still have the same problem (it’s not the rounding that is causing your issue, it is that these numpy sampling functions inherently return values that are independent of each other. So note that “that is solved already” is not true.

Two things to take from this:

(1) Be careful to use the appropriate functions. If you want a sample of random integers, use the randint() function rather than random(). e.g. randint(low=0, high=39, size=20)

(2) The above still doesn’t give you sampling without replacement, as each number is still independent of the others (i.e. repeated values are possible). In programming, “sampling without replacement” is more usefully thought of as:
– creating a finite set (e.g. the numbers 0 to 39)
– shuffling them to create a random order
– taking a subset of the first (or last, or whatever) n values.

e.g.

# create initial list once:
row_numbers = range(40) 

# do this each time a new sample without replacement is needed:
shuffle(row_numbers) 
row_sample = row_numbers[0:20]
2 Likes