Yet another topic on installation & dependencies

This topic is somewhat related to the discussion Require only hard dependencies in pip install - Development - PsychoPy, but it’s not about the number of dependencies rather than improving the installation process.

<Update: rewrote the whole thing to incorporate several PS remarks and make it more readable and to-the-point.>

I’m trying to set it up PsychoPy, and my impression is that the current state of the installation process is a bit messy. I am a user of both Conda and a Pyenv/Pip/Poetry stack, so I’m comparing both.

For starters, there’s confusion about which Python version works. The downloads page recommends 3.6; here in the discussions I saw that 3.8 was the target; the PyPI package shows it’s OK up to 3.10; and conda-forge has packages up to 3.9 (as of 2022 April 5). In practice, regarding a pip-based install (e.g. Poetry), the dependency package PocketSphinx has wheels only up to 3.6, and other packages are limited to 3.8 (pyo, pywinhook etc) in the same way. Irritatingly, these packages don’t have clear Python version limitations, so, lacking recent wheels, pip attempts (and fails) to compile them from source. Compiling doesn’t work most of the time, and shouldn’t be expected to work on a normal user’s computer anyway. It also takes an eternity to resolve the dependency tree, because of all the possible combinations (10+ minutes on Poetry). On the other hand, while conda is happy to install PP on 3.9, that installation has other issues as it’s missing dependencies , and in case those dependencies are required, some of them don’t actually support 3.9 (e.g. tobii-research). It can be argued that this all isn’t PP’s fault, but that’s like turning a blind eye to reality - it’s not possible to change the entire messy python ecosystem, but it’s not that difficult to make it work by specifying package versions that work.

Then, there’s that the Conda and PyPI 2022.1.2 versions have different dependencies. E.g., pyo and python-vlc are not required on conda, or the google api packages, or tobii-research, and so on. Even though I haven’t tested it, I suspect that this comes with a difference in functionality, e.g. the vlc-based moviestim version won’t work with conda out of the box, will it? Furthermore, skipping some packages seems to go against the principle of a monolithic install anyway. If conda can be slimmed

In the referred discussion, the point is made that the current monolithic installation is currently preferred. I can accept this so I won’t argue in favour of a modular PP (even though I do think some parts could be easily broken off, such as the Builder). Yet even with a monolithic package, the documentation could be clearer, conda and pypi versions could be aligned, and version specifications could be narrowed down so that releases are simple and straightforward to install on pip. If some neglected dependency doesn’t actually work on anything but 3.6, then, even though core PP can run on 3.10 in theory, requirements for releases should be limited to a version that works.

So far, I’ve found that git clone’ing the repo and changing setup.cfg to my liking works best, but that way is hardly feasible for every user.

3 Likes

I’d also love to see a psychopy package that is pip installable. Due to the vast number of dependencies, I think this is only possible if the package defines some core (essential) dependencies and makes the rest optional. For many users, the standalone package might be fine, but I would like to integrate PsychoPy into my regular workflow, which makes use of pip and venv (so pretty much the default tools).

I also think that this “mess” is certainly not entirely due to PsychoPy. It’s rather related to the general packaging state of Python, which unfortunately does not have a single method that works for everything (xkcd: Python Environment). In particular, conda is an external package manager that I do not like to support, because the number one priority should be pip, the official Python package manager. This should be the baseline, so if I’d put any effort into improving packaging I’d try to make it work with pip first.

2 Likes

While I agree with you in general, my point is not that the number of PP dependencies should be reduced. That has been discussed before (see reference on the top) and clearly stated that it was not planned, so let’s not propose something again that’s already been rejected.

My point is that PP installation should be more consistent throughout various platforms (conda, pip), and requirements for release packages should be stricter to allow successful pip installs. This is doable with a little testing and care, as opposed to the huge effort that’d be required to break up PP.

2 Likes

It will be immensely easier with fewer dependencies. But I’m happy with any improvement that can be made to make pip install psychopy work on any platform with the latest Python version.

1 Like

I should’ve been more specific with the title. Something like “please make installs deterministic” or “fix dependency specifications” would’ve been more fitting.

@cbrnr , I’m happy for your replies, since you seem to be the only one interested in this topic, however, I think we’re talking about different things. Making PP work on latest Python versions isn’t possible because of the limitations of dependencies, which require older versions. Having a large number of dependencies seems to be a design decision, which I don’t want to argue with.

On the other hand, the current installation process could be improved with little work, if the developers were interested. In a large project that relies on dozens of external packages, like PP, having deterministic installation (at least for product releases) can avoid a lot of problems. Problems with installation (see my original post) can be minimized; debugging is also much simpler if everyone has the same packages.

1 Like

I think we both would like to see pip install psychopy working. You’re right that we talk about different ways to achieve this, but if you think this can be fixed by restricting package versions I’m sure the developers will be happy to discuss this in a PR. After all, this currently doesn’t work, and pretty much the only way to use PP is to use the standalone installers (which do not exist for Linux).

Well, you can get it to work by editing the dependencies manually - that way it’s possible to slim down the dependencies, which is basically what you’re asking for. Just clone the repo from git, comment out the dependencies in setup.cfg that you don’t need, and install from source. I use poetry for the last step, and it’s as simple as ‘poetry add ./psychopy’, provided the PP source is located in a sub-folder of your project. While poetry is unable to come up with a working solution for all dependencies within an acceptable amount of time (once I let it run for 20+ hours), it works well with a reduced number.

1 Like

I usually do pip install psychopy --no-deps and then add the essential dependencies.

This is an open-source project. You clearly have a strong level of technical knowledge, so you could become one of the developers, simply by cloning the repository and issuing pull requests.

This includes updating errors you note in the documentation - the website content itself is under git control.

Sorry for the delay. I was on leave last week and there are lots of parts to this, lots of issues you point out, but I want to make sure everyone’s clear about the directions and the reasons.

By the way, I think it’s a touch unfair to say nobody is interested in this problem - I’ve spent literally thousands hours trying to get this right and battling to package standalone installers that are probably quite a big part of PsychoPy’s popularity. Yes, installing with pip and conda doesn’t always work, especially with newer versions of Python, but mostly due to problems with dependencies.

Which Python versions do/should we support?

We aim for PsychoPy’s own code (which is what we can ultimately control) to support all the current versions of Python for which there is support and I believe we are currently achieving that. Whether that means you can use it on a particular version of Python would depend on what features you need for your experiment. For instance, if you don’t need precise response timing (so you don’t need iohub, psychtoolbox) and you don’t need eye-tracking then I believe you should be able to get a minimal installation working as @cbrnr points out, using pip install --no-deps and then manually installing what you need. I think that is possible using Python 3.10.x for instance.

What Python versions should we claim to support?

As @kxKkPbRawwdwEu points out the documentation lists Python 3.6 as recommended (needs updating to say 3.8) and the setup.cfg lists all versions 3.6-3.10. That’s based on the fact that our own code works with all those versions to the best of our knowledge. If not we’d call it a bug and try to fix it. Should we change that to say it’s only compatible up to versions where ALL dependencies work? I don’t think so. Python 3.10 would be fine for a user that doesn’t need hardware interfaces like eyetrackers. (By the way a wheel for pywinhook is available for py3.10)

Should we make sure that all dependencies are available for a particular Python version before saying that version is supported? I personally think that’s a step too far. The main issues with other versions are around hardware, like eyetrackers, and that’s the thing that our user

What packaging systems should we support?

It’s hard to support all things, but contributors can add support for the things they care about. Currently status of options:

  • PsychoPy Standalone : This is the recommended system because we know exactly what’s packaged, including dylibs
  • pip install is next best if you really need your own environment
  • conda often seems to have issues with clashing or broken dependencies but fine if someone else wants to work on it. Volunteers?
  • poetry would be great to support - I’ve heard good things - but I don’t know enough yet to add support myself. Volunteers?

What packages are “required”?

If we know what dependencies are required we could just specify those and it would make pip install psychopy work on a wider range of platforms.

The problem is that it depends on what fraction of PsychoPy’s very large functionality you intend to use, and that’s the issue that cropped up in previous discussions. Examples of non-strict dependencies, focusing on those that are sometimes a problem but important to most users:

  • wxpython is only required if you want to use Builder. Most people do, but not everybody. Do we put that in “minimal”
  • at least one of iohub or psychtoolbox is required if you want sub-ms precision in responses. One specific issue in the original post above is about pywinhook which is only needed by iohub in order to get high-precision key timing on windows. Not all users need sub-ms precision, but many do and just expect it with no tweaking, so is sub-millisecond “required”?

There are other packages that are now not really needed, like pyo is effectively obsolete if psychtoolbox can be installed and those should probably be moved to a legacy_requirements or similar.

Specifying dependency sets according to the features

We could break the optional dependencies into chunks like “minimal”, “precision”, “microphone”, “eyetracking”… We could have one called “standard” with most of the features we think people use. There are a couple of ways to provide the installation options and it kind of comes down to what we want the simple pip install psychopy option to do. Should we make it easiest for people wanting to install the bulk and a bit hard to install minimal, or should we make it easier to install minimal by default and extra steps to add features?

Multiple requirements files: We could provide some additional requirements files for things like below? This could install “standard”

pip install psychopy

This could install just psychopy and fewer depends according to what you need

pip install -e --no-deps
pip install -r requirements_standard
pip install -r requirements_microphone

So this system is good if most users want everything, less good for users that want subsets of functionality.

Specify ‘extras’ in setup.cfg or in the poetry installer

This system is rarely used but it ends p being the reverse. The simple install would be very much minimal, and then you’d need to specify extra args to get things like high-precision:

pip install psychopy # would be very almost nothing
pip install psychopy[precision]  # would be the additional deps for that
pip install psychopy[eyetracking]  # etc  

In poetry it would use the extras dependencies system and look like this:

poetry install  # to get very little
poetry install --extras "precision eyetracking"

Plugin system

For info, we are keen ultimately to move to a plug-in approach to PsychoPy so that functionality can be broken down into the chunks and that should allow a better way of specifying the dependencies you need. Then you could add the eye-tracking plugin, for instance, and then it becomes more clear what we should include as a minimal install. The issue with this approach is that it involves quite a bit of coding effort and testing to make sure that users can easily install the relevant plugins

4 Likes

Thanks @jon for your detailed response! I really appreciate the effort you put into PsychoPy! Just a short comment, you can also combine “extras”, for example pip install psychopy[precision,eyetracking]. I think this would be a really nice way to install only what you need. There could also be a shortcut such as pip install psychopy[all].

I agree that pip should be the next best option after the standalone installers. After all, it is the standard (official) tool to install packages (and not conda).

Regarding what should be the default (minimal or maximal), I’d go with minimal and then let users add their extras. If this is documented, I don’t think it should be a problem.

Well, just to be explicit, the slight issue is that, whichever user we optimise for (the minimalist or the give-me-everything-I-might-want) they tend not to read the documentation. People tend to go to the terminal and just type pip install psychopy without checking whether there are any useful options to that. It cuts both ways:

  • install-everything users get annoyed if they do pip install psychopy but their experiment doesn’t work (because their study used a feature outside the minimal set)
  • minimalist users get annoyed if install fails on a dependency that isn’t needed for their use-case

Possibly this can be solved with minimalist install by default and clear logging like:

  • “Your experiment uses keyboard responses but the low-latency psychopy extension is not installed. We recommend you run pip install psychopy[low-latency] pip install psychopy[all]
  • “Your experiment uses eyetracking but the eyetracking extension is not installed. Please run pip install psychopy[eyetracking] or pip install psychopy[all]

So it seems like it is easier to soften the blow for minimalist default than it is the other way around. And I guess the other argument in favour of minimalist installs is that this is more often the preference for “advanced” users. Newbie users will/should go with Standalone anyway

1 Like

Exactly this! I also think that the standalone installers already provide the “give-me-everything-I-might-want” approach.

@jon, thanks for such a detailed response. I assumed that since this topic has been discussed before, it wouldn’t get much attention so I was too hasty to add that remark about “nobody being interested”. I am sorry, it was not appropriate at all. I’m glad to see so many replies. PsychoPy is an amazing project and I’m happy that it’s available for free - thank you @jon.

I agree with most of what’s been posted here and in no way am I trying to say that any of it is incorrect. I also understand the difficulties and complexity of moving forward. However, I think that while you’re discussing similar points that I mentioned in my original post, the perspective / emphasis is slightly different. The way this discussion is going is what I’d call developer-focused as opposed to user-centric. In the end, the conclusions may be the same or very similar, but I think looking at it from another perspective is useful.

My user experience

PP is a bit like configuring things on Linux – either it works out of the package or you’ll spend the next day digging around source code, researching documentation (if you’re lucky) or trying to get information from online discussions. Let’s say I’m a mid-level user: I know a bit of programming, won’t shy away from reading some docs, I even enjoy tinkering with PP’s dependencies to a limit. On the other hand, I don’t necessarily want to know everything about PsychoPy’s dependencies just to be able to successfully install it. I know the standalone version won’t work for me, and I don’t need most of the extra functionality, like the Builder and speech recognition etc. Yet I had to do research on most of the dependencies to have a feeling if I’d need them or not, even then I had to make guesses about what’s needed, and even then, it took several attempts (and a lot of time) to get a working combination. For example, I removed bidi and the arabic-something packages because I don’t need that functionality, yet psychopy.visual wouldn’t load without them. Figuring out which Python version would work took several attempts as well. And I didn’t even think of looking for unofficial complies outside of PyPi.

@jon gave a few examples of some packages that are or aren’t necessary depending on what the user wants to do with PsychoPy: wxpython, iohub, and that pywinhook has recent, albeit unofficial wheels. These examples actually support my point. Which package is necessary for what function, and what package works with which Python version is something developers probably know, but how does a user supposed to have all that information?

There’s a lot that can be done about this, and several possible ways to go about it have already been mentioned here:

  1. Improve the documentation; detailed description of what functionality needs which dependency.

  2. When PP is run, check missing dependencies and print out a (long and detailed) message about what will not work and details about dependencies (such as in the previous point). After that, things can break the same way they break now – the user has been warned. This way the user is confronted with the information each time PP is run. I think this is not that difficult to implement.

  3. The previous point can be done in different ways, e.g. as mentioned, a modular approach would warn about missing dependencies when an attempt to use the respective function is made. However, this is probably more complicated to implement.

  4. Provide a release package (e.g. psychopy-release) with dependencies “frozen” to versions that are tested to be working; similar to the standalone install but working with package managers. This would be very quick and easy to install with package managers as no resolution process is required.

  5. Lean core package + modules / deps grouped into functionality-based packages

  6. Combinations of the above.

The solution I favour, and which I believe doesn’t take a lot of effort to implement, is to have a lean core install and better info about dependencies (possibly points 1 and 2 both), but this might not work for most casual users.

Package managers

I stopped using conda a while ago because of differences in the availability of packages, more importantly, because the same package might be available under different names, or the same package name would include different content as compared with pypi. PsychoPy belongs to the last group. I wonder what the reason is for those differences, why the conda package installs other requirements as the pip one?

In any case, pip has known issues with package conflicts, and, last time I checked, even their latest (better) dependency resolver is documented to introduce conflicts if packages are added to an existing environment as opposed to installing said packages together. I think Conda is better in this respect, having a separate ecosystem of packages often leads to problems. I’ve read that poetry has a reliable resolver, and in my experience, it works reasonably well. I don’t think poetry needs special care to be supported – it works through pip, after all. However, the other day I tried to install PP with all dependencies with poetry and it ran over 20+ hours without being able to find a solution (I believe it was targeting Python 3.6), so not so useful in every situation. All of the above managers work much better, if the dependencies are fixed or at least more strictly versioned, though.

All in all, I understand that improving on the current situation can involve quite a bit of extra work. My skills and time are limited, but, in principle, I’d be willing to help out if I can.

In general, I do try very much to think about this from the user perspective. It’s just that there are many types of users with different perspectives! You’re optimising for your own type - mid-level programmer wanting minimal install (on Py3.10?). Some of the actions you suggest work for all types (e.g. the docs were incorrect, which I’ll fix today) but others improve your experience at a cost to others (e.g. moving to the lean core distribution) and then we have to work out which is best for most users

To answer some specific Qs:

That’s a tricky one, but could be interesting to work on. Should have some sort of validating code, presumably, that to make sure it’s currently accurate. It’s the sort of thing that could easily slip out of date

The conda installation was contributed by users that wanted it but I think they’re not using PsychoPy these days or haven’t got around to updating it. Also they had to use some workarounds for a couple of packages that were hard to get into the conda ecosphere or were incompatible with other parts of conda packaging.

This might have been due to an issue that was recently fixed in our setup.cfg where conflicting versions of python-vlc were being requested. It’s now fixed in the release branch so hopefully the next release will work better for this (2022.1.3 which is imminent)

This is an interesting approach and might also be used to solve the issue of minimalist users versus those that prefer omnibus packages. pip install psychopy could be omnibus as now, and pip install psychopy-release could be the minimalist installer

Kind of. What’s best for users is something you have a better view on, my wish is merely to have a working distribution (lean or otherwise) that can be installed with a package manager, and information about how to customise things. I cannot think of any downsides to these for users (more work for the devs, probably). My preference is to have a lean install, but that’s optional.

I can imagine that the testing etc. required to validate this kind of information would be an extra burden. However, once a release is out, it won’t change (much) so this information will not necessarily go stale either, will it?

I was aware of this problem and had already fixed all labels in setup.cfg that could “confuse” Poetry. I only did this to see if PP would install at all with all dependencies (which means it had to be Python 3.6). Poetry had no trouble with PP on Python 3.8 with a few less dependencies.

I’m currently looking at the options for packaging subsets of PsychoPy dependencies

The problem (as described in this thread and others):

  • we have lots of dependencies
  • some don’t install nicely or are large
  • some aren’t needed, depending on what you do
    • if you don’t do eyetracking you don’t need those libs
    • if you don’t use the app you don’t even need wxPython(!)

We want a way that easily handles the following scenarios/user-types:

  1. I only want the lib, not the app, but I do want serial/parallel ports
  2. I want the app but not much more
  3. I want everything in one go

Ways we can specify sub-packages or soft/dependencies:

  • Extras: poetry and pip both support ‘extras’ so we can do
    • poetry install psychopy[hardware] #add extra dependencies
    • pip install psychopy[hardware]
    • BUT no way to exclude something. You can’t use this to say psychopy has app by default but that can be excluded
  • Groups: this a pure poetry thing but looks great for one day:
    • poetry install psychopy --with hardware --without app
    • BUT new. Not supported on pip and requires poetry 1.2+
  • Multiple different release objects (different github/pypi entries)
    • pip install psychopy # (most common)
    • pip install psychopy-lib # (smallest set of depends)
    • pip install psychopy-app # (superset of psychopy-lib)
    • pip install psychopy-app psychopy-hardware
    • pip install psychopy-superpack # install most things
2 Likes

@jon

Thanks for continuing to pursue this subject.

Regarding your point about how to “specify sub-packages”, I think the most feasible way would be the 3rd option, i.e. separate pip packages. Relying on a feature in Poetry (that not everyone uses) in a development branch (1.2) doesn’t sound so good. I don’t know much about extras, but unless it allows one to specify several extras, I don’t see how it would be able to support the multiple options that I also think are necessary (basically as you listed in the 3rd option).

Currently going through the python dependency hell. Very nice to read your response from Aug. 2022. There is hope! poetry add psychopy fails very hard on Debian 11. I’ve been using pyenv to try all versions from 3.8.13 onwards. No luck.

poetry groups are fantastic. But yeah, too new. Plus poetry is already proving to be slow at resolving large dependency trees, and imo will be replaced by something faster somewhere down the line. I’d definitely say packaging to pip is much more future-proof. That said, for now, I do use poetry and love the reproduceable builds it offers.

So progress report:

  • PsychoPy release v2023.1 now has a plugin mechanism and the ability to install both plugins and standard PyPI packages from within the GUI. That solves some of the problem even for Standalone users and we’ve added about 20 plugins already focussing for now on moving particular hardware extensions out. We will be moving additional parts of the code, like different movie backends, different audio backends, out to plugins over the next year
  • I’m currently adding support for pyproject.toml dependency listings using pdm (I tried poetry as well, but that one annoyed me - pdm is very similar but works better!). That allows us to add/remove a dependency more flexibly and track versions for clashes etc. It’s also pretty much critical for the next step…
  • allowing subset packages (like the experimental psychopy-lib) as independent projects. psychopy-lib is for people that essentially want a psychopy experiment run-time without creating experiments in the app and doesn’t need wxpython, for instance. Creating these subset packages is going to be a bit painful to work with - we can’t just specify different dependency sets in a single repo, unfortunately, so I’m investigating doing it this way with entire separate repos. The problem with these will be that their history won’t nicely map onto the main repo, and I’m concerned they’ll break when automatically updating, but we’ll see how they go
3 Likes