Hack to enable hardware accelerated video decoding in moviepy aka moviestim3

Some users in this forum have complained about not being able to present high-resolution high-fps videos at the right fps. While processing large frames takes a long time so there are limitations as to what can be achieved, in some cases hardware accelerated decoding can improve the situation. I fooled around in moviepy’s code (used by moviestim3) and found an easy way to enable hw accelerated decoding, at least for Nvidia GPUs (I don’t use AMD or Intel, but there should be a way for those too).

While this isn’t strictly a psychopy topic, I think some adventurous users could benefit from this - at least it’s worth giving it a try. I can’t promise huge improvements, because on my i7-8700k at work, CPU decoding is still faster even at 1440p, while on my old home rig GPU decoding did help even at 720p. This hack is far from optimal, because each frame is transferred several times between the CPU and GPU - that’s moviepy’s limitation, which would need extensive modifications to overcome.

The hack itself is very simple: ffmpeg_reader.py in the moviepy package takes care of loading and decoding movies, by launching ffmpeg in a separate process and piping in each frame (I’m ignoring audio now). Add “-c:v h264_cuvid” or another similar parameter to ffmpeg’s command line before the input file, and hardware decoding is enabled. Of course, this is for Nvidia GPUs only, and only those that support it. AMD / Intel GPUs can use another decoder, not sure which, haven’t tried (OpenCL?).

Warning: this messes up your moviepy package - it might stop working for videos that are encoded in a different way than the one specified by the parameter added.

Code (ffmpeg_reader.py lines 89-95):

cmd = ([get_setting("FFMPEG_BINARY")] + i_arg +
               ['-loglevel', 'error',
                '-f', 'image2pipe',
                '-vf', 'scale=%d:%d' % tuple(self.size),
                '-sws_flags', self.resize_algo,
                "-pix_fmt", self.pix_fmt,
                '-vcodec', 'rawvideo', '-'])

modified to

cmd = ([get_setting("FFMPEG_BINARY"), '-c:v', 'h264_cuvid'] + i_arg +
               ['-loglevel', 'error',
                '-f', 'image2pipe',
                '-vf', 'scale=%d:%d' % tuple(self.size),
                '-sws_flags', self.resize_algo,
                "-pix_fmt", self.pix_fmt,
                '-vcodec', 'rawvideo', '-'])

That’s it. This can be used for videos encoded in h264; HEVC encoded videos need hevc_cuvid.

As I said above, this might not help much at all. YMMV

Sidenote: do not enable ‘-hwaccel’ (such as ‘-hwaccel cuvid’), because it will crash moviepy. Even though ‘-hwaccel’ could possibly improve decoding times a lot by avoiding CPU-GPU transfer delays, moviepy is not prepared to use frames left on the GPU.