The file size is probably not the issue, in the end it’s going to be similar to or slightly smaller than separate video and audio files. The issue is that the way that PsychoJS interfaces with cameras and microphones just doesn’t allow it.
It actually works almost exactly the same way in both cases, it creates a media buffer of the specified format (video or audio) and then saves it as a webm file (yes, the microphone recordings are also webm files). The problem is just that it doesn’t have a way of combining those two buffers into a single file. You can do it after downloading the data without too much trouble, you just can’t do it in Pavlovia itself as far as I can tell (even with custom code it would be very tricky because you’d have to access the buffer before it was saved and uploaded, and I’m not sure if the encoding system would allow it anyway).
I think that the PsychoJS version of the camera component is still somewhat under development (right now it’s basically a copy/paste of the microphone component with a few tweaks), so I expect this is something that may change in the future, but I’m afraid that for right now this is just a hard limitation.