MIT-SHM security

First things first. This is not about today's rumor of a security vulnerability in OpenSSH. As it happens, I know nothing about it. No. I want to mention another problem, which has a lot to do with the X11 protocol, and almost nothing to do with the Secure SHell one.

Over the last few months, I have been porting the VLC X11 port to the new, or rather not-so-old, X C Bindings library, a.k.a. XCB (more on this another time maybe). Among various stuff, I re-implemented the support for the MIT-SHM extension of X11:

Originally, X11 transmitted images as pixmaps over a TCP/IP connection. Back then, this was cool because it enabled remote desktop or "export display". However, pixmaps tend to be rather large, meaning transmission was slow. So when using the local display, X11 was made to run on local (Unix) connections instead. But even that was wasteful as the pixmaps were copied from the X11 application to the kernel, and then from the kernel to X11 display server, i.e. two useless copies (or at least one, depending how the kernel implements Unix sockets). People at MIT invented an extension whereby pixmaps would be stored in shared memory between applications and servers, using what is now known as System V IPC shared memory segments.

Back to our matter... The use of MIT-SHM is particularly popular with video players, as it avoids a log of large memory copies, which constitute a waste of both CPU and memory bandwidth. However, the Xlib documentation for MIT-SHM recommends:

(...) on many systems for security reasons, the X server will only accept to attach to the shared memory segment if it’s readable and writeable by ‘‘other’’. (...)

Obedient media players, and an ever increasing number of speed-concious X11 graphical user interface toolkits, consequently create their SHM segments with 0777 permissions, i.e. world-readable, world-writable (and world-executable but I will neglect that today).
And there is the problem... anyone on the system can attach to the segments, and read the raw pixmaps content. In other words, anyone on the system can access the images that media players sends to the hardware. And they can even tamper with them! For this, they need to determine:

The solution to this problem is obvious. In fact, it is in the same paragraph of the Xlib documentation:

On systems where the X server is able to determine the uid of the X client over a local transport, the shared memory segment can be readable and writeable only by the uid of the client.

In other words, it is as simple as creating segments with a 0600 access permission mask. Over non-local transport, MIT-SHM does not work anyway. The degenerated case of non-local transport on a local system owes to plain dumb X server misconfiguration.

What about OpenSSH?

At this point, it is fair to wonder how this relates to OpenSSH. Indeed, OpenSSH is all about non-local access, whereby MIT-SHM does not normally work in the first place.

Problem is, OpenSSH X11 export display feature does pretend to support MIT-SHM. Or more precisely, it fails to account for the fact that the proxied X server cannot support MIT-SHM when used through OpenSSH.
In principles, the X server should not provide the MIT-SHM extension to remote clients - more generally, if it knows that it will not work. Unfortunately, OpenSSH proxies the X11 server in such a way that remote X11 client applications (running on the remote SSH server) look like local X11 clients to the local X11 server (running on the local SSH client machine). It namely creates a faked local connection between the ssh client and the user's DISPLAY, which usually points to a local X server. To be fair, OpenSSH has no other optionr. X11 over TCP is disabled by default in most modern systems, TCP would exhibit needlessly higher overhead, and I am not even sure if X servers know to disable MIT-SHM for TCP (as I never used X11 over TCP)...

Older versions of VLC media player were known to crash due to this, as they did not handle MIT-SHM failures correctly. This has been fixed a long time ago.
But theoretically, there is still a possibility for an evil local user to leverage MIT-SHM to display content at his/her will. Again, the ID on the remote system, pixmap format, dimension must be determined, at the right time. On Linux, segments IDs follow a statically linear assignation algorithm, so predicting them should be feasible. The other parameters were already mentioned earlier.

The potential impact is quite minimal (hijacked video content), as is the exposure (local attack). This is definitely not a doomday scenario for OpenSSH... Still, I wish it would strip the MIT-SHM from the extension list during X11 setup. Tthe proxied X11 sessions would have to be partially parsed by SSH which is certainly why the bug was never fixed.