When it comes to LED matrices, I’m sort of famous for not actually having a lab setup (see: posts tagged with RGBMatrixEmulator and RGBMatrixEmulator itself) and prefer to emulate in software. However, this week, I got to play with a real Raspberry Pi again, and it wasn’t just the usual “flash an SD card, download some software, push some pixels”. I got to dig into the guts of Python, Cython, and how the two are connected after Pillow (the popular Python imaging library) removed low-level access to its image array. I found it interesting, so let’s dive in!
The Scenario
Earlier this year, the MLB LED Scoreboard team was tracking an upcoming deprecation regarding unsafe access to the image array backing PIL.Image instances which would be removed in version 12.0.0. We traced it to the rpi-rgb-led-matrix SetImage() call that the scoreboard uses in limited fashion. As a reminder, rpi-rgb-led-matrix is the software driver that interfaces between Raspberry Pi and HUB75 LED matrices. SetImage() takes a Pillow Image object and draws all pixels in that image to the matrix. For our project, it is only used to temporarily display the MLB logo on script boot while data is fetched from MLB’s APIs. Losing this wasn’t a huge issue, as we can just blank the matrix during this time instead. Regardless, we opened an issue to track it upstream.
Fast forward to a couple weeks ago, and Pillow 12.0.0 is imminent, so I decided to take a closer look as the issue wasn’t fixed yet.
For all the code snippets below, we’ll be looking at rpi-rgb-led-matrix bindings/python/rgbmatrix/core.pyx on commit a18883, prior to fixing the issue.
Here’s the original code from SetImage():
def SetImage(self, image, int offset_x = 0, int offset_y = 0, unsafe=True):
if (image.mode != "RGB"):
raise Exception("Currently, only RGB mode is supported for SetImage(). Please create images with mode 'RGB' or convert first with image = image.convert('RGB'). Pull requests to support more modes natively are also welcome :)")
if unsafe:
#In unsafe mode we directly access the underlying PIL image array
#in cython, which is considered unsafe pointer accecss,
#however it's super fast and seems to work fine
#https://groups.google.com/forum/#!topic/cython-users/Dc1ft5W6KM4
img_width, img_height = image.size
self.SetPixelsPillow(offset_x, offset_y, img_width, img_height, image)
else:
# First implementation of a SetImage(). OPTIMIZE_ME: A more native
# implementation that directly reads the buffer and calls the underlying
# C functions can certainly be faster.
img_width, img_height = image.size
pixels = image.load()
for x in range(max(0, -offset_x), min(img_width, self.width - offset_x)):
for y in range(max(0, -offset_y), min(img_height, self.height - offset_y)):
(r, g, b) = pixels[x, y]
self.SetPixel(x + offset_x, y + offset_y, r, g, b)
Nothing too funky yet – we’re in a .pyx which means we can use Cython to interface with C, but for now that’s not being used here. We can see what we’re trying to do here is default the implementation to use a faster, “unsafe” access. “Unsafe” typically implies we’re not making guarantees that the code won’t break, so one wonders why this is the default implementation to use… It’s at least good to know we can use unsafe=False to get around this deprecation, even if it’s slower.
Moving on to examine that unsafe function SetPixelsPillow():
@cython.boundscheck(False)
@cython.wraparound(False)
def SetPixelsPillow(self, int xstart, int ystart, int width, int height, image):
cdef cppinc.FrameCanvas* my_canvas = <cppinc.FrameCanvas*>self._getCanvas()
cdef int frame_width = my_canvas.width()
cdef int frame_height = my_canvas.height()
cdef int row, col
cdef uint8_t r, g, b
cdef uint32_t **image_ptr
cdef uint32_t pixel
image.load()
ptr_tmp = dict(image.im.unsafe_ptrs)['image32']
image_ptr = (<uint32_t **>(<uintptr_t>ptr_tmp))
for col in range(max(0, -xstart), min(width, frame_width - xstart)):
for row in range(max(0, -ystart), min(height, frame_height - ystart)):
pixel = image_ptr[row][col]
r = (pixel ) & 0xFF
g = (pixel >> 8) & 0xFF
b = (pixel >> 16) & 0xFF
my_canvas.SetPixel(xstart+col, ystart+row, r, g, b)
Alright! Now we’re in some interesting functionality. We’re definitely playing with low-level stuff now, and that means we need to cdef all the variables we’ll use at the top of the function (we can’t mix and match C and Python), so we’re defining a canvas object (which is native to rpi-rgb-led-matrix), setting up some variables for dimensions, position, color values, and most importantly image_ptr.
It should be pretty obvious that this pointer is a 2D buffer containing pixels in the image. We load this into a temporary (Python) pointer ptr_tmp using image.im.unsafe_ptrs – aha! our deprecated function! – before casting it finallly to uint32_t**. This code is running in mostly C, not Python, so it’s easy to imagine we’re probably getting pretty significant performance improvements with this method.
Now we know the fix is that we’ll need to find a way to get to a 2D uint32 array of pixel data without using the removed unsafe_ptrs call. It’s not worth looking at that call directly as it’s more or less just a Python wrapper returning a reference to the buffer that we just found. Instead, let’s look at the C header in Pillow, which shows where that buffer is located.
Here’s src/libImaging/Image.h @ Pillow 11.3.0
struct ImagingMemoryInstance {
/* Format */
ModeID mode; /* Image mode (IMAGING_MODE_*) */
int type; /* Data type (IMAGING_TYPE_*) */
int depth; /* Depth (ignored in this version) */
int bands; /* Number of bands (1, 2, 3, or 4) */
int xsize; /* Image dimension. */
int ysize;
/* Colour palette (for "P" images only) */
ImagingPalette palette;
/* Data pointers */
UINT8 **image8; /* Set for 8-bit images (pixelsize=1). */
INT32 **image32; /* Set for 32-bit images (pixelsize=4). */
// More things ...
}
Pretty straightforward stuff, but keep this in mind later as we’ll come back to it.
Implementing a Fix
Now that I knew where to look and what behavior to replicate, I started looking at how to actually do it. The deprecation warning noted that we could likely replace image.unsafe_ptrs() to image.im.getim(), so I started there.
This function signature isn’t super helpful…
def getim(self) -> CapsuleType:
"""
Returns a capsule that points to the internal image memory.
:returns: A capsule object.
"""
self.load()
return self.im.ptr
I had to dig into what a CapsuleType represents since I’ve never worked with it before… Essentially, in this case it’s a PyCapsule, which is more or less a container for a C pointer, and this capsule contained a pointer to the ImagingMemoryInstance we saw before. That sounds promising! I thought this would be pretty straightforward after finding this.
Instead, I hit a fairly major roadblock. My strategy was pretty straightfoward:
- Import the
ImagingMemoryInstancestruct from Pillow into the Python script - Read the pointer from the capsule and cast it to the struct
- Read
image.image32directly
What actually happened was Cython totally choked reading the header I needed:
# doesn't work because Cython can't handle typedef *
cdef extern from "Imaging.h":
cdef struct ImagingMemoryInstance:
pass # Filled out from header
Cython is a compiler, and Pillow has some quirky old C code. It seems like it was blowing up on structs like the following, but I’m not totally sure. Either way, it wasn’t working, so I needed another solution
typedef struct ImagingMemoryArena {
// members
} *ImagingMemoryArena;
Without that struct that I need, I can’t get the byte offset to the buffer that SetImage() is expecting. And I really don’t want to hardcode the offset, as that would be quite brittle. I briefly explored that option, but noted that the implementation of the first member (mode) hasn’t changed in probably a decade – right up until it did within the last few months.
The Workaround
Rather than use Cython to try to handle the import, I wrote a really simple shim that does this for us. The header sets up a named struct and declares a function get_image32() – note that it accepts a raw pointer!
#ifndef SHIMS_PIL_H
#define SHIMS_PIL_H
#ifdef __cplusplus
extern "C" {
#endif
typedef struct ImagingMemoryInstance ImagingMemoryInstance;
int** get_image32(void* im);
#ifdef __cplusplus
}
#endif
#endif
But now, the function definition accepts the pointer to the ImagingMemoryInstance we need, and we pull in Imaging.h from Pillow, which should contain exactly the struct needed!
#include "Imaging.h"
#include "pillow.h"
int** get_image32(void* im) {
ImagingMemoryInstance* image = (ImagingMemoryInstance*) im;
return image->image32;
}
And it turns out this works pretty well! There’s a little bit left to glue together at this point, mainly just getting the pointer from the capsule (which you must do by name). I also wrote a helper method to pull out the buffer. Like I mentioned before, in functions that interface with both C and Python, you can’t intermingle Python variables before C variables, so these helpers are useful to make the code a bit more manageable.
cdef extern from "Python.h":
void* PyCapsule_GetPointer(object capsule, const char* name)
const char* PyCapsule_GetName(object capsule)
cdef extern from "shims/pillow.h":
cdef int** get_image32(void* im)
@cython.boundscheck(False)
@cython.wraparound(False)
cdef int** get_pillow_buffer(object capsule):
cdef void *image
image = PyCapsule_GetPointer(capsule, PyCapsule_GetName(capsule))
return get_image32(image)
And finally, the last bit is to simply drop in the new buffer lookup in the unsafe SetImage() function. As expected, this part is really straightforward. Also of note here is that the function now requires the capsule directly, rather than reading it from the Pillow image object. This is, again, due to the fact that calling image.getim() is a Python function and thus, needs to happen after C definitions. By passing it in the parameters, we don’t have to declare it at all, simplifying things.
@cython.boundscheck(False)
@cython.wraparound(False)
def SetPixelsPillow(self, int xstart, int ystart, int width, int height, object image_capsule):
cdef cppinc.FrameCanvas* my_canvas = <cppinc.FrameCanvas*>self._getCanvas()
cdef int frame_width = my_canvas.width()
cdef int frame_height = my_canvas.height()
cdef int row, col
cdef uint8_t r, g, b
cdef int **buffer
cdef int pixel
buffer = get_pillow_buffer(image_capsule)
for col in range(max(0, -xstart), min(width, frame_width - xstart)):
for row in range(max(0, -ystart), min(height, frame_height - ystart)):
pixel = buffer[row][col]
r = (pixel ) & 0xFF
g = (pixel >> 8) & 0xFF
b = (pixel >> 16) & 0xFF
my_canvas.SetPixel(xstart+col, ystart+row, r, g, b)
Wrap Up
Here’s the final merged PR for reference. I hope you found this as interesting as I did – I did not expect to need to write a C shim to get this working, but after it’s all said and done I’m happy with the way it turned out and several others have independently verified the work.
I feel like I have a better understanding (although certainly not complete) of how Python can interface with C/C++ code. It’s a little cumbersome to get used to, but given the performance improvements it can drive, it’s totally worth it.
That’s all for now. Thanks for reading!