Building a Custom Markup Language for Raspberry Pi LED Matrices

April 9th, 2022

I’m starting to write a new markup language for Raspberry Pi RGB LED matrices to simplify building custom layouts for the matrix itself. I’m calling this new language MatrixML. The process for creating a layout before MatrixML is pretty fiddly – the developer really needs to wrangle lots of attributes like color, font, and position for each line of text, shape, or possibly even pixel. All of that adds up quickly for level of effort, and it’s only compounded by the fact that it’s a fairly low-level API that it needs to be run through. When things need to be animated, it means managing internal timers on top of all of this.

My goal was therefore to write MatrixML and an associated rendering engine that can abstract the lower level details of the rpi-rgb-led-matrix library away from the developer – much like HTML is the markup language that the browser’s rendering engine uses to create webpages.

Building the Parser

When I was writing Hyde, one of the first steps to creating the language was to build a tokenizer and parser for it. Much the same way, MatrixML would need the same components. The easiest way to get this off the ground for MatrixML was to reach into the html.parser library included within Python to get what I need.

This is actually a little confusing for two reasons. It doesn’t parse only HTML, which is great because MatrixML won’t use tags in the HTML specification that you’re used to, like <div> or <span>, and instead I’ll be defining my own. Secondly, HTMLParser isn’t really a parser, it’s more of a tokenizer. It handles reading start/end tags, attributes, and data from the template source, but that’s about it – there’s no conversion to any sort of intermediate code representation within Python. Because of that, it’s still very convenient to use, but I’ll still need to do a lot of work to actually build up my own intermediate code once the HTMLParser tokenizes it.

Here’s my parser called MatrixTemplateParser (source @54102a8):

# MatrixML.parser

from html.parser import HTMLParser
from MatrixML.elements import *
from MatrixML.errors import InvalidTagType, ParseError


class MatrixTemplateParser(HTMLParser):

    ELEMENT_TYPES = {
        'row': RowElement,
        'scroll': ScrollElement,
        'text': TextElement
    }

    def load_template(self, template_path):
        with open(template_path, 'r') as f:
            self.__html_raw = f.read()

    def parse(self):
        self.tokens = []
        self.feed(self.__html_raw)

        return self.parse_tokens(self.tokens)

    def handle_starttag(self, tag, attrs):
        self.validate_tag(tag)
        self.tokens.append({
            'tag': tag,
            'type': 'start',
            'attrs': attrs
        })

    def handle_endtag(self, tag):
        self.validate_tag(tag)
        self.tokens.append({
            'tag': tag, 
            'type': 'end', 
            'attrs': []
        })

    def handle_data(self, data):
        if data.strip() == '':
            return

        self.tokens.append({
            'data': data.strip()
        })

    def validate_tag(self, tag):
        if tag not in self.ELEMENT_TYPES:
            raise InvalidTagType(f'Tag "{tag}" is not a supported tag. Supported tags are: {", ".join(self.ELEMENT_TYPES)}.')

    def parse_tokens(self, tokens):
        stack = []
        output = []

        start = 0

        if len(tokens) == 1:
            return tokens[0].get('data')

        for i, token in enumerate(tokens):
            if token.get('type') == 'start':
                stack.append(token)

            if token.get('type') == 'end':
                prev_token = stack.pop()

                if prev_token.get('tag') != token.get('tag'):
                    raise ParseError(f'Mismatched tag {token.get("tag")}')

            if len(stack) == 0:
                subset = tokens[start + 1:i]
                element_type = self.fetch_element_type(token.get('tag'))
                element = element_type(self.parse_tokens(subset))
                output.append(element)
                start = i + 1

        return output

    def fetch_element_type(self, token_type):
        element = self.ELEMENT_TYPES.get(token_type)

        if element is None:
            raise ParseError(f'Invalid element type for {token_type}')

        return element

First, I’ve defined the ELEMENT_TYPES my parser will support. Right now it doesn’t support self-closing tags (i.e. like <br />, <hr />, or <img />). This dict also maps the tag to the internal element that will eventually be used to render the data. There are some helpers that rely on this dict like fetch_element_type() and validate_tag() just to make sure the template makes sense. These throw an error so halt execution if you screw up writing MatrixML templates.

Next we have utility functions for loading the template and parsing data. Notice in the parse() method there is a call to self.feed(), which is the method that HTMLParser uses to tokenize the tags. The other additions that come from HTMLParser are handle_starttag(), handle_endttag(), and handle_data(), which I use to simply push tokens to an internal list for later use.

The parse() method is where the fun happens. This is a recursive function that pushes and pops tokens from a stack. By comparing the previous token to the current one, it immediately knows if the template is valid or not (this is kind of like a riff on the classic brace validation algorithm). From there, it determines the internal element type for a given token, figures out how many tokens it should put into the new element, and tries to parse those before shoving it into the element’s data. Since it’s recursive, this means the output is a elements with nested lists of other elements that can eventually be used to render text and graphics on the screen.

Writing a MatrixML Template

Writing MatrixML is pretty simple! Write it using the valid tags that are defined in the parser just like you would standard HTML:

<text>
  TEXT
</text>
<row>
  <text>
    ROW
  </text>
</row>
<scroll>
  <text>
    SCROLL SCROLL
  </text>
</scroll>

From there, the parser then has an internal representation of our template that can be rebuilt with a little code:

parser = MatrixTemplateParser()
parser.load_template('test.matrix.html')
parser.parse()

def print_elements(elements, level=0):
    for element in elements:
        if isinstance(element.data, list):
            print_elements(element.data, level + 1)
        else:
            print("  " * level + element.data)

print_elements(parser.elements)

And the output is:

TEXT
  ROW
  SCROLL SCROLL

What’s Next?

This really only scratches the surface of what’s already been done for MatrixML. There’s a lot of code written to actually render the elements to the matrix – the specifics of which I’ve already covered in other posts about RGBMatrixEmulator and the rpi-rgb-led-matrix driver library (read those here and here)). I’ll probably do another writeup as MatrixML matures covering these topics in more detail within the context of this new rendering engine.

In the meantime, I’ll leave a sample of the above MatrixML template running:

MatrixML

That’s all for now. Thanks for reading!

ty-porter

Building a Custom Markup Language for Raspberry Pi LED Matrices
Applying lessons from my toy language to simplify building custom LED templates.

Building the Parser

Writing a MatrixML Template

What’s Next?

Building a Custom Markup Language for Raspberry Pi LED Matrices Applying lessons from my toy language to simplify building custom LED templates.

Building the Parser

Writing a MatrixML Template

What’s Next?

Building a Custom Markup Language for Raspberry Pi LED Matrices
Applying lessons from my toy language to simplify building custom LED templates.