Background

I've always wanted to take some old software and figure out how it works, and update it to work in the modern day with modern protocols and formats. I had wanted to before make a consulting company that would try to bid on software renovation projects to fix things like 40 year old COBOL unemployment systems. However I realized that these things are massive undertakings with not so great budgets, and very little eagerness on the part of younger programmers. And most importantly usually the call for bids makes it clear that the project would be to modernize rather than a full over-haul and remake due to cost considerations. Ultimately I would want to do some kind of retrofit project and get some old code working again, and so that leads to the premise of today's blog post.

I found the perfect crux of issues in a project on milk.com, which is owned by Dan Bornstein who was instrumental in the early days of Android for creating the Dalvik VM. There's a UPC/EAN barcode generator he made back in 1994 and when you try to run it you get the infamous broken image icon. He acknowledges that this occurs because it outputs using the XBM format, which no modern browser supports but is a very efficient B/W image encoding that is easy to output as a text stream. He includes the source code on his website, with a permissive license that just requires attribution back to his website and no commercial uses unless otherwise negotiated. He politely requested that people send him an email were they to do something cool with his code.

I had been doing a lot of other projects with barcodes, one of which was called PDFTiks (I may publish this later, but I'm still on the fence) which used libqrencode, libdmtx, and pdf417lib to draw squares to create tickets from a database I made with MySQL in C++. For my work I regularly wrestle with lower level older code, and I'm very familiar with the vagaries that C/C++ code presents with data structures. When you use the website you can easily tell by first glance is that it's a Common Gateway Interface (CGI) server. CGI servers were the first form of web applications which were typically written in C, which was how the barcode server was programmed.

Program Dissection

When we open the the tar.gz file that's given to us we have the following files:

  • barcode-image.cgi
  • barcode-request.cgi
  • barcode.c
  • index.html

The CGI files are basic Bash scripts which route the barcode binary, the index.html file is the data you see on the advanced page of the website. So let's focus mainly on the barcode.c file, but we'll get back to the index.html file later. This file is actually very well commented, and totalling over 2231 lines, which I'd say 1/3rd are comments. The program is very interesting, first thing I noticed was a "password feature" which was apparently put in place to prevent hotlinking to the site, as the password was changed hourly. I do remember how controversial hotlinking was a kid, so I do think this is a holdover from that time. There's also a "words to ponder" which is shown to people that request an image without the correct password, as a polite way to discourage the hotlinkers. However these things can all be disabled by CLI flags and you can use the barcode binary independently from the CGI server which was also awesome for quickly trying to resolve this.

At the top of the file there's a struct that we can clearly see is ultimately what we'll be having to work with:

/* simple bitmap structure */
typedef struct
{
    int width;
    int height;
    int widthBytes;
    unsigned char *buf;
}
Bitmap;

This is somewhat similar to libqrencode's output, which is also a character array with a width integer (since QR codes are always square, the width doubles as the height). However the widthBytes gives us some additional insight here on the pixel packing. The character array for libqrencode is 1 byte per pixel (the last bit being whether it's black or white), while here it's one bit per pixel. This is similar to the pdf417lib buffer which I struggled with because I had my bit-shifting code going in the wrong direction. The Datamatrix library also has an option known as 1BppK, which I inferred in this case K meant "Key" which is the black in the CMYK pigment set. But when you try to use this you'll get assert errors and if you remove the assert you'll get a segmentation fault, until I found out that it was included as an option but not supported. So I knew this was going to be a little bit more challenging, but since I had previous experience I was still confident I could work it out.

Now we gotta take a look at our functions to do a good retrofit, hopefully we can find one that will do all the drawing which we can replace. Bornstein mentions using PNG as an "up-and-coming" image format unencumbered by patents like GIF and lossless unlike JPEG. So I started to look into libpng, which seems to be fully featured and well documented. They have a book online which has a chapter solely dedicated to writing PNGs, but theirs is to a file, while we need to output to STDOUT. But it looked promising so I installed the development headers for it.

Retrofitting

On a random Monday afternoon (much like this one) I decided after reading over the chapters for the libpng book many times to finally give it a go, and I found that the function I needed was bitmapPrintXBM. This would take the bitmap struct and using the character array just make a simpler header with some metadata and cast the bits into the correct format, making the XBM image. Man, what ingenuity. So I made a function called bitmapPrintPNG that would be a drop-in replacement.

Metadata

First let's focus on trying to make this as much as an equivalent as possible, starting with the metadata present in the image. The function signature is (Bitmap *b, const char *comment, const char *name, int httpHeader) so we need to make sure to integrate the comment and name into the PNG. Libpng provides png_text structs which are used for the metadata, and they have Title, Comment, and Software available which is perfect for our use. Now let's move on to the "meat and potatoes" of actually writing the image. First let's take care of how we need to print out the image for the final CGI server. I'll skip over the details of setting it up, but this article is a good primer and the resource I used for setting it up.

Writing to STDOUT

LibPNG gives a interface that allows for custom writing of the image to something other than the file. We need to get this printed out to stdout, which is file descriptor 0. However through the CGI service there's no real way to do this, you need to instead print out the contents of the PNG. When you try this with %s though this leads to the image abruptly ending, which reminded me of another issue I had on doing a project for the Cryptography I course on Coursera. There I made in C++ a "hash chain" program that would take blocks of 1024 bytes and make a chain back to the beginning, so they could progressively be verified as the video was read in. One issue I was having was that it seemed to cut off suddenly and not continue on, and I figured out it was that I had a NULL byte in the data stream since it was binary. So when you use functions such as strcpy, it will stop at the NULL byte, so instead you need to use memcpy, and supply the exact memory size you wish to copy. I realized I could not use %s because it was similarly string based so NULL delimited. Instead I would iterate through the given character array up until the size printing out every single character, hacky but it worked perfectly, which gave us an end-to-end solution to then continue developing. The code is shown below.

void pngPrint(png_structp png, png_bytep data, png_size_t length)
{
    // null bytes in the image don't play nice with %s
    for(int i = 0; i < length; i++){
        printf("%c", data[i]);
    }
}

Pixel Buffer Writing

The library defines the pixel buffer as an array of row_pointers. And each row_pointer is an array of png_byte, so we can use the for loop present already in bitmapPrintXBM which iterates over the rows with the height variable of the struct, and then goes byte by byte outputting each pixel. Here's our basic loop:

for (int y = 0; y < b->height; y++)
{
    uint8_t *row = png_malloc(png, sizeof(uint8_t));
    row_pointers[y] = (png_byte *)row; 
    for (int xbyte = 0; xbyte < b->widthBytes; xbyte++)
    {
        // bitmap is represented as 1 bit pixels which are inverted
        *row++ = bitmapGetByte(b, xbyte, y);
    }
}

And so if we encode "12345678" this is what we get with this code:

 

So there's something very obviously wrong with this image. What mainly stands out to me is that it's inverted, which is an easy fix. To invert it we just need to XOR it with 1, and for a mask that would cover all bits that would be "1111 1111" or 0xFF. So we modify the line to show

*row++ = bitmapGetByte(b, xbyte, y) ^ 0xFF;

So when we look at the barcode now we'll see that we're at least writing white and black as we want it, but it still doesn't look right... there's still something very off about this image, why is it scalloped? It took me a few seconds of looking at it, and it hit me. The endianess! There's byte endianess where we arrange how the bytes are ordered. But there's also bit endianess which is obviously mismatched here. I could try to swap the endianness manually, but let's take a look at the libpng API and see what they offer us first. If you were to look at the libpng manual you'll find the following line:

png_set_packswap(png);

And with that, we have this final barcode! Yeah, it's done and it wasn't that hard was it? Well I made sure to test everything in my CGI server I stood up, and when I made sure it was a good replacement I also went so far as to generate a patch file to show exactly what changes were made, Mr. Bornstein did say he was very busy with his regular full time programming job so I'd rather make it as easy as possible to verify the changes. I wrote up a little email saying all he had to include now when compiling the code was -lpng and have the development headers installed. He replied back the next day and said this:

Hi Mr. Lopez! Thanks for reaching out.
As you noted, the Barcode Server code is indeed in need of image-output modernization. I've been meaning to update it for a while, and I appreciate your effort. Thanks!
That said, my plan has actually been to make it be fully client-side (a serverless Barcode "Server") by porting the code to JS and drawing in a <canvas> or <svg> element. CGI on milk.com's HTTP server is kinda on life support and not likely to survive the next major upgrade of the system.

Redoing it in WebAssembly

I was feeling super awesome getting the project knocked out. But his mention of making it a serverless side was probably trying to port the code to Javascript directly, but I knew that WebAssembly would also fit the bill. I also didn't know a thing about it so I thought it would be another perfect learning opportunity. Going to the website it was simple enough to get started, you clone the git repo and then run the ./emsdk install latest command. Then since I didn't want to gunk up my PATH variable I also ran the source ./emsdk_env.sh so then we can run emcc.

Wasteful Deviation

Here I lost some steam, and I didn't really want to continue with the project because it was difficult to get it to be a page that didn't have to be served on a server (thanks CORS). However I found that there was a specific flag, namely -sSINGLE_FILE which would incorporate all the Javascript and WebAssembly in one file to prevent that issue. Then I saw that WebAssembly supported libpng straight out of the box, but there was no easy way to display it to the screen. The beginning applications do show a basic demo with SDL 1.2, which made me stupidly assume it supported only that version and not libSDL2.

If you're not familiar with libSDL2 it is the Simple DirectMedia Layer; which provides a platform agnostic ways to copy pixel buffers to the screen, get input from keyboard/mouse, joystick, even touch screen and use audio amongst other things. I started playing with it after attending a talk where Steve Wozniak geeked out about writing Breakout with very low-level code. Sadly the moderator cuts him off saying "never ask an engineer how to build a computer", which was the biggest travesty I had been a live witness to, he was a wealth of knowledge. I really wanted to be able to do something like that as well (and I may write other articles of my SDL projects). However libSDL1.2 is even harder to use than 2, because many of the little convenience functions such as SDL_Rect are simply not present, you're given a pixel buffer and expected to write everything with that.

Here I was able to easily get the code ported to present the regular bitmap I had before, which was easy enough as copying the pixel buffer over to each bit and also having to swap the endianness myself. So those 2 little commands before were relevant. However I realized that unlike the original PNG you could download and resize using Nearest Neighbor to get a good scaling of the barcode, there was literally no way to zoom in or expand the canvas element on the page. So I made it a requirement that it could be scalable using nearest neighbor so the user could see the barcode easier, since they were admittedly kinda tiny.

However this proved almost impossible for me, I was able to easily get it scaled in the width dimension but could not get the height accounted for correctly, the image would striate and I just about gave up on it. I still also needed to get some Javascript code written so that the options could be passed into the WebAssembly version. But I found out that libSDL2 was available later, but I was still kinda burned out and had very little time to dedicate to this passion project, until I was in the trajectory of Hurricane Ian. This caused me to have a lot of free time where I'd otherwise be doing errands, exercising, or socializing and stuck me inside for a few days.

Rectangles, Rectangles Everywhere

Here is where I finally knocked out the project, and I wrote another replacement to the function I had before, called bitmapPrintSDL, however I didn't make this a retrofit anymore since the CGI server requirement was lifted. Instead I got rid of all the cruft of the HTTP Header and the other nonsense and made a function that would just find the black squares and draw them to the screen, which was much easier since I could work in both dimensions easily.

void bitmapPrintSDL(Bitmap *b, char *comment, char *name)
{
    int scale = 3;
    SDL_Init(SDL_INIT_VIDEO);
    SDL_Window *window;
    SDL_Renderer *renderer;
    SDL_CreateWindowAndRenderer(b->width * scale, b->height * scale, 0, &window, &renderer);

#ifdef TEST_SDL_LOCK_OPTS
    EM_ASM("SDL.defaults.copyOnLock = false; SDL.defaults.discardOnLock = true; SDL.defaults.opaqueFrontBuffer = false;");
#endif

    SDL_Rect rect;
    SDL_SetRenderDrawColor(renderer, 255, 255, 255, 255);
    SDL_RenderClear(renderer);
    SDL_SetRenderDrawColor(renderer, 0, 0, 0, 255);
    
    for (int y = 0; y < b->height; y++)
    {
        for (int xbyte = 0; xbyte < b->widthBytes; xbyte++)
        {
            for(int bit = 0; bit < 8; bit++){
                if(bitmapGet(b, xbyte * 8 + bit, y)){
                    rect.w = scale;
                    rect.h = scale;
                    rect.y = y * scale;
                    rect.x = (xbyte * 8 + bit) * scale;
                    SDL_RenderFillRect(renderer, &rect);
                }
            }
        }
    }
    SDL_RenderPresent(renderer);
}

I kept the PNG code and just ripped out the pngPrint function and made it output to barcode.png. To call the function from inside, I called it using Module.ccall, which allows parameters to be passed into the function. I would just pull what value the user picked from the drop down instead of doing it over a GET request like before. However there was a little hitch with the canvas code when I tried to move it to integrate with the barcode.html page we got from the original image, which I was incorporating using the --shell-file CLI argument. Frustratingly enough the Emscripten web shell includes some code that needs to be intialized before your Web Assembly can use the canvas. I just ripped it out of the original and then made sure that my {{{ SCRIPT }}} directive was beneath it, along with my Javascript shim.

Then I used the FS module to read the file from the virtual filesystem, and I had to cast it to a Blob and ensure the mimetype was of "image/png". Then we use a FileReader to then read the data stream back as as "data URL" which is a base64 encoded version of the PNG file. We then put a second button that says "Save it" so that users can download the PNG in addition to seeing it on their screens (as long as they allow popups).

if(download){
    let image = FS.readFile("/barcode.png", {encoding: "binary", flags: "r"});
    var blob = new Blob([image.buffer], {type: 'image/png'}); // if you try to use the UInt8Array directly it'll encode it to numbers
    var reader = new FileReader();
    reader.onloadend = function(e) {
      window.open(reader.result);
    }
    reader.readAsDataURL(blob); // gives us a base64 encoded URL, make sure to enable popups!
}

So then finally a serverless barcode "server"!

I sent the new version over to Mr. Bornstein and he took a little longer to reply but ultimately said this:

Hi! That's quite impressive!
I encourage you to put it up on your own site. As for me, I know it's still stale, but my intention remains to replace it on my site with a JS version (that I made myself).
And to be honest, I can't fault him there. I would probably do the same, there's a certain sense of pride in having done it yourself. I did send it to many of my friends to show off though. I hope this article inspires you to try to retrofit some software, even if it's inconsequential, the act of trying and failing and learning is well worth it.
 
Update 2023-06-29: It seems that Dan Bornstein has updated the website to have a new JavaScript HTML5 Canvas based barcode generator. Seems pretty slick but it didn't work in Firefox, only Chrome when I tried it. It's probably much lighter than the WebASM version I created here, but it was still a fun exercise regardless. I wonder if my submission was a gentle nudge to get that updated, or was it just the CGI server finally getting replaced.

Contact Us

services@lizard.company

(941) 681-8420