home home about about blog blog music music tech tech photos photos art art photoshop photoshop
blog post
back to blog main

solving a base64 mystery nobody asked for

2024-02-18 @ 22:46 UTC
filed under: tech nerd sniping

yesterday i came across an interesting tumblr post. (the original has been removed, but it lives on in reblogs)

tumblr post of screenshots that start with 'me in a relationship', followed by a long base64 string

this wormed itself into my mind and i ended up spending far, far too much time on it.

for those unaware, the spaghetti wall of letters and numbers is a base64-encoded JPEG image (and not a URL as some guessed). [ed note: multiple people have argued that it is indeed a URL and the MDN and WHATWG agree with you. however i still argue that since it is not locating an internet resource i still think URI is more apt and that just because your browser's address bar can parse something, that doesn't necessarily make it a URL. semantics ;) ] in certain cases when you tried to insert/paste an image into what’s ostensibly a text-only box, this could happen.

the thing that’s bugging me however is that there’s image data there. we have fairly a clear (albeit with JPEG artifacts) screenshot of text that, thanks to how Windows ClearType renders text, each character is identical to each other, that is to say, an uppercase Q will always look more or less pixel-perfect each time, meaning we don’t have to guess what a Q looks like, we simply have to pixel-accurate match it.

closeup of uppercase Qs highlighted to show pixel similarities

as an aside, this is why regular OCR struggles so much with this kind of data retrieval, such as code even when it’s clearer than a physical paper scan. ordinarily, OCR will try to best-guess every single letter because it expects each letter to be slightly different from each other (as would be the unpredictable nature in a scanned document), and on top of that most OCR today will try to autocorrect because it expects the scanned text to contain words in some human written language.

so, all we have to do is make a program to recognize each character and piece back together the whole base64 string, right? well…

first i stitched all 7 images back into a single block of text, observing the consistency of the line spacing. some of the screenshots have little bits of the previous one sticking out of it, which helps with alignment and to make sure they’re in the right order.

closeup of the base64 screenshots showing edge bits that align with each other

after that i had to sample every single letter off this file. this means going around the file and finding one example of each different character we’re trying to identify, saving it as its own separate file so that the program can load them as references to compare against in the full image. for base64, the alphabet consists of a-z, A-Z, 0-9, +, / and =. once i had the initial code in place…

result of the recognized text data recreated by pasting the isolated letters. there are gaps where a match was not identified

…close! but oh so far. if any one single character in a base64 string is wrong or missing, the resulting decode will be wrong. the issues i was having were mostly with the lowercase r and j because of how the kerning affected the pixels around those letters. i was also getting false matches for r where there should be an m. what followed was grueling hours of tweaking the matching code and my known font set to better fit the original image and get as close as possible to a 100% match. here is the resulting code, maybe it’ll be useful for someone and this won’t have been a complete waste of time.

once i was confident through the verification image that i had all characters recognized, i put it through a base64 to JPEG decoder. i actually did this several times as i improved the recognition and what follows is the best result that came out of it yet. i suspect some of the data might be missing (perhaps a line or block of text got lost in between screenshots), or i have a wrong character somewhere resulting in a wrong value. this is the image extracted from the base64 string:

a corrupted picture of a person wearing an oversized backpack

this is where i'd stopped when i made my initial post about this on tumblr. however, shortly afterwards (because i can't let sleeping dogs lie) i found that i still had one letter wrong:

the original text image followed by the diff with a recreated image. the difference shows that an m was mismatched as an r

with this it still doesn’t parse as fully valid base64 in strict mode so i think there’s still another letter in there that’s wrong, but i couldn’t find it. however this gives us a better look:

a less corrupted picture of a person wearing an oversized backpack

and this is finally enough to do a reverse image search. i present to you, the HD version of our intrepid massive backpacker:

a complete, higher resolution picture of a person wearing an oversized backpack

still have no idea what they meant by “me in a relationship” with that, though.

<- using webgbcam in OBS my pocketchip experience in 2024 ->