I’m a fan of the DRY principle: Don’t Repeat Yourself. One could extend this acronym to a more accurate Don’t Repeat Yourself Any More Than Necessary (DRYAMTN – dry-am-tin, catchy, right?), meaning: it’s not that you put on a different pair of shoes every day (ahem I don’t) or go home to a different house, it’s that you’ve found the most efficient way to put on shoes or drive home. You still repeat yourself, day in and day out, just not any more than necessary.
Part 4. Slaying the beast, escape from hell
Note. There are many excellent resources in existence that describe database cleanup processes similar to this, in much more detail than I wish to go into for this blog, so please check those out if you find yourself in this situation, links will follow this post.
Mojibake (文字化け?) , from the Japanese 文字 (moji) “character” + 化け (bake) “transform”, is the presentation of incorrect, unreadable characters when software fails to render text correctly according to its associated character encoding.
Part 3. Revealing the beast, bit by bit
“What do we know … of the world and the universe about us? Our means of receiving impressions are absurdly few, and our notions of surrounding objects infinitely narrow. We see things only as we are constructed to see them, and can gain no idea of their absolute nature.”
I wallowed a little bit, but then I got fired up again and hit the interbooks.
Unfortunately for me, most searches only partially described the problem I was facing—a piece here, a piece there—there was no great, cathartic reveal. So I gathered the pieces that did fit and put them in a big ugly pile in front of me. After shuffling them around and using my thinkin’ muscles, a picture began to form:
Part 2. Into the pitch black depths
“Every happy database is alike, every munged database is munged in its own way.”
I would describe the period that followed the meeting with my new boss to be a rapid transition from a doe-eyed optimism to a state of desperately headbutting into reality. I was groping around in the dark for something, anything to hold on to after several weak attempts at solving the problem proved problematic at best. Here are some of what I encountered:
This story is by no means original, a web search for ‘mixed character sets’ or ‘mysql character set hell’ will return many relevant results (including the inspiration for the title), harrowing tales from developers and DBAs who have gone through the wringer at some point in their careers. This is mine.
Part 1. The beginning
Oh, to be young and naïve again…
…I think this every time the memories from the first days at my first tech job spring up and inevitably, fatalistically, and cruelly guide me toward one particular moment in general, that of an insidiously short conversation I had with my new boss one day before lunch.
The title says it all. I am abandoning Wordpress and trying something a little more ‘baked in’, that is to say, HTML-based and not database-based. I like it!