Solving rush-hour

For those who’ve been inspired by last programming challenge, I thought I’d give annotations on a programming exercise I’ve created.

The challenge: code an AI to solve Rush Hour, you can play Rush Hour online if you aren’t familiar with how the game works. If you want to follow along, you can find my solution on github.

The game of Rush Hour
The game of Rush Hour(tm) – The objective is to get the red car to the exit. The above picture illustrates one possible board.

Let me start with the high-level thought process.

How to solve it?

The most naive approach is a brute-force solution, trying every move, filtering out backtracking. Some simple math will let us rule this possibility in or out.

The only state in Rush Hour the positions of the cars. Thus the number of possible game states is equal to the number of possible arrangements of cars. We can approximate an upper bound of the latter by multiplying the number of possible spaces each car could move to (this ignores cars being on top of each other, hence upper-bound). In the above board we see 8 vehicles, and each vehicle can only ever occupy 4 or 5 squares [they can’t turn ever] depending if their size). Some boards have more vehicles, so if we estimate 10 vehicles at 5 positions that’s 5^10 < 10 million.  So yes, brute force is in.

The general solution:

Because “brute force” is such a general mechanism, there really is a lot of reusable code in the solution. In my solution I’ve pulled out the reusable logic into genericSearch.js. The idea is that a whole host of puzzles follow the same form: Given an Initial state try to get to End State.

The docblock below is the interface to genericSearch that is used by the solver, but this same search function could be used by any number of puzzles (triangle peg game, sudoku, etc).

Screenshot_2
Any brute-force can be represented with only these params.

Thus, to write the program (after having written genericSearch.js) all I must do is to make those 5 parameters [and any helpers to go along].

Initial state – Brain Dead Simple

I know a lot of engineers who would be tempted to represent the game of rush hour with at least a half dozen classes (“so we have vehicle base class, which has car or truck subclasses, and we’d need a class for the board and legal moves”).

And that is a valid way to go about it. I went with a minimalist solution– a nested array 36 characters. Brain dead simple.

 

 

 

It seems that perfection is attained, not when there is nothing more to add, but when there is nothing more to take away. — Antoine de Saint Exupéry

Because every car is at least 2 in size, this string notation suffices to convey both the position and bearing of vehicles, everything we need. Since the exit is always in the same spot, that needn’t be represented in state. Adjacent characters of the same value represent the same car.

We have our first parameter, initial state. However, we have skimped in our state, by not defining what a vehicle is or orientation in the state object we’ll have to define that logic later.

Some of the advantages of representing the board state as an array of characters:

  • It’s trivial to serialize. Serialization is key to making sure we don’t backtrack when searching for solutions.
  • It’s very easy to input.
  • It’s very easy to output and debug. At the end to see the solution found I can simply “play” the states and watch the characters move.
  • At any given moment it’s trivial to determine if a space is vacant of any vehicles.
  • Certain impossible board-states are prevented automatically (e.g. a misconfigured initial state could not result in overlapping vehicles).

Some of the disadvantages:

  • We must determine every car on the board and which direction it’s facing. This is non-trivial (let’s call “trivial” something that I code correctly on my first try).
  • Moving a vehicle is kind of “dirty.” By moving vehicles by overwriting the board state it’d be easy make bugs that create weird board states (vehicle with size one).
  • Certain impossible board-states are prevented automatically (e.g. a misconfigured board state could not result in a vehicle with no bearing)
Parameter 2 – Possible moves

Each vehicle may move forward, or backward (assuming its path is unobstructed). To keep things simple, we can say a vehicle can only move 1 square at a time and represent longer travels as multiple moves.

Get all possible moves from this state
Get all possible moves from this state

This definition introduces a few more functions (getVehicles, canGoForward, canGoBackward), which I won’t put into the post. See the full code for that. The reason I exclude them is because I don’t have any particularly elegant solution to those tasks.

Parameter 3 – Apply move

Again I have no magic to work on this one so I won’t show the 15 lines. In fact, it feels like the least dry piece of the code to me, and it makes me want to refactor the most. The number of occurrences of -1 are too numerous (6) and speak to casual coding.

So don’t get me wrong, I don’t want to say anybody should hold this code up as an amazing example of good code. What I hoped to illustrate is how a problem that seems daunting can actually be broken down into 19 functions, the longest being 23 lines.

The Gimmes

I have two functions left. One is how to uniquely represent the state as a string. The other is win condition. The former is easy: JSON.stringify. The latter is easy as well.

Screenshot_6

And we’re done! Now I’ve left our the genericSearch.js, which honestly was perhaps the most fun, but this post is long enough. If your thirst isn’t quenched try playing around with it yourself.

Takeaways
  • Keep the pure & reusable code in a separate file (or separate function) from your 1-time, implementation-dependent code
  • Maybe 2 classes would have come in handy. Despite being small, the code doesn’t seem as friendly as I’d want. If I had kept state as an object of objects I would never have to define string -> state mapping.
  • Unit tests would have been a good way to allow such a refactoring to be less painful.

Don’t be an Architecture Hipster – A guide

One of my more popular HN comments was a teardown of an all-too-common engineering subculture. The article in question sought to teach “Software Architecture,” but ultimately annoyed me and many other HN readers.

Tl; dr:
Software architecture discussions are so polluted by software-hipsters that the majority of software engineers are disinclined to discuss and study architecture at all.

Know-it-all / hipster

  • Probably studied philosophy, english, film-studies, or engineering.
  • Lurks around discussions, hoping to overhear a mistake, any mistake. 
  • Uses an unnecessarily complex vocabulary
  • Has a self-righteous and loud tone of voice, especially when correcting people.
  • Enjoys correcting people, even on minor/irrelevant points.
  • If asked him a question you can be sure of two things:
    1) he will never admit he doesn’t know
    2) he will answer with so many niche terms that you will end up more confused than when you began
  • He may be likened to Dwight Shrute, or the comic-book guy.

Architecture  Hipster

  • Loves UML making diagrams. Gladly presents them without a legend to people who don’t know how to read UML diagrams.
  • Love creating new unintuitive names for very simple ideas (e.g “Broker topology”). Proceeds to use these niche names in discussions and judges other people for not knowing what a “Broker Topology” is.
  • Gets really into programming fads that are hailed as panaceas but later get tossed out for a new panacea (e.g. microservices, Layered Architecture, REST).
  • Gets very emotionally attached to programming decisions, even if they will have no effect on the user.
  • Loves engineering around unlikely “what ifs…”.
  • Prefers creating abstraction through classes instead of functions, even if no state is needed.
  • Is bored by “practical details” like logging, debugging, convenience, fail-safes, and delivering quickly.
  • Maximizes the number and obscurity of patterns used, leading to classes named things like EventLoopListenerFactory.
  • Only cares about performance if obscure (e.g. tail-call optimization)
  • Isn’t actually very capable creating software (hasn’t won any coding competitions, has trouble with git reset, hasn’t written any software that people enjoy using)

Thus:

Know-it-all / hipster
 = Architecture  Hipster 
+ software architecture 

I don’t dislike architecture. Architecture is a beautiful study. However, the complexity of the discipline makes an fertile ground for phonies; a space for phonies to use miscommunication as a tool to create an illusion of their own competence.

The “Hello” World

Have you ever been searching for a song on your favorite music service and scrolled through all the songs that matched your search. All of them?

For example, top of the pop billboards at the time of this writing is Adele’s “Hello.” I scrolled for a while and got to at least 1,962 songs named “Hello,” before I stopped scrolling.

It feels like looking into the grand canyon.

A few things are immediately apparent-

  1. All comes to pass. To illustrate, Madonna too is in the Hello list, but her Hello didn’t last; I can’t see why Adele’s would.
  2. The top dog takes it all. Adele’s variant has 250 million spotify listens. Most of the Hellos have < 2,000 listens. Adele’s Hello very well may have more (spotify) listens than all of the other 2 thousand combined.
  3. It’s  harder than it seems. When all the songs you know of are big hits, it can be hard to realize just how many unpopular songs are out there.
  4. This goes deeper than “Hello.” I just picked the top song on the billboard at the moment, but it could have been any songs. Or any poem, or book, blog, or famous person.

And a few things are less obvious-

  1. Why does the top dog take it all? Why do 99% of songs never reach the radio? Are most of these songs just bad? Is it simply that it’s ten times easier to write a bad song than a good one? Or is it that the music industry builds pop celebrities for profit, and radios buy in? Or is it that audiences don’t want so much choice, that we only like a song the 3rd time we hear it so we focus on a few new ones?
  2. So two thousand people chose the same word for their title. Is unique art only a fantasy? From the pool of a million english words, two thousand artists all picked the same one as the title of their song. I know this because I searched by title. How many of these songs share the same key chords? How many of these songs are about the same thing? Artists, like the rest of us, like to think they are doing the unique, but maybe the pool of possibilities split among all of humanity isn’t big enough to allow us each a distinctly unique idea, song title, or life story.
  3. So where do they all end up? There must be at least 80 hours of “Hello,” on spotify. Which makes me think there must be enough music on spotify that I couldn’t listen to it all if I dedicated the rest of my waking life to it. And people are still writing music. Is it just a never-ending cycle of new genres with a small fraction surviving into each new generation with the vast musical history resting in peace at the bottom of our searches? Or do we someday exhaust the unique musical possibilities?

 

My programming challenge

Check it out

Excerpt from the intro screen:

Zennish

A programming puzzle inspired by a programming career

Professional programming involves balance between the opposing philosophies of architecture (GRASP, SOLID) and pragmatism (KISS, YAGNI). Over-architected code front-loads difficulty to solve problems that won’t necessarily ever come. Under-architected code causes exponentially more pain with each new requirement, like a house on a bad foundation. Finding the middle ground consistently, our job, is hard. 

But in the real world the changes you have to make over your code are so spaced out and forgotten. Thus the feedback loop on your coding such decisions may be months or years. Welcome to the garden of practice. When you make choices in this low-stakes sandbox it’ll only take minutes to reap what you have sewn. 

I also see this an alternative to traditional coding challenges (e.g. codility, google code jam) that challenges the traditional valuation of engineers on academic math-intensive skills. Expert engineering is not about plugging a sequence of numbers into the OEIS; it’s creative composition. 

Consider the following code samples.

// Specimen A 

if ($forceClose !== ' ' && $forceClose !== NULL && $issues['forcedClose'] && $forceClose > 0) 
     { 
     //ONLY SAVE IF 1, NEVER TOGGLE BACK
     $this->saveAttrForUser($user, 'forced_close', $forceClose);
     if ($forceClose == '1') 
         {
         Log::saveWrite("App killed");
         }
    }
// Specimen B
// note the prefix 'b' indicates type Boolean, the prefix 's' indicates type String
$bGotValidMessageFromClient = !($sForceClose !== ' ' && $sForceClose !== NULL);
$bAppPassedExitMessage = $aIssues['forcedClose']; 
$bExitWasForceful = ( (int)$sForceClose > 0);
$bAppWasForceClosed = $bExitWasForceful && $bGotValidMessageFromClient && $bAppPassedExistMessage;

if ($bAppWasForceClosed) 
     { 
     // A comment here that would mean something...
     $this->saveAttrForUser($sUserId, 'forced_close', $sForceClose);
     $bForceCloseWasAKill = ($sForceClose == '1');
     if ($bForceCloseWasAKill) 
         {
         Log::saveWrite("App killed");
         }
    }
These two samples contain the exact same logic. The former is more direct, essentially it inlines all of the variables in the latter. And that is why it is worse. A variable serves as a unit of abstraction, and a well-named variable is a comment; thus having more variables is better. In fact, all if conditions should the form if ($bSomeVariable) {, that is, never put a complex expression within the parentheses of the if. If the logic is particularly complex then use multiple variables, as seen in the sample above.

All if conditions should the form if ($bSomeVariable) {

At first glance, the bottom example is longer and may look intimidating. But it is superior because it exposes its internal logic. Try deciding what the top code does, and what ‘1’ indicates in that code. Then consider the bottom code. The top is an actual code sample.

And while we’re adding in these extra variables, let’s name our variables clearly. Programmers seem to have some unspoken assumption that variable names need to be less than 10 characters and not have prepositions… Maybe this made sense when you had an 80 character wide monochromatic screen with no form of autocomplete, but technology has made that trend obsolete.  I suspect most people would try to turn the variable name $sLinkToUniversalDownloadPage into something like $sUniversalDownloadLink which is inferior because it is ambiguous (is the link universal or the download universal?).

Or, for example, what would a function named “OpenFileLock” do? Is it opening a lock on a file? Is it filing a lock (in some kind of lock registry)? Is it locking an open file? Each of the three words in the title can be read as a multiple parts of speech (i.e. verb, noun, adjective, etc). Or perhaps FileLock is an existing class in this codebase. Prepositions here would make all of the difference.

Name everything; name it well.

Death By Metrics

‘A little knowledge is a dangerous thing’ – Alexander Pope

 

TL;DR – Graphs, dashboards, and metrics give a psychological illusion of control and understanding which is usually false and can be worse than no information at all.

Why is it so easy to fall subject to crappy metrics? Let’s consider a story. You’re a project manager and you just got yelled at because the engineering project you manage is already 2 sprints behind. You feel helpless. You asks the engineers why everything is taking longer and they give technical explanations about unknowns that you can’t see at all. You really don’t know anything.

Then something magical happens: you discover you can look at the engineers’ pull requests, and see how many PRs they make each day and how many diffs are in them. You start to look at these and start to notice patterns. Maybe less PRs get done on work from home Mondays… are they slacking off? When all you have is a nail, everything looks like a hammer.

And you just ruined the company. Suddenly you’re wondering why Bob has the fewest commits, why there are fewer PRs on work-from-home Mondays, why there has been a decrease in number of PRs last month. The answers respectively are: Bob is the only one to comment  his code so everybody except him is sped up because of it, sprints start on Mondays and fewer PRs have 1 day turn-around so this biases your resulth. Everything your one metric might lead you to assume has been entirely incorrect.

Or worse, you decide to evaluate engineers based on their Jira tickets. You have engineers estimate their tickets and then see if they can meet their goals on time. Suddenly you have incentivized your engineers to overestimate tickets, play “hot potato” with responsibilities (“Not really a bug! Ticket Closed!”), not help each other (teaching time just makes you look slower), and not invest in long-term architectural improvements that won’t show immediate benefits.

Or worse yet, you decide to judge the success of your company based on Monthly Active Users. You have your awesome facebook game and have 30 million MAUs. You force all of your users to send out invites to all of their friends and Viola 31 million MAUs! You change your code so that every one of your users must send invites to friends every week. Now you have 32 MAUs. The graphs look pretty convincing, you get promoted. And you wrecked the company. A year later facebook changes the rights to your app so that you can’t spam its users, your users have been pressured by their friends to stop spamming invites, your entire company has become a laughing stock in the gaming industry and is now lampooned everywhere for the rest of its existence.

The common pattern here is  individuals who don’t have industry turning to oversimplified numbers. It is people with finance degrees changing videogame experiences, MBAs insisting on multiple A/B tests to change an icon, or a non-technical hiring manager not calling a Bill Gates for an interview because he doesn’t have a CS degree.

WHY I love PHP Storm

It’s rare I go out of my way to just celebrate something, I usually try to have a balance and nuanced view and don’t like the mindset of attachment. However, I wish somebody had introduced me to this tool sooner. Thus this article is on WHY I love php storm (not that I love php storm).

Saying good things about the IDE is worthless to me if you don’t explain why, and nobody has ever explained WHY php storm is so much better than the alternatives.

Just to help you make an informed decision on how much/little my word is worth, these are the IDEs I’ve used enough to compare against for PHP&JS development:

Aptana, DreamWeaver, Eclipse, Netbeans, Notepad++, Zend Studio

 

The why:

  • Go to file quickly (without using the mouse), using partial search with wildcards, with an instantaneous dropdown. So I can hit command-shift-N, type public/*user*.js and see a list of matching files, hit down to the correct one, press enter, and have it open, all with negligible latency.
  • Other shortcuts are very useful, easily customizable, like back and forward.
  • I can search my workspace without using the mouse, and it’s very quick, and intelligently stops if it finds >1000 matches.
  • Spell-checks w/ the ability to add to dictionary. Because I’m a good speller this functions as both an error-catch (mistyped a variable name?) as well as a way to make team interactions better (how annoying to use somebody else’s misnamed function every time).
  • The refactors are good.
  • It nails both PHP and also Javascript, in one IDE.
  • It doesn’t crash. I leave it running for weeks, literally. I don’t close the app when I put my computer to sleep, even over the weekend often. Even after weeks it uses less than a gig of ram (no memory leaks), and is still fast.
  • Intelligently looks at doc blocks to infer return type.
  • Can reliably “go to definition” (by keyboard shortcut too) across documents in both PHP and JS.Javascript:
  • Validates JS. Can understand strict mode and ensure your code follows it if used.
  • Understands JSON correctly and validates on the fly.
  • Monitors for global variables and unused variables.
  • Perfectly understand prototypal inheritance for its autocomplete etc.

php_storm

“Distinguish Navigation From Content”

In my UI manifesto I mentioned a rule of good design: “distinguish navigation from content.” That means that for all information your application gives, it should be immediately and unmistakably apparent whether the information is situational data (i.e. content such as a particular webpage) or a application functionality (i.e. navigation, like the address bar). This should be apparent from the very first use without having to try it out.

Metaphorically speaking, if you run Safeway, don’t make an employee uniform that is simply a red shirt; if you do that then anybody who walks in with a red shirt could be mistaken as an employee. Make the distinguishing feature something that cannot be mimicked.

Screenshot from youtube, yellow circle added obviously.
Screenshot from youtube, yellow circle added obviously.
Screenshot from youtube
Notice how there is no visual change whatsoever as the mouse covers the x button.

 

 

A real life example from just this week. Youtube failed to do this. Notice this particular ad shows as “x” button in the uppper-right that indicates the ad can be closed. How do I, as a user, know that this is a real X button made by youtube, and not a drawn x in the ad image itself that will trick me into clicking the ad? I don’t, because the the X acts identical to the rest of the ad, the hover icon is the same, no css changes, the two possibilities are indistinguishable until I actually try it out.

How could you fix this? Make the X button partially leave the ad box in the upper right (an area that I can infer the ad doesn’t have permission to draw to) and have it highlight its border yellow when it is hovered (this assumes that the youtube ads are not mouse-responsive, which is true at the time of writing).

How I propose youtube mid-reel ads should behave
How I propose youtube mid-reel ads should behave

What’s the big deal?

So, what is the big deal? Afterall, clicking the X in the youtube ad does end up closing it, seems pretty minor.

It’s a big deal because it’s a systemic problem following a very basic formula that is easy to fix. The general principle behind “distinguish navigation from content” is that the user always needs to be aware of who they are communicating with. 

An old virus trick, back when viruses spread across file-sharing networks, was the someMusic.exe trick, where a trojan would be crafted that would have the default icon of windows media player. Hence, if a user had file-extensions turned off on a PC computer (the default), the file would appear exactly like a music file and would be double-clicked readily. This is again an example of a blurring between communication; the operating system isn’t communicating unambiguously if the icon you are seeing is a certification of file type or is content picked by the application’s creator.

Can you figure out what's going on here? It's all about distinguishing the source of the content.
Can you figure out what’s going on here? It’s all about distinguishing the source of the content.