The Programming Interview

I’m not a huge fan of the programming interview–on either side of the table. The signal-to-noise ratio for identifying successful candidates is frustrating. The programming questions that you’re asked to solve haven’t been real Computer Science questions for decades, and since Boyer-Moore nobody in their right mind has re-implemented strstr.

Even more fun is when these coding exercises are expected to be written on a whiteboard. For example, a question I was asked looked something like this: “Given matcher(“abba”, “red blue blue red”); implement function matcher(pattern, teststring) and return true if the provided string follows the pattern and false otherwise.”

You might, upon seeing this, immediately think it’s easy: you’ve got a space-delimited pattern. Split it up and compare!


// Space delimited pattern matching.
function matcher(pattern, teststring) {
if (pattern.length > teststring.length) { return false; }

// Convert pattern and teststring to an array.
pattern = [].slice.call(pattern);
var testarray = teststring.split(' ');

var identified = Object.create(null);

return pattern.every(function(patternkey, index) {
if (identified[patternkey] !== undefined && identified[patternkey] !== testarray[index]) {
return false;
}

identified[patternkey] = testarray[index];
return true;
});
}

That works, and then immediately you’re peppered with questions about how else you could do it, or what the complexity is. Calculating the complexity and speed is in most cases a distant third to the goals of “making it work,” and “making it maintainable.” The only relevant question, especially on the front end, is “does the JS frame in which this code is running yield in under 16ms for your use case?” The followup question is, “is this the tallest tentpole in that JS frame?” I’ll trade algorithmic complexity for ease of maintenance every time. We can talk about the “named” algorithms in that space once we realize it’s a performance bottleneck, but in building user interfaces with a budget of 16ms (60Hz refresh rate) that’s rarely necessary.

So maybe instead of taking advantage of the space-delimited matching, you recognize that this question is nothing more than a pretty interface to a regular expression. In order to meet the API design goals, however, you have to generate the regular expression, and then run it against the provided test string. You could cache the regular expression (the new RegExp(); call is the slowest piece of this code) but that is cheating. A simple version looks like this:


// RegEx version.
function matcher(pattern, teststring) {
if (pattern.length > teststring.length) { return false; }

// Convert pattern to an array.
pattern = [].slice.call(pattern);

var patterninfo = Object.create(null);
var norepeat = true;

var backrefcount = 1;
var regex = "";

pattern.forEach(function(patternkey, index) {
if (patterninfo[patternkey] !== undefined) {
norepeat = false;
regex += "\\" + patterninfo[patternkey];
} else {
patterninfo[patternkey] = backrefcount++;
regex += "(.+)"
}
});

if (norepeat) { return true; }

return new RegExp(regex).test(teststring);
}

At this point you’ve managed to come up with two solutions for the problem, with varying performance characteristics. The regular expression builder you’ve written makes it easy to handle the next question you were inevitably going to be asked, “how would you do it if the test string weren’t space delimited?” But what if you came up with the regular expression solution first? Well, clearly you’re going to be asked to implement it without using a regular expression. The amusing thing about asking that question is that it is likely that the interviewer themselves has not implemented it without a regular expression. Much less have they come up with a solution on a whiteboard. But say you’re a masochist and have nothing better to do on your Saturday night in San Francisco with interviews bookending the weekend. You might write something like this in incredibly tiny print on the whiteboard in your hotel room:


// Non-delimited pattern matching without regex.
function matcher(pattern, teststring) {
var teststringlength = teststring.length;
var patternlength = pattern.length;
if (patternlength > teststringlength) { return false; }

// Convert pattern to an array.
pattern = [].slice.call(pattern);

var patterninfo = Object.create(null);
var patterninfokeys = []; // Exists because Object.keys isn't available for dicts.

pattern.forEach(function(patternkey, index) {
if (patterninfo[patternkey] !== undefined) {
var occurrences = patterninfo[patternkey].indices.push(index);
patterninfo[patternkey].max = Math.floor((teststringlength - (patternlength - occurrences))/occurrences);
} else {
patterninfokeys.push(patternkey);
patterninfo[patternkey] = {
indices: [index],
min: 1,
max: teststringlength - (pattern.length - 1)
}
}
});

// Not repeating means it will default match everything.
if (patterninfokeys.length === patternlength) { return true; }

// "Even" patterns will only match "even" test strings.
// Eliminates 1/4 of the problem space.
var even = patterninfokeys.every(function(patternkey) {
return patterninfo[patternkey].length % 2 === 0;
});
if (even && teststringlength % 2 !== 0) { return false; }

// Adjacent single pattern keys can be coalesced.
// NOTE: this raises the minimum matching pattern length.
// NOTE: They're guaranteed to be adjacent in the data structure due to how they're added.
var previouspatternkey;
var offset = 0;
patterninfokeys = patterninfokeys.filter(function(patternkey) {
// First time through we need to simply skip.
if (!previouspatternkey) {
previouspatternkey = patternkey;
return true;
}

var occurrences = patterninfo[patternkey].indices.length;
var previousoccurrences = patterninfo[previouspatternkey].indices.length;

if (occurrences === 1 && previousoccurrences === 1) {
// Update the previous element.
patterninfo[previouspatternkey].min++;
patterninfo[previouspatternkey].max = Math.floor((teststringlength - (patternlength - patterninfo[previouspatternkey].min))/occurrences);

// Splice the pattern key out of the pattern array.
// Count how many we've removed, that will be our offset for future iterations.
// (Each removal results in the index being off by one more.)
pattern.splice(patterninfo[patternkey].indices[0]-offset++,1);

// This patternkey will be filtered out.
return false;
} else {
// We can move on to the next pattern.
previouspatternkey = patternkey;
return true;
}
});

// Now we check to see what the possible remaining solutions look like.
var stack = [];
var answer = permute(patterninfokeys);

// Unused, could be done instead of maintaining the offset.
function reduceIndices(compareIndex) {
patterninfokeys.forEach(function(patternkey) {
patterninfo[patternkey].indices = patterninfo[patternkey].indices.map(function(index) {
return index > compareIndex ? --index : index;
});
});
}

function permute(patterninfokeys) {
var answer = false;
var patternkey = patterninfokeys[0];

for (var value = patterninfo[patternkey].min; value <= patterninfo[patternkey].max; value++) { // Add the current value. stack.push(value); // Check to see if that pushed us over the limit. // If so, bail out. The rest of this tree is dead to us. var patternlength = calcpatternlength(stack); if (patternlength > teststringlength) {
stack.pop();
break;
}

if (patterninfokeys.length > 1) {
// We've not made it all the way to the leaf node.
answer = permute(patterninfokeys.slice(1));
} else if (patternlength === teststringlength) {
answer = check(stack, teststring);
}

if (answer) {
// We're matched, we're done.
break;
} else {
// No good, try again.
stack.pop();
}
}

return answer;
}

function calcpatternlength(lengths) {
var tocheck = patterninfokeys.slice(0, lengths.length);

return tocheck.reduce(function(previous, patternkey, index) {
return previous + patterninfo[patternkey].indices.length * lengths[index];
}, 0);
}

function check(lengths, teststring) {
// Look up length using the patternkey.
var lengthsbypatternkey = Object.create(null);
var samplesbypatternkey = Object.create(null);

patterninfokeys.forEach(function(patternkey, index) {
lengthsbypatternkey[patternkey] = lengths[index];
});
return pattern.every(function(patternkey) {
var sample = teststring.substr(0, lengthsbypatternkey[patternkey]);
teststring = teststring.substr(lengthsbypatternkey[patternkey]);

// Force the pattern to match.
if (samplesbypatternkey[patternkey] && samplesbypatternkey[patternkey] !== sample) {
return false;
}

// Force unique.
for (var x in samplesbypatternkey) {
if (patternkey == x) { continue; }
if (samplesbypatternkey[x] == sample) { return false; }
}

// Save our sample.
samplesbypatternkey[patternkey] = sample;

// Continue to the next piece of the pattern.
return true;
});
}

return answer;
}

And then you’ve got a solution that nobody can complain about, right? But realistically, that’s not the case. If you had somehow managed to write all of this code inside your single hour-long interview you’ll immediately be asked, “what are the remaining optimizations you could make to improve the performance? How could you make it more maintainable?” Easy: teach the original developer how to write regular expressions.


// Were you expecting more?
/(.+)?(.+)?([.]{3,})?\2\1/.test("abcdeba");

Here’s a performance comparison on JSPerf of these approaches. Amusingly, at a minimum of 78,000 operations per second, you’d be far more suited to optimize for function length over anything else if it’s being served as JavaScript to the client.

I don’t mean to pick on this question too much, but it pretty clearly demonstrates why nobody in their right mind would ever create a function like that. And I’m probably the only interviewee who has ever gone back and finished the problem as posed as an exercise in thinking through problems.

Toward Better Interviews

Please make your questions appropriate to the interview and to the job. Feel free to steal my favorite interview question for front end developers: “Implement debouncing in JavaScript.” It tests all sorts of wonderful things, all of which you use on a nearly daily basis:

  • Scoping.
  • Understanding of this.
  • Asynchronous behavior.
  • Prototype hijacking.
  • arguments “array.”
  • Binding.
  • “Currying.”


function debounce(myfunction, ms) {
var timeout;
return function() {
var self = this;
var origargs = arguments;
if (timeout != undefined) { clearTimeout(timeout); }
timeout = setTimeout(function() {
myfunction.apply(self, Array.prototype.slice.call(origargs));
}, ms);
}
}

There’s a lot to discuss, and best of all it easily fits on a whiteboard at about 8 lines of code.

It takes more work to identify questions which directly apply to the work that you do while demonstrating mastery of skills. Invest time and energy into coming up with interview questions which strongly correlate with the how well an interviewee will do in the position.

I hope to ruin function matcher(pattern, teststring) as an interview question with this post. The first two answers are trivial, the last needlessly complex. The best part about the debouncing question is that, even if you can rote reproduce it, without knowing every one of the constituent components you can’t field questions on it. Go ahead and prepare for that question if I’m interviewing you–you’ll end up brushing up on a lot of the things that I’ll want to know you’re capable of in your day-to-day job. It’s a win-win.