Jacob Kaplan-Moss

Work Sample Tests:

Coding “Homework”

Welcome back to my series on work sample tests. This post describes one of the kinds of work sample test I use (and suggest others use as well). For some theory and background, see the first few entries in the series: introduction, the tradeoff between inclusivity and predictive value, and my “rules of the road” for good work sample tests.

Background

I use the term “homework” to refer to any exercise done asynchronously – an assignment given to a candidate to complete on their own time. Thus: “coding homework” is any sort of programming exercise that the candidate works on and turns in on their schedule.

Who this is for: any role where writing code is a major part of the job. Coding homework is a very broadly useful technique, and it’s suitable for nearly any engineering role. I just wouldn’t use coding homework for engineering roles where writing code isn’t at least around 30% of the job.

What it measures: ability to write and reason about software. If you construct the exercise carefully (more on that below) it can accurately simulate actual coding problems the candidate is likely to face at work.

Coding homework is my default work sample test: I use it for all engineering roles unless it’s obvious that another kind of exercise is better. There are good reasons to make homework-style work sample tests the default: they’re relatively easy to construct, they scale reasonably well to large hiring rounds, they’re accurate simulations of real work, and easier than most other kinds of tests to construct in a way that maximizes inclusivity.


The Exercise

The exercise needs to be something similar to a kind of problem that someone would be likely to encounter in this job (rule #1). For me, I usually go with some variation of “parse this file and answer some questions about it”; nearly every programming role encounters some variant of this problem. Ad Hoc’s “SLCSP” exercise is a great example (and one I use frequently):

Problem

You’ve been asked to determine the second lowest-cost silver plan (SLCSP) for a group of ZIP codes.

Task

You’ve been given a CSV file, slcsp.csv, which contains the ZIP codes in the first column. Fill in the second column with the rate (see below) of the corresponding SLCSP and emit the answer on stdout using the same CSV format as the input. Write your code in your best programming language.

See below for more discussion on constructing an exercise. Whatever the exercise is, though, you should offer some instruction to candidates (rule #6):

  • Be clear about which programming language(s) are acceptable. If there’s a specific language the candidate will need to know on Day 1 of the job, you should use that language. But if you can allow choice – and usually you can – you should (rule #4).

  • Similarly, be clear about any environmental constraints. I tend to ask people to minimize their use of external dependencies (Gems, NPM, PyPI, etc.), but I don’t outright forbid it. It’s easier to follow code that only uses the lowest common denominator of a language’s standard library, but I also don’t want to prevent a candidate from using an appropriate tool for the job. I’ve found I don’t need a strict rule here: almost every candidate seems to understand that the point is for me to look at their code. I’ve never seen such a heavy reliance on dependencies as to render the test invalid. Telling candidates to “minimize dependencies” seems to strike a good balance: candidates sometimes use external libraries for features tangential to the problem (e.g. CLI option parsing, UI of some sort, etc.), but the bulk of the code is theirs.

  • Along with the code, I suggest asking for tests and documentation (something like a README that explains how to run the tests and the code). This more closely simulates a real job, where writing tests and documentation is part of calling something “done”.

  • Clearly set the expectation of a maximum of 3 hours of work (rule #2), but give at least a week between the assignment and the submission deadline (rule #3).

  • Make sure to mention the follow-up interview, and give them an idea of what to expect to be asked about.

I put all this in a briefing email that gets sent to candidates. For an example of what that email might look like, see the candidate instructions 18F uses1.

Post-exercise interview

After the candidate writes and submits the code, they’ll have an interview with someone on the team who’s read their code. This ensures that the work sample test is the start of a discussion, not a pass/fail.

This interview usually doesn’t need a full hour; 30-45 minutes is typically sufficient. Sometimes I’ll combine: have a single one-hour interview that’s half a review of the coding exercise, and half behavioral interview questions.

I’ll start the coding interview by asking the candidate to walk me through their code and how it works, and as the discussion progresses I’ll ask follow-ups like2:

  • How does this part (point to a tricky bit) work?
  • Are you happy with your solution? Why or why not?
  • What would you do differently if you got to do this over again?
  • Did you get stuck anywhere? How’d you get unstuck?
  • What was your testing strategy?
    • Did you write tests before coding? After? Switch back and forth?
    • Is that usually how you write test code or did you do something differently here?
    • Are you happy with your testing strategy? Why or why not?
  • I see you used ${LANGUAGE}…
    • … are you happy with that choice? Why or why not?
    • … do you think it’s a good fit for this sort of exercise? Why or why not?
  • What other languages might you have used?
    • What would be better/worse/different about solving this problem in ${OTHER_LANGUAGE} instead?
  • I see you used [some third party module] to solve this problem. How would you solve it without that module?
  • What would you do differently if you had much longer to work on this problem?
  • If your data file was several orders of magnitude larger, say 1 TB, would your code still work?
    • Why or why not?
    • What would you do differently if you needed to handle files that large?

This last one’s my favorite follow-up and a great illustration of using the interview to gain additional knowledge about a candidate’s skill that you wouldn’t otherwise get just from the code itself.

Behaviors to look for

  • Is the code they produced reasonable? Programming is a team sport, so perfection isn’t required; you’re looking for competence. The quality bar I usually aim for is “draft pull request”: there might be bugs or some rough spots, but the general ideas should be there. With a bit of polish, it should be acceptable quality for your codebase.
  • Can they explain the code – how it works, what it’s doing – and their process for writing it? Can they explain the design decisions and tradeoffs they made along the way?

Positive Signs

  • 👍 Clear explanations of what the code is doing and how it works. Bonus points if they talk about the principles or patterns behind why they wrote the code the way they did.
  • 👍 Clean code: well-factored, obvious variable names, comments/docstrings where needed, etc. Their code is unlikely to match any company-specific style guides or norms, so don’t ding a candidate if they use camelCase where you prefer under_scores. Those kinds of stylistic differences are trivial. Instead, look for generally-good readability and internal consistency.
  • 👍 I want to work with engineers who write good tests and documentation, so I always ask for them as part of the exercise, and I look to see that they’re well-written like the rest of the code.
  • 👍 Reasonable decisions around tradeoffs: e.g., choosing to use a parsing library vs. some DIY parsing with regular expressions. A candidate doesn’t necessarily have to have made the same decision you might have, but they should have a reason why they made the choice they did, and that reason should be, er, reasonable.

Red Flags

  • 🚩 The code doesn’t work in some sense: it won’t run, or runs but produces the wrong output, or is incomplete (e.g., lacking documentation or tests), or tests fail, etc. These aren’t necessarily dealbreakers — in the real world, not all code works on the first try — but the candidate needs to be able to articulate why they fell short, and it should be a reason that’s related to the test itself (i.e., something that wouldn’t be replicated in the real job). It is a serious red flag if the code doesn’t work but the candidate thinks it does, or can’t explain how they got stuck.
  • 🚩 They’re unable to explain what the code is doing or how it works, or their explanation doesn’t match what’s in the code.

Discussion

You’ll notice that this technique isn’t as simple as “here’s some coding homework”; there’s a lot more wrapped around the relatively simple assignment. All this additional “stuff” beyond “write some code” is to make the exercise respect the rules of good work sample tests.

This structure also frees us from needing to keep tests secret. We’re not trying to make this a test in the sense of those silly pop quizzes my high school math teacher liked to spring on us. Because the coding exercise is only one part of the overall test, it prevents us from doing something obnoxious like requiring candidates to sign NDAs before being given the test3.

Quite the opposite: a well-structured “homework”-style work sample test can be entirely public, like the ones by Ad Hoc I’ll cover below. I’ve started putting the details of work sample tests in the job ads I send out so candidates can see exactly what they’ll be asked to do before they apply.

Alternate versions of this exercise

There’s a nearly endless variety of exercises that could fit into this “homework” archetype. You can probably come up with a variant of this exercise to match any software development job. This flexibility is why I use it so often, and why it’s the first kind of exercise I’m covering. It can take some care to come up with an exercise that follows the the rules of good work sample tests — in particular, scoping down to fit the narrow time limit is challenging — but it’s nearly always doable.

To get you started, here are some other questions or types of questions that you might use (or use as starting points):

  • Ad Hoc’s homework exercises - the “SLCSP” example I’ve used comes from Ad Hoc, and the rest of their exercises are equally good.
  • As I mentioned above, my go-to exercise archetype is “parse this file and answer some questions about it”; there are nearly an endless variety of file formats and data processing tasks you could look at with this style of test.
  • A twist on “parse this file” is “interact with this API”: you provide an API endpoint and some documentation, and ask candidates to perform some task against that API.
  • Or, flip that around: ask the candidate to implement an API against which some code you provide runs.
  • A variant of “write some code” is “fix a bug in this code”: you provide candidates with code that’s broken, and ask them to find and fix the bug. This can be great for positions where debugging skills are critical, like software testing or QA positions.
  • Another option is “finish this code”: give candidates code that’s incomplete, and ask them to finish it off in some way. This is great for exercises that might be too large in scope by themselves.

Again, the possibilities are endless. The framework I provided can help you produce a good test out of nearly anything.

Questions?

If you have questions about this example or anything in the series, send me an email or tweet at me. If I get some good questions, I’ll answer them in a series wrap-up.


  1. I wrote the first version of these instructions when I worked at 18F. They’ve been updated some since, but are still mostly my words. ↩︎

  2. these are adapted from 18F’s hiring guide. Like the instructions above, I wrote the first version of these while I was at 18F. ↩︎

  3. I wish this was hypothetical. ↩︎