Hiring - finding the needle in a haystack

Update 2022-01-03

I’ve done my fair share of hiring at Bird Eats Bug, and can now attest to the fact that the conclusion I reach below applied for Yara, but don’t quite cut it for Bird Eats Bug.

Learning: The right hiring practices vary widely, depending on the company! I’ll update this post with details in due time.


I’ve hired a couple of people and went through a hundred-something CVs. Hiring is incredibly costly. From the hiring side:

Information gathering

The core problem of hiring is information gathering. Neither the candidate nor the employer have enough (of the right) information about each other to be entirely sure whether a “yes” or a “no” is the right decision.

Hurdles getting the right information

Gathering information is hindered by three factors:

Tools to get information on the hiring side

To gather information about a candidate, here’s a list of typical tools. I list each with the downsides I noticed when hiring web developers with javascript focus:

Scanning the CV

Predictive relevanceon hard skillson soft skills
for a good fitlow-
for a bad fitmedium-

Time investmentSubjectivityFilters out disproportionallyAutomation
10minhighjuniors (lack of practical experience is visible)There’s dozens “AI powered” CV filters. I don’t have practical experience and don’t know whether they are trust-worthy.

Reading a CV is the default first step in most companies. There’s a reason: it doesn’t take much time and some aspects can be automated (e.g. filtering out CVs which do not contain certain key phrases). Looking at CVs does have value as a first filter if there’s a lot of applications. Unfortunately the CV doesn’t really say much about the person. Whereas I learned in school that a single typo would destroy any chance of me getting the job, I’ve rarely seen CVs without minor to medium sized issues: typos, incorrect grammar, bad formatting, bad data linkage (links are plaintext instead of hyperlinks), missing basis information (like email address, place of residence), a lack of structure. A bad CV is a weak argument against a candidate, but I’ve spoken to plenty candidates with bad CVs who turned out to be great. Even though one would think that with one of the trillion CV templates out there no one would get the basics wrong, there seems to be enough software developers who just don’t care enough for the formalities. Contrary, there’s also candidates who are skilled in overselling themselves on paper. In conclusion I don’t think the CV has much predictive relevance. It only allows to filter out candidates who are obviously lacking relevant education or practical experience.

Judging the portfolio / GitHub profile

Predictive relevanceon hard skillson soft skills
for a good fithigh-
for a bad fitlow-

Time investmentSubjectivityFilters out disproportionallyAutomation
10-30minhighcandidates with a small portfolio (often juniors)-

Having a look at existing code of the candidate can uncover talented, creative candidates with spare time. They tend to have awesome repositories. I’ve never seen anyone with strong OSS contributions so far. For most candidates, the picture is somewhat more muddled. There’s a lack of context: Is this project badly documented and the git commit history lacking because it’s a weekend hackathon project? How much time was put into it? What was the goal, and do the tech decisions make sense given the goal?

Then there’s plenty of people who don’t have a filled portfolio at all (like yours truly). Developers spent the whole day programming, usually closed source code they can’t share, and they might not be able to invest private time, or when they do, they might want to play around without the pressure of having public results to show for it. A middling or absent portfolio can’t be taken as a signal either way. Only truly horrific code might deter from the candidate’s attractiveness.

Logical thinking / cognitive aptitude test (async)

Predictive relevanceon hard skillson soft skills
for a good fitlow-
for a bad fit--

Time investmentSubjectivityFilters out disproportionallyAutomation
---automated through providers like testdome

I’m critical of such tests. Even though test providers claim scientific studies, my anecdotal experience speaks against them: I’ve worked with a couple of talented people who failed such tests. I can also think of plenty developer positions where pure logical thinking is not even such a crucial capability.

Coding challenge (async)

Predictive relevanceon hard skillson soft skills
for a good fitlow-
for a bad fitmedium-

Time investmentSubjectivityFilters out disproportionallyAutomation
30minhigh (due to missing context)juniors (when requiring to start project from scratch or with time limit), people with less time (often seniors), creatives (when setting too many rules), people who need precise instructions (when not giving enough guidance)potentially high with tools like devskiller’s automated scoring. Given my experience with how different different team members interpret code, I don’t trust that stuff at all.

Letting the candidate write some code at home seems like a great way to determine the hard skills. If only sigh. In practise, a coding challenge can’t take too much time (candidates will drop off), it can’t be too complex (candidates will misunderstand instructions and deliver the wrong thing, and it’s hard to judge whether it’s their or your fault), and it can’t be too specialised (otherwise you’ll miss the big picture). The resulting generic task(s) of a couple of hours will not give you a good picture of the candidate’s actual hard skills. Even with reduced scope you’ll loose candidates who are not motivated to invest multiple hours without any gain on their side: Some seniors think they don’t need to prove themselves, and other candidates have a family, are moving, or have other interviews ongoing. If you can afford to lose candidates and want to find the most motivated ones (often juniors), enforcing the coding challenge might be a great tool. Besides being a time-hog for the candidate, the hiring team needs to spend time to review the results quickly, to not drag out the application process. Therefore the team needs to be ready to “jump” at any candidate who submits a result, interrupting daily work. Properly reviewing a single challenge takes roughly half an hour per review.

Often enough there was cases where the first reviewing team member asked a second person to review, to verify a judgement (Is this code slightly above or below the acceptable threshold?). For senior developers, the definition of “good code” is narrow, while there’s a wide area of what is considered “bad code” (where large parts of both judgements are subjective). But even with two reviews, we ran into enough candidates who apparently could deliver code that looked okay at first sight, but who — when asked in detail — couldn’t explain how certain parts worked and why other parts didn’t make much sense. Nowadays it’s too easy to copy-paste together a dozen stack overflow answers and medium posts.

There’s also plenty challenges which seem obviously bad (Why did the candidate add a huge library to solve this small scale problem?), where candidate’s answer let me to question our initial assessment. For the example question above it was answers like: “Yes, I wouldn’t use that library normally for such a small project, but I know that your actual product is more complex, and so I thought that you’d want to see that I can deal with such more complex code.” or “Yes, this library isn’t optimal for the coding challenge, but I’ve been using it for other projects where it does make sense for a couple years now, and I used it here to speed up finishing the coding challenge, because I assumed you’d look at the time I took”. Do you know if the candidate submitted a sloppy git history because he/she felt time pressure and thought no one would care about the git history? Did the candidate only add the tests because he/she thought he/she would be rejected otherwise, but would never write tests normally? Is the absence of tests a sign that the candidate is against testing, or just the prioritisation of the candidate to focus on different things when solving the challenge? Did the candidate use that pattern to show that they know it, or because he/she thought it would be the best solution for the problem?

Without follow-up I don’t trust my own initial review anymore. Without follow-up it’s hard to identify people with dangerous half-knowledge, and it’s hard to know whether a solution is objectively bad or just subjectively bad because of different priorities, experiences, and assumptions. Of course you can state a great many set of rules and preferences in the coding challenge’s README — but trust me, people don’t read it thoroughly, or they don’t believe the instructions, or they think they are smarter than to follow the same rules as everyone else. Limiting the solution space through instructions also inherently limits the candidate pool to people who are attuned to similar instructions, filtering out a certain “creative problem solver type”.

A question is, whether the challenge should leave as much to set up from scratch to the candidate, or mirror the job’s project in a simplified manner. Giving full freedom will lead to strange choices on the candidate’s side, which are hard to judge. Candidates also tend to use too much time setting things up from scratch, not leaving enough non-setup related code to judge. On the other hand, a senior should be able to make reasonable decisions, and some creative candidates shine when they are allowed to do things their way. Setting up a base project for the candidates allows for a shorter coding challenge, and might mirror the work environment closer (the team already decided for library x). This will benefits anyone who would get hung up on setting new things up (which is not an everyday thing to do in many jobs), is more junior-friendly, but mostly benefits those who have already worked in a similar environment. It also makes it harder to identify developers with special creative strength.

Then there’s the matter of time limits: setting a time limit on solving a coding challenge will have half of the candidates panic and underperform, whether they are capable or not. In reality software developers rarely have to deliver against the clock, at least not in an unfamiliar problem space without colleagues around to help them. Without time limits though, nearly all candidates will ignore the time suggestion, and spend the amount of time they are willing to invest to keep the chance at the job. Again, this is great to find motivated people with plenty of free time, but makes it harder to compare results.

All in all, the results of coding challenges end up surprisingly different and therefore hard to compare.

Ok, so why bother doing a coding challenge still?

I’ve run across a couple of “near perfect” results. People who made reasonable tech stack decisions under time pressure, while maintaining a clean git history, writing clean code, and implementing the acceptance criteria. The developers behind those challenges turned out to possess those abilities in real live too. So a coding challenge is good to uncover diamonds!

Besides that, the coding challenge lets you get a glimpse of the true priority stack of a person: Does that person like to push out features like crazy, or is he/she thorough? Does he/she prefer building UI or logic? This is a weak signal though, as a candidate will be strongly influenced by what he/she thinks the hiring team wants to see.

A coding challenge can also uncover some tell-tale sign of seniority. Even if a senior is sloppy, he’ll still follow a minimum of clean code principles and patterns. Less experienced candidate’s challenge results will often miss a certain “last finishing touch”.

As established earlier, due to the time investment required on the candidate’s side, the coding challenge can be used to filter for motivated people. If you struggle having too many candidates in the pipeline, forcing them do the code challenge before further interviewing will shorten the list of candidates. Just beware that the basic programming language snippet coding described below turned out to show more indicative results of hard skills with the same time investment on the hiring team’s side.

If you want to hire someone who can jump straight into your codebase and be productive with it, for example when hiring while the team is under delivery pressure2, or if the tech stack is much more specific than just “JS with a selection of the most popular libraries”, high productivity or familiarity can be inferred from good progress with appropriate quality.

If you decide to ask a candidate to do a coding challenge: Let the candidate know that the challenge result will be the starting point of technical interviewing. This makes the coding challenge more meaningful and ensures the candidate that he/she can explain him/herself. In a later interview, ask the candidate to do a quick walk-though of the result in the interview. The candidate should answer high-level questions like: What would you have done next if you had had more time? What would you have written tests for if this were production code? Which areas would you have improved / refactored and why? Why did add those libraries? If you can get your HR department to allow you to handle some feedback via a PR, there’s an even better feedback loop.

So this was a lot of opinions on coding challenge. Overall, I have never managed to get the coding challenge to get enough un-skewed, trustworthy, valuable information to be worth the hassle. Your mileage might differ.

Basic programming language snippet coding (synchronous)

Predictive relevanceon hard skillson soft skills
for a good fitlowmedium
for a bad fithighhigh

Time investmentSubjectivityFilters out disproportionallyAutomation
45minlow (clear cut-off threshold, high comparability of candidates)juniors, candidates who fear delivering code live-

Every job focuses on one or two programming language. There’s a minimum knowledge level on the language(s) you expect from a new team member. Setting up a couple super basic tasks which can be solved with small code snippets3 and having a live call with the candidate to go over them is a perfect way to ensure that the candidate possesses the wanted minimum knowledge.

Even if the tasks are basic, they help to evaluate seniors: seniors will fly through the tasks, point out multiple options (or reason why they chose the option that they typed down), and be comfortable explaining the deeper language concepts behind the snippets. Contrary to the coding challenge, the synchronous nature allows to gain immediate context.

In contrast to all tools above, the synchronous call allows to understand soft skills. You’ll immediately notice how comfortable a candidate is explaining his/her code work.

The downside is that candidates who are uncomfortable coding on the spot will show a performance that is worse than their capabilities. They might be filtered out disproportionally. This effect is not too large tough, if the snippets are small and isolated enough and they are allowed to skip when being stuck.

Doing this asynchronously is not recommended, because you want to know where a candidate was not fluid and used resources for help.

Judging the hard skills of a candidate on this call is fairly unbiased: there’s clear criteria you can define upfront to judge the skills. For example: a senior candidate should complete all code snippets within 30min. A medium or junior candidate could take 40min. For a junior candidate it might be acceptable if there’s an incorrect / unsatisfying answer to one of the snippets.

In my estimation, this live snippet coding has a good information gain / cost ratio. It doesn’t replace a longer hard skill assessment, because the code snippets should be so small that they don’t give the full picture of whether the candidate can build a bigger, useful product.

Code pairing session (synchronous)

Predictive relevanceon hard skillson soft skills
for a good fitmediumhigh
for a bad fithighhigh

Time investmentSubjectivityFilters out disproportionallyAutomation
1-2hhighcandidates with weak pairing skills and fear of delivering code live-

A live code pairing session can be considered a good addition to the basic synchronous snippet coding. Whereas the former checks for basic language understanding and remains comparable and objective, the pairing session allows to dig deeper in building actual functionality, focus even more on the soft skill side, and therefore is much more subjective and less comparable. I found a live pairing session to give all the context that the at home coding challenge is lacking, while building up more of personal connection. One hour of pairing is enough to finish a small focused task, even on the actual product. Allow the candidate to use his/her computer for the pairing – using a provided computer turned out to distract candidates, especially senior people, who tend to customise their environment.

Of course the live code pairing session struggles from the same issues as all synchronous tools, which is to say that developers with low self-confidence, weak communication skills or on the introverted side of the spectrum will be stressed and underperform under the pressure of delivering in real time in an unknown environment.

Pairing rarely does not turn out in a clear opinion in favour or against a candidate and feels indicative of job performance. It’s especially easy to filter out bad fits.

The reason you might not want to do the pairing with all candidates is that it is a noteworthy time investment, which also needs to be scheduled, since it is synchronous. It’s a great tool as one of the final filters for a well-calibrated list of late stage candidates.

System design whiteboard session (synchronous)

Predictive relevanceon hard skillson soft skills
for a good fitmediumhigh
for a bad fithighhigh

Time investmentSubjectivityFilters out disproportionallyAutomation
45min - 1.5hhighcandidates with specialised skill-set (non full-stack); candidates who are not used to thinking in larger system design-

If you require “builders”, let the candidate draw up a rough sketch of a larger hypothetical system or re-engineer an existing complex part of your application. This requires the interviewer to do a good job communicating expectations and the use-cases, as well as providing good information to make good trade-off decisions. If done well, you’ll identify candidates who are comfortable with the full stack, who are able to ideate new greenfield systems, or who could bring existing systems to a new level. It’s easy to turn the whiteboard session into a longer exchange of opinions and experiences on a wide variety of technical topics: What’s the right database (schema) for this? Is this a use-case where REST makes more sense, or graphQl? How can we deal with caching? How optimise the workload for costs, scaling, or stability? Would we go the microservice route or with a majestic monolith? Go serverless? Which programming languages and libraries make sense here? Which cloud provider would be best suited to run the workload? A junior might have less experiences to share, but can stand out with creative ideas.

This tool is useful at a late stage of the application process. It comes with the benefit of requiring a deep technical discussion without getting bogged down in implementation or language details. The soft skills are as important as the result.

One could argue that this interviewing step is a bad fit for jobs where bigger-picture thinking is not regularly required. For a small team I’d argue against that: Any engineer should be able to come up with an initial proposal for a complete use-case of an application. Otherwise I’d consider the candidate to be too specialised.

Besides the point that any developer should be able to draft a system, to me this interview tool allows to get a good gauge at how quick someone is to grasp real-world problems and solve user needs: Is the candidate asking good questions? How quick is he/she to identify the challenging parts? Is the candidate finding the edge cases? Is the candidate willing to make judgement calls to solve them, or falling back on the interviewer?. The same on personality: Is the candidate confident and optimistic about the outlook of designing something new? Is the candidate keen to throw out unfinished ideas and iron out details later, or is the candidate taking time to think things through before he/she proposes a solution? How much is the candidate involving the interviewer?

If you’re having separate people working for front end and back end development, this interviewing tool is a great way to “bridge the gap” and get opinions on a candidate from both angles.

Interview Conversation

Predictive relevanceon hard skillson soft skills
for a good fitlowhigh
for a bad fitmediumhigh

Time investmentSubjectivityFilters out disproportionallyAutomation
2hhighcandidates with lacking soft skills and skills in self-marketing-

To gauge soft skills, there’s no way around having conversations. I’m not a fan to throw unrelated technical questions at people without a clear thread. For assessing technical prowess, most tools described above are better fits. Those tools however lack a clear intent in grasping the personality. Sure, any synchronous technical conversations contains pointers on the character, but it’s only that. Whereas hard skills can be learned given the will and resources, it’s much harder to re-shape a person’s behaviour and character. So if you hire someone, who doesn’t fit snuck like a glove on day one, weight soft skills much higher than hard skills. Focus the interviewing on finding out as much about the person(al traits) and character relating to the desired profile as possible. The only other thing that should be touched is red signs4.

Ask situational questions giving insights into the candidate’s traits: Tell me about the last time you had a heavy disagreement with a colleague! Go deeper: You mentioned that you brought the topic to your manager instead of resolving it with the colleague. Why did you decide that way? and so on… Using stories about past behaviour tend to be show more honest views on the character, than asking the candidates hypothetical questions. However, don’t overuse those situational questions either: They tend to take much time to go the full depth, and people might draw blank finding a good situation to tell about.

Besides the fit of the candidate’s character with the position, the reverse is just as important: Can the company honestly offer an attractive outlook for the candidate? Don’t hire someone in the knowledge that your workplace is not suited for them to succeed and grow! It’s pretty easy to understand the motivation of a candidate and what they are looking for. Cover the basics with the questions:

Not everyone can interview in a way that feels somewhat like a natural conversation, while keeping the topics on point, especially when it comes to more personal questions. Train the people in the team best suited for this. Don’t expose a candidate to more than two interviewers at a time. It will be unnatural, intimidating, and make it harder to keep the conversation flowing coherently. Rather split up the questions you’d like to have answered and your team and let everyone have some slices. No slice should be shorter than half an hour, otherwise the conversation feels too chunked and requires the candidate to adopt to too many persons. Chunking interviewing duties, you’ll have a diverse set of subjective views on the candidate’s soft skills and fit.

All subjective inputs on judging a candidate are a two edged sword. As team lead, I over time learned to trust my gut saying “no” more than other team member’s “yes”es5. In return, the team lead’s “yes” should not overpower the team’s “no”s either. Beware of the bias to like people who are similar to yourself. And always make sure to understand whether the candidate belongs in the camp of being able to sell themselves well (discount your feeling on their expected skill a bit) vs. more introverted candidates who leave a worse impression than their performance would be.

As stated, there’s no way around spending time in conversation with a candidate to understand the soft skills and fit. It’s messy, time intensive, subjective, hard to compare, bias-ladden, takes energy. As software developer it’s easy to discount the important of irrational or unmeasurable stuff like soft skills. As pointed out above, they are more important though, because harder to teach than hard skills. In the end, you should only hire candidates you would be excited to work with, who’ll make your team stronger, people you’ll be able to learn from (or you’ll be able to develop to that point). If you don’t feel good energy after an interview, don’t bother.

Phew! It depends on your specific situation, but definitely a combination of the ones outlined above. If you are not overloaded with candidates (or have strict pre-filters), I might consider to skip the at-home coding challenge, and replace it with the more expensive, but information-rich synchronous equivalents (pairing).

Keep in mind that all of the tools have trade-offs I outlined before. Often you’ll realise that you filter candidates too strongly in one phase. Once you lowered your bar in that interview step, you found that you overcorrected, and additionally you realise that the following interview phase needs to become more strict to not let too many candidates get to the phase after that. Getting good at filtering candidates with the tools that are cheap for you as person responsible for hiring, the more time you can invest on the candidates who make it through.

Would you ask me for my preferred ideal interviewing set-up for a tiny team, I’d answer:

Footnotes

  1. The degree to which the job market is a buyers / sellers market of course plays a crucial role.

  2. Well, in this case you are fucked anyway, because hiring eats shit-tons of time, and people will only be available weeks or months later. It would make more sense to hire freelancers without much interviewing (and fire them quickly if they turn out to be duds)

  3. In the case of JS e.g.: asynchronous operations, object mutation / cloning, and equality checks, some data wrangling.

  4. E.g.: Why did you switch employer 5 times in the last 5 years?

  5. Maybe it’s because the manager knows about the increased responsibility and will have to deal with people problems. Maybe it’s also because of the team lead’s intuition is slightly more trained due to increased exposure to different candidates. Who knows? 🤷‍♂️

  6. That is challenging in some bureaucratic countries (🇩🇪 ahem).