Fabulous Adventures In Coding
Eric Lippert is a principal developer on the C# compiler team. Learn more about Eric.
Interviewing job-seeking candidates is probably the most impactful thing that I do at Microsoft as far as our business is concerned. Sure, the day-to-day work of implementing the compiler is of course what I am specifically there to do. But ultimately nothing impacts the bottom line of our division more than preventing bad hires and encouraging good hires. The dozens of people that I’ve interviewed who got hired will collectively deliver much more value (or do much more damage!) than I can alone. So I think about it a lot.
I find it interesting to notice the common threads that show up in the surprisingly disparate group that is Microsoft interview candidates. A big red flag for me that I see fairly often I jokingly characterize as a form of magical thinking. Now, as someone who values diversity and cherishes the inalienable human right to believe in any old crazy thing you want, I of course do not care at all if candidates engage in magical thinking on their own time. But magical thinking about software concerns me greatly. For example, I should never be allowed anywhere near network drivers, as my beliefs about routing are clearly magical in nature.
The trouble is that I occasionally see this sort of thing in people who are not making silly jokes about it.
The technical interview question I usually ask is a deliberately vague and abstract version of a real problem that I had to solve back when I was working on the VBScript engine. The details aren’t germane to this particular discussion, but suffice to say that every candidate quickly figures out that a necessary step in solving the problem is the automatic generation of a guaranteed-to-be-unique “cookie” value of some sort. The “cookie” is used to track and control the progress of a “task” being performed by a “server”.
You can learn a lot about a candidate from their approach to this sub-problem. Candidates, sensibly enough, always try to solve the unique cookie problem by using an off-the-shelf tool that they are familiar with, like:
There are pros and cons to all of these approaches; none of them is necessarily “right” or “wrong”. (And we explore the pros and cons as the interview continues.) Where I start to see magical thinking though is when I ask the candidate to assume that their chosen solution is simply not yet implemented on their platform. Rather, they are the developer who needs to implement the tool if they want to use it in the overall solution.
I fondly remember the moment of dawning comprehension when I asked a particular candidate “But suppose you had to write the code in the database implementation that auto-generates the primary key on the table you are using to solve this problem. How would that code work?” The candidate was completely taken aback, and just stared at me for a moment before saying “wow, I never before thought about the fact that someone had to write code that does that.” Apparently in his world creating primary keys is done by the primary key pixies who live in the b-tree forest. :-)
Turns out a lot of people think that GUIDs are generated by the GUID goblins, that random numbers are created by the RNG ogres, and so on.
In reality, all of these tools I mentioned were implemented by writing code. And therefore someone had to analyze the problem space, weigh the costs and benefits of possible approaches, design a solution, implement a solution, test it, document it and ship it to customers. No magic involved!
Of course this is not to say that we shouldn’t treat abstractions as abstractions. You don’t want to be relying upon the implementation details of an abstraction, and you don’t want to waste mental cycles understanding precisely how every tool does its job right down to the movement of the electrons. Moreover, there’s nothing wrong at all with being a developer who takes existing tools and strings them together to make something new; we all do that. But what I want to know is can the candidate make their own tools? Because that’s the business my team is in: making tools.
Don't forget the old axiom: "Any sufficiently advanced technology is indistinguishable from magic."
That's why I want to hire magicians, because they obviously can create sufficiently advanced technology.
Ah, but don't forget the slightly less old axiom "Any sufficiently arcane magic is indistinguishable from technology." -- Eric
My favorite question to ask is describe a linked list. You would not believe how many BA/BS-level CS graduates cannot even begin to answer the question. How sad and pathetic is that?
Philosophically, could it be better to, say, call an API *without* knowing or inferring the implementation -- that one should design their application to the presented interface, and not anticipate the implementation? Even if they knew the details of API call implementation, it seems a bit of a violation of the unspoken contract between caller and implementer for the caller to possess prescience or insight into what should be a black box -- to winkingly circumvent the trust factor that requires that the caller believe that the API delivers ALL OF and ONLY WHAT is advertised by its signature (and supporting documentation). if that trust factor doesn't exist, don't we run the risk of becoming lazy in our implementations, and expecting our callers to anticipate our implementations? What happens when we change the implementation? For example, routing will always be routing, but what happens when we give the fairies blue wings instead of green ones? Does the guy with the magic network wand REALLY need to know? But when his wand's magic lightning is calibrated/optimized for green-winged fairies (he got a trip to the fairy circle as part of his wizard MVP program, to torture the metaphor beyond all reason), suddenly, we -- as the fairy masters -- have reneged on an IMPLICIT contract between (a) caller and implementer, even if the EXPLICIT broader one remains unchanged.
Hey, just asking. I'm not saying ignorance is bliss, but it arguably has its benefits.
There was a stackoverflow question similar to this that I asked a few months ago:
I embarrassed myself with my limited knowledge of probability, but got a moment of dawning comprehension with finding out about base 62.
I think that every computer science student should be required to write a script interpreter that supports recursion. I have written several ove the years just for lack of anything beter to do and each time I decide to write one i really learn alot more about how code really works and therefore gain insight for general coding. If you are going to code, it is very good to have a real understanding of how your code runs so that you can write your code to run more efficiently. Creating 'virtual memory stacks' managing pointers to functions and passing of values. I would like to see more people that are capable of 'understanding the magic' as well. We need more wizards!
I remain unconvinced that certain aspects of software development cannot be lumped under the 'magical thinking' or 'it's too hard let's go shopping' mindset. Examples that come to mind include cryptographic modules and (somewhat related) RNG or hash generators. I believe that the VAST majority of developers and programmers are best served by using tried and tested implementations instead of having to come up with their own, and here's why;
1. Unless you're IN the crypto business, there's a significantly huge chance that your 'improved algorithm' will stuff things up. Heck, even if you're in the crypto business you can bugger this up easily.
2. Even if you are working off a known algorithm (e.g. AES/Rijndael) you could very easily introduce subtle bugs that mess up your crypto module.
I mind me the time when Debian included a slightly less than useful RNG in their distros simply because someone thought it would be a good idea to monkey around with the standard implementation (and don't let facts get in the way of a good story, right?) - I imagine anyone proposing a new implementation would have to jump through flaming hoops and swim through a pool filled with sharks and alligators to prove his/her code correctness before proceeding one step further.
In these cases, I suspect it would be better to port a working, tested code module and iron out the wrinkles instead of having to implement it from scratch.
Absolutely it is a terrible idea to try to write your own Security-with-a-capital-S code unless you are a professional with many years training and experience in that field, and the support of other professionals who know what they are talking about. I agree strongly with your point in that respect, and have made that same point frequently in this blog.
But there is an important difference between understanding the fact that the implementation details of certain algorithms are so fraught with difficulties that most mortals ought not to attempt them, and the inability to talk about what basic techniques you might use to build that algorithm were you able to do so correctly.
One is the recognition that a little knowledge is a dangerous thing. The other is a lack of understanding of the basic parts that underlie any nontrivial algorithm. If a candidate has difficulty determining the relationship between number of bits in a timestamp and the time granularity of that timestamp, or has no ideas about how to efficiently remember that a given unique number has been used before, then they're unlikely to be successful implementing any algorithm in the compiler. -- Eric
You make a very good point about the complex algorithms involved in cryptography and why in many cases it is best to use a proven algorithm. I think that everyone here would agree with you that you dont have to know how a known algorithm you are using works. But even when using a proven algorithm implementation or something in the framework it is not good to think of it as "Magical."
Its not that you have to know how the algorithm works. I think the point here is that it is good to know that something works because someone else made it work. It is good to recognize that the api or algorithm you are using is subject to the same computational constraints as your code, and is not simply "Magic."
Ah... the light dawns. Yes, I agree that people need to know that software is pretty much written by people, and that *someone* wrote the code, or implemented the algorithm. The OP as much says so, and that part of it I quibble with not in the least.
It's just that I see in the example provided two algorithms I would not want to implement on the spot, as it were - the GUID and random number. I don't recall which factors go into calculating random numbers offhand, you see, and I certainly couldn't tell you which pseudo-random number generating algorithm is the best for any given circumstance. Alas, I was suckered by the example and not the greater principle. Mea culpa.