One of my favorite Apple references on the Simpsons comes from the following clip. During a school assembly, Kearney tells his buddy, "...take a memo on your Newton: Beat up Martin." He scribbles away only to see the Newton's handwriting recognition interpret his memo as "Eat up Martha." With my Simpsons-geek friends, I still sometimes use this quote in completely inane contexts.
Well, fast forward (or is it rewind?) to yesterday. It was late at night and I had to stop by a bank to deposit some checks. Now, these days I rarely use checks (or to the Canadians reading this – cheques) but from time to time people give me money and a che(ck|que) is what they give me.
I drove up to the bank and walked up to the ATM machine. My plan was to put the che(que|ck)s in the envelope and deposit them into my account. However, I could see that the little slots for dispensing envelopes were sealed off, never to be opened. “Great,” I thought. “Do I have to wait until tomorrow when they open it up again? Why would they close this off? I’m sure I’ve deposited at night before!”
But I continued to examine the ATM. There was a Deposit button on the front of it, so I figured I’d try pressing it first before giving up and heading home. I entered in my card and pressed my PIN number, then pressed the Deposit button.
A set of instructions popped up. It said “Put all of your checks and cash in one pile and insert into the machine.” I did not have to separate them, I just put them all together. Some of my che(ck|que)s were of different sizes and all of them were of different values, but I followed the instructions and put them in. I figured I’d have to enter in the amount later. After all, that’s how you normally deposit money into an ATM – you put the money into an envelope, insert it into the machine and then type in how much everything is worth, cash and checks.
But here’s the thing that caught me off-guard.
I watched as it separated and scanned all of the checks and then displayed the images to me on-screen. “Hmm, that was quick.” Then, it told me the full amount of the checks I deposited!
“Holy sh*t!” I said.
Why did I say that? The ATM had scanned the checks and read the values – in human handwriting – and interpreted them as actual numerical values. It added them up and displayed the correct value.
I was very surprised to see that. Machines are okay at reading in text and then OCR’ing it into digital text that a computer can manipulate. However, in my experience, OCR is pretty buggy and only mildly reliable. That’s why we have reCAPTCHA – because computers are not that good at reading text with bits of random crud in it. Humans are needed to do that difficult grunt work.
Yet here was a service that not only is reliable enough to OCR text, it is reliable enough to OCR human handwriting... so reliable that the bank feels good enough to use it in their ATM machines where it will be used by many, many customers who will be providing feedback on this new feature.
If a large business like a bank is willing OCR handwriting, then how difficult is it to do reliable OCR on CAPTCHA’s, anyhow? Handwriting is variable because each of us do it differently. Most of us print, while a few of us use cursive, but it all contains random crud. If recognizing handwriting is not all that difficult then recognizing text in an image can’t be much more difficult, either.
We’ve known for a long time that CAPTCHAs are broken, but what we mean when we say that is that machines can interpret a CAPTCHA something like 20% of the time (or a bit more, or a bit less). And if you automate it, you can do it as much as you want. Of course, rate limiting can make random guessing a lot less productive.
But what if random guessing became a lot less random? If an ATM can recognize the handwriting of five different people, then I am supposing that a hacker could probably figure out how to beat some of the more common CAPTCHAs (what’s funny is that for many websites that use these, I cannot figure out about 1/3 of them).
But on the other hand, maybe the problem of figuring out handwriting isn’t a big deal. All an ATM has do it is interpret numbers (in the value field in the checks); it doesn’t have to actually read letters, only numerics. Perhaps that makes this a lot easier and the barrier to the next level of reading letters (and hence, breaking CAPTCHAs on the first try) is still pretty daunting. You only need to recognize 0, 1, 2, 3, 4, 5, 6, 7, 8 and 9. That’s not that many.
Still, the ability to read handwriting is pretty cool. Hopefully this is not another step in machines’ quest to take over the world.
Technology marches on and all, but I do think there's at least one substantive difference between a che(que|ck) and a CAPTCHA. Though quality of handwriting is definitely variable, as you mention, I don't think most people slur their block letters (and/or Arabic numerals) into one another enough as to produce the sort of ambiguity that's present in a CAPTCHA image. An extra bar drawn through an entire word can turn an "F" into an "E" or an "O" into a "Q", which the CAPTCHA-cracker is going to have to try to work out. The che(que|ck) parser can operate under the assumption that the user isn't trying to screw with it and go one letter at a time, without considering whole-word distortions and outside context. After all, you WANT your check to be accepted!
All this to say that I think the ATM is receiving far more comfortable input than a CAPTCHA-cracker does. It might be interesting to see if you could get the ATM's parser to fail by, say, taking a thin marker and swiping a random line or two through your name and the dollar amounts.
I want to try this now just to see if the ATM can recognize my atrocious handwriting.
Also, I like how you bend over backward to please folks with check/cheque, but then write "ATM Machine".