the1's WebLog

Programming Challenge: Phraser

Want to prove that you are the best programmer money can buy?  (OK, I know you are not for sale, but your boss may need a friendly reminder that it's time for your next big raise.)  Here's your chance:

 

On a telephone keypad, the number keys 2 -- 9 also carry letters on them (see the picture below).  Your task is to write a program that finds all possible ways to phrase a phone number.  By "phrase" I mean using a mixture of numbers and English words to spell out the number.  For example, the number 642-394-6369 can be phrased as nice-window, nice-wind-ox, nice-win-fox, nice-9-in-fox, etc.

 

 

Your input includes a list of English words, as well as the phone number as a sequence of digits.  For testing your program, you can use a basic English word list I found here.

 

The merit of an entry will be judged mainly by how clear the code is, not necessarily the efficiency.  You can choose your favorite programming language, as long as it gets the job done (If it's not a widely known language, you probably want to make sure the code is self-documenting, so other people can understand why your code is so cool).

 

I will post my own solution in several days.  (In case you are wondering, it is not in C++.)

 

Enjoy!

Published Monday, March 29, 2004 10:11 PM by the1

Comments

 

Joe Lisp User said:

This is pretty similar to the programming problem that was used to compare Java to C/C++ (and later to Lisp) a few years ago:

http://www.norvig.com/java-lisp.html
March 29, 2004 11:33 PM
 

Anonymous said:

I didn't see you mentioning payment or a prize? Do you expect someone doing your job for free? :-)
March 29, 2004 11:58 PM
 

the1 said:

Dear Joe,

Thanks for the pointer! I wasn't aware that there was a similar contest. But we shouldn't let that take away the fun, should we? Let's try to beat Norvig!

Dear Anonymous,

By winning the contest, you gain unlimited bragging right. Didn't I also mention your boss's reaction? Come on, there is no excuse for not doing it. :-)
March 30, 2004 1:17 AM
 

senkwe said:

Hey, isn't including numbers in the phrase a cop out? eg "nice-9-in-fox". Shoudn't there be a few constraints on how many numbers are allowed in the phrase?
March 30, 2004 8:31 AM
 

jaybaz_MS's WebLog said:

March 30, 2004 1:08 PM
 

the1 said:

Dear Senkwe,

Let's keep it simple. For now I don't care how many numbers you use in the phrase. The idea is to get ALL different ways to phrase your phone number. Additional constraints (e.g. 3 digits allowed at most) can aways be added later to filter the result. The principle of modular programming says we should do only ONE thing (and do it well) in one module.
March 30, 2004 1:10 PM
 

B.Y. said:

OK, I have a question. When you say "how clear the code is", does the word code mean code only, or code and comment ? If I have the following two entries:

*entry A: simple code without comment.
*entry B: slightly more complex code than A, but it has a ton of comments to make it easier to understand.

which entry is considered better in this contest ?



March 30, 2004 2:09 PM
 

the1 said:

Dear B.Y.,

You are really serious about this. :-) This is a contest for clear code. While good comments are important, they are not the focus. (Actually often I don't trust comments, because I've seen so many cases when they are out of sync with what the code really does -- programmers always update the code and forget to change the comments accordingly).

Good code should be clear on itself and require few comments. Bad comments are worse than no comments at all. I hate people writing comments just for comments' sake, e.g.

a++; // increase the value of a by 1

In your example, A will be preferred -- as long as it can be easily understood without comment.

Finally, I don't want to pretend being an authority on code clarity. There is no judge for this contest. :-) The whole purpose of this chanllenge is to have some fun and learn from each others' different angles of solving problems. (OK, I lied about the big raise, but if we learn something from this, it can be applied to get us closer to the raise.)
March 30, 2004 3:38 PM
 

Jay Bazuzi [MS] said:

Here are some thoughts on comments: http://weblogs.asp.net/jaybaz_ms/archive/2004/03/22/94066.aspx

If your code can't be understood without comments, then either add them or simplify. (Doing neither is bad!)
March 30, 2004 4:57 PM
 

CalvinH [MS] said:

My boss forwarded me this blog<g>

I have a 100 line solution (with comments) not written in C++. It uses a 53000 word dictionary and does not have much optimization, although I have an optimized version too.

How should I submit this entry? Should I just post it here?

thanks
March 31, 2004 1:28 PM
 

the1 said:

Dear Calvin,

Wow, the first entry! Congratulations!

You can either post the code here, or put it on your own blog (if you have one or want to create one) and provide a link.

If you post it here, the format and indentation will be lost. :-( Let's see how it looks, and if necessary, you can email me (Zhanyong Wan) the code and I'll post it in its original format.

Also, can you provide a link to your dictionary so others can also use it? Thanks!
March 31, 2004 1:37 PM
 

VS DATA Team's WebLog said:

March 31, 2004 8:19 PM
 

CalvinH [MS] said:

Here's a link to the programming challenge solution on my blog. It also includes a link to the dictionary I used.

http://blogs.msdn.com/vsdata/archive/2004/03/31/105159.aspx
March 31, 2004 8:23 PM
 

VS DATA Team's WebLog said:

March 31, 2004 10:29 PM
 

the1's WebLog said:

April 1, 2004 4:43 PM
 

Justin Rogers said:

I made some changes to the algorithm by cutting out single character responses. I don't bother placing the dashes in since I don't think they are uber important, but I could put that feature back in if necessary. The code is hosted on.

http://weblogs.asp.net/justin_rogers/articles/105916.aspx

I'm using the cheapo dictionary for now, and I've done 0 optimizations. The only true optimization is in the switch table.

One thing to note. You can really optimize the algorithm in the case that a 1 or 2 somehow bisects the numbers. When you do this, you only have to process two substrings, reinserting the splitting character where necessary. This can result in a huge performance gain. However, a good deal of the numbers you would process don't contain a 1 or a 0, and so the perf gain is lost.
April 1, 2004 5:45 PM
 

Justin Rogers said:

And my algorithm isn't properly convergent either, so I'm not yet completed.
April 1, 2004 5:55 PM
 

the1's WebLog said:

April 1, 2004 7:28 PM
 

the1's WebLog said:

April 2, 2004 5:11 AM
 

the1's WebLog said:

April 2, 2004 4:00 PM
 

Ken Robertson's Blog said:

April 2, 2004 4:46 PM
 

Ken Robertson's Blog said:

April 2, 2004 4:47 PM
 

the1's WebLog said:

April 6, 2004 7:50 PM
 

the1's WebLog said:

April 6, 2004 7:50 PM
 

the1's WebLog said:

April 6, 2004 7:55 PM
 

the1's WebLog said:

April 6, 2004 8:48 PM
 

Frans Bouma's blog said:

April 7, 2004 5:31 AM
 

Nick Berardi said:

"Nice" isn't in the dictionary list that you provided.
April 7, 2004 7:56 AM
 

Nick Berardi said:

neither is "in", "win", or "fox". just to let everybody know for testing purposes.
April 7, 2004 8:05 AM
 

Nick Berardi's Blog said:

April 7, 2004 1:27 PM
 

Nick Berardi's Blog said:

April 7, 2004 1:40 PM
 

Nick Berardi said:

Here is my submission for the contest. You can find the code at http://blog.zigamorph.com/Default.aspx/archive/2004/04/07/158.aspx

In addition you just need to take the word is in this blog and add "nice", "in", "win", and "fox". Then name it "Dictionary.txt" and put it in the same directory as your program.

Have fun and if you find any bugs please submit it to my blog for the code.

For all of you that want to know how long this too, it took about 3 hours to wrap my mind around the problem and code it. So please forgive spelling errors.

Thanks, Nick
April 7, 2004 10:48 AM
 

Nick Berardi said:

Oh one more thing. In you original post you mentioned that there was 160 solutions for the number 6423946369 but my program returns 228 possible solutions. I have verified all the data am I missing something in my program? Anybody see an error, or was there an error with the original dataset?
April 7, 2004 11:03 AM
 

the1 said:

Nick,

Your result list is bigger because your word list contains "ox", which mine doesn't have.
April 7, 2004 1:06 PM
 

Mike said:

April 7, 2004 11:03 PM
 

Nick Berardi said:

Ah..

That makes more sense, I had to add that to get one of the 4 results you had listed above.
April 8, 2004 9:29 AM
 

Nick Berardi said:

Hey Mike,

Nice easy solution to look at. But what dictionary did you use. Because the one I got off the site above doesn't seem to work. Well at least there seems to be a loop that never ends?
April 8, 2004 9:40 AM
 

Brian C Robinson said:

Ok, here's my beautifully innefficient but correct and clear solution:

http://www.herassmygod.org/bcr19374/programming/Phraser/Phraser.py

Tests available here:

http://www.herassmygod.org/bcr19374/programming/Phraser/PhraserTest.py

And an example of using the class (assuming words.txt in the current directory is a text file with one word per line, as the example stated in the problem):

http://www.herassmygod.org/bcr19374/programming/Phraser/PhraserRun.py
April 9, 2004 10:24 AM
 

the1's WebLog said:

August 20, 2004 5:31 PM
 

historic black colleges in atlanta ga said:

historic black colleges in atlanta ga

September 11, 2007 5:35 PM
 

foxpro.catalyst ?? » Blog Archive » NF . Channel9 said:

November 11, 2007 8:02 PM
 

the1 s WebLog Programming Challenge Phraser | Paid Surveys said:

June 2, 2009 2:58 AM
 

the1 s WebLog Programming Challenge Phraser | debt solutions said:

June 15, 2009 8:56 PM
 

the1 s WebLog Programming Challenge Phraser | low cost car insurance said:

June 17, 2009 12:50 AM
Anonymous comments are disabled

© 2009 Microsoft Corporation. All rights reserved. Terms of Use  |  Trademarks  |  Privacy Statement
Microsoft
Page view tracker