I came across the RegEx Side blog this week (hat tip to Jason Haley) where Brendan asks the question "Computer Science is not teaching regular expressions?" I have to admit that except when I taught an independent study using Perl I didn't teach regular expressions. I learned them (a little) in my own education but then with all the credits in CS I took between undergraduate and graduate programs you would hope so.
I never used them much because frankly without supporting software they were pretty much something one learned to understand parsing. They were not something one used much unless one was doing real compiler development. That's all changed today though. Many modern languages have classes, methods, or other routines that allow programmers to much more easily use regular expressions. The .NET Framework and the standard Java library both include regular expression tools. Languages like Perl are pretty much all about the regular expressions.
It's probably time to find some room to cover regular expressions earlier in CS education. I'm just not sure where. There is clearly no more room in the AP CS curriculum. That course may be too full already. Does it belong in a first programming course? I'm not so sure about that. It's a good thing for a serious CS student or future professional developer to know but it can be a little bit much for a brand new programmer to get their head around. So where does it fit? I'm open to suggestions.
If students do not learn it at the high school level I sure hope they learn it in college though. Regular expressions are a powerful tool.
A few thoughts (from a very non-teacher :)):
1) If someone doesn't know much about the performance implications they may likely use this for every little task. Many books reccommend deep knowledge of the underlying engine including using micro-optimizations for complex expressions; including the somewhat esoteric difference between state machine types (NFA/DFA/Combo etc.).
2) Likewise, regular expressions are usually not a viable substitute for localization routines
3) Due to the design of it, it can be difficult (I recall one instance where someone said it was techinically impossible) to make something like a standards-comforming e-mail address parser
If one teaches it in high school, I have a feeling it is yet another of those things they would have to "re-learn" later in college when they (hopefully) learn the practical side of things.
Then again, it would make a lot of those early programming assignments a lot easier :D:D.