Syntax, Semantics, Micronesian cults and Novice Programmers

Syntax, Semantics, Micronesian cults and Novice Programmers

Rate This
  • Comments 48

I've had this idea in me for a long time now that I've been struggling with getting out into the blog space.  It has to do with the future of programming, declarative languages, Microsoft's language and tools strategy, pedagogic factors for novice and experienced programmers, and a bunch of other stuff.  All these things are interrelated in some fairly complex ways.  I've come to the realization that I simply do not have time to organize these thoughts into one enormous essay that all hangs together and makes sense.  I'm going to do what blogs do best -- write a bunch of (comparatively!) short articles each exploring one aspect of this idea.  If I'm redundant and prolix, so be it.

Today I want to blog a bit about novice programmers.  In future essays, I'll try to tie that into some ideas about the future of pedagogic languages and languages in general. 

Novice programmers reading this: I'd appreciate your feedback on whether this makes sense or it's a bunch of useless theoretical posturing.

Experienced programmers reading this:  I'd appreciate your feedback on what you think are the vital concepts that you had to grasp when you were learning to program, and what you stress when you mentor new programmers.

An intern at another company wrote me recently to say "I am working on a project for an internship that has lead me to some scripting in vbscript.  Basically I don't know what I am doing and I was hoping you could help."  The writer then included a chunk of script and a feature request.  I've gotten requests like this many times over the years; there are a lot of novice programmers who use script, for the obvious reason that we designed it to be appealing to novices.

Well, as I wrote last Thursday, there are times when you want to teach an intern to fish, and times when you want to give them a fish.  I could give you the line of code that implements the feature you want.  And then I could become the feature request server for every intern who doesn't know what they're doing…  nope.  Not going to happen.  Sorry.  Down that road lies cargo cult programming, and believe me, you want to avoid that road.

What's cargo cult programming?  Let me digress for a moment.  The idea comes from a true story, which I will briefly summarize:

During the Second World War, the Americans set up airstrips on various tiny islands in the Pacific.  After the war was over and the Americans went home, the natives did a perfectly sensible thing -- they dressed themselves up as ground traffic controllers and waved those sticks around.  They mistook cause and effect -- they assumed that the guys waving the sticks were the ones making the planes full of supplies appear, and that if only they could get it right, they could pull the same trick.  From our perspective, we know that it's the other way around -- the guys with the sticks are there because the planes need them to land.  No planes, no guys. 

The cargo cultists had the unimportant surface elements right, but did not see enough of the whole picture to succeed. They understood the form but not the content.  There are lots of cargo cult programmers -- programmers who understand what the code does, but not how it does it.  Therefore, they cannot make meaningful changes to the program.  They tend to proceed by making random changes, testing, and changing again until they manage to come up with something that works. 

(Incidentally, Richard Feynman wrote a great essay on cargo cult science.  Do a web search, you'll find it.)

Beginner programmers: do not go there! Programming courses for beginners often concentrate heavily on getting the syntax right.  By "syntax" I mean the actual letters and numbers that make up the program, as opposed to "semantics", which is the meaning of the program.  As an analogy, "syntax" is the set of grammar and spelling rules of English, "semantics" is what the sentences mean.  Now, obviously, you have to learn the syntax of the language -- unsyntactic programs simply do not run. But what they don't stress in these courses is that the syntax is the easy part.  The cargo cultists had the syntax -- the formal outward appearance -- of an airstrip down cold, but they sure got the semantics wrong.

To make some more analogies, it's like playing chess.  Anyone can learn how the pieces legally move.  Playing a game where the strategy makes sense is the hard (and interesting) part.  You need to have a very clear idea of the semantics of the problem you're trying to solve, then carefully implement those semantics.

Every VBScript statement has a meaning.  Understand what the meaning is.  Passing the right arguments in the right order will come with practice, but getting the meaning right requires thought.  You will eventually find that some programming languages have nice syntax and some have irritating syntax, but that it is largely irrelevant.  It doesn't matter whether I'm writing a program in VBScript, C, Modula3 or Algol68 -- all these languages have different syntaxes, but very similar semantics.  The semantics are the program.

You also need to understand and use abstraction.  High-level languages like VBScript already give you a huge amount of abstraction away from the underlying hardware and make it easy to do even more abstract things.

Beginner programmers often do not understand what abstraction is.  Here's a silly example.  Suppose you needed for some reason to compute 1 + 2 + 3 + .. + n for some integer n.  You could write a program like this:

n = InputBox("Enter an integer")

Sum = 0
For i = 1 To n
      Sum = Sum + i
Next

MsgBox Sum

Now suppose you wanted to do this calculation many times.  You could replicate the middle four lines over and over again in your program, or you could abstract the lines into a named routine:

Function Sum(n)
      Sum = 0
      For i = 1 To n
            Sum = Sum + i
      Next
End Function

n = InputBox("Enter an integer")
MsgBox Sum(n)

That is convenient -- you can write up routines that make your code look cleaner because you have less duplication.  But convenience is not the real power of abstraction.  The power of abstraction is that the implementation is now irrelevant to the caller.  One day you realize that your sum function is inefficient, and you can use Gauss's formula instead.  You throw away your old implementation and replace it with the much faster:

Function Sum(n)
      Sum = n * (n + 1) / 2
End Function

The code which calls the function doesn't need to be changed.  If you had not abstracted this operation away, you'd have to change all the places in your code that used the old algorithm.

A study of the history of programming languages reveals that we've been moving steadily towards languages which support more and more powerful abstractions.  Machine language abstracts the electrical signals in the machine, allowing you to program with numbers.  Assembly language abstracts the numbers into instructions.  C abstracts the instructions into higher concepts like variables, functions and loops.  C++ abstracts even farther by allowing variables to refer to classes which contain both data and functions that act on the data.  XAML abstracts away the notion of a class by providing a declarative syntax for object relationships.

To sum up, Eric's advice for novice programmers is:

  • Don't be a cargo cultist -- understand the meaning and purpose of every line of code before you try to change it.
  • Understand abstraction, and use it appropriately.

The rest is just practice.

  • Thomas and Jonathan, interesting points. Thomas first:

    >No, this is him being too lazy to read the
    >documentation. Especially ADO.NET is trivial
    >as object model. ASP.NET is not really
    >complex wither, but it is always an advantage
    >to know what happens there - means: for
    >ADO.NET it really helps a lot to know what
    >HTML is and how the web works (it is
    >interesting how many people working with
    >ASP.NET have no clue about how http does
    >work).

    All true (well, the "trivial" part is up for discussion). But I think you are illustrating an interesting issue: in order to be effective with ASP.NET, you need to know HTML. But suppose I don't really know HTML that well, but nonetheless I need to get this one thing working right now. In theory, I could go away and study HTML, etc. until I felt I really understood what's going on. But I don't have time! I need to get this thing done right now!

    ASP.NET + HTML is probably not the best example, but perhaps XML + XSLT is. You could spend months studying XSLT before you felt like you "understood" it, but in fact, you can get stuff working pretty rapidly by copy-and-tweak. Which leads to the kind of situation I was describing. Note that the person who is doing this IS learning XSLT, just on an incremental and JIT basis. It simply takes time to absorb all of these things, and it's not always (in fact, rarely) realistic to spend a long time studying the background on something before you start working with it.

    Thomas: interesting point. I can tell you that different types of documentation is written in different ways. The example you cite is from a reference topic (description of a class or member). That type of documentation is traditionally written tersely because reference docs are to some extent written for the already-experienced programmer who (it is theorized) does not need a bunch of background information repeated when all they wanted was the syntax of method such-and-this. (You'll note that almost all reference material for all languages -- JScript, C#, Java, T-SQL -- follows this philosophy.) The idea is that if you don't already know what an object is or whatever, you should be able to read about that in a conceptual topic, which in previous years would have been the "programmer's guide." Good reference material will cross-reference to the conceptual documentation that provides the background required to understand the reference material. A weak analogy, I suppose, is a dictionary. It defines the words, but it doesn't tell you each time what a noun or adjective or verb is; it's assumed you already know that or can look it up elsewhere.

    More germane to this discussion, reference documentation is not meant to be tutorial in nature. It's not where a beginner should be learning the semantics of the language. (Again, is the theory.) To be clear, I think it's possible to write good and bad reference documentation. I favor verbosity myself (obviously).
  • We've each restated many of the same things using other helpful analogies here about the educational process of becoming a programmer.

    Regarding "Mort" or the middle-level "experimenter/copy & paster" coder...certainly this person exists: the fact is that there is the spectrum from the Parroting Cargo/Cult Novice Coder to the Polished Pro-grammer (who can steal just about any code and know full well what to expect of it). But tread very carefully in the cut & paste world.

    It's as if there are different "heuristic" as well as abstraction levels. A "Mort" will C&P and test/break the code. A Polished Pro knows this and anticipates doing this BUT is cautious in the "breaking"; "constants that vary; the variables that don't" can cause all hell to break loose even in the experimental phase.

    Someone else said in a previous response "...learn to learn...": that is the key. Learning music, learning airplane maintenance, learning whatever...if you don't start from principles (abstraction), you're a Cargo Cult programmer. But learning by doing (heuristics) is part of the learning process...i.e. seeing principles AT WORK.
  • Programming and Personas
  • mike, I think you meant to reply to my post in the 2nd part of your post.

    Judging by your comment "[reference documentation] is traditionally written tersely" it seems I totally failed to illustrate my point with that example.
    What I wanted to say is that I find that sample topic to have, if anything, too much information. Too much syntactic garbage : what's the benefit of repeating on every reference page that to call a method you need to put a dot to the right of an object value ? Particularly when instead of well-defined words such as 'instance', they use 'object name' which is totally wrong and ill-defined.

    Once you get rid of that garbage, there is ample room to specify the details that really make the difference such as the data types of the arguments.

    And I disagree that reference material for all languages (I was actually talking about API, not language, references but I see what you mean) is written in the same way. Look at the CLR or PSDK docs, which essentially target non-scripting programmers; while they have their own share of problems, they have at least the advantage of actually being _reference_ texts, in that they precisely and concisely define a specification for an API. The scripting runtime docs just leave too much to the imagination.

    Hoping this is clearer...

  • I'm in your "expert" category, having been programming for 20 years and done everything from device drivers to business systems for Fortune 50 companies. On the side, I teach (C & Perl) and write books (Perl & Web), and I'd teach a lot more if I had the time.

    I think you're right on the mark. When teaching (and writing) I've had tremendous success teaching in 4 steps to beginner and intermediate programmers.

    First, give an overview of a small problem (in plain English) with lots of handwaving and other gestures -- but no code. Walk through how it'd be solved.

    Second, give a piece of code to solve the problem directly. Apply the walkthrough to the code, explaining at each step what's going on (but not syntax!). If there's a danger of Cargo-Culting, it's here. What I do to combat this is vary the problem a bit, and show them ever-changing variations on the same theme. This re-inforces how the code is solving the problem, rather than what's being solved.

    Third, either as an exercise or with a class solve a similar-but-not-quite-a-trivial-variation of the problem. That further re-inforces the problem solving pattern.

    Fourth (in class), take wrong examples that students have worked on or given up on and work them out in front of the group using classic beginner debugging techniques: lots of "print" statements, divide and conquer, simplification by hard-wiring logic, etc..

    The thing that makes some of this difficult, is that some languages are more prone to "idiomatic" structures than others. The idioms have to be taught one way or the other, but I'd rather re-construct the idioms by hand and have the student (or reader) pleasantly surprised later to find that their coding technique is canon.
  • I am a novice 'Programmer'..Having decided to formally take a programming course for the first time. (Its C++ by the way..)

    <snip>
    # Don't be a cargo cultist -- understand the meaning and purpose of every line of code before you try to change it.
    # Understand abstraction, and use it appropriately.
    <End snip>

    This article applies to many facets of life/work...

    It could as easily be restated..

    Dont be a CC - understand every LAW before you try to change it

    Dont be a CC - understand every <Medical Procedure on the Foot>|<type of carburator to put on a edlebrock>|<flavor of ice cream avail on the market>|....before you try to change it.

    This blog speaks to professionalism and the apparent lack of professionalism in the programming /IT world as perceived by the non-programming/IT world.

    Businesses look for and expect results - cut-n-paste with the resultant hope it works - is accepted/required. Produce code or loose your position.

    Is the cargo-cultist approach the reason beginning / basic programming is be 'global-sourced' / ' out-sourced' / 'off-shored'??

    Which came first - the chicken or the egg?

    The business world apparently perceives a lack of value in a professional programmer ( for the most part the companies want websiteX or embedded productY to just work ).

    Mind you I am not against free-trade - Micro|Macro Econ should be required before allowing anyone to graduate from University.

    Back to the blog - I perceive that a some amount of the population desires to know WHY - or HOW things work. A fair group however do not.

    Excellent articel and definitely thought provoking. It is rare that I read anything where I am required to look up a definition of a word prior to finishing the first paragraph.
    (prolix - tediously prolonged or tending to speak or write at great length).

    Thank you - I have just added a word to my vacabulary ( as I am guilty of it as well)

    -Cheers
  • Might as well add "sesquipedalian" while you're at it.

    There's no doubt that poor understanding of the nature of software engineering is behind many poor management decisions -- but it is also the case that software engineering is not a mature enough discipline yet to consistently give good data to management!

    That's a huge topic in itself, which I know little about.
  • I think your entry has good merits for the novices in the audience. Definitely, at some point assertion of random code mimicry will halt progress toward some new concievable goal. Much like trying to find harmony on a piano by random sounding of keys.

    While I still consider myself a intermediate programmer. IMO programming (learning to program) is just like trying to write about unkown subjects or speak a foreign language. It boils down to a knowledge transfer problem. Of course knowledge transfer is a deep abstraction in and of itself and it requires many data exchanges between the communicating parties to begin a meaningful discussion and is the subject here.

    What novice programmers need is a primer that enables reflexive understanding of what programming is so that they can keep evolving their understanding of programming.

    By primer I mean something like the concept of numbers and a set of axioms in math; or an alphabet and the parts of speech in english; or notes and modality in music. I know of no such great such reference manual for programming. Maybe the tao "The art of programming", but I think that is rather complicated by the mathematical notation. And besides novices want practicle advise not conceptual advise.

    But beyond this primer, what also is needed is a good short book on how to solve computer problems similar to G. Polya's "How to solve it" for math problems.

    The book "Design patterns" comes close here, but again it bogs down any novice with notation and syntax yet again. The book "Refactoring" might also be good but I believe has an even greater symbolic requirement.

    Alas this goal may be unreachable, because of the complexities associated with programming both mathematical and linguistical.

    I think what is needed is something like a very concise compendium of of "Design patterns", "Refactoring", "Feature Modeling", "Aspect Modeling" and excerpts of "The 'New' Touring omnibus", combined with a pratical backgrounder on syntax, grammar, and sematics in order to boot strap a novice. But that is too much to ask any new programmer!

    So... are we doomed to suffer the wrath of cargo culture programmers indefinitely? I think not. This is just a normal step in the advancement of how we communicate with each other. Computers are just a rich communication medium which allow us to interact and express ourselves. Like natural languages, programming is just a new and very young dialect yet to be codified into a succinct and intuitive manner like music, math or language.
  • Excellent post, Eric! :)

    Just thought I'd mention that we've got a good thread about learning to program going on at the moment:
    http://www.sitepoint.com/forums/showthread.php?t=156802
  • As a "novice" web developer for 4+ years, I am VERY guilty of cargo cult programming. Even after picking my way through the bits of code I have either written, or cut-and-pasted, I don't feel any closer to actually understanding the semantics.

    In order to improve my ability to semanticize(?), I have searched high and low for any information regarding how do you break a problem down and apply the tools any given language has, to solve the problem? When do you know the problem can't be broken down any farther? It's that part that I need help with, but I think I am in the minority.

    And quite frankly, the example you gave regarding abstraction, is SO far over my head, I'm still spinning, but I will keep coming back, because it was a great article. Thanks.
  • i've been a developer for 15 years coding all kinds of projects and i've gotta say there's a fine line between abstraction and cargo cult programming.

    i'm guilty of ccp. often times you get a snippet, you know what's it's supposed to do, you can look at it and have a rough idea of what it's doing where, but in reality, the deepest details of WHAT'S REALLY GOING ON (tm), are totally magical.

    anybody that's worked as a knowledgeable novice on a crunch project with high level APIs for occasionally deep and dark stuff like say heavy direct3d programming probably knows where i'm coming from. it's due in week. nobody has a clue even on the newsgroups. "you're trying to do what?!?!?" but nevertheless, you're the crazy with the duct tape and debug stream waiting to see what happens.

    in the realm of rapid development, where you're stuck with alpha or beta (if you're lucky) APIs to produce the latest greatest feature, waving the mystical palm frawns and offering sacrifices is sometimes the best and only chance you have.

    yet some people might calmly think of that as black boxing. widget A draws a perfect circle and whatever it does underneath the hood is it's business. to channel arthur c clarke, if you're comfortable with it, it's technology, if you're not, it's magic.

    after a generation or two of cargo culting, when nothing keeps showing up, you'd have to realize that what you're doing isn't working. it's magic is gone. if you do actually have success, you'll repeat that magic, but like a witch doctor or grandma, that home remedy really turns into technology.

    frankly, i wouldn't recommend staring into the dark of ccp for new programmers. leave that to the old, fringe weirdo on the edge of town. he or she is mostly useless except for the most dire of situations. (aka wrangling windows security hassles and legacy programming insanity vs immediate deadlines.)
    m.
  • This reminds me a lot of what I have tended to refer to in the past as Voodoo Programming, especially in maintenance situations:]
    [Me]"No, don't leave it at that"
    [Dev#2] "Why not, I don't know why but it seems to work"
    [Me] "Because it doesn't make any bloody sense! Find out what's *really* going on".

    As for the vexed problem of the care and feeding of new programmers, there is a dirty little secret that nobody (especially in CS departments) wants to admit in public: you can take many perfectly normal, intelligent people and teach them how to write syntactically correct code, you can teach them design patterns up the wazoo so they can write code that handles many commonly occurring situations, but only a certain proportion of those people will ever really "get it" (the rest will generally end up as their managers and tell the others how to do what they never really understood themselves). That is because programming is only partly a matter of education, while it is also very importantly a matter of aptitude.
    We pretend otherwise because
    a) It helps justify academic CS budgets
    b) It suits corporate executives who think programming can and should be reduced to a factory activity.
  • Kevin, you are right. There are different types of people in the world. I've met super-intelligent guys (actually had them tested at interivews) who simply weren't able to think in abstract terms. This is all well and normal; they are not stupid for that, just differently wired.

    Furthermore some people learn by induction, other by deduction -- someone needs to see 1.000 examples before realizing how to go about it, while others first have to understand the big picture first.

    <provocation>
    That being said, Eric isn't your post a cargo cult type post: talking of mechanics of teaching the trade, but not understanding the big picture of how people learn?
    </provocation>
  • > Eric isn't your post a cargo cult type post: talking of mechanics of teaching the trade, but not understanding the big picture of how people learn?

    Why do you think that? The post isn't about how people learn at all, it's about a pitfall that people fall into: confusing the form of a program with its content.

    I could just as easily have written the piece in the context of writing sonnets, not programs. Yes, you have to learn what the structure of a sonnet is before you can write one, but just because you have the structure right doesn't mean you've written a GOOD sonnet, any more than just because you've moved the pieces legally means you can win at chess.

    That people have different learning styles is undoubtedly true, but I fail to see the relevance to this particular post.
Page 2 of 4 (48 items) 1234