Let me preface everything in this post with the statement that SPSS is a remarkable tool for data analysis and that I’m not an SPSS expert – I used to have passing familiarity with SAS (different beast) and some college-level statistics from nearly twenty years ago.

Over the weekend I spent a couple of hours working with SPSS 14 on a data set from some 700 phone surveys with customers. Setting aside the fact that the data set I was working on was big (not huge, but 700 people times something like 40 questions is big) and unfamiliar (I didn’t know what any of the questions really meant when I went into the data), I have to say that it is a complex product, but surprisingly navigable to a newbie.

My main purpose was to find correlations in the data set so I could reduce the number of distinct questions I needed to identify a characteristic. So, for example, if I found that age had a strong correlation with gray hair, I could use one of the two as a variable to help me identify a particular group of people. By contrast, I also wanted to find variables that would usefully identify groups of people – say, age and height, which aren’t highly correlated in most sample populations. (Understand that my data set was targeted at people who write code, so things like “height” weren’t on the question list.)

To start with, I have to be RASsed into corpnet for the Microsoft license to work. Annoying. Secondly, the product has a very slow startup (license verification, package loading, etc.). But once running, I could run my correlations by pulling down the Analyze menu and selecting Correlate. Basically everything I wanted to do was about that simple. I needed to refer either to the documentation or to the Web for information on the different methods used for things like correlations, but in the end the defaults always seemed to be most useful.

Basically, I have to say that for a complex topic (statistics) and a big product, it’s pretty navigable.