The concept of data driven testing has been around for quite a long time, and I think it's worthwhile to review its application which can be extremely powerful to automated testing.  At the same time, it's also beneficial to know that there are certain situations where it can do more harm than good to try to 'force feed' DDT into the tests.  The thing that I like about DDT is that the basic concept is so simple, but yet not many testers know when and how to apply it.  And when they find success with it, they try to over-use it.  This practice eventually lead to inflexible and broken design in their automation. 

Let’s go over the main advantages on DDT and when we should use it to improve our automation.  Data driven tests provide its biggest bang for the buck when there are numerous tests which are permutations of one another.  For example, testing an input to an application or API parameters.  Scenario-based tests can also be data-driven, as long as the general execution steps within the scenario remain unchanged.  Recently, I happened to come across some legacy automations at our team, and I noticed that roughly 60-70% of the test scripts are simply data.  One major pitfall of having test data coupled with the code makes the code fragile.  Anytime there’s a change in requirement, test data may need to be added or updated.  As a result, the whole script needs to be edited, compiled, and linked (for non-interpreted languages, that is).  So why don’t we just save ourselves the trouble and simply abstract those data out from the test code?  They have no business being together in the first place. 

IMO, the most difficult aspect of DDT is to recognize that the tests can be data-driven from the get-go.  Testers can be so absorbed into coding up the automation and creating a bunch of tests in the shortest amount of time.  By simply taking a step back and look at the test cases at hand, one can quickly see the pattern of DDT and how to correctly apply it.  This white paper by Keith Zambelich (note it's a Word doc) does an excellent job of detailing the different pieces in DDT and how to go about implementing one.

Let’s take a look at some common pitfalls with DDT approach, which I must admit that I had fallen not once but a few times.  

  • Not planning ahead.  For example, abstracting out the actual execution steps as part of test data is just asking for trouble.  Here's a crude example to illustrate what I'm talking about, let's just say you are testing a username/password dialog box.  For argument's sake, let's just say that most of your test cases are simply validating different username/password combination.  There is a test case which will enter a username/password, and then click Cancel, and enter a different username/password.  As you can see, there is an extra step of hitting the Cancel button.  By nature, this ought to be in the code.  However, now your code will simply contain one-off case simply to handle this single test case.  Now imagine there are a bunch of test cases with different execution steps.  Your code could ended up not being generalized at all, but instead there are so many hacks put in place to handle all the different scenarios.  To avoid this, look at the overall picture of all the testcases and design both test data file "schema" and driver code accordingly.  Keep in mind that sometime you just can't apply DDT for every test cases.
  • Over-using data-driven techniques.  This is somewhat an extension from #1.  I truly believe that using data as a mechanism for logic control is simply a bad idea.  I have seen some automation which basically use data file to control test execution (in addition to data).  The code becomes overly generalized and almost contain no logic.  Everything is now transferred into data file.  I called this "force feeding" DDT.  Actual test execution step (i.e. main program driver) should be the code.  Values passing into those steps to be executed should be abstracted out.
  • Not being consistent with your data abstraction nor code generalization.  There are some test data left hardcoded and some in the test data file.  Inconsistency always lead to confusion and code integrity.
  • Not thinking about proper reporting.  This is probably apply to most test automation in general.  Make sure that the report is precise, clear, and easy to grasp -- especially when there are failures.  This way you know right away if bugs are found.
I'm sure there are more, but those are the most common ones at the top of my head at the moment.  Data driven testing is very flexible and powerful when a tester know how to use it appropriately.