Eric writes:

[I]t seems the main problem [in automating software testing] is getting the test data - two steps before you can even think about automation. When talking to developers and QA folks it's a major roadblock. Any firsthand insight on this?

"Getting the test data" is indeed a problem. More so because it comes with a meta-problem: what does "getting the test data" mean?

"Getting the test data" might mean "How do we figure out what it is we're supposed to be testing for?" Testing that an application matches its spec is different than testing that it solves an actual customer problem is different than testing that it doesn't corrupt the user's system is different than testing how it handles resource unavailability is different than testing its performance.

"Getting the test data" might mean "How do we prioritize the infinite set of possible test cases?"

"Getting the test data" might mean "How do we determine what the application is supposed to do?"

"Getting the test data" might mean "How do we determine how the system should react to particular data values?"

[How many more "might mean"s can you come up with?]

I have one answer for each of these questions: talk with your team. Talk with your feature team. Talk with your management. Talk with your customers. Talk with your help desk.

I find that talking about these questions with my feature team drives discussions about "I dunno what should happen in that case" and "Oh yeah, I had better handle that scenario" and "I don't think we need to test that deeply" and "What do you mean you think a surface pass of that area is sufficient?"

I find that talking about these questions with my management drives discussions about "We don't have that much time" and "That's a good idea, let's make sure everybody is doing that" and "You want to spend two weeks doing what?"

I find that talking about these questions with my customers drives discussions about "This feature working correctly is the make-or-break decision point for whether I upgrade" and "I don't care about that feature at all" and "I didn't know the application could do that" and "I want this functionality yesterday will you please ship it already!"

I find that talking about these questions with my help desk drives discussions about "The top five calls we receive are all in Feature H" and "When that works it will really help diagnose problems but when it fails it will require a customer site visit to resolve" and "The last five releases have all had this major issue will you please fix it already!"

One of the benefits my team is seeing from our automation stack is that using it inherently drives these discussions. As our feature teams discuss what a feature is and how it works we also discuss what the LFM will look like. This surfaces issues about what the user actions (i.e., benefits) really are, and how much verification do we want to do and how will we do it, and which test cases do we really care about. Because our test cases don't have UI and other gory details management and customers and help desk technicians review and comment on them. The ability to add and drop execution paths and data values and verification without affecting (most) test cases lets the answers to these questions change.

Our automation stack is an enabler, but the discussions are the key. Discussions that any tester can start having with any part of their team. Doing so may be difficult, especially if your team has little trust for each other just now. Showing some respect can help there. Reading and acting upon Patrick Lencioni's The Five Dysfunctions Of A Team can help as well. This can be a scary proposition, and difficult to get started. I think you will find, though, as I have, that it is well worth the effort.