The Forever Wait
Many thanks to Patrick for this reminder about another practice an SDET should avoid (actually, most SDE's should, as well- but they have their own masters and I'll leave that job to them). Not that it's easy to pick the "right" amount of time. In this much older post of mine, I mention a test case that took an incredible amount of time when we turned Driver Verifier on. Well, that test case has been a rascal in many ways (it even found some unusual and rare bugs in Driver Verifier itself). I was initially using the same wait time on all of my tests, but I had to stop- I set the wait on this one out to many minutes, but special cased it so the others wait a few seconds- if I didn't, a hardcore hang would potentially set runtime out to hours, not good for automation.
Well, I ran it today [I'm spending time I shouldn't be trying to streamline our automation- in this case making the installation aspects of it smart enough to find their own content, instead of having people type an ever changing set of locations into WTT as parameters every time we have to make a run, which is, after all, every day or more- but at the same time I removed the copying of things we no longer use, or are getting from insecure places when there are secure places, etc- to make sure I didn't mess it up, I'm running all of our tests to make sure I didn't prune out the collection of something we need], and the test case AFTER it failed because it timed out. I already know this is because the cleanup from all those allocations is hogging resources (I've even hit the "too long at dispatch level" break on machines with way too much memory), preventing me from getting time to get that next instance spawned off my test bus (serves me right for trying to squeeze some test performance by spinning up the next test while the first was spinning down).
The forever wait would make life easier in this case- but it wouldn't be near as much fun.
Now if any of you are thinking- "wait- why aren't you even talking about breaking and forcing people to look at the hang when it times out like this?"- well, we still have an opening or two for more SDETs on our team (and our close buddies in Storage have plenty of good opportunities for you as well), and you're obviously beginning to think like one! The best answer I ahve at the moment is that I serve multiple masters, one of which is regular automated checking, for which recording a failure and moving on is OK (as long as it isn't new, anyway). I've been thinking about ways over the last few days to accommodate this [break conditionally, but not all of the time in way that's smart enough to suit our needs], but haven't hit upon that most desirable solution yet. Not that I can't think of ways- but I need ways I can achieve with what's at hand- all the aggressive work we've done on hardening the test code so it doesn't break more often than the product it's testing have been working, so I've at least got time to move from "Stop all the bugchecks" to "what about attempt rates and failure rates" phase. Actually, it's not just that- it's that Patrick and Wei have been stepping up and helping with some of that fixing and the never-ending job of triaging failures in our test labs.
But, thanks to Patrick, that "things to do next time you crack that DDI test code" list now has another item:
Things to do the next time you crack that moldy old test code
- Set USER_C_FLAGS to /TP
- Set the warning level to 4 (convert warnings to error is already on)
- PFD green before you're done, period.
- Look for suspicious coding patterns where failures are being masked or ignored- at the very least in the places the above changes have forced you to look.
- Examine KeWait... calls for NULL timeout and WaitFor... calls for INFINITE. Take them out with the rest of the trash!
[Edit 5/24] Sorry , I just realized I revealed someone's email address wtihout permission [it made a fine pun, but still not the right thing to do]- stats say not likely anyone realized it yet, but mea culpa, anyway...
Just had to add this one
The title of this post keeps reminding me of The Forever War- a science fiction novel that earned both Nebula and Hugo awards. The awards probably speak for themselves- I'm quite pleased to have a copy of it in my personal library (I guess I'm pleased to have a personal library, but I leave that sort of thing to my personal blog.
As for the music- happens to be another bit of mellow fineness from Field of View (sorry, I didn't come up with a more useful link, but I'm short on time, as always) at the moment- I may have paid what seemed an outrageous fee to import this CD set (not the one I linked to, that would take me a while to find, I'm afraid) from Japan, but it was soooooo worth it!