I've always been suspicious to SQL queries that are automagically generated by some framework. And when I read this article on lengthy SQL queries it certainly was another gallon of gasoline on the fire. Sure premature optimization is the root of all evil and all that but there is also another important rule in software development; Don't do obviously stupid things. If you want to use a framework for data access, which is very common for productivity reasons, be sure to design your software so it is easy to replace the generic framework with something specific.
If you on the other hand end up with really large queries when you write the queries your self (I'm a stored procedure guy so I have a hard time even making up what kind of SQL query would end up being that large), but it is obvious what the solution is - Stored procedure.
I recently had to find a neat way to remove all empty directories recursively on a Unix machine. In the world of UNIX you can expect to find a way to do things like this pretty easy. When I started to search for a neat way to do it (rather than reading a bunch of MAN-pages) I came across a really funny story on The Old New Thing. Windows users are so used to having to use an application to do simple things like this, they forget about scripting possibilities. Guess that will change with Power shell.
However this was about how to do this on Unix. Well, this is my solution:
find $1 -type d | sort -r |
while read D
ls -l "$D" | grep -q 'total 0' && rmdir "$D" 2>/dev/null
That script takes one argument; a directory you want to remove if it and all its sub-directories are empty. Any directories encountered where files exists are preserved.
This is a fair recommendation and actually makes a lot of sense. Especially if you're implementing TDD in a team where people are a little bit skeptical. But there are a few dangers with allowing the use of this rule.
Personally I think that if something is considered to be so simple it does not need a test, it must be really simple to test. And if it so simple to test why shouldn't you use it. The relative cost might be high but the absolute cost for adding a real simple test for some real simple functionality is worth it in my book since it removes the decision from the developer. "No new functionality unless you have failing tests" is much simpler to follow and remember than "no new functionality unless you have failing tests or you consider the functionality to be really simple". Also consider the code coverage issues.
In "agile projects" it is common to use user stories to describe what has to be done. But it is also common to use constraints to describe things that cannot be described in a user story. This can be things like:
Constraints are things that should always be considered during every user story implemented.
However sometimes people come along with things that sound like constraints but they're not. For example:
This is a bad constraints since it have no goal. You might think you improve the constraint if you say:
This is still a bad constraint since you probably have a release date. What happens if the constrained time is not enough to achieve the goal?
I think that if there is a known problem (for example some responses take more than one second) you should add user stories (or bugs) for that and prioritize and plan the fixes for those problems just like anything else. And if you have no known problems you should just set up constraints telling the team what kind of quality you want. If it takes the team 50% of the velocity to pass all the constraints you should let them do so since otherwise you'll have to let them do it later anyway. Either way you'll notice the team are struggling to pass all the constraints and maybe they are to strict. But if you time box them from the start you will notice this later than if you force the team to always deliver more or less release quality.
Measuring code coverage is often perceived as a good measure of test quality. It is not. Good tests finds problems when changes are made to the code. But if you just want to have large code coverage you can easily make a number of tests calling all your methods but not checking the results. The only thing high code coverage values really tell you is that the code is at least not crashing with the given input.
If you however are using BDD/TDD, code coverage values might be of interest. For example if you do not have 100% function coverage (i.e. not 100% of the functions are called) then you aren't really using BDD/TDD are you? Because how could you write a method that is not called by anyone? Well actually you might have created methods that are never called by your tests. Many TDD practitioners use a rule "to simple to test" which applies to real simple, property like methods. I don't really like that philosophy, but more on that in a later post.
So now you think with 100% function coverage you will also have 100% line coverage with BDD/TDD, right? Well, yes and no. Typically you don't since you will add error handling with logging methods that will never occur since the errors never occur in your tests. It might be that you open a file and if that fails you log an error and exit your application. This never happens in the tests since you always manage to open the file in your tests. So how did those lines get in there if never called? Well one other rule often used by TDD practitioners is that you "should not do stupid things". If a system call may fail, you check the result regardless of tests or not. With dependency injection you'll probably get close to 100% but there is no point in bending back-wards in order to achieve high code coverage.
Does this mean there is no point in measuring code coverage? I think it is great to measure code coverage if you use the result correctly. You should not add more tests just to increase coverage since test added just to increase coverage tend to just exercise code and not really testing something interesting. But low code coverage when using BDD/TDD is definitely a warning signal that something is wrong. The team is not using the methodology correctly. So what is considered OK coverage levels? From personal experience I think anything below the levels listed in the table below should be considered bad since type without injectiontype withoutinjection with with you should have no problem at all achieving the given values.
But sometimes there is someone (usually a manager) that thinks coverage should be above some level. So even though you know those tests will not really be useful you have to add more tests for coverage. What do you do? Either you can try to ignore the coverage fact and just try to add more tests, testing interesting things. Or you could try using Pex. Pex is a tool from Microsoft Research that is an automated exploratory testing tool. It can be used to make a small test suite with high code coverage and with only a few simple examples I get the impression it is quite good at finding border cases in your code. This will not however replace your traditional tests/specifications written as part of your TDD/BDD process. But it can help you test some cases you did not think of and that way increase code coverage even more without any extra effort from you. At least it is better than adding coverage tests by hand.
And if you listened to me and started to write nice looking SQL, maybe you wanna look ate making your C# code look nice too...
I read this article on SQL Code Layout and Beautification and can only agree with the author that other people's SQL often is hard to understand because I'm so used to how I write my SQL statements. One of the links is to an on-line tool formatting your SQL according to a number of rules. I was happy to find that my way of writing SQL was supported:
SELECT a, b, c AS d FROM x, y, z AS w WHERE a = 2 AND b IN (3,4,7)
And another example:
SELECT obj.run, obj.camcol, STR(obj.field,3) AS field, STR(obj.rowc,6,1) AS rowc, STR(obj.colc,6,1) AS colc, STR(dbo.FOBJ(obj.objid),4) AS id, STR(obj.psfmag_g - 0 * obj.extinction_g,6,3) AS g, STR(obj.psfmag_r - 0 * obj.extinction_r,6,3) AS r, STR(obj.psfmag_i - 0 * obj.extinction_i,6,3) AS i, STR(obj.psfmag_z - 0 * obj.extinction_z,6,3) AS z, STR(60 * distance,3,1) AS d, dbo.FFIELD(neighborobjid) AS nfield, STR(dbo.FOBJ(neighborobjid),4) AS nid, 'new' AS 'new' FROM (SELECT obj.objid, run, camcol, field, rowc, colc, psfmag_u, extinction_u, psfmag_g, extinction_g, psfmag_r, extinction_r, psfmag_i, extinction_i, psfmag_z, extinction_z, nn.neighborobjid, nn.distance FROM photoobj AS obj JOIN neighbors AS nn ON obj.objid = nn.objid WHERE 60 * nn.distance BETWEEN 0 AND 15 AND nn.mode = 1 AND nn.neighbormode = 1 AND run = 756 AND camcol = 5 AND obj.TYPE = 6 AND (obj.flags & 0x40006) = 0 AND nchild = 0 AND obj.psfmag_i < 20 AND (g - r BETWEEN 0.3 AND 1.1 AND r - i BETWEEN -0.1 AND 0.6)) AS obj JOIN photoobj AS nobj ON nobj.objid = obj.neighborobjid WHERE nobj.run = obj.run AND (ABS(obj.psfmag_g - nobj.psfmag_g) < 0.5 OR ABS(obj.psfmag_r - nobj.psfmag_r) < 0.5 OR ABS(obj.psfmag_i - nobj.psfmag_i) < 0.5)ORDER BY obj.run, obj.camcol, obj.field
I don't know what's worse; making crosswords with only SQL related questions (requires free registration to access) or actually trying to solve the same. This is one of many thing that currently puzzles me. One other thing that also scares me is that I get annoyed when I cannot solve these crosswords right away...