When I got to work this morning, I looked into a folder on a file server for the results of an automation tool that runs overnight. The folder was filled with dozens of files reporting potential problems and I jumped in to investigate.

Each report happens to have three files. They have this naming convention:

Myfile.dmp - a memory dump of the state OneNote was in. MB in size.
Myfile.txt - a short text file with basic information. 1-2K in size.
Myfile_verbose.txt - a much more detailed file. Roughly 30K in size.

It quickly started to look like there was only one error report and tons of duplicate reports. Not wanting to open each file individually, I decided to kick out a quick tool to remove the duplicates so I could see what was left. Since the memory dumps are a little bit time consuming to open, I figured a few minutes of tool work would save me time later on today.

I actually did not need to write much code, and after a quick review on MSDN, here's the "down and dirty" routine I came up with.

string[] FileList = Directory.GetFiles(@"\\servername\folder","*.txt",SearchOption.TopDirectoryOnly);
foreach (string s in FileList)
{
    FileInfo fi = new FileInfo(s);
    StreamReader sr = fi.OpenText();
    string contents = sr.ReadToEnd();
    sr.Close();
        if (contents.Contains("SearchString"))
        {
            Console.WriteLine(s + " has SearchString in it");
            string[] fileName = s.Split('.');
            string[] fileList = new string[3];
            fileList[0] = fileName[0] + ".txt";
            fileList[1] = fileList[0].Substring(0, fileName[0].Length - 8) + ".dmp";
            fileList[2]=  fileList[0].Substring(0, fileName[0].Length  -8) + ".txt";
            foreach (string f in fileList)
            {
                Console.WriteLine("Deleting file " + f);
                File.Delete(f);
            }
    }
}

So the code looks in the folder and gets a list of all the text files. I could have limited this routine to only use the "verbose" text files since that would be the only file of the three with the search text, but like I said, this is down and dirty. Then iterate through the text files to see which ones contain the key phrase I wanted. If it has the phrase, I can take advantage of our naming convention and generate a list of the other files associated with this text file, and delete them.

Nothing rocket science about this. It took about 20 minutes to write, 5 minutes to test since I was nervous about deleting potentially needed files and a few seconds to run. If each of the files needed 5 minutes to process - a reasonable estimate - to identify if this text was buried inside, I probably saved around 3-5 hours of work.

A nice little savings here.

Questions, comments, concerns and criticisms always welcome,

John