Often times, either due to a misconfiguration/bug/solar eclipse or otherwise, customers call into Microsoft Product Support Services complaining that their Exchange server is churning out transaction logfiles at an alarming rate. For every instance of this symptom, there are at least a dozen reasons why this is happening. Regardless, there's never been a good way to parse the transaction logs and extract any useful patterns. In lieu of rolling up my sleeves and actually writing code to accomplish such a task, I've slapped together a bunch of utilities that will do the job. Ugly? Sure. Useful? You bet. Having used it against many customer issues, I can attest that this actually works, and works quite well.
1. Download the "Unix for Win32" utilities from http://downloads.sourceforge.net/unxutils/UnxUtils.zip?modtime=1172730504&big_mirror=0
2. Extract all files from the UnxUtils\usr\local\wbin subsirectory to C:\UNIX
3. Download strings.exe from http://live.sysinternals.com/strings.exe, and place strings.exe into C:\UNIX
4. Make a C:\TMP directory (Unix tools need a Win32 equivalent of /tmp)
5. Make a directory for all your transaction log files (i.e. D:\customers\test), and place all the logs in this dir
6. From a cmd prompt, navigate to your C:\UNIX dir
7. Run the following command:
strings -q -n 16 D:\customers\test\*.log | cut -f3 -d: | sort | uniq -c | sort | tee c:\log-output.wri
What this is doing:
· Identifies all strings in the logs greater than 16 chars
· Removes the D:\customers\test\E00xxxx.log: from the output
· Sorts the output
· Finds all duplicate records, and retains a count
· Sorts the final output (ending with the largest # of occurrences)
· Writes all the output to c:\log-output.wri (use WordPad / write.exe to open; notepad.exe mangles the output)
If you're running this on Windows 7 or above, you'll have to modify the output directory as follows (as it won't let you write directly to the root of the C: drive) ...
strings -q -n 16 D:\customers\test\*.log | cut -f3 -d: | sort | uniq -c | sort | tee c:\users\yourname\log-output.wri
The output will be sorted from the least number of repeating occurences to greatest, so crack open that log-output.wri file, scroll to the bottom, and commence spelunking!
I've run a 30 minute ExMon trace and have dumped the results to a CSV file so I can manipulate the data in Excel.
What column of data would be the best one to focus on for the "highest amount of activity" by a user?
I believe the column you're after is "Total Bytes" (or similar) ... we want to identify who is sending the most data.
Wasn't sure if I should focus on "Packets", "Bytes Out" or "Log Bytes"...
Ah ... "Log Bytes." That's the one.
So what is the "Log Bytes" counter? I did not find it in the documentation.
Use Exchange User Monitor (Exmon) server side to determine if a specific user is causing the log growth problems. Sort on CPU (%) and look at the top 5 users that are consuming the most amount of CPU inside the Store process. Check the Log Bytes column to verify for this log growth for a potential user. If that does not show a possible user, sort on the Log Bytes column to look for any possible users that could be attributing to the log growth.
Hope this helps,
Thanks for sharing the blog Scott, so i must run this in my workstation after copying the whole Exchange Transaction log into my local hard drive ?
Yes, this would be performed locally on a workstation (off the Exchange Server). While you don't have to copy *all* the logfiles to your workstation, the more data that these tools run against, the more relevant the result set becomes.
i am not sure if you can help me, but i did the steps outlined in your blog and i got strings that contain only a single letter, i.e. " DDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD" which expanded up to one page of a wordpad. do you know what could be the problem? i already lcoated the user causing the problem and place him on a separate database.
thank you in advance :)
One of the techniques I've found helpful if this approach doesn't produce helpful output is to reduce to number of commands until you identify the one that is causing issues.
For example, try this:
strings -q -n 16 D:\customers\test\*.log | cut -f3 -d: | sort | uniq -c | sort
If this still is producing garbage, try this:
strings -q -n 16 D:\customers\test\*.log | cut -f3 -d: | sort | uniq -c
If the same, then this:
strings -q -n 16 D:\customers\test\*.log
Hope this helps locate where the "garbled output" is coming from.
Thanks very much for posting this, it's been a great help. I used it today as our transaction logs are filling in bursts at a rapid rate, filling about 50 logs per minute for 5 minutes then doing it again about 30 mins later.
From the info in the output file, I found that every log in one of these bursts seems to have the same email about 10 times, so I think there is a stuck message or a message bouncing around in Exchange.
We have 3 Exchange servers each with 2 Storage Groups. The transaction logs are filling only for some of the SGs but at least one on all 3 servers. I can see the email addresses of the users in the email and they are on 2 of the 3 servers. The servers that those 2 users are on have been dismounted but the other server still gets the message in it's transaction logs.
Do you have any idea how I can find out where the stuck message is and remove it?
I ran ExMon but couldn't see any particular user thrashing the server.
Sorry to respond with more questions ...
- Do these same users show up with the same level of frequency in the Message Tracking logs? If so, is the Message-ID consistent?
- For these users, does OWA (show a consistent view of the Outbox as Outlook (thinking: stuck message repeatedly trying to send)? You could also login to these "suspect" mailboxes with either Outlook or MFCMAPI (http://mfcmapi.codeplex.com) in online mode to double-check, too.
- Finally, what server & client versions are in use here? The reason I ask is that in older builds, Exchange would allow Outlook to repeatedly try & submit messages even if the user was over their send/receive quota. Also, I fixed a bug in the store many moons ago where a VSAPI-based virus scanner would induce a loop if circular server-side rules were present ... see support.microsoft.com/.../923799.
Thanks for this post. Does this same technique apply to Exchange 2010 as well? I tried it and almost always end with the following in the top spot. ESE Super ECCXORChecksum. I searched on this and cannot find any information about it.
A few clarification questions if I may ...
1) I'm presuming you have "Enable background database maintenance (24 x 7 ESE scanning)" enabled, correct?
2) Does the high logfile generation only reproduce during Online Maintenance?
3) Also, what is your Maintenance Schedule set to? Any changes to it recently?
Why wasn't powershell used to accomplish this?