Welcome to MSDN Blogs Sign in | Join | Help

Microsoft Excel

The team blog for Microsoft Excel and Excel Services.
Fun With Conditional Formatting

Yesterday I wrote about some work I was doing with Tables recently.  Today I want to do the same for conditional formatting – specifically, using colour scales.  (For a refresher, or for those that are new to this blog, you can read up on changes to conditional formatting in this series of blog posts, and you can read specifically about colour scales here).

In this case, I was looking at a table that contained the results of a set of tests.  The table looked like this (again, I made up the data for this blog entry).


(Click to enlarge)

Essentially, we run a bunch of scenarios against a series of tests periodically and look at the results.  0 is ok, positive numbers are good (the higher the better), and negative numbers are bad (lower means worse).  Seemed like an ideal candidate for a colour scale.  Using a button that I had added to my Quick Access Toolbar, I applied a colour scale to the table.  Here is the result.


(Click to enlarge)

As you can see, there are a couple of problems.  The biggest one is that there are obviously some outliers on the low end of the scale that mean that everything else looks good (remember, that in the default application, the 3-colour scale uses the lowest point, the 50 percent mark, and the highest value to assign colours … if there is an outlier on either end of the data, it can lead to results like this).  You can really see the problem if you click on the image above.  So, to fix that, I made some adjustments to the rule.  Using the manage rules dialog …


(Click to enlarge)

… I made a couple of changes to how the rule was going to be evaluated.  First, I changed the midpoint from 50% to a number, and set that number to 0 (since above 0 was good and below was bad).  Second, I used a formula to determine the value for the minimum (lower bounds) colour to eliminate the effect of the outliers.   Specifically, I changed the minimum to be of type “formula” and entered this formula, which simply determines what the value is of the 5th percentile: =PERCENTILE($C$3:$I$52, 0.05).  (Choosing the 5th percentile is not terribly scientific, but I had seen this report enough to know that there were never more than one or two outliers, so using the 5th percentile should handle them).


(Click to enlarge)

So now my rule ran something like this – “turn -1.36 (what the formula calcs to in this data set) and everything below it deep red.  Everything between -1.36 and 0 is a gradient fill between red and yellow.  Everything between 0 and the highest value is a gradient fill between yellow and green.  The results look something like this:


(Click to enlarge)

That was better, but still not what I wanted, since the effect was pretty busy.  So, I decided to make one more change.  Once again I opened up the Edit Rule dialog, and this time I changed the colour for the midpoint from yellow to white.


(Click to enlarge)

Once I hit OK a couple of times, the result looked like this …


(Click to enlarge)

… which is pretty much what I wanted.  Values close to 0 are mostly white, so they are not called out as good or bad visually, the outliers on the negative side (probably measurement problems) are not skewing the results, and there is good gradation on both sides, allowing me to quickly eyeball around for things to investigate.

Hopefully that was interesting – the point being that by tweaking the rules themselves, there is no end of possibilities available with conditional formatting.

Posted: Wednesday, July 19, 2006 6:06 AM by David Gainer

Comments

Superphilipp said:

Huh? What has happened to your Ribbon(tm)? I seem to remember that it was about three times as high.
# July 19, 2006 9:35 AM

ZivC said:

Assume I have tabular data, and I want to paint in red an entire row when one cell (say, in the first column) of that row meets some condition. How do I do that? AFAICS, conditional formatting is per cell only.
# July 19, 2006 10:23 AM

Mario Goebbels said:

Seeing that huge quickstart bar, I take it the own food doesn't taste good enough? ;)
# July 19, 2006 10:28 AM

joe said:

ZivC - you use the "Formula Is" function of conditional formatting, available in all conditional formatting (or at least back to Office 2000, that I know of).

For all the cells you want to format based on one cell, you would use "Formula is" and return TRUE or FALSE based on that formula, using an IF, or OR or AND, etc.

e.g.
highlight several cells in a row to be colored based on column A and apply this "Fomula Is" conditional formatting to all cells
=IF($A1=1,TRUE,FALSE)
you can drag or copy/paste this formatting down all the rows, and all the rows will change format according to the values in column A
# July 19, 2006 10:32 AM

Kdbertel said:

That still looks a little busy to me, even with the white. I like the bar-graph backgrounds a bit more.

On that note, is there a way to apply custom formatting to a cell based on the value of another cell?

And, I guess you prefer a large quickstart bar to using the ribbon? Any particular reason? How convoluted is doing all this with the ribbon, instead of the quicklaunch?
# July 19, 2006 11:27 AM

Mark D said:

I don't understand why all the border types aren't available in conditional formatting, as well why can't things like text be set?

joe, a normal =$a1=1 would work as well because it evaluates to true/false by default. the if becomes redundant
# July 19, 2006 12:19 PM

Harlan Grove said:

One set of gotta have QAT icons for CF, another set for external data, another set for pivot table operations, another set for drawing, . . .

Since the QAT can't extend to multiple rows of icons, kinda looks like there's still some apparent need for old style toolbars.

Maybe the ribbon is a better menu, but that's all it is. Toolbars served a different need. Too bad Microsoft decided to fix something that wasn't broken.

As for the example images, you've redefined UNprofessional. How about an icon of a dog food dish for the dog's lunch formatting option?
# July 19, 2006 12:42 PM

Brent said:

When you conditional format a range with the icons, can a legend be displayed automatically that shows what each icon represents?
# July 19, 2006 1:06 PM

Mike Staunton said:

It's the ad-hoc choices that you're making about the boundaries for the different conditional formats that worry me

If you have some knowledge about the distribution of the underlying data points - such as Normal - then it would be much better to calculate the mean and standard deviation of the sample points and then normalize the data points before applying conditional formats
# July 19, 2006 1:12 PM

Leta said:

Mike, I think David was showing ad-hoc analysis of data, so that was the whole point.

Harlan, David stated that he is a heavy keyboard user. What is your problem with that? (beside trolling)
# July 19, 2006 1:56 PM

Harlan Grove said:

Leta, how is the size of the QAT related to keyboard use? Or do you mean that David's heavy keyboard use explains the ugly conditional formatting?

If Excel had 3D contour plots, color-coded CF as shown above would be unnecessary.
# July 19, 2006 4:24 PM

David Gainer said:

Superphilipp, Mario, Kdbertel, what you see in the post with respect to UI is the fact that I added some buttons to the Quick Access Toolbar and have collapsed the ribbon.  Since I use mostly keyboard shortcuts, I find that for most work, the shortcuts and a number of items on the Toolbar are sufficient, and for the odd time where I need the Ribbon, it is one keystroke (CTRL + F1) away.  This setup also gives me more space for cells on my monitor than Excel 2003.  None of this is a comment on the Ribbon – simply that this is the way that I work.

To answer the other question asked in this area, things are no harder or easier with the Toolbar relative to the Ribbon; it is simply a matter of personal preference.

ZivC, I think Joe answered your question.

Kdbertel, yes, you can now apply number formats as part of conditional formatting, which includes custom number formats.

MarkD, we enabled custom number formats this time around.  If there is a lot of demand for borders, we will take  a look.

Brent, you would have to add one manually (add the icons to some other cells and type text).  Good feature idea for the future.

Mike, I wholeheartedly agree.
# July 19, 2006 7:17 PM

sam said:

"To answer the other question asked in this area, things are no harder or easier with the Toolbar relative to the Ribbon; it is simply a matter of personal preference. "

Well how about MS respecting this "Personal Preference" and giving me the "choice" to use the Classic UI or the Nasty Ribbon

Sam

# July 20, 2006 3:19 AM

ZivC said:

David: Yes, Joe and Mike answered my question (thanks guys). I'm disappointed with the answer, however. Isn't the whole point of the ribbon to make such features easy to find? In my personal experience, it's far from obvious.
# July 20, 2006 12:10 PM

Joe said:

ZivC - i was answering how to use "Formula Is" in Excel 2003 and below. i haven't tried to use it in Excel 2007 yet.

I agree that the "Formula Is" functionality is hidden, and hard to use - it usually takes me a couple tries before I actually get what I want. i  have no idea if this particular feature is improved or more visual in Excel 2007. I don't think the ribbon addresses this issue - maybe the ribbon will get you to use Conditional Formatting better, but other design features will have to be implemented better in order to show people how to use CF more effectively.
# July 20, 2006 3:22 PM

David Gainer said:

ZivC, we considered advertising this at the top level in the ribbon, but based on our research, the new conditional formatting rules we added covered by far the majority of the cases that people use formulas for in current versions of Excel.  We will continue to think about this area going forward.
# July 20, 2006 5:49 PM

Andrew said:

How do I programmatically add a formatting rule? For example, I would like to create a macro that creates a number of formatting rules that can be used later by the user or write an Add-In that generates formatting rules. Does the Excel Object Model include these rules?

Thanks,
Andrew
# July 21, 2006 12:12 PM

David Gainer said:

Andrew, check out this post for some VBA examples:

http://blogs.msdn.com/excel/archive/2005/10/14/481237.aspx
# July 21, 2006 4:08 PM

Biff said:

"the "Formula Is" functionality is hidden, and hard to use"

I don't know about Excel 2007, but in earlier versions it's not hidden at all and is exactly where you would think it should be, under Format. It's "easy" to use. In fact, I use "Formula Is" exclusively. I never use "Cell Value Is".

"use "Formula is" and return TRUE or FALSE based on that formula"

Using a logical comparison that returns a boolean is the most common method, however, you can use any formula expression that returns a numeric value as well. Any numeric value other than 0 will be evaluated as TRUE and the format will be applied. A numeric return of 0 will be evaluated as FALSE and the format will not be applied. A very simple example would be:

=COUNTIF(A$1:A$10,"x")

If the result was 0 no format would be applied.

If the result was 7 the format would be applied.  
# July 22, 2006 12:05 AM

Colin Banfield said:

Biff, the “Formula Is” equivalent is also easy to find in Excel 2007. You select New Rule from the Conditional Formatting menu (which displays after you click the Conditional Formatting button on the Ribbon). In the New Formatting Rule dialog box, there's a list of rule types and one of these is "Use a formula to determine which cells to format."

I suspect that the real concern being alluded to is contained in the "...and hard to use" part. Except for providing examples in a help window, I don't know how the UI could make using specific formulas in conditional formatting more "discoverable." For example, how can the UI expose the fact that a formula like =COUNTIF($A$1:$A$10,A1)>1 highlights duplicates in the selected range A1 to A10? From this perspective, formulas would appear "hard to use." However, what you can do is figure out the most common conditions that folks use conditional formatting formulas for and expose these conditions as highlighting options in the IU, thus making these specific conditions “discoverable.”

In this respect, Excel 2007 does a reasonably good job. For example, on the Conditional Formatting menu you can choose to highlight cells containing duplicates, unique values, top N items, top N %, bottom N items, bottom N %, above average, below average and 1 to 3 standard dev above or below average.  You can highlight cells with dates occurring today, yesterday, tomorrow, last 7 days, next month and so on.  You can highlight cells with text containing, not containing, beginning with & ending with.  You can highlight cells with blanks, non-blanks, errors or no errors.  Finally, there’s a column comparison option available to Tables.  To highlight cells with any of the preceding conditions in earlier versions of Excel, you'd need to use formulas, and this is where discoverability becomes an issue for most folks.  In Excel 2007, there should be far less need to use formulas at all.

On a related note, it’s unfortunate that Data Validation wasn’t a similar beneficiary of any new conditions in Excel 2007.  For example, the ability to use UDFs directly in data validation formulas (the workaround to reference a UDF in a worksheet cell is lame and doesn’t work if you want to apply the UDF to a bunch of input cells). Also, I’d have liked to see a “like” operator, so that you could validate a specific input string format.
# July 22, 2006 1:09 PM

Biff said:

How does Excel 2007 conditional formatting handle these:

** highlight cells containing duplicates

Is the first instance of a value considered a duplicate?

A1 = Smith
A2 = Smith

Are both cells highlighted or is just A2 highlighted?

** Top/botton N lists

Are "ties" accounted for?

A1 = 1
A2 = 1
A3 = 1
A4 = 2
A5 = 3

Would the bottom 3 be A1:A3 or A1:A5 ?

Or:

A1 = 1
A2 = 1
A3 = 1
A4 = 1
A5 = 3

Would the bottom 3 be A1:A4 ?
# July 22, 2006 6:09 PM

Colin Banfield said:

Is the first instance of a value considered a duplicate?

A1 = Smith
A2 = Smith

Are both cells highlighted or is just A2 highlighted?

<<<Both Cells are highlighted.>>>

A1 = 1
A2 = 1
A3 = 1
A4 = 2
A5 = 3

Would the bottom 3 be A1:A3 or A1:A5 ?

<<<A1:A3>>>

A1 = 1
A2 = 1
A3 = 1
A4 = 1
A5 = 3

Would the bottom 3 be A1:A4 ?

<<<Yes.>>>
# July 22, 2006 6:23 PM

Biff said:

The only gripe I have about CF Formula Is, is that teenie tiny microscopic little box for the formula. Argh!
# July 22, 2006 6:31 PM

David Gainer said:

Interesting discussion.  Colin, thanks for the explanation.  One note – I am sorry to say the “column comparison” is going to be gone in the beta refresh (and for 2007).  We turned up some pretty significant bugs recently, and we are past the point where we can do the work required to address them, so that is going to have to wait for the next version.  Thanks for the data validation items.

Biff, I agree with you about the smaller refedits.  That’s something I would like to try and solve all over the place in the future.
# July 23, 2006 1:33 AM

Biff said:

I would upgrade just for that one improvement!
# July 23, 2006 1:56 AM

Gareth Horton said:

David,

We develop a product that exports to Excel.  In the past, we have not pushed too much of our metadata over to the Excel side, but with the new Open XML formats, we are in the position to do this much more.  One of the main areas for this is Conditional Formatting.  We are able to define conditional formatting with our product and now push it over to Excel Conditional Formatting rules, rather than just applying the formatting on the cells/ranges themselves.

Our product is capable of dealing with quite a lot of data, as well as having fairly complex formatting rules.  What we found when we started to do this with XLSX is that as with Excel 2003, Excel 2007 remains very buggy in all areas of conditional formatting: display and configuration.

1. On a fairly up to date machine (18 mths old middle end PC, 1GB RAM) XLS (using BIFF8) or XLSX files with more than 2000-3000 conditional formatting 'rules' seems to be the tipping point to poor display performance. We can exceed that very easily with an export from our product.  

2.Whereas OpenOffice.org 2.0 will show similar problems when opening an XLS file with large numbers of conditional formats, once opened, it will outperform Excel 2007 in display performance.

3.In our efforts to produce creative formulas in XLS to overcome the 3-rule limit, we discovered that the boolean operator OR() will evaluate both of its expressions, even though the first evaluates to true, (as we later confirmed in the Excel 2003 help). Could this be fixed in 2007?

Seeing as the opening up of the Excel file format to developers (only true Excel dependents such as ourselves tackled native BIFF8) will allow many more applications to create rich, fully featured Excel files, now potentially up to 1 million rows and numerous columns in size, could you review the way Excel handles conditional formatting from a performance perspective.

The legacy way that conditional formatting is implemented in openxml, might benefit from a different approach, such as using formatting IDs as with traditional xfids, allowing cells to be a member of a cell formatting group, as opposed to a post-display evaluation

I can certainly generate very complex files for you, if you need them.

Thanks in advance

Gareth

gareth_horton@datawatch.com

# July 25, 2006 1:59 PM
New Comments to this post are disabled
Page view tracker