Welcome to MSDN Blogs Sign in | Join | Help

New Change to the Excel blog

I’d like to take this opportunity to announce a new feature of the blog, which some of you might have noticed already. Our Content Publishing team has begun collecting “Power Tips” from members of the Excel community, including the product team, Excel MVPs, user group members, and other Excel enthusiasts. Power Tips are intermediate-level “How-to” tips that you can use to enhance your spreadsheets. We hope they’ll be valuable to those of you who feel pretty proficient with Excel, but who don’t consider yourselves seasoned spreadsheet "developers" (yet).

We have already tagged a number of past posts with the Power Tips tag, so you can start following this feature today by clicking the tag on the right side of this page. You can even set up an RSS feed for just that tag, if you want.

Enjoy!

Week Numbers in Excel

Today’s author is, Ron de Bruin, an Excel MVP. You can find more useful tips from Ron and links to Excel add-ins at his website: http://www.rondebruin.nl/

There are four primary week numbering systems in use worldwide. Each system has subtle differences that you should be aware of. Excel can work with any of these systems:

1)  ISO Week number: The International Organization for Standardization (ISO) ISO8601:2000 Standard.
All weeks begin on a Monday. Week one starts on Monday of the first week of the calendar year with a Thursday.
2)  Excel WEEKNUM function with an optional second argument of 1 (default).
Week one begins on January 1st; week two begins on the following Sunday.
3)  Excel WEEKNUM function with an optional second argument of 2.
Week one begins on January 1st; week two begins on the following Monday.
4)  Simple week numbering.
Week one begins on January 1st, week two begins on January 8th, and week 53 has only one or two days (for leap years).

Note: Excel does not have a standard worksheet function for the ISO week number and simple week numbering system.

Worksheet Formulas for Week Numbers

The following sections assume that you have a date in cell B4 for testing the week number formulas.

ISO Week Numbers

There is no built-in worksheet function for ISO week numbers in Excel. Copy the following formula and paste it in a worksheet cell to return an ISO week number:

=INT((B4-DATE(YEAR(B4-WEEKDAY(B4-1)+4),1,3)+WEEKDAY(DATE(YEAR(B4-WEEKDAY(B4-1)+4),1,3))+5)/7)

Alternatively, you can open the Visual Basic editor, click Module on the Insert menu, and then copy this user-defined function (UDF) into the module. After adding this UDF to your workbook, you can use it like a built-in function =IsoWeekNumber(B4)

Public Function IsoWeekNumber(d1 As Date) As Integer
' Attributed to Daniel Maher
    Dim d2 As Long
    d2 = DateSerial(Year(d1 - WeekDay(d1 - 1) + 4), 1, 3)
    IsoWeekNumber = Int((d1 - d2 + WeekDay(d2) + 5) / 7)
End Function

You can find more information about ISO dates and week numbers on this page: http://www.rondebruin.nl/isodate.htm

The Excel WEEKNUM Function

Reliance on the Analysis Toolpak in Excel versions before Excel 2007 can create problems because the add-in may not
be installed by users of your spreadsheets (a default Excel installation has it unchecked in setup). Also, there are difficulties for international users when you use Analysis Toolpak formulas because these formulas are not translated by Excel if you open the workbook in a different Excel language version.

Note: In Excel 2007 WEEKNUM is a standard built-in worksheet function, so you will not have the problems above if you share your workbook between different Excel 2007 language versions.

Important: If you do not have Office 2007 SP2 installed, read the information on this page: http://www.rondebruin.nl/atp.htm

You can use these two replacement functions from Daniel Maher to avoid problems.

Replace =WEEKNUM(B4,1) with:

=1+INT((B4-(DATE(YEAR(B4),1,2)-WEEKDAY(DATE(YEAR(B4),1,1))))/7)

Replace =WEEKNUM(B4,2) with:

=1+INT((B4-(DATE(YEAR(B4),1,2)-WEEKDAY(DATE(YEAR(B4),1,0))))/7)

Simple Week Numbers

There is no built-in worksheet function for simple week numbering in Excel. Copy the following function and paste it in a worksheet cell to return simple week numbers:

=INT((B4-DATE(YEAR(B4),1,1))/7)+1

The Week Calendar File

The week calendar file shows you all the dates and week numbers from a certain year on one printable page. If you want to have a week calendar from another year you only have to change one cell (the year). There is a separate sheet for each of the four week numbering systems listed at the start of this post.

Download the Calendar file
Week Numbers Calendar.zip (File date : 27-Feb-2005)

Acknowledgements

I would like to acknowledge general reference on all date issues to:
Chip Pearson: http://www.cpearson.com/excel/topic.aspx
Dave McRitchie: http://www.mvps.org/dmcritchie/excel/xlindex.htm
Daniel Maher has also published numerous simplifications of date formulas some of which were used or adapted in producing the calendar.
The late Frank Kabel Frank created the ISO week number worksheet function on this page.
The base formula used for the ISO year start in the calendar file were derived from a UDF written by John Green, Sydney.

More Information
ISO Date Representatation and Week Numbering:
http://www.rondebruin.nl/isodate.htm
You should refer to Chip Pearson's web site for an exposition on Week Number implementation:
http://www.cpearson.com/excel/weeknum.htm
Implementing Week-Numbering Systems and Date/Time Representations:
http://msdn.microsoft.com/en-us/library/bb277364.aspx

Formula to Access a List of Values Interspersed with Zeros or Blanks

Today’s author is Bob Umlas, an Excel MVP since 1994.

This tip is a formula which enables you to access a list of values interspersed with zeros or blanks and pick up only the non-zero values in the same sequence they’re listed. It’s better to illustrate. Suppose you have this list in A1:A14:

image01

…and you want to produce this list:

image02

The following formula entered in E1 and filled down to E6 does the trick. It’s an array formula which means that you must press Ctrl+Shift+Enter after entering the formula instead of just Enter:

=INDEX($A$1:$A$14,SMALL(IF($A$1:$A$14<>0,ROW($1:$14),""),ROW(A1)))

Let’s take the formula apart and see how it works. The inner IF-statement, IF($A$1:$A$14<>0,ROW($1:$14),""), checks for non-zeros, and if it is a non-zero, it returns the row number; otherwise it returns blanks. If you select that portion of the formula in E1 and press the F9 key, you’ll see this:

image03

…which means that rows 1,4,7,9,10 and 14 do not contain zeros or blanks. ROW(A1) returns the value 1 and is used instead of simply the number 1 because this formula is being filled down, and in the row below it will become ROW(A2), returning a 2, etc. So, the SMALL function now returns the smallest value from that list, or 1. In the row below, we have ROW(A2) or 2, and the 2nd smallest value is 4. So each formula returns the row number for the non-zero cells. That in turn is passed to the INDEX function, and we effectively have these formulas:

=INDEX(A1:A14,1)
=INDEX(A1:A14,4)
=INDEX(A1:A14,7)
and so on, which returns the values we need for the list of non-zero values.

As an additional tip, because it may not be clear how far down to fill, you can use conditional formatting to hide the potential errors if you were to drag down too far. That is, suppose you initially dragged the formula down to E8:

image04

You can hide these this way. Select all of column E, then click Home > Conditional Formatting > New Rule > Use a formula to determine which cells to format, and enter this formula:

=ISERROR(E1)

image05

Then click the Format button, the Font tab, and select a white font:

image06

Then, when you click OK all the way out, the #NUM! errors won’t show.

The reason this formula must be entered as an array-formula is because the IF(A1:A14<>0… portion of the formula requires it. The IF-statement normally takes one value to test for true/false, not an array of values as we did here. If you don’t use Ctrl+Shift+Enter to create an array formula, you will get #VALUE! and #NUM! errors.

Microsoft Office 2010 Technical Preview

Today, Microsoft is announcing a limited, invitation only Technical Preview program for Microsoft Office 2010 that will kick off in July. The program will provide you with the opportunity to experience early, pre-release versions of Office 2010 which will include the following applications:  Word 2010, Excel 2010, Outlook 2010, PowerPoint 2010, OneNote 2010, Access 2010, InfoPath 2010 and Publisher 2010. Check out www.office2010themovie.com for signing up to be considered for the Technical Preview Program.

VBA Coding Contest

I'm passing along some information I think will be interesting to many of the readers of this blog. 

Test your VBA coding skills and win!

MSDN® is sponsoring a coding contest for Office 2007, named OfficePalooza! This sweepstakes will run two weeks beginning April 20, 2009, and features ten fun Visual Basic of Applications™ (VBA) coding challenges in the form of puzzles and games. Each entrant will earn a chance to win one of hundreds of available prizes, determined by a random drawing at the end of the contest.

In conjunction with this contest, an Advanced Business User theme will run on Microsoft Office Online from mid-April to mid-May, and will showcase the automation and extensibility aspects of Office 2007 through macros, custom VBA coding, the Fluent UI, and Office Open XML.  This collaborative effort will also highlight existing and newly-created content on the MSDN Microsoft Office Developer Center.

 

Modifying Conditional Formatting Color Ranges in Excel 2007

Today’s authors, Amit Velingkar, a Program Manager on the Excel team, and Bob Silverstein, a User Experience Researcher on the Excel team, talk about creating professional looking Color Ranges in Excel.

Introduction:

My fellow PM on Excel, Robin Wakefield, wrote a blog article on how to create professional looking charts in Excel 2007. Continuing with the same theme, we will investigate the idea of tweaking Color Ranges in Excel 2007. We will start with a default Conditional Formatting Color Range and modify it to give it more pastel look. Also, this would be an ideal forum to give us your feedback on existing conditional formatting colors used in Excel 2007.

As a part of my preparation for this article, I did two things – researched customer files that use conditional formatting and explored various options with Microsoft designers.

Looking at some customer files, I quickly observed a trend. I found out that many users wanted to highlight a small portion of their data. They were achieving this by changing some of the colors in their color ranges to white. Also, some customers preferred a pastel shade of the same color.

I discussed this trend with Microsoft designers and they taught me some neat tricks about manipulating colors. First, they told me that choosing the correct shade of white was critical while mixing colors. And last, an intuitive way of changing shades of colors was to use the HSL (Hue, Saturation and Luminosity) model.

My blog article attempts to bring all these lessons together. We will change some of the colors to white and tweak the colors shades to pastel. Finally, we will introduce a VBA macro that lets you quickly try out new color shades.

Procedure:

1. I started out with applying three of our default color combinations to some data. I choose the following color ranges: Red-Yellow-Green, Red-Yellow-Blue and Yellow-Green.

clip_image002

2. I replaced the yellow color with the appropriate shades of “white”.

Changing the yellow colors to pure whites created a high contrast. For creating whites with a good spectrum of colors, we crank up the RGB values all the way to 255 creating a pure white – Remember RGB (255,255,255) is white. Then, remove just a little amount of the participating color to create “target” white.

As an example while mixing Red and Green, we set our white to (250,250,255) by removing some red and green. The end result is a color range containing pure shades of Red and Green.

Color 1

Color 2

White

Notes

Red (255,0,0)

Green (0,255,0)

(250,250,255)

Remove some red and green

Red (255,0,0)

Blue (0,0,255)

(250,255,250)

Remove some red and blue

Red(255,0,0)

-

(250,255,255)

Remove some red

Green (0,255,0)

-

(255,250,255)

Remove some green


Select the color range. On the Home tab, click on Conditional Formatting, then Manage Rules. In the Conditional Formatting Rules Manager, select the Conditional Format and click on Edit Rule. In the Edit Formatting Rule dialog change the yellow color to white.

clip_image002[5]

3. Using HSL Model to change non-white colors

We can use the HSL model to manipulate color. HSL defines the color spectrum based on Hue (base color from the rainbow spectrum), Luminosity (brightness) and Saturation (purity of the color).

Select the color range. On the Home tab, click on Conditional Formatting, then Manage Rules. In the Conditional Formatting Rules Manager, select the Conditional Format and click on Edit Rule. In the Edit Formatting Rule dialog, click on the color dropdown and choose More Colors. Click on the Custom tab and choose HSL in the Color model dropdown.

In this example, we will change colors using the HSL models as follows:

o Hue: Unchanged, this leaves the base color unchanged. Hence our Reds stay Red.

o Saturation: Decrease the Saturation to 150. This makes the colors more pastel.

o Luminosity: Increase the luminosity to 200. The goal is to increase the brightness to make the color more visible and showcase the text behind the color ranges.

clip_image002[7]

Result:

By using shades of white and by tweaking the luminosity and saturation of colors, we changed the color ranges as follows:

image

HSL Color Manipulation using VBA:

Since I wanted to try out a lot of combinations of Luminosity and Saturation values to tweak color ranges, I decided to write a VBA macro to do this. I quickly learnt that while Excel allows HSL manipulation through the UI, it does not support HSL manipulation through VBA. I looked around for ways to manipulate HSL and finally found a blog post by Tony Jollans that did something similar. Although most of the code used is directly from Tony Jollans’s blog post, I wrote some additional subroutines, added some comments and re-factored some of the subroutines for my purpose.

The entry function for my macro is called ChangeColorRangeColors().

Sub ChangeColorRangeColors()
   ...

   For Each clrScale In Selection.FormatConditions
      For Each criteria In clrScale.ColorScaleCriteria
         ' get clr
         clr = criteria.FormatColor.Color

         ' split RGB
         SplitRGB clr, R, G, B

        'skip whites - when all RGB componenst are above 240
         If R < 240 Or G < 240 Or B < 240 Then

            ' get HSL
            RGBtoHSL clr, H, S, L

            ' increase luminosity
            L = (iLuminosity / 255) ' from 0.0 to 1.0

            ' decrease saturation
            S = (iSaturation / 255) ' from 0.0 to 1.0

            ' get RGB
            HSLtoRGB H, S, L, clr

            ' set new clr
            criteria.FormatColor.Color = clr

         End If
      Next
   Next
End Sub

At a high level, my HSL manipulator code works in the following ways:

  1. Get the RGB encoded value from the selected color ranges.
  2. Split the RGB encoded values into the Red, Green and Blue components.
  3. Skip the white colors – we do not want to change these.
  4. Covert RGB into HSL.
  5. Modify the Saturation and Luminosity, leaving Hue unchanged.
  6. Convert the modified HSL values back to encoded RGB.
  7. Apply the new RGB value back to the color range.

I won’t go into the details of converting between RGB and HSL values. These conversion routines are well documented and can be freely found on the internet.

However, another tricky part was trying to split the encoded color value that is used by Excel into individual RGB components. VBA provides a function, aptly named RGB(), that returns an encoded long value when provided with the R,G and B components. I have used this function in my macro to set the Color Range colors. However, VBA does not provide a reverse function to split this encoded long value. I have used a custom function called SplitRGB to do this.

Sub SplitRGB(RGB As Long, R As Double, G As Double, B As Double)
  
Dim HexString As String

   ' convert the long to Hex - bb:gg:rr
  
HexString = Hex(RGB)

   ' in order to get r,g,b components out of the string,
   ' we have to make it is atleast 6 characters long - bb:gg:rr
  
HexString = Right(String$(5, "0") & HexString, 6)

   ' get each individual color and convert to an double (range: 0 to 255)
  
R = CDbl("&H" & Mid$(HexString, 5, 2))
  
G = CDbl("&H" & Mid$(HexString, 3, 2))
  
B = CDbl("&H" & Mid$(HexString, 1, 2))
End Sub

Since the R, G, B values span from 0 to 255 and can be accommodated in 1 byte. Excel uses a 4-Byte long to encode the RGB values. The first byte is used to store the red value, the second byte stores the green value and the third byte stores the blue values. An easy way to observe these values is to convert this encoded long into a hexadecimal string. We then split this string using the Mid function to isolate the R, G, B components. Finally, we use a conveniently provided CDbl function to convert hexadecimal text to a number. Observe that appending the “&H” enables the CDbl function to identify the text as hexadecimal.

To recap, We can create professional looking Conditional Formatting color ranges by choosing the correct shades of whites and using the HSL model to tweak Saturation and Luminosity values. I am hoping that by using these techniques and subroutines, you will be able to automate the look and feel of your Excel Spreadsheets.

The attached workbook contains these color ranges and the VBA macros. We would love to hear your feedback on these colors.

A Robust Way To Reference Multiple Columns In a Table

I’m sure many of you have built a spreadsheet like this before: you’ve got a table of data; one of the columns in this table contains a formula which references a span of columns in the same table.  The number of columns you need to reference may change over time as requirements change.  How do you you build this table in such a way that the number of columns can be changed without breaking the formula that references them?

Let’s walk through a somewhat contrived but simple example to demonstrate the issue.  Say I’m a real estate investor and I’m tracking a list of houses I’m interested in purchasing.  I’m a demanding and detail oriented buyer so I want to know the square footage of the individual rooms in the dwelling.  Based on this information, I want to calculate the count of rooms in the dwelling as well as the total square feet of the rooms combined.  The table might look something like this:

image

The formula for the “#Rooms” column looks something like this:

=COUNT( Table1[[#This Row],[SqFt-Room1]:[SqFt-Room3]] )

Now, let’s say I want to add a new house, but this new house has four rooms.  Simple, just add a column, right?  Not so fast; depending on how we add that column our formulas for “#Rooms” and “Ttl Sq. ft.” may not update as expected.

Or let’s say I’ve changed focus to small condominiums and no longer have a need for the “Room3” column.  If I attempt to delete it, then my formulas will break.  How do we avoid this?

Here's one trick I use in such situations.  It may not be the best answer, but I’ll share it with you here and if others have suggestions to offer you can add them to the comments.

Solution: add an extra “dummy” column before and after the span as end caps, and refer to those columns in your formulas.  Using our real estate example, I would add a column before “SqFt-Room1” called “RoomsStart” and one after “SqFt-Room3” called “RoomsEnd”, like so:

image

I would keep these columns empty and never put data in them.  This is important because otherwise my COUNT and SUM formulas may return the wrong results.  Then I’d highlight the columns a different shade (this is totally optional but I personally like the visual effect of marking off the start and end of the span) and resize them to something very small so they are out of the way for the most part, like so:

image

Alternatively, you can hide the columns if you so desire.  Then I’d update my formulas so that they referenced these columns instead, like so:

image

Now, when I want to add a fourth room, I select the “RoomsEnd” column and right-click \ Insert and I get a new column which I can name “SqFt-Room4”.  This new column will automatically be included in my COUNT and SUM calculations.  Similarly, if I ever get rid of “SqFt-Room4”, I don’t have to worry about breaking any of my formulas.  It may not be the most elegant solution, but it gives me a virtually worry-free way to reference a changing number of columns in my table.

Analyzing Data: Functions or PivotTables

Today’s author, Monica Poinescu, a Software Developer in Test on the Excel team, discusses two different approaches to analyzing data in Excel.

Edit: I've attached a file at the bottom of this blog that contains spreadsheets of the examples discussed in this post.

My earlier blog on the new Excel 2007 function SUMIFS spawned a very interesting discussion (thanks to everyone who posted comments there): when trying to analyze/aggregate data in a table, how do we decide whether to use functions versus PivotTables?

This blog outlines reasons to use one option or another. To better illustrate the two alternatives I’ll consider a real estate inspired example: let’s say I have list of homes for sale and their corresponding characteristics:

clip_image002

and I’m trying to find the average price for those homes which have at least 3 bedrooms, a garage and are between 5 and 10 years old. Just by looking at the table, we see that only house3 and house4 satisfy all conditions and the average of their prices is 312,500.

Here is a functions based solution:

=AVERAGEIFS(G2:G6,C2:C6,">2",E2:E6,"yes",F2:F6,">1999",F2:F6,"<2004")

which returns $ 312,500.

To build a corresponding PivotTable, one can use several filters:

clip_image004

Advantages of PivotTables:

  1. While AVERAGEIFS is limited to using at most 127 conditions, PivotTables can handle more than 127 conditions.
  2. The elements of the set that fulfill all conditions are listed in the resulting PivotTable.
  3. PivotTables have a lot of flexibility: the ease of use offered by the new UI allows for a very quick detailed analysis of different available options. Nested layers in a PivotTable offer added results visualization.
  4. Several different approaches are possible: one can construct different PivotTables that answer the same question above.
  5. Particularly useful for large data sources: when relying on an external data source, you don’t need to bring all the data in Excel and one could, for example, use OLAP databases.

Advantages of using functions:

  1. It’s easier to see in one glance all the conditions being used either by looking in the formula bar, or by listing all criteria in separate ranges.
  2. The result updates immediately when adding rows to the source table, while PivotTables need to be refreshed.
  3. The criteria, when referenced in a cell, could be as well the result of another formula, while value filters in PivotTables can only use constants.
    For example, in the formula above, one could replace
    =AVERAGEIFS(G2:G6,C2:C6,">2",…
    with
    =AVERAGEIFS(G2:G6,C2:C6,I4,…
    where I4 contains another calculation ( e.g. =">"&FIXED(SUM(1,1),0)).
    The image below shows a corresponding PivotTable filter:
    clip_image006
    The last field will not accept formulas, only numbers.
  4. Formulas take little space and are easy to move around in a sheet.

Note that in both cases you can use the wildcard characters to define criteria. Also both solutions deal in a similar manner with missing data or errors in the range.

I don't know if I can be fully objective on this question because I'm more of a formula person myself. If I've missed a reason you should pick one approach over another, feel free to let me know the reason you use formulas or PivotTables for summarizing data by leaving a comment.

Excel VBA Performance Coding Best Practices

Today’s author, Chad Rothschiller, a Program Manager on the Excel team, is back with a follow up from his previous post on VBA and Excel performance.

I want to start off this post by thanking everyone who sent in their examples in response to my January request. It is incredibly helpful to be able to look at what you all are doing with Excel! Not only did I see a huge variety in how Excel is being used, you also pointed out various tips and tricks for writing fast VBA code in Excel.

In this post I'm going to share with you the most important performance tips I know about. There are tons of sites, pages, and people who are experts as well on this subject, have performed their own tests, and shared their results and ideas. If you think I missed an important concept for how to optimize Excel VBA performance, or if you've got a valuable comment or link to share, please feel free to post here so everyone can benefit. Thanks!

Turn Off Everything But the Essentials While Your Code is Running

This optimization explicitly turns off Excel functionality you don't need to happen (over and over and over) while your code runs. Note that in the code sample below we grab the current state of these properties, turn them off, and then restore them at the end of code execution.

One reason this helps is that if you're updating (via VBA) several different ranges with new values, or copy / pasting from several ranges to create a consolidated table of data, you likely do not want to have Excel taking time and resources to recalculate formulas, display paste progress, or even redraw the grid, especially after every single operation (even more so if your code uses loops). Just one recalculation and one redraw at the end of your code execution is enough to get the workbook current with all your changes.

Here's some sample code that shows how and what to shut off while your code runs. Doing this should help improve the performance of your code:

'Get current state of various Excel settings; put this at the beginning of your code

screenUpdateState = Application.ScreenUpdating

statusBarState = Application.DisplayStatusBar

calcState = Application.Calculation

eventsState = Application.EnableEvents

displayPageBreakState = ActiveSheet.DisplayPageBreaks 'note this is a sheet-level setting

'turn off some Excel functionality so your code runs faster

Application.ScreenUpdating = False

Application.DisplayStatusBar = False

Application.Calculation = xlCalculationManual

Application.EnableEvents = False

ActiveSheet.DisplayPageBreaks = False 'note this is a sheet-level setting

'>>your code goes here<<

'after your code runs, restore state; put this at the end of your code

Application.ScreenUpdating = screenUpdateState

Application.DisplayStatusBar = statusBarState

Application.Calculation = calcState

Application.EnableEvents = eventsState

ActiveSheet.DisplayPageBreaks = displayPageBreaksState 'note this is a sheet-level setting

Here's a quick description for each of these settings:

Application.ScreenUpdating: This setting tells Excel to not redraw the screen while False. The benefit here is that you probably don't need Excel using up resources trying to draw the screen since it's changing faster than the user can perceive. Since it requires lots of resources to draw the screen so frequently, just turn off drawing the screen until the end of your code execution. Be sure to turn it back on right before your code ends.

Application.DisplayStatusBar: This setting tells Excel to stop showing status while False. For example, if you use VBA to copy/paste a range, while the paste is completing Excel will show the progress of that operation on the status bar. Turning off screen updating is separate from turning off the status bar display so that you can disable screen updating but still provide feedback to the user, if desired. Again, turn it back on right before your code ends execution.

Application.Calculation: This setting allows you to programmatically set Excel's calculation mode. "Manual" (xlCalculationManual) mode means Excel waits for the user (or your code) to explicitly initiate calculation. "Automatic" is the default and means that Excel decides when to recalculate the workbook (e.g. when you enter a new formula on the sheet). Since recalculating your workbook can be time and resource intensive, you might not want Excel triggering a recalc every time you change a cell value. Turn off calculation while your code executes, then set the mode back. Note: setting the mode back to “Automatic” (xlCalculationAutomatic) will trigger a recalc.

Application.EnableEvents: This setting tells Excel to not fire events while False. While looking into Excel VBA performance issues I learned that some desktop search tools implement event listeners (probably to better track document contents as it changes). You might not want Excel firing an event for every cell you're changing via code, and turning off events will speed up your VBA code performance if there is a COM Add-In listening in on Excel events. (Thanks to Doug Jenkins for pointing this out in my earlier post).

ActiveSheet.DisplayPageBreaks: A good description of this setting already exists: http://support.microsoft.com/kb/199505 (Thanks to David McRitchie for pointing this out).

Read/Write Large Blocks of Cells in a Single Operation

This optimization explicitly reduces the number of times data is transferred between Excel and your code. Instead of looping through cells one at a time and getting or setting a value, do the same operation over the whole range in one line, using an array variable to store values as needed.

For each of the code examples below, I had put random values (not formulas) into cells A1:C10000.

Here's a slow, looping method:

Dim DataRange as Range
Dim Irow as Long
Dim Icol as Integer
Dim MyVar as Double
Set DataRange=Range("A1:C10000")

For Irow=1 to 10000
  For icol=1 to 3
    MyVar=DataRange(Irow,Icol)  'Read values from the Excel grid 30K times
    If MyVar > 0 then 
      MyVar=MyVar*Myvar ' Change the value 
      DataRange(Irow,Icol)=MyVar  'Write values back into the Excel grid 30K times
    End If 
  Next Icol
Next Irow

Here's the fast version of that code:

Dim DataRange As Variant
Dim Irow As Long
Dim Icol As Integer
Dim MyVar As Double
DataRange = Range("A1:C10000").Value ' read all the values at once from the Excel grid, put into an array

For Irow = 1 To 10000
  For Icol = 1 To 3
  MyVar = DataRange(Irow, Icol)
  If MyVar > 0 Then
    MyVar=MyVar*Myvar ' Change the values in the array
    DataRange(Irow, Icol) = MyVar
  End If
Next Icol
Next Irow
Range("A1:C10000").Value = DataRange ' writes all the results back to the range at once

Note: I first learned of this concept by reading a web page by John Walkenbach found here: http://www.dailydoseofexcel.com/archives/2006/12/04/writing-to-a-range-using-vba/

A previous Excel blog entry by Dany Hoter also compares these two methods, along with a selection / offset method as well: http://blogs.msdn.com/excel/archive/2008/10/03/what-is-the-fastest-way-to-scan-a-large-range-in-excel.aspx

...which leads me to my next point.

Avoid Selecting / Activating Objects

Notice that in the above-referenced blog post, the selection method of updating a range was the slowest. This next optimization minimizes how frequently Excel has to respond to the selection changing in the workbook by minimizing the selection changing as much as possible.

Range Example: Again, see the Excel blog post quoted above. It demonstrates that using selection is the slowest of the 3 methods discussed for reading and writing to ranges.

Shapes Example: Setup: I have 40 shapes on a sheet, and I want to write "Hello" in each of them.

Using the slower "selection" method, the code looks like this:

For i = 0 To ActiveSheet.Shapes.Count

   ActiveSheet.Shapes(i).Select

   Selection.Text = "Hello"

Next i

The much faster method is to avoid selection completely and directly reference the shape:

For i = 0 To ActiveSheet.Shapes.Count

   ActiveSheet.Shapes(i).TextEffect.Text = "Hello"

Next i

The concepts illustrated by the examples above can also be applied to objects other than Ranges and Shapes.

Note: I first learned of this concept, in the context of shapes, by reading a web page by Ron de Bruin found here: http://www.rondebruin.nl/shape.htm

Related Performance Paper

See the "Improving Performance in Excel 2007" paper on MSDN: http://msdn.microsoft.com/en-us/library/aa730921.aspx

This is a fairly detailed and comprehensive paper that introduces the bigger grid and increased limits in Excel 2007, and primarily focuses on Excel calculation performance and debugging calculation performance bottlenecks. There's also a short section on how to write faster VBA macros.

Other Performance Optimizations

While the above optimizations are what I consider the most important, there are a few other "honorable mention" optimizations I will mention briefly for you to consider.

Consider the performance gains by implementing your code's functionality via XLL / C-API. An overview and supporting materials for the SDK can be found here: http://msdn.microsoft.com/en-us/library/bb687827.aspx .

Declare variables with explicit types to avoid the overhead of determining the data type (repetitively if used in a loop) during code execution.

For simple functions used by your code in high frequency, implement them yourself in VBA instead of using the WorksheetFunction object.

Use Range.SpecialCells() to scope down the number of cells your code needs to work with.

Creating Dynamic Validation Ranges

Today’s author, Dany Hoter, a Product Planner for the Excel team, shares a neat trick he learned recently for creating dynamic data validation ranges.  The sample file used for this blog post can be found in the attachments at the bottom of this post.

The problem: Validating data entry based on hierarchical (parent child) data. The example used is regions and countries but it could be countries and cities, product categories and sub-categories, class and student name, etc.

You want to enter a region from a list of regions and in the next cell you want to select a country but only from the countries which belong to that region.

How do you define the list of countries to validate against? The trick is basing the country validation list on an expression which will point to a different range based on the region value.

Follow the instructions in the file itself (attached below) and see how it works.

Points to notice:

  1. In Step 2 named ranges are created automatically based on a selected range.
  2. The named range used for country validation uses relative reference to point to the list of values. In this way the same mechanism will work anywhere in the spreadsheet. The validation of the cell with country is based on the region value to the left of country. The actual expression is =INDIRECT(B13). For example the named range for countries in Europe is Europe. The indirect function uses the value of the region to point the relevant named range. B13 happens to be the cell to the left of the current cell when defining the name.
  3. The validation rules are propagated automatically to each new row in the table.
  4. The lists of regions and countries do not have to be in the same worksheet as the input cells.

How to Create a Professional Chart using Excel 2007

Today's author, Robin Wakefield, a Program Manager on the Excel team, discusses charts.

As a new program manager in Excel one of my first tasks was to understand what problems users encounter when building professional charts in Excel. I defined professional as following the principles of Edward Tufte by reducing the amount of non-data ink on a chart. On an earlier post, there were a lot of comments about this particular aspect and I agree we could do better in helping users accomplish this goal. I also looked at blogs and publications which were known to produce professional looking charts to determine what the salient aspects were. As I researched this further I found there were a lot of remarks on the web that pertained to Excel 2003 charting so I thought it would be worthwhile to share what I found; specifically what changes you can make to a default chart in Excel 2007 to make it more professional looking. Your feedback on this would be greatly appreciated as we are considering how we can build this into the product going forward.

In order to illustrate this let’s try to recreate a chart I found in the Wall Street Journal.

image

I manually created the data so it may be a little off but when I insert a column chart Excel creates the following default chart for me.

image

In order to make this chart look more professional like the Wall Street Journal you can make the following modifications.

Step 1: Update Formatting Settings

With the chart selected click the Formatting tab on the Ribbon. In the left corner you can choose the chart object you would like to format and click "Format Selection" to get the formatting dialog for the object.

clip_image004

Chart Area

  • Set Border Color = No Line.

Vertical Value Axis

  • In order to keep your scale the same regardless of changes in the size of chart set the minimum, maximum and major unit options to fixed amounts.
    • Minimum = –.1
    • Maximum = .08
    • Major Unit = .02
  • Set Line Color = No Line

Vertical (Value) Axis Major Gridlines

  • Set Line Style = Dash type to deemphasize the gridlines as they are helpful in understanding the data but not the emphasis of your chart.

Horizontal Category Axis

  • Set Axis Labels = Low
  • Set Major Tick Mark = None
  • Label distance from axis = 0
  • Uncheck Multi-Level Category Labels

Series 1

  • Set Gap Width ≈ 34%
  • Fill = Solid Color and choose a red color to match the WSJ chart

Step 2: Update Font Size

Select the chart and on your home tab select a font size which is large enough that you can still read the words but they are not the emphasis of your chart. In this example I set the font size to 8.

Step 3: Remove Legend

Given the example only has one data series the legend is unnecessary in this case.

Step 4: Add Title

Add a title to the chart to tell your reader what the chart represents. In order to mimic the Wall Street Journal chart you need to move your title to the upper left hand corner and set the alignment to left. You also need to format the text so the title is in a larger font and bold whereas the subtitle is in a smaller font.

Step 5: Add captions

In order to add the citation you need to manually add a text box with this information to your chart.

Step 6: Add a Vertical Axis Title

In order to display the units of the vertical axis without showing a % next to each number on the axis the best way to do this is to add a vertical axis title and enter %. You can then move this title to line up with first number on the vertical axis.

image

As you can see I was able to get pretty close to the original chart by changing formatting settings and adding some additional text. The thing I was not able to accomplish without a hacky workaround was to fill only the bottom portion of the plot area with a gray color.

Avoid Unessential Formatting

Throughout this article I have focused on what changes you can make to the default chart but I would like to take a moment to discuss the principle of avoiding unessential formatting. As part of Office 2007 the concept of Themes was introduced. This was an Office wide effort and is shared across Word, PowerPoint, and Excel. For more information about themes and how they work in Excel see this post.

Within each Theme there are multiple Chart Styles available which is described here. The way this works is the chart styles are predefined to map to accent colors, fills, and effects etc which are defined in the Theme. When a new theme is added the chart styles are automatically generated based on the Theme definition. Although this was a great way to produce many choices for users it had an unfortunate side effect in that some of the effects/fills are not optimized for charting as they are also used on shapes, smart art etc.

See the styles matrix below for an example of the styles that are created based on the Office Theme definition.

image

In order to reduce the amount of non-data ink I would recommend sticking to the first row of styles as they do not have any effects applied to them. As you move down the rows, effects get more intense and in some themes the effects make the chart unusable such as the Metro theme which adds a glow around the columns.

Within one row we provide a variety of color options as well. The first option is always a grayscale and this will be your best choice to reduce the amount of non-data ink. The second option is all 6 accent colors in the Theme and options 3-8 produce a monochrome chart based on one of the accent colors.

If you would like to add color to your chart the following themes are good options for color variety where the colors are differentiated but no one color stands out from the others.

  • Office
  • Civic
  • Median
  • Paper
  • Technic
  • Trek
  • Urban
  • Origin

Another great resource to help choose the right chart type for your data and programmatically determine when some of the above changes should be applied is the Chart Advisor. Check out the post here.

Posted by Joseph Chirilov | 35 Comments
Filed under: ,

Simple Expense Tracking With New Excel 2007 Functions

Today’s author, Monica Poinescu, a Software Developer in Test on the Excel team, gives an overview of some new functions in Excel 2007 and along the way shows us how to do some simple expense tracking.

Considering the current economic environment, one may desire to better keep track of personal expenses. The new functions introduced in Office Excel 2007 simplify this effort. Here is how: given a list of expenses sorted by dates and categories (food, travel, clothing, etc.), how do I find how much was spent for each category, per month? And what’s the average amount spent? The functions SUMIFS, AVERAGEIFS and COUNTIFS perform selective calculations: they take into account only those values which fulfill several criteria. The values could be in a range spanning several rows and columns.

Let’s assume I entered the expenses in a table:

image

The solution below uses the new Excel 2007 feature structured references; see 'Tables Part 3: Using Formulas with Tables' for more details on how to simplify your formulas based on tables.

I’m also using the function SUMIFS which extends the functionality of SUMIF: given a range to be summed, it will add only those values that fulfill all the given criteria (one could use between 1 and 127 criteria applied to corresponding ranges).

The syntax is SUMIFS( the_range_to_sum, range_to_apply_criteria1, criteria1, [range_to_apply_criteria2, criteria2], … ) where the first parameter tells Excel what range we want to sum and the following pairs of arguments tell Excel on which range to apply the respective condition.

In visual terms, I think of each pair (range_to_apply_criteria1, criteria1) as a punch card
with some (0 or more) perforations, in the places where the criteria is TRUE. Stack all these cards on top of each other, the_range_to_sum being on the bottom, and SUMIFS adds only the values that remain visible. 

Then start building your reports table by entering in F1:I3 the categories and the months:

image

and in G2 enter:

=SUMIFS( Table1[Amount], Table1[Category], G$1, Table1[Date], $F2 )

which will evaluate the ‘home’ expenses for January (235.35).

It should look like this:

image

Similarly, in H2 enter:

=SUMIFS( Table1[Amount], Table1[Category], H$1, Table1[Date], $F2 )

this will evaluate the ‘food’ expenses for January .

And in I2 enter:

=SUMIFS( Table1[Amount], Table1[Category], I$1, Table1[Date], $F2 )

to get your travel related amount.

You can copy/paste these formulas down as you add more months. This will automatically take into account new entries in your expense list. Also, in case you need to figure the average expense per category, within each month, you can use:

=AVERAGEIFS( Table1[Amount], Table1[Category], "home", Table1[Date], "January" ) - this returns 117.675

Quick and Easy Data Access with Excel Services

Today’s author, John Campbell, a Program Manager on the Excel Services team, shows us the quick and easy way to get external data access working with your Excel Services spreadsheets.

I have seen a lot of great blog posts that tell folks how to do great things with refreshed data in an Excel spreadsheet using Excel Services. I’ve seen posts about slicing dashboard pages, about using .odc files to manage data, and more. One thing that continues to be an issue is how to get connected to that data in the first place, in way that will work on Excel Services. The key is all about ‘telling the server’ what user identity to use when connecting to your data source. I know a number of people have gotten over this hump, but there are a lot of others who struggle here. In this posting I will shed some light on why the server works the way it does here, and show you the easiest way to get this up and running. I’m abbreviating in this post – so I’m not going to cover every last detail, option, and scenario here. If you want a lot more detail, and want to go deeper on why, and other options I don’t talk about here, then see the whitepaper I wrote on the topic: http://technet.microsoft.com/en-us/library/cc262899.aspx.

First, some quick terminology and background:

Credentials: The log in name/password of a user. These are used to establish your identity to the larger network/system.

Authentication: ‘Who you are’ – this is how the system verifies your identity.

Authorization: ‘What you can do’ – this is the process of the system determining what rights or permissions you have.

Connection String: This is a string, typically stored in a workbook or .odc file, which has the basic information needed to connect to a data source. It contains things like what server to use, how to authenticate, along with any special parameters that are needed to form the raw connection to the database.

Your goal is to get the user, who is viewing your spreadsheet in their browser, authenticated and authorized against the data source. (After all, your cool report showing sales figures sliced by time won’t be worth much if you can’t refresh the data to see the current numbers.) To be more specific, I am talking about getting an identity to use for authentication/authorization from the Excel Calculation Server (ECS) to the data source (circled in the picture):

clip_image002

Connection strings will typically contain information about how to authenticate to the data source, but, for various reasons, Excel Services can’t parse them to determine how to perform the authentication. So the user must explicitly set how the authentication is to be performed. There are three options for this: Windows (delegate your domain credentials to each box on the way to, and including, the data source), SSO (lookup a username/password combo to use on your behalf out of an Single Sign On database in SharePoint), or None (no specific options are set – just take the connection string and give it a try). By default, the Windows authentication is attempted because A) it is the most secure, and B) it has the best chance of working, assuming Kerberos is configured in a typical environment, without somebody (the administrator of the server or database) needing to do something special to make it work.

Ok, enough with the background info. So, you say you can’t setup Kerberos and don’t want to spend a bunch of time getting into details of authentication - so what’s the easiest thing to do? The rest of this blog is about how to configure the None option, and related server switches, to get external data refresh working on Excel Services.

Step 1: Configure the unattended account in the Excel Services administration page.

The unattended account is really just a set of credentials that Excel Services impersonates before certain types of external data connections are made. Think of it as a dummy user account that Excel Services keeps around for connecting to data. By default, many connection strings (like connections to SQL or Analysis Services), are configured to use whatever identity is available in the process at the time the connection is made. What all that boils down to is, if the connection string specifies that integrated security is used, then the identity of the unattended account will be used at the data source for authentication/authorization.

So here is how to do it:

On the server box, go to the Excel Services admin page: From the taskbar, click Start, point to All Programs, point to Microsoft Office Server, and then click SharePoint 3.0 Central Administration. Select the name of the SSP from the left-hand navigation bar. For example, SharedServices1 as shown.

clip_image003

Then, click Edit Excel Service Settings. Then scroll down and find the Unattended Account settings.

clip_image004

Enter in the name and password of an account that has access to the data sources you are planning to access, and click OK.

A few things to pay special attention to:

  • If any other users want to use this account for their data refresh, they will need to make sure it has access to their data sources.
  • Security is important! You should never use an account that has access to the SQL databases that are running your SharePoint farm. If the unattended account does have access to SharePoint, then it becomes possible for users to load a workbook on the server that connects directly to the SharePoint databases, bypassing most SharePoint security.

Step 2: Configure the data connection in workbook

From the Excel Data ribbon, in the Connections section, click Connections.

clip_image005

In the Workbook Connections dialog box, which shows all the external data connections that are currently being used by the workbook, select the connection that needs to be changed, and then click Properties.

clip_image006

In the Connection Properties dialog box, which allows many properties of the connection to be changed, click the Definition tab.

clip_image007

Click the Authentication Settings button.

From the Excel Services Authentication dialog box, select the None option.

clip_image009

Now just click OK on the various dialogs until you are back to your spreadsheet.

That’s all you need to do. Also note that with this None setting, if you use something like SQL authentication where a username and password are saved in a connection string, then SQL (or other data providers) should use those credentials to connect to the data. I say should, because Excel and Excel Services doesn’t control this – whether that username/password gets used is completely up to the specific implementation of the data provider.

If you want more information about why authentication works this way, more details about the other options, more information about managing data connections, or prescriptive guidance and Q&A’s in this area - then see the whitepaper, where I go much deeper, at http://technet.microsoft.com/en-us/library/cc262899.aspx .

New Case Study Involving Excel Services

A new case study has been published that describes how Microsoft IT used Excel Services to build a portal for finance data.  Here's a snippet from TechNet:

Controller Workspace: Connecting With Excel Services in Microsoft Office SharePoint Server 2007
Describes how Microsoft IT created a centralized user interface platform, called Controller Workspace, that enables users to view reports, key performance indicators (KPIs), and other finance data. Out-of-box features of Microsoft Office 2007 Excel Services in Office SharePoint Server 2007 provide the foundation for the centralized portal.

The case study can be found here.  There is a video demonstration at the link as well.

Posted by Joseph Chirilov | 1 Comments
Filed under:

Hidden and Invisible Objects

Today’s author, Ben Rampson, a Program Manager on the Excel team, talks about cleaning up spreadsheets.

I often receive files demonstrating issues customers are experiencing with Excel.  Recently I have noticed a common problem in some of these files that impacts the file’s performance and size: hidden and invisible objects.

When gathering data in Excel it is common to start with information found in a variety of sources, frequently lists or tables located on the Internet.  Customers often copy and paste this data into Excel from the original source, but unfortunately this can also unintentionally paste many additional objects into the spreadsheet besides the data.  These objects (shapes, text boxes, controls, etc.) are often not noticeable to the user after the paste, but can result in slower performance and larger file sizes.  One step customers can take to avoid unwanted, hidden, and invisible objects in their document is to find and remove the objects after the paste operation as part of their data cleansing process.

Below is an image of part of a data set I copied and pasted into Excel 2007 from a website. The data set had 35 rows of data, but only the first few are shown.

After pasting my data (and extra objects) into Excel. 

After pasting my data (and extra objects) into Excel.

Looking at this image it is evident that some additional shapes were copied into Excel; an icon is shown in cell A1 and there are checkboxes in many of the rows in column B.  I could select these visible objects and delete them, however, I would end up missing some of the objects that have been pasted into my sheet.

The following steps will help you find and delete all of the additional objects in your Excel 2007 worksheet. 

The first thing you want to do is verify that you have additional objects on your spreadsheet.  The easiest way to view a sheet’s objects is to turn on the on the Selection Pane (on the Home Tab go to the Editing Chunk > click the Find and Select Dropdown > select the Selection Pane option making the pane visible).  Looking at the selection pane you will see a list of all objects on the current sheet and an indication if the objects are visible or hidden.  If the selection pane is blank then you do not have extra objects on the sheet.

The top of the Selection and Visibility Pane for my sheet.  Notice that 100 objects were pasted into my document. I have seen customer files with thousands of hidden objects.

The top of the Selection and Visibility Pane for my sheet.  Notice that 100 objects were pasted into my document. I have seen customer files with thousands of hidden objects.

Once you have the selection pane open and verified that you have objects on the sheet, the next step is to put your spreadsheet in Design Mode if it is supported on your current workbook.  Design Mode will only be enabled in Excel if your sheet contains certain types of controls.  Turning on Design Mode will allow you to select all controls, not just the basic shapes and form controls.  To enter Design Mode select the Design Mode button in the Controls Chunk on the Developer Tab. If this button is disabled in the Ribbon then your sheet does not contain the types of controls that require this step.  (Note: If you do not have the Developer Tab visible in the Ribbon you can enable it with the following steps: Office Button > Excel Options > check the “Show Developer tab in the Ribbon” checkbox on the default Popular tab). 

You now are ready to select objects using the Go To Special dialog.  To select all objects in the sheet: Ctrl+G to open the Go To dialog > select the Special button > select Object > click OK.  The objects will then be selected and their selection handles are visible on the sheet. 

My sheet with all objects selected.  You can see that in addition to the previously visible objects I now also have selected some textboxes previously hidden on my sheet.

My sheet with all objects selected.  You can see that in addition to the previously visible objects I now also have selected some textboxes previously hidden on my sheet.

At this point you can simply hit the Delete key and remove all of these objects from the sheet.  You can also choose to be more selective about the objects you delete; if you have other objects already on your sheet you wish to keep, ctrl+clicking the objects in either the selection pane or on the sheet will remove them from the current selection prior to hitting delete.

With just a little data cleansing work to you can avoid having additional unwanted objects in your document, resulting in faster performance and smaller file size.   

More Posts Next page »
 
Page view tracker