Welcome to MSDN Blogs Sign in | Join | Help

Use the VBA SaveAs Method in Excel 2007

Today’s author is Ron de Bruin , an Excel MVP. You can find more useful tips and links to Excel add-ins at his website: http://www.rondebruin.nl/

You see a lot of old SaveAs code that does not specify the FileFormat
parameter. In Excel versions before Excel 2007, code without this parameter
will not cause too many problems because Excel will use the current FileFormat
of the existing file -- and the default FileFormat for new files is a normal workbook. But because there are so many new file formats in Excel 2007, you shouldn't
use code that doesn’t specify the FileFormat parameter.

In Excel 2007, the SaveAs method requires you to provide both the FileFormat parameter and the correct file extension.

For example, in Excel 2007 this line of code will fail if the ActiveWorkbook is not an .xlsm file:

ActiveWorkbook.SaveAs "C:\ron.xlsm"

But this code will always work:

ActiveWorkbook.SaveAs "C:\ron.xlsm", fileformat:=52 
' 52 = xlOpenXMLWorkbookMacroEnabled = xlsm (workbook with macro's in 2007)

These are the main file formats in Excel 2007:

51 = xlOpenXMLWorkbook (without macro's in 2007, .xlsx)
52 = xlOpenXMLWorkbookMacroEnabled (with or without macro's in 2007, .xlsm)
50 = xlExcel12 (Excel Binary Workbook in 2007 with or without macro's, .xlsb)
56 = xlExcel8 (97-2003 format in Excel 2007, .xls)

Note: I always use the FileFormat numbers instead of the defined constants
in my code so that it will compile OK when I copy the code into an Excel
97-2003 workbook. (For example, Excel 97-2003 won't know what the
xlOpenXMLWorkbookMacroEnabled constant is.)

Example

At the end of this section is a basic VBA code example for a macro named Copy_ActiveSheet_New_Workbook() that copies the ActiveSheet to a new Workbook and then saves it in a format that matches the file extension of the parent workbook.
Note: You can use this macro in Excel 97-2007.

If you run the code in Excel 2007, it will look at the FileFormat of the parent workbook and save the new file in that format. However, if the parent workbook is an .xlsm file and there is no VBA code in the new workbook, the code will save the new file as an .xlsx file.
If the parent workbook is not an .xlsx, .xlsm or .xls file, then it will be saved as an .xlsb file.

If you always want to save in a certain format you can replace this part of the macro:

Select Case Sourcewb.FileFormat
Case 51: FileExtStr = ".xlsx": FileFormatNum = 51
Case 52:
    If .HasVBProject Then
        FileExtStr = ".xlsm": FileFormatNum = 52
    Else
        FileExtStr = ".xlsx": FileFormatNum = 51
    End If
Case 56: FileExtStr = ".xls": FileFormatNum = 56
Case Else: FileExtStr = ".xlsb": FileFormatNum = 50
End Select
With the single line of code from this list for the format you want to use:
FileExtStr = ".xlsb": FileFormatNum = 50 
FileExtStr = ".xlsx": FileFormatNum = 51
FileExtStr = ".xlsm": FileFormatNum = 52
FileExtStr = ".xls": FileFormatNum = 56

Or maybe you want to save the one sheet workbook to .csv, .txt, or .prn. (you can use this also if you run the macro in Excel 97-2003)

FileExtStr = ".csv": FileFormatNum = 6
FileExtStr = ".txt": FileFormatNum = -4158
FileExtStr = ".prn": FileFormatNum = 36

Here’s the full code example.

Sub Copy_ActiveSheet_New_Workbook()
'Working in Excel 97-2007
    Dim FileExtStr As String
    Dim FileFormatNum As Long
    Dim Sourcewb As Workbook
    Dim Destwb As Workbook
    Dim TempFilePath As String
    Dim TempFileName As String

    With Application
        .ScreenUpdating = False
        .EnableEvents = False
    End With

    Set Sourcewb = ActiveWorkbook

    'Copy the sheet to a new workbook
    ActiveSheet.Copy
    Set Destwb = ActiveWorkbook

    'Determine the Excel version and file extension/format
    With Destwb
        If Val(Application.Version) < 12 Then
            'You use Excel 97-2003
            FileExtStr = ".xls": FileFormatNum = -4143
        Else
            'You use Excel 2007
            'We exit the sub when your answer is NO in the security dialog that you
            'only see when you copy a sheet from a xlsm file with macro's disabled.
            If Sourcewb.Name = .Name Then
                With Application
                    .ScreenUpdating = True
                    .EnableEvents = True
                End With
                MsgBox "Your answer is NO in the security dialog"
                Exit Sub
            Else
                Select Case Sourcewb.FileFormat
                Case 51: FileExtStr = ".xlsx": FileFormatNum = 51
                Case 52:
                    If .HasVBProject Then
                        FileExtStr = ".xlsm": FileFormatNum = 52
                    Else
                        FileExtStr = ".xlsx": FileFormatNum = 51
                    End If
                Case 56: FileExtStr = ".xls": FileFormatNum = 56
                Case Else: FileExtStr = ".xlsb": FileFormatNum = 50
                End Select
            End If
        End If
    End With

    '    'If you want to change all cells in the worksheet to values, uncomment these lines.
    '    With Destwb.Sheets(1).UsedRange
    '        .Cells.Copy
    '        .Cells.PasteSpecial xlPasteValues
    '        .Cells(1).Select
    '    End With
    '    Application.CutCopyMode = False

    'Save the new workbook and close it
    TempFilePath = Application.DefaultFilePath & "\"
    TempFileName = "Part of " & Sourcewb.Name & " " & Format(Now, "yyyy-mm-dd hh-mm-ss")

    With Destwb
        .SaveAs TempFilePath & TempFileName & FileExtStr, FileFormat:=FileFormatNum
        .Close SaveChanges:=False
    End With

    MsgBox "You can find the new file in " & TempFilePath

    With Application
        .ScreenUpdating = True
        .EnableEvents = True
    End With
End Sub

New Change to the Excel blog

I’d like to take this opportunity to announce a new feature of the blog, which some of you might have noticed already. Our Content Publishing team has begun collecting “Power Tips” from members of the Excel community, including the product team, Excel MVPs, user group members, and other Excel enthusiasts. Power Tips are intermediate-level “How-to” tips that you can use to enhance your spreadsheets. We hope they’ll be valuable to those of you who feel pretty proficient with Excel, but who don’t consider yourselves seasoned spreadsheet "developers" (yet).

We have already tagged a number of past posts with the Power Tips tag, so you can start following this feature today by clicking the tag on the right side of this page. You can even set up an RSS feed for just that tag, if you want.

Enjoy!

Week Numbers in Excel

Today’s author is, Ron de Bruin, an Excel MVP. You can find more useful tips from Ron and links to Excel add-ins at his website: http://www.rondebruin.nl/

There are four primary week numbering systems in use worldwide. Each system has subtle differences that you should be aware of. Excel can work with any of these systems:

1)  ISO Week number: The International Organization for Standardization (ISO) ISO8601:2000 Standard.
All weeks begin on a Monday. Week one starts on Monday of the first week of the calendar year with a Thursday.
2)  Excel WEEKNUM function with an optional second argument of 1 (default).
Week one begins on January 1st; week two begins on the following Sunday.
3)  Excel WEEKNUM function with an optional second argument of 2.
Week one begins on January 1st; week two begins on the following Monday.
4)  Simple week numbering.
Week one begins on January 1st, week two begins on January 8th, and week 53 has only one or two days (for leap years).

Note: Excel does not have a standard worksheet function for the ISO week number and simple week numbering system.

Worksheet Formulas for Week Numbers

The following sections assume that you have a date in cell B4 for testing the week number formulas.

ISO Week Numbers

There is no built-in worksheet function for ISO week numbers in Excel. Copy the following formula and paste it in a worksheet cell to return an ISO week number:

=INT((B4-DATE(YEAR(B4-WEEKDAY(B4-1)+4),1,3)+WEEKDAY(DATE(YEAR(B4-WEEKDAY(B4-1)+4),1,3))+5)/7)

Alternatively, you can open the Visual Basic editor, click Module on the Insert menu, and then copy this user-defined function (UDF) into the module. After adding this UDF to your workbook, you can use it like a built-in function =IsoWeekNumber(B4)

Public Function IsoWeekNumber(d1 As Date) As Integer
' Attributed to Daniel Maher
    Dim d2 As Long
    d2 = DateSerial(Year(d1 - WeekDay(d1 - 1) + 4), 1, 3)
    IsoWeekNumber = Int((d1 - d2 + WeekDay(d2) + 5) / 7)
End Function

You can find more information about ISO dates and week numbers on this page: http://www.rondebruin.nl/isodate.htm

The Excel WEEKNUM Function

Reliance on the Analysis Toolpak in Excel versions before Excel 2007 can create problems because the add-in may not
be installed by users of your spreadsheets (a default Excel installation has it unchecked in setup). Also, there are difficulties for international users when you use Analysis Toolpak formulas because these formulas are not translated by Excel if you open the workbook in a different Excel language version.

Note: In Excel 2007 WEEKNUM is a standard built-in worksheet function, so you will not have the problems above if you share your workbook between different Excel 2007 language versions.

Important: If you do not have Office 2007 SP2 installed, read the information on this page: http://www.rondebruin.nl/atp.htm

You can use these two replacement functions from Daniel Maher to avoid problems.

Replace =WEEKNUM(B4,1) with:

=1+INT((B4-(DATE(YEAR(B4),1,2)-WEEKDAY(DATE(YEAR(B4),1,1))))/7)

Replace =WEEKNUM(B4,2) with:

=1+INT((B4-(DATE(YEAR(B4),1,2)-WEEKDAY(DATE(YEAR(B4),1,0))))/7)

Simple Week Numbers

There is no built-in worksheet function for simple week numbering in Excel. Copy the following function and paste it in a worksheet cell to return simple week numbers:

=INT((B4-DATE(YEAR(B4),1,1))/7)+1

The Week Calendar File

The week calendar file shows you all the dates and week numbers from a certain year on one printable page. If you want to have a week calendar from another year you only have to change one cell (the year). There is a separate sheet for each of the four week numbering systems listed at the start of this post.

Download the Calendar file
Week Numbers Calendar.zip (File date : 27-Feb-2005)

Acknowledgements

I would like to acknowledge general reference on all date issues to:
Chip Pearson: http://www.cpearson.com/excel/topic.aspx
Dave McRitchie: http://www.mvps.org/dmcritchie/excel/xlindex.htm
Daniel Maher has also published numerous simplifications of date formulas some of which were used or adapted in producing the calendar.
The late Frank Kabel Frank created the ISO week number worksheet function on this page.
The base formula used for the ISO year start in the calendar file were derived from a UDF written by John Green, Sydney.

More Information
ISO Date Representatation and Week Numbering:
http://www.rondebruin.nl/isodate.htm
You should refer to Chip Pearson's web site for an exposition on Week Number implementation:
http://www.cpearson.com/excel/weeknum.htm
Implementing Week-Numbering Systems and Date/Time Representations:
http://msdn.microsoft.com/en-us/library/bb277364.aspx

Formula to Access a List of Values Interspersed with Zeros or Blanks

Today’s author is Bob Umlas, an Excel MVP since 1994.

This tip is a formula which enables you to access a list of values interspersed with zeros or blanks and pick up only the non-zero values in the same sequence they’re listed. It’s better to illustrate. Suppose you have this list in A1:A14:

image01

…and you want to produce this list:

image02

The following formula entered in E1 and filled down to E6 does the trick. It’s an array formula which means that you must press Ctrl+Shift+Enter after entering the formula instead of just Enter:

=INDEX($A$1:$A$14,SMALL(IF($A$1:$A$14<>0,ROW($1:$14),""),ROW(A1)))

Let’s take the formula apart and see how it works. The inner IF-statement, IF($A$1:$A$14<>0,ROW($1:$14),""), checks for non-zeros, and if it is a non-zero, it returns the row number; otherwise it returns blanks. If you select that portion of the formula in E1 and press the F9 key, you’ll see this:

image03

…which means that rows 1,4,7,9,10 and 14 do not contain zeros or blanks. ROW(A1) returns the value 1 and is used instead of simply the number 1 because this formula is being filled down, and in the row below it will become ROW(A2), returning a 2, etc. So, the SMALL function now returns the smallest value from that list, or 1. In the row below, we have ROW(A2) or 2, and the 2nd smallest value is 4. So each formula returns the row number for the non-zero cells. That in turn is passed to the INDEX function, and we effectively have these formulas:

=INDEX(A1:A14,1)
=INDEX(A1:A14,4)
=INDEX(A1:A14,7)
and so on, which returns the values we need for the list of non-zero values.

As an additional tip, because it may not be clear how far down to fill, you can use conditional formatting to hide the potential errors if you were to drag down too far. That is, suppose you initially dragged the formula down to E8:

image04

You can hide these this way. Select all of column E, then click Home > Conditional Formatting > New Rule > Use a formula to determine which cells to format, and enter this formula:

=ISERROR(E1)

image05

Then click the Format button, the Font tab, and select a white font:

image06

Then, when you click OK all the way out, the #NUM! errors won’t show.

The reason this formula must be entered as an array-formula is because the IF(A1:A14<>0… portion of the formula requires it. The IF-statement normally takes one value to test for true/false, not an array of values as we did here. If you don’t use Ctrl+Shift+Enter to create an array formula, you will get #VALUE! and #NUM! errors.

Microsoft Office 2010 Technical Preview

Today, Microsoft is announcing a limited, invitation only Technical Preview program for Microsoft Office 2010 that will kick off in July. The program will provide you with the opportunity to experience early, pre-release versions of Office 2010 which will include the following applications:  Word 2010, Excel 2010, Outlook 2010, PowerPoint 2010, OneNote 2010, Access 2010, InfoPath 2010 and Publisher 2010. Check out www.office2010themovie.com for signing up to be considered for the Technical Preview Program.

VBA Coding Contest

I'm passing along some information I think will be interesting to many of the readers of this blog. 

Test your VBA coding skills and win!

MSDN® is sponsoring a coding contest for Office 2007, named OfficePalooza! This sweepstakes will run two weeks beginning April 20, 2009, and features ten fun Visual Basic of Applications™ (VBA) coding challenges in the form of puzzles and games. Each entrant will earn a chance to win one of hundreds of available prizes, determined by a random drawing at the end of the contest.

In conjunction with this contest, an Advanced Business User theme will run on Microsoft Office Online from mid-April to mid-May, and will showcase the automation and extensibility aspects of Office 2007 through macros, custom VBA coding, the Fluent UI, and Office Open XML.  This collaborative effort will also highlight existing and newly-created content on the MSDN Microsoft Office Developer Center.

 

Modifying Conditional Formatting Color Ranges in Excel 2007

Today’s authors, Amit Velingkar, a Program Manager on the Excel team, and Bob Silverstein, a User Experience Researcher on the Excel team, talk about creating professional looking Color Ranges in Excel.

Introduction:

My fellow PM on Excel, Robin Wakefield, wrote a blog article on how to create professional looking charts in Excel 2007. Continuing with the same theme, we will investigate the idea of tweaking Color Ranges in Excel 2007. We will start with a default Conditional Formatting Color Range and modify it to give it more pastel look. Also, this would be an ideal forum to give us your feedback on existing conditional formatting colors used in Excel 2007.

As a part of my preparation for this article, I did two things – researched customer files that use conditional formatting and explored various options with Microsoft designers.

Looking at some customer files, I quickly observed a trend. I found out that many users wanted to highlight a small portion of their data. They were achieving this by changing some of the colors in their color ranges to white. Also, some customers preferred a pastel shade of the same color.

I discussed this trend with Microsoft designers and they taught me some neat tricks about manipulating colors. First, they told me that choosing the correct shade of white was critical while mixing colors. And last, an intuitive way of changing shades of colors was to use the HSL (Hue, Saturation and Luminosity) model.

My blog article attempts to bring all these lessons together. We will change some of the colors to white and tweak the colors shades to pastel. Finally, we will introduce a VBA macro that lets you quickly try out new color shades.

Procedure:

1. I started out with applying three of our default color combinations to some data. I choose the following color ranges: Red-Yellow-Green, Red-Yellow-Blue and Yellow-Green.

clip_image002

2. I replaced the yellow color with the appropriate shades of “white”.

Changing the yellow colors to pure whites created a high contrast. For creating whites with a good spectrum of colors, we crank up the RGB values all the way to 255 creating a pure white – Remember RGB (255,255,255) is white. Then, remove just a little amount of the participating color to create “target” white.

As an example while mixing Red and Green, we set our white to (250,250,255) by removing some red and green. The end result is a color range containing pure shades of Red and Green.

Color 1

Color 2

White

Notes

Red (255,0,0)

Green (0,255,0)

(250,250,255)

Remove some red and green

Red (255,0,0)

Blue (0,0,255)

(250,255,250)

Remove some red and blue

Red(255,0,0)

-

(250,255,255)

Remove some red

Green (0,255,0)

-

(255,250,255)

Remove some green


Select the color range. On the Home tab, click on Conditional Formatting, then Manage Rules. In the Conditional Formatting Rules Manager, select the Conditional Format and click on Edit Rule. In the Edit Formatting Rule dialog change the yellow color to white.

clip_image002[5]

3. Using HSL Model to change non-white colors

We can use the HSL model to manipulate color. HSL defines the color spectrum based on Hue (base color from the rainbow spectrum), Luminosity (brightness) and Saturation (purity of the color).

Select the color range. On the Home tab, click on Conditional Formatting, then Manage Rules. In the Conditional Formatting Rules Manager, select the Conditional Format and click on Edit Rule. In the Edit Formatting Rule dialog, click on the color dropdown and choose More Colors. Click on the Custom tab and choose HSL in the Color model dropdown.

In this example, we will change colors using the HSL models as follows:

o Hue: Unchanged, this leaves the base color unchanged. Hence our Reds stay Red.

o Saturation: Decrease the Saturation to 150. This makes the colors more pastel.

o Luminosity: Increase the luminosity to 200. The goal is to increase the brightness to make the color more visible and showcase the text behind the color ranges.

clip_image002[7]

Result:

By using shades of white and by tweaking the luminosity and saturation of colors, we changed the color ranges as follows:

image

HSL Color Manipulation using VBA:

Since I wanted to try out a lot of combinations of Luminosity and Saturation values to tweak color ranges, I decided to write a VBA macro to do this. I quickly learnt that while Excel allows HSL manipulation through the UI, it does not support HSL manipulation through VBA. I looked around for ways to manipulate HSL and finally found a blog post by Tony Jollans that did something similar. Although most of the code used is directly from Tony Jollans’s blog post, I wrote some additional subroutines, added some comments and re-factored some of the subroutines for my purpose.

The entry function for my macro is called ChangeColorRangeColors().

Sub ChangeColorRangeColors()
   ...

   For Each clrScale In Selection.FormatConditions
      For Each criteria In clrScale.ColorScaleCriteria
         ' get clr
         clr = criteria.FormatColor.Color

         ' split RGB
         SplitRGB clr, R, G, B

        'skip whites - when all RGB componenst are above 240
         If R < 240 Or G < 240 Or B < 240 Then

            ' get HSL
            RGBtoHSL clr, H, S, L

            ' increase luminosity
            L = (iLuminosity / 255) ' from 0.0 to 1.0

            ' decrease saturation
            S = (iSaturation / 255) ' from 0.0 to 1.0

            ' get RGB
            HSLtoRGB H, S, L, clr

            ' set new clr
            criteria.FormatColor.Color = clr

         End If
      Next
   Next
End Sub

At a high level, my HSL manipulator code works in the following ways:

  1. Get the RGB encoded value from the selected color ranges.
  2. Split the RGB encoded values into the Red, Green and Blue components.
  3. Skip the white colors – we do not want to change these.
  4. Covert RGB into HSL.
  5. Modify the Saturation and Luminosity, leaving Hue unchanged.
  6. Convert the modified HSL values back to encoded RGB.
  7. Apply the new RGB value back to the color range.

I won’t go into the details of converting between RGB and HSL values. These conversion routines are well documented and can be freely found on the internet.

However, another tricky part was trying to split the encoded color value that is used by Excel into individual RGB components. VBA provides a function, aptly named RGB(), that returns an encoded long value when provided with the R,G and B components. I have used this function in my macro to set the Color Range colors. However, VBA does not provide a reverse function to split this encoded long value. I have used a custom function called SplitRGB to do this.

Sub SplitRGB(RGB As Long, R As Double, G As Double, B As Double)
  
Dim HexString As String

   ' convert the long to Hex - bb:gg:rr
  
HexString = Hex(RGB)

   ' in order to get r,g,b components out of the string,
   ' we have to make it is atleast 6 characters long - bb:gg:rr
  
HexString = Right(String$(5, "0") & HexString, 6)

   ' get each individual color and convert to an double (range: 0 to 255)
  
R = CDbl("&H" & Mid$(HexString, 5, 2))
  
G = CDbl("&H" & Mid$(HexString, 3, 2))
  
B = CDbl("&H" & Mid$(HexString, 1, 2))
End Sub

Since the R, G, B values span from 0 to 255 and can be accommodated in 1 byte. Excel uses a 4-Byte long to encode the RGB values. The first byte is used to store the red value, the second byte stores the green value and the third byte stores the blue values. An easy way to observe these values is to convert this encoded long into a hexadecimal string. We then split this string using the Mid function to isolate the R, G, B components. Finally, we use a conveniently provided CDbl function to convert hexadecimal text to a number. Observe that appending the “&H” enables the CDbl function to identify the text as hexadecimal.

To recap, We can create professional looking Conditional Formatting color ranges by choosing the correct shades of whites and using the HSL model to tweak Saturation and Luminosity values. I am hoping that by using these techniques and subroutines, you will be able to automate the look and feel of your Excel Spreadsheets.

The attached workbook contains these color ranges and the VBA macros. We would love to hear your feedback on these colors.

A Robust Way To Reference Multiple Columns In a Table

I’m sure many of you have built a spreadsheet like this before: you’ve got a table of data; one of the columns in this table contains a formula which references a span of columns in the same table.  The number of columns you need to reference may change over time as requirements change.  How do you you build this table in such a way that the number of columns can be changed without breaking the formula that references them?

Let’s walk through a somewhat contrived but simple example to demonstrate the issue.  Say I’m a real estate investor and I’m tracking a list of houses I’m interested in purchasing.  I’m a demanding and detail oriented buyer so I want to know the square footage of the individual rooms in the dwelling.  Based on this information, I want to calculate the count of rooms in the dwelling as well as the total square feet of the rooms combined.  The table might look something like this:

image

The formula for the “#Rooms” column looks something like this:

=COUNT( Table1[[#This Row],[SqFt-Room1]:[SqFt-Room3]] )

Now, let’s say I want to add a new house, but this new house has four rooms.  Simple, just add a column, right?  Not so fast; depending on how we add that column our formulas for “#Rooms” and “Ttl Sq. ft.” may not update as expected.

Or let’s say I’ve changed focus to small condominiums and no longer have a need for the “Room3” column.  If I attempt to delete it, then my formulas will break.  How do we avoid this?

Here's one trick I use in such situations.  It may not be the best answer, but I’ll share it with you here and if others have suggestions to offer you can add them to the comments.

Solution: add an extra “dummy” column before and after the span as end caps, and refer to those columns in your formulas.  Using our real estate example, I would add a column before “SqFt-Room1” called “RoomsStart” and one after “SqFt-Room3” called “RoomsEnd”, like so:

image

I would keep these columns empty and never put data in them.  This is important because otherwise my COUNT and SUM formulas may return the wrong results.  Then I’d highlight the columns a different shade (this is totally optional but I personally like the visual effect of marking off the start and end of the span) and resize them to something very small so they are out of the way for the most part, like so:

image

Alternatively, you can hide the columns if you so desire.  Then I’d update my formulas so that they referenced these columns instead, like so:

image

Now, when I want to add a fourth room, I select the “RoomsEnd” column and right-click \ Insert and I get a new column which I can name “SqFt-Room4”.  This new column will automatically be included in my COUNT and SUM calculations.  Similarly, if I ever get rid of “SqFt-Room4”, I don’t have to worry about breaking any of my formulas.  It may not be the most elegant solution, but it gives me a virtually worry-free way to reference a changing number of columns in my table.

Analyzing Data: Functions or PivotTables

Today’s author, Monica Poinescu, a Software Developer in Test on the Excel team, discusses two different approaches to analyzing data in Excel.

Edit: I've attached a file at the bottom of this blog that contains spreadsheets of the examples discussed in this post.

My earlier blog on the new Excel 2007 function SUMIFS spawned a very interesting discussion (thanks to everyone who posted comments there): when trying to analyze/aggregate data in a table, how do we decide whether to use functions versus PivotTables?

This blog outlines reasons to use one option or another. To better illustrate the two alternatives I’ll consider a real estate inspired example: let’s say I have list of homes for sale and their corresponding characteristics:

clip_image002

and I’m trying to find the average price for those homes which have at least 3 bedrooms, a garage and are between 5 and 10 years old. Just by looking at the table, we see that only house3 and house4 satisfy all conditions and the average of their prices is 312,500.

Here is a functions based solution:

=AVERAGEIFS(G2:G6,C2:C6,">2",E2:E6,"yes",F2:F6,">1999",F2:F6,"<2004")

which returns $ 312,500.

To build a corresponding PivotTable, one can use several filters:

clip_image004

Advantages of PivotTables:

  1. While AVERAGEIFS is limited to using at most 127 conditions, PivotTables can handle more than 127 conditions.
  2. The elements of the set that fulfill all conditions are listed in the resulting PivotTable.
  3. PivotTables have a lot of flexibility: the ease of use offered by the new UI allows for a very quick detailed analysis of different available options. Nested layers in a PivotTable offer added results visualization.
  4. Several different approaches are possible: one can construct different PivotTables that answer the same question above.
  5. Particularly useful for large data sources: when relying on an external data source, you don’t need to bring all the data in Excel and one could, for example, use OLAP databases.

Advantages of using functions:

  1. It’s easier to see in one glance all the conditions being used either by looking in the formula bar, or by listing all criteria in separate ranges.
  2. The result updates immediately when adding rows to the source table, while PivotTables need to be refreshed.
  3. The criteria, when referenced in a cell, could be as well the result of another formula, while value filters in PivotTables can only use constants.
    For example, in the formula above, one could replace
    =AVERAGEIFS(G2:G6,C2:C6,">2",…
    with
    =AVERAGEIFS(G2:G6,C2:C6,I4,…
    where I4 contains another calculation ( e.g. =">"&FIXED(SUM(1,1),0)).
    The image below shows a corresponding PivotTable filter:
    clip_image006
    The last field will not accept formulas, only numbers.
  4. Formulas take little space and are easy to move around in a sheet.

Note that in both cases you can use the wildcard characters to define criteria. Also both solutions deal in a similar manner with missing data or errors in the range.

I don't know if I can be fully objective on this question because I'm more of a formula person myself. If I've missed a reason you should pick one approach over another, feel free to let me know the reason you use formulas or PivotTables for summarizing data by leaving a comment.

Excel VBA Performance Coding Best Practices

Today’s author, Chad Rothschiller, a Program Manager on the Excel team, is back with a follow up from his previous post on VBA and Excel performance.

I want to start off this post by thanking everyone who sent in their examples in response to my January request. It is incredibly helpful to be able to look at what you all are doing with Excel! Not only did I see a huge variety in how Excel is being used, you also pointed out various tips and tricks for writing fast VBA code in Excel.

In this post I'm going to share with you the most important performance tips I know about. There are tons of sites, pages, and people who are experts as well on this subject, have performed their own tests, and shared their results and ideas. If you think I missed an important concept for how to optimize Excel VBA performance, or if you've got a valuable comment or link to share, please feel free to post here so everyone can benefit. Thanks!

Turn Off Everything But the Essentials While Your Code is Running

This optimization explicitly turns off Excel functionality you don't need to happen (over and over and over) while your code runs. Note that in the code sample below we grab the current state of these properties, turn them off, and then restore them at the end of code execution.

One reason this helps is that if you're updating (via VBA) several different ranges with new values, or copy / pasting from several ranges to create a consolidated table of data, you likely do not want to have Excel taking time and resources to recalculate formulas, display paste progress, or even redraw the grid, especially after every single operation (even more so if your code uses loops). Just one recalculation and one redraw at the end of your code execution is enough to get the workbook current with all your changes.

Here's some sample code that shows how and what to shut off while your code runs. Doing this should help improve the performance of your code:

'Get current state of various Excel settings; put this at the beginning of your code

screenUpdateState = Application.ScreenUpdating

statusBarState = Application.DisplayStatusBar

calcState = Application.Calculation

eventsState = Application.EnableEvents

displayPageBreakState = ActiveSheet.DisplayPageBreaks 'note this is a sheet-level setting

'turn off some Excel functionality so your code runs faster

Application.ScreenUpdating = False

Application.DisplayStatusBar = False

Application.Calculation = xlCalculationManual

Application.EnableEvents = False

ActiveSheet.DisplayPageBreaks = False 'note this is a sheet-level setting

'>>your code goes here<<

'after your code runs, restore state; put this at the end of your code

Application.ScreenUpdating = screenUpdateState

Application.DisplayStatusBar = statusBarState

Application.Calculation = calcState

Application.EnableEvents = eventsState

ActiveSheet.DisplayPageBreaks = displayPageBreaksState 'note this is a sheet-level setting

Here's a quick description for each of these settings:

Application.ScreenUpdating: This setting tells Excel to not redraw the screen while False. The benefit here is that you probably don't need Excel using up resources trying to draw the screen since it's changing faster than the user can perceive. Since it requires lots of resources to draw the screen so frequently, just turn off drawing the screen until the end of your code execution. Be sure to turn it back on right before your code ends.

Application.DisplayStatusBar: This setting tells Excel to stop showing status while False. For example, if you use VBA to copy/paste a range, while the paste is completing Excel will show the progress of that operation on the status bar. Turning off screen updating is separate from turning off the status bar display so that you can disable screen updating but still provide feedback to the user, if desired. Again, turn it back on right before your code ends execution.

Application.Calculation: This setting allows you to programmatically set Excel's calculation mode. "Manual" (xlCalculationManual) mode means Excel waits for the user (or your code) to explicitly initiate calculation. "Automatic" is the default and means that Excel decides when to recalculate the workbook (e.g. when you enter a new formula on the sheet). Since recalculating your workbook can be time and resource intensive, you might not want Excel triggering a recalc every time you change a cell value. Turn off calculation while your code executes, then set the mode back. Note: setting the mode back to “Automatic” (xlCalculationAutomatic) will trigger a recalc.

Application.EnableEvents: This setting tells Excel to not fire events while False. While looking into Excel VBA performance issues I learned that some desktop search tools implement event listeners (probably to better track document contents as it changes). You might not want Excel firing an event for every cell you're changing via code, and turning off events will speed up your VBA code performance if there is a COM Add-In listening in on Excel events. (Thanks to Doug Jenkins for pointing this out in my earlier post).

ActiveSheet.DisplayPageBreaks: A good description of this setting already exists: http://support.microsoft.com/kb/199505 (Thanks to David McRitchie for pointing this out).

Read/Write Large Blocks of Cells in a Single Operation

This optimization explicitly reduces the number of times data is transferred between Excel and your code. Instead of looping through cells one at a time and getting or setting a value, do the same operation over the whole range in one line, using an array variable to store values as needed.

For each of the code examples below, I had put random values (not formulas) into cells A1:C10000.

Here's a slow, looping method:

Dim DataRange as Range
Dim Irow as Long
Dim Icol as Integer
Dim MyVar as Double
Set DataRange=Range("A1:C10000")

For Irow=1 to 10000
  For icol=1 to 3
    MyVar=DataRange(Irow,Icol)  'Read values from the Excel grid 30K times
    If MyVar > 0 then 
      MyVar=MyVar*Myvar ' Change the value 
      DataRange(Irow,Icol)=MyVar  'Write values back into the Excel grid 30K times
    End If 
  Next Icol
Next Irow

Here's the fast version of that code:

Dim DataRange As Variant
Dim Irow As Long
Dim Icol As Integer
Dim MyVar As Double
DataRange = Range("A1:C10000").Value ' read all the values at once from the Excel grid, put into an array

For Irow = 1 To 10000
  For Icol = 1 To 3
  MyVar = DataRange(Irow, Icol)
  If MyVar > 0 Then
    MyVar=MyVar*Myvar ' Change the values in the array
    DataRange(Irow, Icol) = MyVar
  End If
Next Icol
Next Irow
Range("A1:C10000").Value = DataRange ' writes all the results back to the range at once

Note: I first learned of this concept by reading a web page by John Walkenbach found here: http://www.dailydoseofexcel.com/archives/2006/12/04/writing-to-a-range-using-vba/

A previous Excel blog entry by Dany Hoter also compares these two methods, along with a selection / offset method as well: http://blogs.msdn.com/excel/archive/2008/10/03/what-is-the-fastest-way-to-scan-a-large-range-in-excel.aspx

...which leads me to my next point.

Avoid Selecting / Activating Objects

Notice that in the above-referenced blog post, the selection method of updating a range was the slowest. This next optimization minimizes how frequently Excel has to respond to the selection changing in the workbook by minimizing the selection changing as much as possible.

Range Example: Again, see the Excel blog post quoted above. It demonstrates that using selection is the slowest of the 3 methods discussed for reading and writing to ranges.

Shapes Example: Setup: I have 40 shapes on a sheet, and I want to write "Hello" in each of them.

Using the slower "selection" method, the code looks like this:

For i = 0 To ActiveSheet.Shapes.Count

   ActiveSheet.Shapes(i).Select

   Selection.Text = "Hello"

Next i

The much faster method is to avoid selection completely and directly reference the shape:

For i = 0 To ActiveSheet.Shapes.Count

   ActiveSheet.Shapes(i).TextEffect.Text = "Hello"

Next i

The concepts illustrated by the examples above can also be applied to objects other than Ranges and Shapes.

Note: I first learned of this concept, in the context of shapes, by reading a web page by Ron de Bruin found here: http://www.rondebruin.nl/shape.htm

Related Performance Paper

See the "Improving Performance in Excel 2007" paper on MSDN: http://msdn.microsoft.com/en-us/library/aa730921.aspx

This is a fairly detailed and comprehensive paper that introduces the bigger grid and increased limits in Excel 2007, and primarily focuses on Excel calculation performance and debugging calculation performance bottlenecks. There's also a short section on how to write faster VBA macros.

Other Performance Optimizations

While the above optimizations are what I consider the most important, there are a few other "honorable mention" optimizations I will mention briefly for you to consider.

Consider the performance gains by implementing your code's functionality via XLL / C-API. An overview and supporting materials for the SDK can be found here: http://msdn.microsoft.com/en-us/library/bb687827.aspx .

Declare variables with explicit types to avoid the overhead of determining the data type (repetitively if used in a loop) during code execution.

For simple functions used by your code in high frequency, implement them yourself in VBA instead of using the WorksheetFunction object.

Use Range.SpecialCells() to scope down the number of cells your code needs to work with.

Creating Dynamic Validation Ranges

Today’s author, Dany Hoter, a Product Planner for the Excel team, shares a neat trick he learned recently for creating dynamic data validation ranges.  The sample file used for this blog post can be found in the attachments at the bottom of this post.

The problem: Validating data entry based on hierarchical (parent child) data. The example used is regions and countries but it could be countries and cities, product categories and sub-categories, class and student name, etc.

You want to enter a region from a list of regions and in the next cell you want to select a country but only from the countries which belong to that region.

How do you define the list of countries to validate against? The trick is basing the country validation list on an expression which will point to a different range based on the region value.

Follow the instructions in the file itself (attached below) and see how it works.

Points to notice:

  1. In Step 2 named ranges are created automatically based on a selected range.
  2. The named range used for country validation uses relative reference to point to the list of values. In this way the same mechanism will work anywhere in the spreadsheet. The validation of the cell with country is based on the region value to the left of country. The actual expression is =INDIRECT(B13). For example the named range for countries in Europe is Europe. The indirect function uses the value of the region to point the relevant named range. B13 happens to be the cell to the left of the current cell when defining the name.
  3. The validation rules are propagated automatically to each new row in the table.
  4. The lists of regions and countries do not have to be in the same worksheet as the input cells.

How to Create a Professional Chart using Excel 2007

Today's author, Robin Wakefield, a Program Manager on the Excel team, discusses charts.

As a new program manager in Excel one of my first tasks was to understand what problems users encounter when building professional charts in Excel. I defined professional as following the principles of Edward Tufte by reducing the amount of non-data ink on a chart. On an earlier post, there were a lot of comments about this particular aspect and I agree we could do better in helping users accomplish this goal. I also looked at blogs and publications which were known to produce professional looking charts to determine what the salient aspects were. As I researched this further I found there were a lot of remarks on the web that pertained to Excel 2003 charting so I thought it would be worthwhile to share what I found; specifically what changes you can make to a default chart in Excel 2007 to make it more professional looking. Your feedback on this would be greatly appreciated as we are considering how we can build this into the product going forward.

In order to illustrate this let’s try to recreate a chart I found in the Wall Street Journal.

image

I manually created the data so it may be a little off but when I insert a column chart Excel creates the following default chart for me.

image

In order to make this chart look more professional like the Wall Street Journal you can make the following modifications.

Step 1: Update Formatting Settings

With the chart selected click the Formatting tab on the Ribbon. In the left corner you can choose the chart object you would like to format and click "Format Selection" to get the formatting dialog for the object.

clip_image004

Chart Area

  • Set Border Color = No Line.

Vertical Value Axis

  • In order to keep your scale the same regardless of changes in the size of chart set the minimum, maximum and major unit options to fixed amounts.
    • Minimum = –.1
    • Maximum = .08
    • Major Unit = .02
  • Set Line Color = No Line

Vertical (Value) Axis Major Gridlines

  • Set Line Style = Dash type to deemphasize the gridlines as they are helpful in understanding the data but not the emphasis of your chart.

Horizontal Category Axis

  • Set Axis Labels = Low
  • Set Major Tick Mark = None
  • Label distance from axis = 0
  • Uncheck Multi-Level Category Labels

Series 1

  • Set Gap Width ≈ 34%
  • Fill = Solid Color and choose a red color to match the WSJ chart

Step 2: Update Font Size

Select the chart and on your home tab select a font size which is large enough that you can still read the words but they are not the emphasis of your chart. In this example I set the font size to 8.

Step 3: Remove Legend

Given the example only has one data series the legend is unnecessary in this case.

Step 4: Add Title

Add a title to the chart to tell your reader what the chart represents. In order to mimic the Wall Street Journal chart you need to move your title to the upper left hand corner and set the alignment to left. You also need to format the text so the title is in a larger font and bold whereas the subtitle is in a smaller font.

Step 5: Add captions

In order to add the citation you need to manually add a text box with this information to your chart.

Step 6: Add a Vertical Axis Title

In order to display the units of the vertical axis without showing a % next to each number on the axis the best way to do this is to add a vertical axis title and enter %. You can then move this title to line up with first number on the vertical axis.

image

As you can see I was able to get pretty close to the original chart by changing formatting settings and adding some additional text. The thing I was not able to accomplish without a hacky workaround was to fill only the bottom portion of the plot area with a gray color.

Avoid Unessential Formatting

Throughout this article I have focused on what changes you can make to the default chart but I would like to take a moment to discuss the principle of avoiding unessential formatting. As part of Office 2007 the concept of Themes was introduced. This was an Office wide effort and is shared across Word, PowerPoint, and Excel. For more information about themes and how they work in Excel see this post.

Within each Theme there are multiple Chart Styles available which is described here. The way this works is the chart styles are predefined to map to accent colors, fills, and effects etc which are defined in the Theme. When a new theme is added the chart styles are automatically generated based on the Theme definition. Although this was a great way to produce many choices for users it had an unfortunate side effect in that some of the effects/fills are not optimized for charting as they are also used on shapes, smart art etc.

See the styles matrix below for an example of the styles that are created based on the Office Theme definition.

image

In order to reduce the amount of non-data ink I would recommend sticking to the first row of styles as they do not have any effects applied to them. As you move down the rows, effects get more intense and in some themes the effects make the chart unusable such as the Metro theme which adds a glow around the columns.

Within one row we provide a variety of color options as well. The first option is always a grayscale and this will be your best choice to reduce the amount of non-data ink. The second option is all 6 accent colors in the Theme and options 3-8 produce a monochrome chart based on one of the accent colors.

If you would like to add color to your chart the following themes are good options for color variety where the colors are differentiated but no one color stands out from the others.

  • Office
  • Civic
  • Median
  • Paper
  • Technic
  • Trek
  • Urban
  • Origin

Another great resource to help choose the right chart type for your data and programmatically determine when some of the above changes should be applied is the Chart Advisor. Check out the post here.

Posted by Joseph Chirilov | 35 Comments
Filed under: ,

Simple Expense Tracking With New Excel 2007 Functions

Today’s author, Monica Poinescu, a Software Developer in Test on the Excel team, gives an overview of some new functions in Excel 2007 and along the way shows us how to do some simple expense tracking.

Considering the current economic environment, one may desire to better keep track of personal expenses. The new functions introduced in Office Excel 2007 simplify this effort. Here is how: given a list of expenses sorted by dates and categories (food, travel, clothing, etc.), how do I find how much was spent for each category, per month? And what’s the average amount spent? The functions SUMIFS, AVERAGEIFS and COUNTIFS perform selective calculations: they take into account only those values which fulfill several criteria. The values could be in a range spanning several rows and columns.

Let’s assume I entered the expenses in a table:

image

The solution below uses the new Excel 2007 feature structured references; see 'Tables Part 3: Using Formulas with Tables' for more details on how to simplify your formulas based on tables.

I’m also using the function SUMIFS which extends the functionality of SUMIF: given a range to be summed, it will add only those values that fulfill all the given criteria (one could use between 1 and 127 criteria applied to corresponding ranges).

The syntax is SUMIFS( the_range_to_sum, range_to_apply_criteria1, criteria1, [range_to_apply_criteria2, criteria2], … ) where the first parameter tells Excel what range we want to sum and the following pairs of arguments tell Excel on which range to apply the respective condition.

In visual terms, I think of each pair (range_to_apply_criteria1, criteria1) as a punch card
with some (0 or more) perforations, in the places where the criteria is TRUE. Stack all these cards on top of each other, the_range_to_sum being on the bottom, and SUMIFS adds only the values that remain visible. 

Then start building your reports table by entering in F1:I3 the categories and the months:

image

and in G2 enter:

=SUMIFS( Table1[Amount], Table1[Category], G$1, Table1[Date], $F2 )

which will evaluate the ‘home’ expenses for January (235.35).

It should look like this:

image

Similarly, in H2 enter:

=SUMIFS( Table1[Amount], Table1[Category], H$1, Table1[Date], $F2 )

this will evaluate the ‘food’ expenses for January .

And in I2 enter:

=SUMIFS( Table1[Amount], Table1[Category], I$1, Table1[Date], $F2 )

to get your travel related amount.

You can copy/paste these formulas down as you add more months. This will automatically take into account new entries in your expense list. Also, in case you need to figure the average expense per category, within each month, you can use:

=AVERAGEIFS( Table1[Amount], Table1[Category], "home", Table1[Date], "January" ) - this returns 117.675

Quick and Easy Data Access with Excel Services

Today’s author, John Campbell, a Program Manager on the Excel Services team, shows us the quick and easy way to get external data access working with your Excel Services spreadsheets.

I have seen a lot of great blog posts that tell folks how to do great things with refreshed data in an Excel spreadsheet using Excel Services. I’ve seen posts about slicing dashboard pages, about using .odc files to manage data, and more. One thing that continues to be an issue is how to get connected to that data in the first place, in way that will work on Excel Services. The key is all about ‘telling the server’ what user identity to use when connecting to your data source. I know a number of people have gotten over this hump, but there are a lot of others who struggle here. In this posting I will shed some light on why the server works the way it does here, and show you the easiest way to get this up and running. I’m abbreviating in this post – so I’m not going to cover every last detail, option, and scenario here. If you want a lot more detail, and want to go deeper on why, and other options I don’t talk about here, then see the whitepaper I wrote on the topic: http://technet.microsoft.com/en-us/library/cc262899.aspx.

First, some quick terminology and background:

Credentials: The log in name/password of a user. These are used to establish your identity to the larger network/system.

Authentication: ‘Who you are’ – this is how the system verifies your identity.

Authorization: ‘What you can do’ – this is the process of the system determining what rights or permissions you have.

Connection String: This is a string, typically stored in a workbook or .odc file, which has the basic information needed to connect to a data source. It contains things like what server to use, how to authenticate, along with any special parameters that are needed to form the raw connection to the database.

Your goal is to get the user, who is viewing your spreadsheet in their browser, authenticated and authorized against the data source. (After all, your cool report showing sales figures sliced by time won’t be worth much if you can’t refresh the data to see the current numbers.) To be more specific, I am talking about getting an identity to use for authentication/authorization from the Excel Calculation Server (ECS) to the data source (circled in the picture):

clip_image002

Connection strings will typically contain information about how to authenticate to the data source, but, for various reasons, Excel Services can’t parse them to determine how to perform the authentication. So the user must explicitly set how the authentication is to be performed. There are three options for this: Windows (delegate your domain credentials to each box on the way to, and including, the data source), SSO (lookup a username/password combo to use on your behalf out of an Single Sign On database in SharePoint), or None (no specific options are set – just take the connection string and give it a try). By default, the Windows authentication is attempted because A) it is the most secure, and B) it has the best chance of working, assuming Kerberos is configured in a typical environment, without somebody (the administrator of the server or database) needing to do something special to make it work.

Ok, enough with the background info. So, you say you can’t setup Kerberos and don’t want to spend a bunch of time getting into details of authentication - so what’s the easiest thing to do? The rest of this blog is about how to configure the None option, and related server switches, to get external data refresh working on Excel Services.

Step 1: Configure the unattended account in the Excel Services administration page.

The unattended account is really just a set of credentials that Excel Services impersonates before certain types of external data connections are made. Think of it as a dummy user account that Excel Services keeps around for connecting to data. By default, many connection strings (like connections to SQL or Analysis Services), are configured to use whatever identity is available in the process at the time the connection is made. What all that boils down to is, if the connection string specifies that integrated security is used, then the identity of the unattended account will be used at the data source for authentication/authorization.

So here is how to do it:

On the server box, go to the Excel Services admin page: From the taskbar, click Start, point to All Programs, point to Microsoft Office Server, and then click SharePoint 3.0 Central Administration. Select the name of the SSP from the left-hand navigation bar. For example, SharedServices1 as shown.

clip_image003

Then, click Edit Excel Service Settings. Then scroll down and find the Unattended Account settings.

clip_image004

Enter in the name and password of an account that has access to the data sources you are planning to access, and click OK.

A few things to pay special attention to:

  • If any other users want to use this account for their data refresh, they will need to make sure it has access to their data sources.
  • Security is important! You should never use an account that has access to the SQL databases that are running your SharePoint farm. If the unattended account does have access to SharePoint, then it becomes possible for users to load a workbook on the server that connects directly to the SharePoint databases, bypassing most SharePoint security.

Step 2: Configure the data connection in workbook

From the Excel Data ribbon, in the Connections section, click Connections.

clip_image005

In the Workbook Connections dialog box, which shows all the external data connections that are currently being used by the workbook, select the connection that needs to be changed, and then click Properties.

clip_image006

In the Connection Properties dialog box, which allows many properties of the connection to be changed, click the Definition tab.

clip_image007

Click the Authentication Settings button.

From the Excel Services Authentication dialog box, select the None option.

clip_image009

Now just click OK on the various dialogs until you are back to your spreadsheet.

That’s all you need to do. Also note that with this None setting, if you use something like SQL authentication where a username and password are saved in a connection string, then SQL (or other data providers) should use those credentials to connect to the data. I say should, because Excel and Excel Services doesn’t control this – whether that username/password gets used is completely up to the specific implementation of the data provider.

If you want more information about why authentication works this way, more details about the other options, more information about managing data connections, or prescriptive guidance and Q&A’s in this area - then see the whitepaper, where I go much deeper, at http://technet.microsoft.com/en-us/library/cc262899.aspx .

New Case Study Involving Excel Services

A new case study has been published that describes how Microsoft IT used Excel Services to build a portal for finance data.  Here's a snippet from TechNet:

Controller Workspace: Connecting With Excel Services in Microsoft Office SharePoint Server 2007
Describes how Microsoft IT created a centralized user interface platform, called Controller Workspace, that enables users to view reports, key performance indicators (KPIs), and other finance data. Out-of-box features of Microsoft Office 2007 Excel Services in Office SharePoint Server 2007 provide the foundation for the centralized portal.

The case study can be found here.  There is a video demonstration at the link as well.

Posted by Joseph Chirilov | 1 Comments
Filed under:
More Posts Next page »
 
Page view tracker