Welcome to MSDN Blogs Sign in | Join | Help

Microsoft Excel

The team blog for Microsoft Excel and Excel Services.
Don't forget about the "A" in "VBA"!

Today's author, Stephane Viot, a Program Manager in the Office Global Experience Platform team, reflects on how leveraging application features could help speed up the development of your VBA solutions, while improving their performance.

Often -but hopefully not always- VBA developers forget about the "A" in "VBA". And by that I mean they might have the tendency of trying to do everything in VBA, like they would in VB, forgetting about what the host application has to offer. To illustrate that point -that is the main purpose of this post- let me share with you a practical example.

The problem

I often need to extract the list of unique values from a range -usually from a list or a table- and paste it into a mail as a comma-delimited string where all values are sorted in alphabetical order.

For example, given the following list, how would you pull out the list of unique states you are shipping products to, in alphabetical order, using VBA code -so that you can re-use it over and over?

image

The VBa-only approach

Before actually getting into the details of a better solution, let's see how one could have tackled the problem, using VBA code only. Even if variants exist, the basic steps would be to:

1. Loop through each cell in the current selection and construct an array of unique values.

2. Sort the array.

3. Build the output string using the list separator provided by the user.

4. Copy the output string to the Clipboard.

When working on very large lists, extracting the unique values from a range using the preceding technique might show its limits, and be quite slow. Sorting the values, too.

Now, think how the application could help you do such things. In other words, what steps would you go through if you had to do it "manually"?

The manual steps

This is what I would personally do.

  1. Copy the data from the Ship State column (D2:D49; skipping the header cell), and paste it into a new worksheet or workbook:
    image 
  2. Click the Data tab in the Ribbon, then Remove Duplicates (under the Data Tools group):
    image 
  3. In the Remove Duplicates dialog, uncheck My data has headers (provided you haven't copied the "Ship State" column header earlier), then click OK.
    image 
  4. The results of the preceding operation would look like the following, after having sorted the list in alphabetical order using the Sort button:
    image 
  5. Now that we have the list of unique values sorted in alphabetical order, how would you get the text "CA, CO, FL, ID, IL, NV, NY, OR, TN, UT, WA, WI" from that list? Again, do not think VBA code and how you could concatenate the content of each cell to build the output string. Think formulae instead. Here is one way of doing it:

    B1: =A1
    B2: =B1&", "&A2

    Then copy the formula in B2 down to B12:
    image 

    So the results of the calculation would be:
    image 
  6. As you can see, B12 now contains the string we are looking for. All we have to do next is select that cell and copy/paste it into the e-mail message -and get rid of that extraneous workbook or worksheet that is no longer needed.

On very large lists, I can guarantee that you will be amazed to see how fast Excel Remove Duplicates can be. And I would bet, without taking any risks, that it would be much faster than any VBA code anyone could write.

As you have probably guessed it by now, a better solution to the Vba-only approach would be to mimic those manual steps.

A better solution: The CopyUniqueValues Add-In

Rather than providing the full VBA code listing here (available for download at the end of this post), let's instead see how the add-in is organized.

The CopyUniqueValues add-in contains:

  • One worksheet, internally named shtTemp, for extracting the list of unique values;
  • One UserForm, frmCopyUniqueValues, for selecting the delimiter and sorting options for the list;
  • And one module, modCopyUniqueValues, for performing the "dirty work".

Below are the delimiter and sorting options users can select:

image

Note: The choice of the delimiter would be determined by the intended use. For example, for building a list...

  • ...to be pasted in a Word document, in place of a table -> select "Comma and space";
  • ...of e-mail addresses to send a message to -> select "Semi-colon";
  • ...that can be used as the source of a Data Validation -> select "Carriage return";
  • ...of column headers for a new table you are creating -> select "Tab"
  • Etc.

The GetUniqueStringsList function, in modCopyUniqueValues, is responsible for creating the sorted list of unique values. It relies heavily on Excel features exposed through its Object Model. It performs the following actions:

  1. Copy/paste append the cell values from the source range to column A in shtTemp, in order to build the complete list of values (still containing duplicates);
  2. Remove duplicates from that list by calling the RemoveDuplicates method.

    Note: The RemoveDuplicates method was first introduced in Excel 2007. In order to make the code work in earlier versions of Excel, you would need to use the AdvancedFilter method instead.
  3. Optionally, sort the list of unique values using the Sort method.
  4. Add the formulae discussed earlier for constructing the output string.
  5. Return the output string from the last cell in shtTemp to frmCopyUniqueValues, where it is copied to the Clipboard using the MSForms.DataObject SetText and PutInClipboard methods.

Conclusion

The intent of this article was to remind (as a piece of advice) VBA developers to always check how the host application could help them solve complex operations, before exercising their coding skills.

When writing Excel macros, I am indeed relying on the application even for the simplest tasks (such as the one below), saving me literally lines of VBA code. For example, I would often favor retrieving a SQL statement I want to use for querying a database from a cell rather than construct it programmatically:

  • Cell A1 would contain the SQL statement "template" e.g., "SELECT FirstName, LastName FROM Contacts WHERE EmailAddress='{0}' ORDER BY LastName, FirstName";
  • Cell A2, the value to use for the "{0}" parameter e.g., "johndoe" (supposedly provided by the user);
  • And cell A3 would have the formula "=SUBSTITUTE(A1,"{0}",A2)", returning "SELECT FirstName, LastName FROM Contacts WHERE EmailAddress='johndoe' ORDER BY LastName, FirstName".

That way, I would simply get the value of the SQL statement from cell A3 and be ready with it.

Posted: Friday, October 17, 2008 11:36 AM by Joseph Chirilov
Filed under: ,

Attachment(s): CopyUniqueValues.zip

Comments

Jon Peltier said:

This is a very good thing to remind programmers of. Unfortunately, many programmers are as unfamiliar with the parent application as they are with the business domain they are programming for.

Excel is a fine platform for application development, because it has an extensive calculation engine and a decent presentation layer. But I have seen Excel VBA "solutions" (I hesitate to use that word) which reproduce many built-in features of Excel, usually much less efficiently.

# October 17, 2008 4:25 PM

Mike Woodhouse said:

Very true. The "dive for VBA" thing is symptomatic of the user who's trying to make a little knowledge go too far.

Throw in a good understanding of indirection, offset, lookups, match/index and array formulae, and there's an astonishing amount one can achieve without ever getting to the VBE. And once that's mastered, I reckon the user will know when to go to code.

I have to confess to having written a VBA "RangeJoin" function, that wraps Join(), mind you.

# October 17, 2008 4:28 PM

dermotb said:

Stephane, VBA is faster than you may think. I can read in and sort unique values from 60,000 cells, in just 0.2 seconds.

This is because VBA can create unique lists extremely quickly using Collections, and readily available QuickSort code handles the sorting.

So VBA can replicate the functionality of the add-in without using any Excel functions. That doesn't mean it's a better solution, naturally, but let's not undersell VBA!

# October 17, 2008 7:53 PM

Stephane Viot said:

Thank you all for your comments.

Dermot, I guess my post is as much about performance as it is about not "reinventing the wheel". Your observation as well as Dany's excellent article ("What is the fastest way to scan a large range in Excel?" at  http://blogs.msdn.com/excel/archive/2008/10/03/what-is-the-fastest-way-to-scan-a-large-range-in-excel.aspx) clearly show that VBA can be extremely fast... when used properly. The other benefits of taking advantage of the application OM usually are more compact code, less complex code, and (in my case :-)) less buggy code. Think about the number of developers and testers behind the OM...

# October 20, 2008 11:11 AM

gkeramidas said:

please don't forget about the speed in excel 2003 and port it over to excel 2007 in sp2. excel code execution in excel 2007 is atrocious and you should be ashamed of yourselves.

# October 23, 2008 1:23 AM
New Comments to this post are disabled
Page view tracker