Welcome to MSDN Blogs Sign in | Join | Help

Microsoft Excel

The team blog for Microsoft Excel and Excel Services.
A Robust Way To Reference Multiple Columns In a Table

I’m sure many of you have built a spreadsheet like this before: you’ve got a table of data; one of the columns in this table contains a formula which references a span of columns in the same table.  The number of columns you need to reference may change over time as requirements change.  How do you you build this table in such a way that the number of columns can be changed without breaking the formula that references them?

Let’s walk through a somewhat contrived but simple example to demonstrate the issue.  Say I’m a real estate investor and I’m tracking a list of houses I’m interested in purchasing.  I’m a demanding and detail oriented buyer so I want to know the square footage of the individual rooms in the dwelling.  Based on this information, I want to calculate the count of rooms in the dwelling as well as the total square feet of the rooms combined.  The table might look something like this:

image

The formula for the “#Rooms” column looks something like this:

=COUNT( Table1[[#This Row],[SqFt-Room1]:[SqFt-Room3]] )

Now, let’s say I want to add a new house, but this new house has four rooms.  Simple, just add a column, right?  Not so fast; depending on how we add that column our formulas for “#Rooms” and “Ttl Sq. ft.” may not update as expected.

Or let’s say I’ve changed focus to small condominiums and no longer have a need for the “Room3” column.  If I attempt to delete it, then my formulas will break.  How do we avoid this?

Here's one trick I use in such situations.  It may not be the best answer, but I’ll share it with you here and if others have suggestions to offer you can add them to the comments.

Solution: add an extra “dummy” column before and after the span as end caps, and refer to those columns in your formulas.  Using our real estate example, I would add a column before “SqFt-Room1” called “RoomsStart” and one after “SqFt-Room3” called “RoomsEnd”, like so:

image

I would keep these columns empty and never put data in them.  This is important because otherwise my COUNT and SUM formulas may return the wrong results.  Then I’d highlight the columns a different shade (this is totally optional but I personally like the visual effect of marking off the start and end of the span) and resize them to something very small so they are out of the way for the most part, like so:

image

Alternatively, you can hide the columns if you so desire.  Then I’d update my formulas so that they referenced these columns instead, like so:

image

Now, when I want to add a fourth room, I select the “RoomsEnd” column and right-click \ Insert and I get a new column which I can name “SqFt-Room4”.  This new column will automatically be included in my COUNT and SUM calculations.  Similarly, if I ever get rid of “SqFt-Room4”, I don’t have to worry about breaking any of my formulas.  It may not be the most elegant solution, but it gives me a virtually worry-free way to reference a changing number of columns in my table.

Posted: Wednesday, April 08, 2009 11:45 PM by Joseph Chirilov

Comments

sam said:

Joseph,

Why cant we simply have a LastRow or LastCol Function which returns the value/Address of the LastRow or LastCol in the row or column.

So Sum(A1:LastRow(A)) would give me sum of all

cells from A1:Upto As much as there is data

=LastRow(A) gives me the value in the last row

Currently I have to do this through a UDF

# April 10, 2009 5:14 AM

Joseph Chirilov said:

sam: It's an interesting idea. It would work for the simple example above, but not so for a more complex table. For example, the idea for this post came as I was creating a scheduling spreadsheet, and I had a span of columns to represent individual developers, and another span of columns to represent individual testers. I needed to do calculations based on the entire set of developers and testers independently. A LastCol function would not have worked for me in this case.

# April 10, 2009 4:46 PM

Tim said:

This technique also works when referencing data in multiple tabs in an Excel workbook

example: sum(starttab:endtab!A1)

# April 13, 2009 6:04 PM

Joseph Chirilov said:

Good point, Tim.  It sure does.

# April 14, 2009 12:31 PM

Steve Edwards said:

Why not use the OFFSET function inside your SUM or COUNT to point to the changing range? The HEIGHT and WIDTH arguments can be used to refer to any area.

For example:

SUM(OFFSET(A1,,,1,4))

Returns the total of the range starting from A1 and continuing 4 columns across to the right along the same row (The HEIGHT is 1 and the WIDTH is 4) Enter the Width argument as a cell reference and by changing the cell value you can refer to any area.

# April 16, 2009 8:36 AM

Joseph Chirilov said:

Steve: thanks for the suggestion. That works too, but for my way of working with Excel it's a bit more cumbersome as I'd have to remember to update the cell that contians the width if I ever add/delete columns, which just increases the chances of making my spreadsheet incorrect.

Also, OFFSET is volatile, which introduces other considerations.

# April 17, 2009 1:54 AM

Kathy B said:

Can someone expand on the "lastrow" formula above and also what is a UDF

# April 17, 2009 8:21 PM

wolfewa said:

Sorry to say, but this is excel 101 and not something that warrants discussion.  sorry....but it's true.  

# April 18, 2009 1:28 AM

Paul Haywood said:

Would this solution work when the data itself is being pulled into Excel via ODBC?

Regards

# April 20, 2009 12:43 AM

Joseph Chirilov said:

Kathy: LASTROW is not a function that actually exists in Excel. Sam suggested it as a possible new feature. A UDF is a "user-defined function". Basically the idea is Excel allows you to write your own funtions that can be called in spreadsheet formulas just like any other function. UDFs can be created in a number of ways, the most common of which is in VBA (Visual Basic for Applications).  Sam was saying he wrote a UDF called LASTROW which tells him where the bottom of the data is on a given sheet.

wolfewa: it may not be an advanced topic but there are many Excel users out there who would benefit from this information. This blog site does not cater only to advanced users.

Paul: Yes this solution should work in that scenario.

# April 20, 2009 1:01 PM

Salvius said:

"I'd have to remember to update the cell that contians the width if I ever add/delete columns"

But you can also enter a COUNT() or COUNTA() function, to calculate the width/height based on what's actually there. For example (and just in case there are any differences, I'm using Excel 2003):

SUM(OFFSET($A$1,ROW()-1,3,1,COUNTA($1:$1)-3))

This formula (pasted into column C in each row) will correctly calculate total square footage in your first sample table, and additional rooms can be added simply by adding a "SqFt-RoomX" header in the header row and inputting the numbers. I suppose there is also the caution that the columns must be contiguous...

I use this trick all the time, usually by using the offset formula to define a named range, which can then be used in other formulas. It's great for things like dynamically resizing charts/graphs, which can use named ranges for data sources.

# May 6, 2009 6:59 PM
New Comments to this post are disabled
Page view tracker