Welcome to MSDN Blogs Sign in | Join | Help

Microsoft Excel

The team blog for Microsoft Excel and Excel Services.
Multi-threaded calculation in Excel, or “how calculation can become much faster in Excel 12”

One of the things I mentioned in my overview of Excel 12 post in September was that we had done some work to speed up calculation on modern hardware (multi-processor or dual-core chips).  I thought I would take a brief break from tables to summarize that work and to see if any readers are interested in beta testing this feature.

Multi-Threaded Calculation

To a large number of customers, Excel’s calculation speed is extremely important – perhaps the most important “feature” we ship.  When planning Excel 12, we started a small investigation to look at different ways we could make Excel calculate faster on computers that had multi-processor or dual-core chips.  The investigation turned out to be promising, so we continued the work, and the result is a very exciting feature that we refer to as multi-threaded calculation, or MTC.  (Note – this is another working name, not a final name.  Also note that our developers refer to this as MTR, or multi-threaded recalculation, but since most customers use the word calculation, I have decided to go that route for this post.)

In a nutshell, this feature enables Excel to spot formulas that can be calculated concurrently, and then run those formulas on multiple processors simultaneously.  The net effect is that a given spreadsheet finishes calculating in less time, improving Excel’s overall calculation performance.  Excel 12 can take advantage of as many processors (or cores, which to Excel appear as processors) as there are on a machine - when Excel loads a workbook, it asks the operating system how many processors are available, and it creates a thread for each processor.  In general, the more processors, the better the performance improvement.
 
Unsurprisingly, the design of the spreadsheet will have a direct impact on the size of any performance increase.  At one end of the spectrum, a spreadsheet that has a lot of completely independent calculations (like a Monte Carlo simulation) should see enormous benefit.  At the other end of the spectrum, a spreadsheet that has one completely linear set of calculations (where each formula depends on the previous formula, in linear fashion) will not see significant performance increases.  The majority of the spreadsheets we see are in between these two ends of the spectrum, so we think that most everyone will see some amount of benefit from this work.  Additionally, people who care about performance can tweak their spreadsheets to take advantage of this capability.

We think this is really exciting work – we are still profiling and testing performance and making tweaks, but based on what we have seen so far, things like Monte Carlo simulations (which are almost ideal test cases) see almost perfectly linear speedup - that is, using 2 threads on a dual-processor or dual-core machine cuts calc time nearly in half – and the majority of other workbooks also see noticeable reduction in calculation time.  This is especially exciting because of the increasing availability of dual-core chips in standard business desktop machines … simply installing Excel 12 on these machines should lead to calculation performance improvements for users.

I can hear developers asking “what about user-defined functions (UDFs)?”  The answer is that in Excel 12, we have enabled functions defined in XLLs to participate in multi-threaded calculation; VBA and automation add-in UDFs will not be multi-threaded (meaning, for example, that a VBA UDF and everything that depends on it cannot be processed simultaneously with other formulas).  We have updated the XLL interface to allow developers to advertise their XLL functions as thread-safe (and to reference the big grid and a few other things to be covered in a later post about XLLs).

A few other things about this feature:

  • Calculation results are not affected by MTC, just the speed with which they are executed
  • Users can turn the feature off if they so desire - with MTC turned off, Excel 12 will calculate the same way as it has in previous versions
  • The first time a workbook is calculated on a machine with a different number of processors, there is some overhead incurred while Excel examines dependencies, so the maximum performance increase will be seen on the second and subsequent calculations (though there will still be improvement on the first calculation)
  • Users can also manually specify the number of threads they want to run simultaneously, so if they want to run more than the number of processors on the machine, they can

Regarding that last point - you may be asking yourself why you would want to specify more threads than the number of processors - the reason we enabled running multiple threads per processor was to help out our customers who run workbooks that make a lot of external calls using XLL user-defined functions (UDFs).  In certain industries, we see lots of customers that have slow-calculating spreadsheets where the bulk of the calculation time is spent outside of Excel in UDFs … the UDFs often run on other machines.  For example, a customer may have a spreadsheet that makes 100 calls to a number of XLL UDFs on a server.  The UDFs perform data retrieval or intensive calculations.  Once all 100 UDFs have returned results to Excel, the spreadsheet then does some analysis on the results of those calls.  Given that Excel 2003 is single-threaded, if each call to an XLL UDF takes 30 seconds, the model takes almost an hour to run, even on a dual-processor machine.  If they ran the same spreadsheet on Excel 12, by default, the calculation time would be cut roughly in half, since Excel 12 would make 2 XLL UDF calls at a time – one for each processor.  If the user then manually set the number of threads to 8, Excel 12 could make 8 XLL UDF calls simultaneously, meaning the calc time would fall from close to an hour to a matter of minutes, assuming the server can return the data at the same rate in either case.

Would you like to help?

The Excel team has reserved a number of positions in our upcoming technical beta for individuals that have the interest, time, hardware, and spreadsheets to help test the work we have done in this area (and all the other work we have done, although we are looking for a few people that are particularly interested in calculation performance).  If you are interested in helping test this feature, please use the “Email” link above to contact me and answer the questions below.  Before you do so, however, please take a look at a few caveats:

  • Our technical beta is private, meaning you will be required to sign a non-disclosure agreement with Microsoft, and you will not be able to discuss the software with anyone outside the beta until the non-disclosure agreement is lifted at a later time 
  • You will need a dual-processor or dual-core machine running a recent MSFT OS to test this feature, and you will need to be comfortable installing beta software on the machine 
  • You will need to be able to devote time during the beta to running your spreadsheets and communicating issues/results to our development and support folks 
  • You will need a set of calculation-intensive spreadsheets to start with – the more variety, the better 
  • Since I have no idea what the response will be to this sort of request for help, we may end up with much more interest than we have room for, in which case I will pick folks randomly to participate … so sending me an email unfortunately does not guarantee we can give you a slot 
  • These positions are for individual testers, not companies.  No problem if you work for a company, but we are looking to sign up individuals at this point, not entire departments.

If you have reviewed all that and are interested, please send me an email with the following:

  • Some background on your company, your job/function, and how that involves Excel 
  • An idea of the hardware you have available for testing 
  • A description of the calculation-intensive spreadsheets that you have for testing – what do they do, what sort of features do they use, how long do they take to calculate, etc. 
  • An idea of how much time you could devote to testing per week over the course of a beta 
  • Whether you have already been nominated for Office 12 beta testing

Again, I can’t promise a spot to everyone that replies, so please don’t be disappointed if we don’t have space.  And thanks in advance to anyone that does contact me.

Next time, more on tables, I promise.

Posted: Thursday, November 03, 2005 2:21 PM by David Gainer

Comments

Harlan Grove said:

On single processor/single core machines would XL12 run faster with threading disabled since that'd presumably eliminate the new dependency checking (on initial recalc) and unavoidable thread overhead (on all recalcs)? If so, will XL12 installation enable or disable threading based on how many processors/cores are available during installation?

If a user had, say, two workbooks open at the same time with no interworkbook references between them, the formulas in each would be independent. Would XL12 assign separate recalculation threads to each workbook so the user could continue working on workbook A while workbook B continues to recalculate? That'd be useful on single processor/single core machines.

Getting back to dependency checking, will XL12 store metadata on formula dependencies in the new XML workbooks? If workbook X had last been recalculated and save on a dual core machine with threading enabled, then opened and saved on a machine with threading disabled (no matter how many processors/cores it might have), would the metadata be lost? How will this affect shared workbooks?

Will OFFSET, INDIRECT and INDEX(..):INDEX(..) defeat dependency checking, or will it work similar to circular reference detection?
# November 3, 2005 6:23 PM

Orion Adrian said:

Harlan - I would imagine that even single processor systems could gain benefit if there were some other external dependancy (e.g. the hard disk, the network) that was slower.

Also I'd wait on the requesting the exact methods for tuning since it's the kind of thing most likely to shift.
# November 4, 2005 9:38 AM

mschaef said:

Interesting stuff.

Can you comment more on how Excel handles forumula evaluation? At one point I thought I had heard it actually compiles worksheet formulas to machine code...
# November 4, 2005 10:21 AM

Mario Goebbels said:

I guess you're leveraging the code that checks for circular references to find all independent strands?

And if there'd be test spreadsheets being supplied, I could make them run on my old trusty dual P3-933 running Windows 2003. I have no idea if I'm nominated though; until now I've just received an invite and signed up.
# November 4, 2005 10:40 AM

Charles Williams said:

Sounds really good.
Whilst the subject is the calculation engine:

Have you added any user control of the granularity of the dependency tree? (pre excel 2002 worksheet level, 2002/3 global level). Multi-threaded recalculation would seem to naturally fit with a more granular approach, and whilst most workbooks calc faster with the global dependency tree there are some that calc slower.

Any exploitation of 64-bit processors?

Any improvements to Range.calculate (it got slowed down in XL 2002 when it was changed to handle within-range dependencies)?

have you added a workbook.Calculate method to complement range.calculate sheet.calculate and all_open_workbooks.calculate?
# November 4, 2005 11:04 AM

David Gainer said:

Hi folks,

Harlan – we will be smart and on single core/processor machines we will not do any of the work required to multi-thread, so users will not see a performance impact and they will not need to disable threading. The scenario you outline about working on workbook A while workbook B is calcing will not be supported in 12 – that would take more work than just the calculation engine. We are still finalizing designs around handling dependency information. With respect to OFFSET etc., no, it will not defeat dependency checking.

mschaef – what specifically are you interested in hearing about?

Mario – shoot me an email and we can discuss further.

Charles – we have not added any control over the granularity of the tree, though we do hear some requests. No specific exploitation of 64-bit processors right now. We have not changed range.calculate nor added a workbook.calculate method. I would be interested to hear the sorts of scenarios that give rise to these questions if you can share.
# November 4, 2005 2:58 PM

Charles Williams said:

David,

I think the Excel team have been absolutely correct to pursue two approaches to improving calculation speed: generic improvements for all users and controllability improvements for specialist users (typically traders & users of large financial /forecasting /engineering /decision support models).

Scenarios:

Range.calculate without dependencies. For cases where you need the fastest possible recalculation and can control the design of the critical block of formulae Range.calculate is still often the fastest way, but the introduction of the within-range dependency calc has slowed it down, often significantly.

Workbook.calculate: when you have a large number of workbooks with some volatility open (which happens more often as Excel's memory capacity increases and the price of RAM falls) you often want to recalc only the active workbook, or in more complex cases control the calculation sequence of the workbooks. Currently you can program this using a somewhat complex scheme with worksheet .enablecalculation, but this method has several drawbacks. Many users are surprised to find that F9/Automatic recalculates all open workbooks rather than just the active book. So it would be good to have an additional calculation option (Active workbook only) and the ability to calculate a specific workbook (workbook.calculate)

Tree granularity; Its difficult to give a specific scenario, but when you have a complex slow-calculating workbook you need all the help you can get: for some workbooks the old method is faster than the new method and you could tweak calculation speed by moving stuff between sheets. Probably also need workbook-level granularity for workbook.calculate.

64-bit: I guess the questions are: what performance improvement is possible and when will there be a sufficient number of 64-bit processors in the marketplace?
# November 5, 2005 6:40 AM

Wim Van de Velde said:

One could reasonably expect that Automatic Recalc only recalculates the active workbook and those workbooks in RAM that have dependencies on the active workbook.

For MTC : does that mean that DataTables could be executed in parallel ? That would be great. We rely on them a lot, as well as on array formulas, VBA UDFs (slow, too), and any combination of the above.
# November 7, 2005 4:51 AM

gkoehler said:

Not a high powered financial analyst, but I the quicker my pest forecast models run the better. 64bit Windows is here, but no use if the software doesn't run take advantage of native 64 bit mode. Combination of 64 bit and hyperthreading with an Athlon X2 is enticing. Would like to see speed tests with Excel 12 set to use these abilities. Plenty of 64 processors sold in last 2 years to justify supporint 64-bit.

According to article below, upgrading to Athlon X2 should be straight cpu swap


from: http://hardware.gamespot.com/Story-ST-17478-1945-x-x-x

If you already own an Athlon 64, AMD made
the upgrade path to the Athlon X2 very easy. Most existing socket 939 motherboards that support the Athlon 64 should support the Athlon X2. AMD claims that the vast majority of motherboards will be able to run Athlon X2s with nothing more than a BIOS update. Be sure to check your power supply though, as AMD recommends the use of a 550W power supply for use with the Athlon X2.

# November 7, 2005 10:34 AM

David Gainer said:

Thanks everyone for the feedback.

Wim - yes, Data Tables and array formulas will benefit from multiple procs/cores. VBA UDFs will not (as explained above).
# November 7, 2005 4:35 PM

Microsoft Excel 2007 (nee Excel 12) said:

In the past few months, I have written a couple of articles (big grid, multi-threaded calculation) that...
# May 3, 2006 9:34 AM

Office Rocker! said:

Some mention of this was made on the Excel blog about the performance improvements that can be derived...
# August 17, 2006 10:04 AM

Microsoft Excel 2007 (nee Excel 12) said:

Fairly recently, the Microsoft Windows Compute Cluster Server 2003 team publically released their product. ...
# September 21, 2006 6:36 PM
New Comments to this post are disabled
Page view tracker