Welcome to MSDN Blogs Sign in | Join | Help

Christian Stich on Excel Services: Financial option valuation implemented with Monte Carlo simulation using multithreaded User Defined Functions (UDFs)

Today’s author: Christian Stich, a program manager on the Excel Services team who likes to combine his finance and engineering backgrounds.  Christian is going to talk about building a multithreaded UDF for Monte Carlo simulations and using it for options pricing.

The example code and workbook in this post demonstrate the implementation of a European Put/Call Option valuation model using a high performance Monte Carlo simulation.   Readers with a finance background realize that European options can be valued using the Black-Scholes options pricing model – the sample workbook also includes a calculation of the options values using Black-Scholes for comparison.  This example is only meant to demonstrate the approach and implementation and does not intent to solve options valuation in the most efficient manner.   The same basic Monte Carlo approach can be extended to price all kinds of derivatives/options (including American and Asian options as well as exotic options); moreover, the underlying distributions are typically not normal/log-normal.  This is due to the fact that extreme price fluctuations are not as rare in reality as predicted by normal/log-normal distributions.  Financial options are extremely important in risk management and mitigation, hedging and arbitrage and thus accurate and rapid pricing of options is very desirable.

The source code for the UDF and the workbooks can be downloaded at:

http://officeblogs.net/excel/MonteCarlo-Source Code+Workbooks.zip

 

Why use Monte Carlo Simulations?

Monte Carlo simulations are extremely useful in those cases where no closed form solutions to a problem exist.  In some of those cases approximations to the solution exist; however, often the approximations do not have sufficient accuracy over the entire range.  The power of Monte Carlo simulations becomes obvious when the underlying distributions do not have an analytical solution.  Monte Carlo simulations are extensively used in the physical and life sciences, engineering, financial modeling and many other areas.    In fact, significant amounts of processing times on supercomputers across the globe are used for Monte Carlo simulations.

 

Advantages of using Excel and Excel Services for financial, engineering or scientific simulations

Excel and Excel Services provide a great platform for modeling of complex scenarios.  The UDF is used to generate the inverse cumulative distribution values which are used to calculate the options values using formulas in the Excel workbook.  Keeping the formulas for the option pricing model in Excel (rather than incorporating them into the UDF) provides a number of advantages.  It allows subject matter experts (in this case financial analysts) to easily and quickly modify the model without having to resort to coding in C# or other programming languages.   Moreover, the fact that the UDF source code does not need to be shared helps protect companies’ intellectual property.  This is also very important to 3rd party companies that provide plug-in solutions for Excel and Excel Services.  Similarly, the fact that access to the models in the Excel Services workbooks can be tightly controlled and is not required for the coding of the UDF also helps the companies’ safeguard intellectual property. 

 

Advantages of using UDFs with Excel Services

UDFs allow programmers to write code that is compiled into machine language and not interpreted/evaluated, resulting in potentially significant performance gains.  UDFs are well suited for implementations of Monte Carlo simulations as they are very computationally intensive.  Data transfer between UDFs and Excel Services is very fast – depending on the server hardware several hundred thousand cells in the workbook can be read and written by the UDF per second.   This allows creating models in Excel Services workbooks that exchange large amounts of data with the UDF without incurring a heavy penalty due to the data transfer. 

 

Advantages of multithreading

CPU manufacturers are rapidly moving toward multi-core CPU architectures, increasing performance through parallelism in addition to the continuing increases in processing speeds.  Today’s mainstream CPUs are dual- or quad-core with hyperthreading whereas a few years ago most CPUs contained a single non-hyperthreaded core.  The number of processing cores is expected to rise steeply in the near future – harnessing the processing power of those CPUs requires programs to be multithreaded to make efficient use of the parallelism.   Furthermore, servers are nowadays available with more than two CPUs, each of which contains multiple hyperthreaded cores itself.  The approach taken in this example Monte Carlo simulation is the essentially a non-recursive divide and conquer algorithm.  The simulation is divided into a number of chunks which are then calculated on the different processing cores in parallel.  This type of simulation is very well suited to be parallelized as each individual part of the simulation is independent.

 

The code

For additional information about using UDFs with Excel Services take a look at Shahar Prish’s How UDFs work in Excel Services - a Primer, Danny Khen’s posts on UDFs - Excel 2007 investments in UDFs #1 & Excel 2007 investments in UDFs #2: Existing UDFs, and Excel 2007 investments in UDFs #3: client/server solution using a core library.

Overview of the code

The code consists of a number of declarations and four methods.  The four methods are MonteCarloNormalDist,  CalculateDistribution, MultithreadedCalculateDistribution, and CalculateDistributionThread.  The MonteCarloNormalDist methods purpose is to act as the interface between Excel Services and the actual Monte Carlo simulation which resides in the remaining methods.  The CalculateDistribution method splits the simulation into a number of equal sized chunks and calls on the MultithreadedCalculateDistribution method.  The MultithreadedCalculateDistribution method itself calls the CalculateDistributionThread method which performs the actual Monte Carlo simulation. 

Declarations

First we create a new class which we call MonteCarlo.  We mark the class as a UDF class in order to make it available to Excel Services.   We also define a class called ThreadWorkItem – it contains the variables for the individual worker threads (more about those further down in the code).

 

using System;

using System.Threading;

using System.Collections.Generic;

using System.Text;

using Microsoft.Office.Excel.Server.Udf;

 

namespace MonteCarlo

{

    [UdfClass]

    public class MonteCarloClass

    {

        public class ThreadWorkItem

        {

            public double[] Result;

            public int thread;

            public int count;

            public int iter;

            public double mean;

            public double stdDev;

            public int distribution;

            public double minVal;

            public double maxVal;

            public ManualResetEvent ManualEvent;

        }

 

 

 

MonteCarloNormalDist

 

Next we implement the MonteCarloNormalDist method and mark it as a UDF method and as volatile.  This method supports array formulas as it determines the number of cells that are passed to the UDF.  It then calls the CalculateDistribution method with the number of cells, desired number of threads, and the parameters that define the normal or log-normal distribution.  If a cumulative distribution function is to be returned the method adds the values of the probability density function in order to arrive at the cumulative distribution function.  Furthermore, this method can also generate an inverse cumulative distribution distribution which is needed for our options pricing model.  Finally, the MonteCarloNormalDist method returns in calculated values to Excel Services.

 

 

        [UdfMethod(IsVolatile = true)]

        public object[,] MonteCarloNormalDist(

            object[,] cellArray,

            int iterations,

            int threads,

            double mean,

            double stdDev,

            int distribution,

            double minVal,

            double maxVal)

        {

            // get size of array

            int cellArrayCount = cellArray.GetLength(0);

            int i = 0;

            int j = 0;

 

            // get number of processor cores present in the system

            int processorCores = Environment.ProcessorCount;

      

            object[,] result = new object[cellArrayCount, 1];

 

            double[] tmp = new double[cellArrayCount];

            double cumulativeSum = 0.0;

            double stepVal = 0.0;

            double curVal = 0.0;

 

            if (iterations > 0)

            {

                // round UP to nearest even number

                if (iterations % 2 != 0)

                {

                    iterations++;

                }

 

                // run number of threads as specified upto number of 'cores'

                // use "if (threads == 0)" in order to allow for more threads than 'cores'

                if (threads == 0 || threads > processorCores)

                {

                    // 0 threads indicated default -> use number of 'cores'

                    threads = processorCores;

                }

                tmp = CalculateDistribution(

                       cellArrayCount,

                       iterations,

                       threads,

                       mean,

                       stdDev,

                       distribution,

                       minVal,

                       maxVal);

 

                // bit 0: 0 -> normal, 1 -> lognormal

                // bit 1: 0 -> probability, 1 -> cumulative

                // bit 2: 0 -> 'regular', 1 -> inverted cumulative

                for (i = 0; i < cellArrayCount; i++)                      

                {

                    if ((distribution & 0x6) != 0)        

                    {

                        // cumulative distribution function

                        cumulativeSum = cumulativeSum + tmp[i];

                        result[i, 0] = cumulativeSum / (double)iterations;      

                    }

                    else

                    {

                        // probability density function

                        // tmp[i] = tmp[i] / (double)iterations * (double)cellArrayCount / (maxVal - minVal); 

                        tmp[i] /= (double)iterations;

                        tmp[i] *= (double)cellArrayCount;

                        tmp[i] /= (maxVal - minVal);

 

                        result[i, 0] = tmp[i];              

                    }

                }

            }

 

            if ((distribution & 0x4) != 0)                                           

            {

                // inverse cumulative

                // divide the cumulative sum into cellArrayCount slices

                stepVal = (double)result[cellArrayCount-1,0] / (double)(cellArrayCount + 1);    

                curVal = stepVal / 2.0;

                j = 0;

                i = 0;

                tmp[j++] = 0.0;

 

                // invert the distribution

                while (i < cellArrayCount)

                {

                    if ((double)result[i, 0] < curVal)

                    {

                        i++;

                    }

                    else

                    {

                        j++;

 

                        if (j < cellArrayCount)

                        {

                            tmp[j] = i;

                        }

 

                        curVal = curVal + stepVal;

                    }

                }

 

                tmp[cellArrayCount - 1] = cellArrayCount;

 

                for (i = 0; i < cellArrayCount; i++)

                {

                    // tmp[i] = tmp[i] / (double)cellArrayCount * (maxVal - minVal) + minVal;

                    tmp[i] /= (double)cellArrayCount;

                    tmp[i] *= (maxVal - minVal);

                    tmp[i] += minVal;

 

                    result[i, 0] = tmp[i];

                }

 

            }

 

            return result;

        }

 

 

CalculateDistribution

 

The following section of the code contains the CalculateDistribution method.  Its purpose is to divide the simulation into equal sized chunks.  The ‘last’ thread is assigned a few more iterations if the number of iterations does not evenly divide by the number of threads.  The method also sets the variables for the individual worker threads prior to spawning the multiple threads.  Subsequently CalculateDistribution waits for all worker threads to complete and then aggregates the results from those threads.

 

 

        public double[] CalculateDistribution(

            int cellArraycount,

            int iterations,

            int threads,

            double mean,

            double stdDev,

            int distribution,

            double minVal,

            double maxVal)

        {

            int i = 0;

            int j = 0;

            int remainingIterations = iterations;

            int iterationsPerThread = 0;

 

            double[] tmp = new double[cellArraycount];

           

            ThreadWorkItem[] twi = new ThreadWorkItem[threads];

            ManualResetEvent[] mre = new ManualResetEvent[threads];

 

            // iterations divided by the number of threads

            iterationsPerThread = iterations / threads;

 

            // set variables for worker threads

            for (i = 0; i < threads; i++)

            {

                twi[i] = new ThreadWorkItem();

                twi[i].thread = i;

                twi[i].count = cellArraycount;

 

                if (i != threads - 1)

                {

                    if (remainingIterations >= iterationsPerThread)

                    {

                        // assign iterations to the all but the last thread

                        twi[i].iter = iterationsPerThread;

                    }

                    else

                    {   // threads will only get 0 iterations if the number of iterations is less than 2*threads

                        twi[i].iter = 0;

                    }

 

                    // subtract the assigned iterations from the remaining iterations

                    remainingIterations = remainingIterations - twi[i].iter;

                }

                else

                {

                    // assign "left over" iterations to the last thread

                    twi[i].iter = remainingIterations;                      

                }