The power of bits for historical event analysis
I guess my title sounds more like a subject for an academic paper, which I may actually do in time, but in any case, the use of bitmasks for represent events occurring over a fixed period of time has powerful implications for detailed correlation with minimal data storage. Previously I write about the use of periods to link detailed history data to summary data periods to help with analysis. Since then, I’ve evolved the design from an indexed view to a static period summary table, which I’ll talk about in a later post.
However, for this post, I will focus on functionality possible with bitmasks within a fixed period of time. In our scenario, we are testing the results of shorting/longing strategies by testing them against history. It is a robot-type implementation where the software doesn’t know what the actual is and then it executes a trading algorithm with variable parameters and then recording the final outcome. I don’t record the detailed results because the storage costs and processing time would be unrealistic. For example, even with a 200% ROI, using short and long strategies, there are over 12 million strategies meeting the criteria against a list of 5000 stocks on AMEX/NYSE against the 30 or so periods for the last year (periods being defined as rolling 1, 3, 6, and 12 months). Adding details outlining the execution of each transaction involved in the simulation would multiple this several times, based on the fact that the most effective strategies have involved frequent trading.
But, it would be very handy to be able to pull up the results of a simulation and actually see what dates were involved in the execution for drill-down reporting, especially if the equity was one of special interest and being profiled (more about that later also). However, the business value for this comes in with being able to do correlative parallel analysis. This allows for correlating at the detail level in order to establish latencies between events. For example, GM recently went broke (yes, Obama really isn’t going to bail out GM, at least not the “old one), the current common stock is worth nothing and their creditors may not be able to collect on the debts. So, we see a potential cascading effect between the stock action of a failing company and it’s creditors. Over the past months, this effect was evident in the market if you look at GM’s suppliers.
What would be interesting would be to see if these kinds of cascading events are really priced into the market – i.e., will the stock value of GM’s suppliers continue to drop even after GM is bankrupt? Is there a delay in when an event occurs that cascades. If so, one could adjust their stock strategy accordingly. On a more positive note, what about if one company’s stock is accelerating and it’s product sales have an effect on related companies, maybe not even in the same sector?
Storing the time series within a fixed period gives us capability to do this analysis to determine latencies. For example, if we store a bit pattern for a 30 day period of when profitable trades were executed. i.e. 00101000… with a start period of April 1, 2009 would represent events occurred on April 3 and April 5 within the period. Let’s say company X had this pattern, but company Y exhibited the same pattern offset by 2 bits – i.e 000010100.. to represent dates April 5 and 7. Using bit-shifting via .NET CLR function we could detect a parallel match. This is way over-simplified, the reality is that overall market pressures would skew these masks some so that they are not exact patterns. However, using “fuzzy” logic, one can check for the differences between the bit masks and determine if there is a correlation based on statistical methods. I.e. if the bit mask for Y offset by 2 shows a density that is within 20 – 30% proximity of the bitmask for X, then we can have a certain degree of confidence based on amount of data sampled per population (confidence interval) that there is a meaningful pattern.
This is a pretty deep topic that I should address in an academic paper, but hopefully that gives you an idea.
[That’s all the time I have now. Later, we’ll get into the code involved for actually accomplishing this.]
(Part 2 - http://blogs.msdn.com/microsoftbob/archive/2009/06/06/part-2-bits-for-event-patterns.aspx)
Technorati Tags:
SQL Server 2008