<?xml version="1.0" encoding="UTF-8" ?>
<?xml-stylesheet type="text/xsl" href="http://blogs.msdn.com/utility/FeedStylesheets/rss.xsl" media="screen"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:wfw="http://wellformedweb.org/CommentAPI/"><channel><title>Luca Bolognese's WebLog : Statistics</title><link>http://blogs.msdn.com/lucabol/archive/tags/Statistics/default.aspx</link><description>Tags: Statistics</description><dc:language>en-US</dc:language><generator>CommunityServer 2.1 SP1 (Build: 61025.2)</generator><item><title>Bayesian inference in F# – Part IIb – Finding Maia underlying attitude</title><link>http://blogs.msdn.com/lucabol/archive/2009/01/19/bayesian-inference-in-f-part-iib-finding-maia-underlying-attitude.aspx</link><pubDate>Mon, 19 Jan 2009 19:48:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:9340323</guid><dc:creator>lucabol</dc:creator><slash:comments>1</slash:comments><comments>http://blogs.msdn.com/lucabol/comments/9340323.aspx</comments><wfw:commentRss>http://blogs.msdn.com/lucabol/commentrss.aspx?PostID=9340323</wfw:commentRss><description>&lt;P&gt;Other parts:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;A href="http://blogs.msdn.com/lucabol/archive/2008/11/07/bayesian-inference-in-f-part-i-background.aspx" mce_href="http://blogs.msdn.com/lucabol/archive/2008/11/07/bayesian-inference-in-f-part-i-background.aspx"&gt;Part I – Background&lt;/A&gt; &lt;/LI&gt;
&lt;LI&gt;&lt;A href="http://blogs.msdn.com/lucabol/archive/2008/11/26/bayesian-inference-in-f-part-iia-a-simple-example-modeling-maia.aspx" mce_href="http://blogs.msdn.com/lucabol/archive/2008/11/26/bayesian-inference-in-f-part-iia-a-simple-example-modeling-maia.aspx"&gt;Part II – A simple example – modeling Maia&lt;/A&gt; &lt;/LI&gt;&lt;/UL&gt;
&lt;P&gt;The previous post ended on this note.&lt;/P&gt;&lt;PRE class=code&gt;&lt;SPAN style="COLOR: blue"&gt;let &lt;/SPAN&gt;MaiaJointProb attitude action =
    &lt;SPAN style="COLOR: blue"&gt;match &lt;/SPAN&gt;attitude &lt;SPAN style="COLOR: blue"&gt;with
    &lt;/SPAN&gt;| Happy     &lt;SPAN style="COLOR: blue"&gt;-&amp;gt; &lt;/SPAN&gt;happyActions |&amp;gt; List.assoc action
    | UnHappy   &lt;SPAN style="COLOR: blue"&gt;-&amp;gt; &lt;/SPAN&gt;unHappyActions |&amp;gt; List.assoc action
    | Quiet     &lt;SPAN style="COLOR: blue"&gt;-&amp;gt; &lt;/SPAN&gt;quietActions |&amp;gt; List.assoc action&lt;/PRE&gt;
&lt;P&gt;This is just a two by two matrix. It simply represents which probability is associated to an (attitude, action) tuple. It is useful to think about it in these terms, because it makes easier to grasp the following function:&lt;/P&gt;&lt;PRE class=code&gt;&lt;SPAN style="COLOR: green"&gt;/// Conditional probability of a mental state, given a particular observed action
&lt;/SPAN&gt;&lt;SPAN style="COLOR: blue"&gt;let &lt;/SPAN&gt;MaiaLikelihood action = &lt;SPAN style="COLOR: blue"&gt;fun &lt;/SPAN&gt;attitude &lt;SPAN style="COLOR: blue"&gt;-&amp;gt; &lt;/SPAN&gt;MaiaJointProb attitude action&lt;/PRE&gt;
&lt;P&gt;&lt;A href="http://11011.net/software/vspaste" mce_href="http://11011.net/software/vspaste"&gt;&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;This is simply a row in the matrix. It answers the question: given that I observe a particular action, what is the probability that Maia has a certain attitude?. This is called “likelihood function” in statistics. Its general form is: given that a I observe an outcome, what is the probability that it is generated by a process with a particular parameter?&lt;/P&gt;
&lt;P&gt;A related question is then: what if I observe a sequence of independent actions? What is the probability that the baby has a certain attitude then? This is answered by the following:&lt;/P&gt;&lt;PRE class=code&gt;&lt;SPAN style="COLOR: green"&gt;/// Multiple applications of the previous conditional probabilities for a series of actions (multiplied)
&lt;/SPAN&gt;&lt;SPAN style="COLOR: blue"&gt;let &lt;/SPAN&gt;MaiaLikelihoods actions =
    &lt;SPAN style="COLOR: blue"&gt;let &lt;/SPAN&gt;composeLikelihoods previousLikelihood action  = &lt;SPAN style="COLOR: blue"&gt;fun &lt;/SPAN&gt;attitude &lt;SPAN style="COLOR: blue"&gt;-&amp;gt; &lt;/SPAN&gt;previousLikelihood attitude * MaiaLikelihood action attitude        
    actions |&amp;gt; Seq.fold composeLikelihoods (&lt;SPAN style="COLOR: blue"&gt;fun &lt;/SPAN&gt;attitude &lt;SPAN style="COLOR: blue"&gt;-&amp;gt; &lt;/SPAN&gt;1.)&lt;/PRE&gt;
&lt;P&gt;It is a trivial extension of the previous function (really), once you know that to combine likelihoods you multiply them.&lt;/P&gt;
&lt;P&gt;We now need to describe what our prior is. A prior is our preconceived notion about a particular parameter (in this case the baby’s attitude). You might be tempted to express that notion with a single value, but that would be inaccurate. You need to indicate how confident you are about it. In statistics you do that by choosing a distribution for your belief. This is one of the beauties of Bayesian statistics, everything is a probability distribution. In this case, we really don’t have any previous belief, so we pick the uniform distribution.&lt;/P&gt;&lt;PRE class=code&gt;&lt;SPAN style="COLOR: blue"&gt;let &lt;/SPAN&gt;MaiaUniformPrior attitude = 1. / 3.&lt;/PRE&gt;
&lt;P&gt;Think of this as: you haven’t read any baby-attitude-specific study or received any external information about the likely attitude of Maia, so you cannot prefer one attitude over another.&lt;/P&gt;
&lt;P&gt;We are almost done. Now we have to apply the Bayesian theorem and get the un-normalized posterior distribution. Forget about the un-normalized word. What is a posterior distribution? This is your output, your return value. It says: given my prior belief on the value of a parameter and given the outcomes that I observed, this is what I now believe the parameter to be. In this case it goes like: I had no opinion on Maia’s attitude to start with, but after I observed her behavior for a while, I now think she is Happy with probability X, UnHappy with probability Y and Quiet with probability Z.&lt;/P&gt;&lt;PRE class=code&gt;&lt;SPAN style="COLOR: green"&gt;/// Calculates the unNormalized posterior given prior and likelihood
&lt;/SPAN&gt;&lt;SPAN style="COLOR: blue"&gt;let &lt;/SPAN&gt;unNormalizedPosterior (prior:'a &lt;SPAN style="COLOR: blue"&gt;-&amp;gt; &lt;/SPAN&gt;float) likelihood =
    &lt;SPAN style="COLOR: blue"&gt;fun &lt;/SPAN&gt;theta &lt;SPAN style="COLOR: blue"&gt;-&amp;gt; &lt;/SPAN&gt;prior theta * likelihood theta&lt;/PRE&gt;&lt;A href="http://11011.net/software/vspaste" mce_href="http://11011.net/software/vspaste"&gt;&lt;/A&gt;
&lt;P&gt;We then need to normalize this thing (it doesn’t sum to one). The way to do it is to divide each probability by the sum of the probabilities for all the possible outcomes.&lt;/P&gt;&lt;PRE class=code&gt;&lt;SPAN style="COLOR: green"&gt;/// All possible values for the unobservable parameter (mental state)
&lt;/SPAN&gt;&lt;SPAN style="COLOR: blue"&gt;let &lt;/SPAN&gt;support = [Happy; UnHappy; Quiet]

&lt;SPAN style="COLOR: green"&gt;/// Normalize the posterior (it integrates to 1.)
&lt;/SPAN&gt;&lt;SPAN style="COLOR: blue"&gt;let &lt;/SPAN&gt;posterior prior likelihood =
    &lt;SPAN style="COLOR: blue"&gt;let &lt;/SPAN&gt;post = unNormalizedPosterior prior likelihood
    &lt;SPAN style="COLOR: blue"&gt;let &lt;/SPAN&gt;sum = support |&amp;gt; List.sum_by (&lt;SPAN style="COLOR: blue"&gt;fun &lt;/SPAN&gt;attitude &lt;SPAN style="COLOR: blue"&gt;-&amp;gt; &lt;/SPAN&gt;post attitude)
    &lt;SPAN style="COLOR: blue"&gt;fun &lt;/SPAN&gt;attitude &lt;SPAN style="COLOR: blue"&gt;-&amp;gt; &lt;/SPAN&gt;post attitude / sum&lt;/PRE&gt;
&lt;P&gt;We are done. Now we can now start modeling scenarios. Let’s say that you observe [Smile;Smile;Cry;Smile;LookSilly]. What could the underlying attitude of Maia be?&lt;/P&gt;&lt;PRE class=code&gt;&lt;SPAN style="COLOR: blue"&gt;let &lt;/SPAN&gt;maiaIsANormalBaby = posterior MaiaUniformPrior (MaiaLikelihoods [Smile;Smile;Cry;Smile;LookSilly])&lt;/PRE&gt;
&lt;P&gt;We can then execute our little model:&lt;/P&gt;&lt;PRE class=code&gt;maiaIsANormalBaby Happy
maiaIsANormalBaby UnHappy
maiaIsANormalBaby Quiet&lt;/PRE&gt;
&lt;P&gt;And we get (0.5625, 0.0625, 0.375). So Maia is likely to be happy and unlikely to be unhappy. Let’s now model one extreme case:&lt;/P&gt;&lt;PRE class=code&gt;&lt;SPAN style="COLOR: green"&gt;/// Extreme cases
&lt;/SPAN&gt;&lt;SPAN style="COLOR: blue"&gt;let &lt;/SPAN&gt;maiaIsLikelyHappyDist = posterior MaiaUniformPrior (MaiaLikelihoods [Smile;Smile;Smile;Smile;Smile;Smile;Smile])

maiaIsLikelyHappyDist Happy
maiaIsLikelyHappyDist UnHappy&lt;BR&gt;maiaIsLikelyHappyDist Quiet&lt;/PRE&gt;
&lt;P&gt;And we get (0.944, 0.000431, 0.05). Now Maia is almost certainly Happy. Notice that I can confidently make this affirmation because my end result is exactly what I was looking for when I started my quest. Using classical statistics, that wouldn’t be the case.&lt;/P&gt;
&lt;P&gt;A related question I might want to ask is: given the posterior distribution for attitude that I just found, what is the probability of observing a particular action? In other words, given the model that I built, what does it predict?&lt;/P&gt;&lt;PRE class=code&gt;&lt;SPAN style="COLOR: blue"&gt;let &lt;/SPAN&gt;posteriorPredictive jointProb posterior =
    &lt;SPAN style="COLOR: blue"&gt;let &lt;/SPAN&gt;composeProbs previousProbs attitude = &lt;SPAN style="COLOR: blue"&gt;fun &lt;/SPAN&gt;action &lt;SPAN style="COLOR: blue"&gt;-&amp;gt; &lt;/SPAN&gt;previousProbs action + jointProb attitude action * posterior attitude  
    support |&amp;gt; Seq.fold composeProbs (&lt;SPAN style="COLOR: blue"&gt;fun &lt;/SPAN&gt;action &lt;SPAN style="COLOR: blue"&gt;-&amp;gt; &lt;/SPAN&gt;0.)
    
&lt;SPAN style="COLOR: blue"&gt;let &lt;/SPAN&gt;nextLikelyUnknownActionDist = posteriorPredictive MaiaJointProb maiaIsLikelyHappyDist&lt;/PRE&gt;
&lt;P&gt;I don’t have the strength right now to explain the mathematical underpinning of this. In words, this says: considering that Maia can have one of the possible three Attitudes with the probability calculated above, what is the probability that I observe a particular action? Notice that the signature for it is: (Action –&amp;gt; float), which is the compiler way to say it.&lt;/P&gt;
&lt;P&gt;Now we can run the thing.&lt;/P&gt;&lt;PRE class=code&gt;nextLikelyUnknownActionDist Smile
nextLikelyUnknownActionDist Cry
nextLikelyUnknownActionDist LookSilly&lt;/PRE&gt;
&lt;P&gt;And we get (0.588, 0.2056, 0.2055). Why is that? We’ll talk about it in the next post.&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=9340323" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/lucabol/archive/tags/F_2300_/default.aspx">F#</category><category domain="http://blogs.msdn.com/lucabol/archive/tags/Statistics/default.aspx">Statistics</category></item><item><title>Bayesian inference in F# - Part IIa - A simple example - modeling Maia</title><link>http://blogs.msdn.com/lucabol/archive/2008/11/26/bayesian-inference-in-f-part-iia-a-simple-example-modeling-maia.aspx</link><pubDate>Wed, 26 Nov 2008 23:41:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:9130414</guid><dc:creator>lucabol</dc:creator><slash:comments>2</slash:comments><comments>http://blogs.msdn.com/lucabol/comments/9130414.aspx</comments><wfw:commentRss>http://blogs.msdn.com/lucabol/commentrss.aspx?PostID=9130414</wfw:commentRss><description>&lt;P&gt;Other parts:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;A href="http://blogs.msdn.com/lucabol/archive/2008/11/07/bayesian-inference-in-f-part-i-background.aspx" mce_href="http://blogs.msdn.com/lucabol/archive/2008/11/07/bayesian-inference-in-f-part-i-background.aspx"&gt;Part I - Background&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;A href="http://blogs.msdn.com/lucabol/archive/2009/01/19/bayesian-inference-in-f-part-iib-finding-maia-underlying-attitude.aspx" mce_href="http://blogs.msdn.com/lucabol/archive/2009/01/19/bayesian-inference-in-f-part-iib-finding-maia-underlying-attitude.aspx"&gt;Part IIb - Finding Maia underlying attitude&lt;/A&gt;&amp;nbsp;&lt;/LI&gt;&lt;/UL&gt;
&lt;P&gt;Let's start with a simple example: inferring the underlying attitude of a small baby by observing her actions. Let's call this particular small baby Maia. People always asks her father if she is a 'good' baby or not. Her father started to wonder how he can possibly know that. Being 'good' is not very clear, so he chooses to answer the related question if her attitude is generally happy, unhappy or simply quiet (a kind of middle ground).&lt;/P&gt;&lt;PRE class=code&gt;&lt;SPAN style="COLOR: green"&gt;/// Underlying unobservable, but assumed stationary, state of the process (baby). Theta.
&lt;/SPAN&gt;&lt;SPAN style="COLOR: blue"&gt;type &lt;/SPAN&gt;Attitude =
    | Happy
    | UnHappy
    | Quiet&lt;/PRE&gt;
&lt;P&gt;Her poor father doesn't have much to go with. He can just observe what she does. He decides, for the sake of simplifying things, to categorize her state at each particular moment as smiling, crying or looking silly (a kind of middle ground).&lt;/P&gt;&lt;PRE class=code&gt;&lt;SPAN style="COLOR: green"&gt;/// Observable data. y.    
&lt;/SPAN&gt;&lt;SPAN style="COLOR: blue"&gt;type &lt;/SPAN&gt;Action =
    | Smile
    | Cry
    | LookSilly&lt;/PRE&gt;
&lt;P mce_keep="true"&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The father now has to decide what does it mean for Maia to be of an happy attitude. Lacking an universal definition of happiness in terms of these actions, he makes one up. Maia would be considered happy if she smiles 60% of the times, she cries 20% of the times and looks silly the remaining 20% of the times. He might as well have experimented with "clearly happy/unhappy" babies to come up with those numbers.&lt;/P&gt;&lt;PRE class=code&gt;&lt;SPAN style="COLOR: green"&gt;/// Data to model the underlying process (baby)
&lt;/SPAN&gt;&lt;SPAN style="COLOR: blue"&gt;let &lt;/SPAN&gt;happyActions = [ Smile, 0.6; Cry, 0.2; LookSilly, 0.2]
&lt;SPAN style="COLOR: blue"&gt;let &lt;/SPAN&gt;unHappyActions = [Smile, 0.2; Cry, 0.6; LookSilly, 0.2]
&lt;SPAN style="COLOR: blue"&gt;let &lt;/SPAN&gt;quietActions = [Smile, 0.4; Cry, 0.3; LookSilly, 0.3]&lt;/PRE&gt;
&lt;P&gt;What does it mean exactly? Well, this father would call his wife at random times during the day and ask her if Maia is smiling, crying or looking silly. He would then keep track of the numbers and then "somehow" decide what her attitude is. The general idea is simple, the "somehow" part is not.&lt;/P&gt;&lt;PRE class=code&gt;&lt;SPAN style="COLOR: green"&gt;/// Generates a new uniformly distributed number between 0 and 1
&lt;/SPAN&gt;&lt;SPAN style="COLOR: blue"&gt;let &lt;/SPAN&gt;random =
    &lt;SPAN style="COLOR: blue"&gt;let &lt;/SPAN&gt;rnd = &lt;SPAN style="COLOR: blue"&gt;new &lt;/SPAN&gt;System.Random()
    rnd.NextDouble&lt;/PRE&gt;
&lt;P&gt;We can now model Maia. We want our model to return a particular action depending on which attitude we assume Maia is in mostly. For example, if we assume she is an happy baby, we want our model to return Smile about 60% of the times. In essence, we want to model what happens when the (poor) father calls his (even poorer) wife. What would his wife tell him (assuming a particular attitude)? The general idea is expressed by the following:&lt;/P&gt;&lt;PRE class=code&gt;&lt;SPAN style="COLOR: green"&gt;/// Process (baby) modeling. How she acts if she is fundamentally happy, unhappy or quiet
&lt;/SPAN&gt;&lt;SPAN style="COLOR: blue"&gt;let &lt;/SPAN&gt;MaiaSampleDistribution attitude =
    &lt;SPAN style="COLOR: blue"&gt;match &lt;/SPAN&gt;attitude &lt;SPAN style="COLOR: blue"&gt;with
    &lt;/SPAN&gt;| Happy     &lt;SPAN style="COLOR: blue"&gt;-&amp;gt; &lt;/SPAN&gt;pickOne happyActions
    | UnHappy   &lt;SPAN style="COLOR: blue"&gt;-&amp;gt; &lt;/SPAN&gt;pickOne unHappyActions
    | Quiet     &lt;SPAN style="COLOR: blue"&gt;-&amp;gt; &lt;/SPAN&gt;pickOne quietActions&lt;/PRE&gt;
&lt;P&gt;The 'pickOne' function simply picks an action depending on the probability of it being picked. The name sample distribution is statistic-lingo to mean 'what you observe' and indeed you just can observe Maia's actions, not her underlying attitude.&lt;/P&gt;
&lt;P&gt;The implementation of pickOne gets technical. You don't need to understand it to understand the rest of this post. This is the beauty of encapsulation. You can start reading from after the next code snippet if you want to.&lt;/P&gt;
&lt;P&gt;'pickOne' works by constructing the inverse cumulative distribution function for the probability distribution described by the Happy/UnHappy/Quiet/Actions lists. There is an &lt;A href="http://en.wikipedia.org/wiki/Inverse_transform_sampling" mce_href="http://en.wikipedia.org/wiki/Inverse_transform_sampling"&gt;entry on wikipedia&lt;/A&gt; that describes how this works and I don't wish to say more here except presenting the code.&lt;/P&gt;&lt;PRE class=code&gt;&lt;SPAN style="COLOR: green"&gt;/// Find the first value more or equal to a key in a seq&amp;lt;'a * 'b&amp;gt;.&lt;BR&gt;/// The seq is assumed to be sorted
&lt;/SPAN&gt;&lt;SPAN style="COLOR: blue"&gt;let &lt;/SPAN&gt;findByKey key aSeq =
    aSeq |&amp;gt; Seq.find (&lt;SPAN style="COLOR: blue"&gt;fun &lt;/SPAN&gt;(k, _) &lt;SPAN style="COLOR: blue"&gt;-&amp;gt; &lt;/SPAN&gt;k &amp;gt;= key) |&amp;gt; snd

&lt;SPAN style="COLOR: green"&gt;/// Simulate an inverse CDF given values and probabilities
&lt;/SPAN&gt;&lt;SPAN style="COLOR: blue"&gt;let &lt;/SPAN&gt;buildInvCdf valueProbs =
    &lt;SPAN style="COLOR: blue"&gt;let &lt;/SPAN&gt;cdfValues =
        valueProbs
        |&amp;gt; Seq.scan (&lt;SPAN style="COLOR: blue"&gt;fun &lt;/SPAN&gt;cd (_, p) &lt;SPAN style="COLOR: blue"&gt;-&amp;gt; &lt;/SPAN&gt;cd + p) 0. 
        |&amp;gt; Seq.skip 1 
    &lt;SPAN style="COLOR: blue"&gt;let &lt;/SPAN&gt;cdf =
        valueProbs
        |&amp;gt; Seq.map fst 
        |&amp;gt; Seq.zip cdfValues 
        |&amp;gt; Seq.cache     
    &lt;SPAN style="COLOR: blue"&gt;fun &lt;/SPAN&gt;x &lt;SPAN style="COLOR: blue"&gt;-&amp;gt; &lt;/SPAN&gt;cdf |&amp;gt; findByKey x

&lt;SPAN style="COLOR: green"&gt;/// Picks an 'a in a seq&amp;lt;'a * float&amp;gt; using float as the probability to pick a particular 'a
&lt;/SPAN&gt;&lt;SPAN style="COLOR: blue"&gt;let &lt;/SPAN&gt;pickOne probs =
    &lt;SPAN style="COLOR: blue"&gt;let &lt;/SPAN&gt;rnd = random ()
    &lt;SPAN style="COLOR: blue"&gt;let &lt;/SPAN&gt;picker = buildInvCdf probs
    picker rnd&lt;/PRE&gt;
&lt;P&gt;Another way to describe Maia is more mathematically convenient and will be used in the rest of the post. This second model answers the question: what is the probability of observing an action assuming a particular attitude? The distribution of both actions and attitudes (observable variable and parameter) is called joint probability.&lt;/P&gt;&lt;PRE class=code&gt;&lt;SPAN style="COLOR: green"&gt;/// Another, mathematically more convenient, way to model the process (baby)
&lt;/SPAN&gt;&lt;SPAN style="COLOR: blue"&gt;let &lt;/SPAN&gt;MaiaJointProb attitude action =
    &lt;SPAN style="COLOR: blue"&gt;match &lt;/SPAN&gt;attitude &lt;SPAN style="COLOR: blue"&gt;with
    &lt;/SPAN&gt;| Happy     &lt;SPAN style="COLOR: blue"&gt;-&amp;gt; &lt;/SPAN&gt;happyActions |&amp;gt; List.assoc action
    | UnHappy   &lt;SPAN style="COLOR: blue"&gt;-&amp;gt; &lt;/SPAN&gt;unHappyActions |&amp;gt; List.assoc action
    | Quiet     &lt;SPAN style="COLOR: blue"&gt;-&amp;gt; &lt;/SPAN&gt;quietActions |&amp;gt; List.assoc action&lt;/PRE&gt;
&lt;P&gt;"List.assoc" returns the value associated with a key in a list containing (key, value) pairs. Notice that in general, if you are observing a process, you don't know what its joint distribution is. But you can approximate it by running the MaiaSampleDistribution function on known babies many times and keeping track of the result. So, in theory, if you have a way to experiment with many babies with known attitudes, you can create such a joint distribution.&lt;/P&gt;
&lt;P&gt;We now have modeled our problem, this is the creative part. From now on, it is just execution. We'll get to that.&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=9130414" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/lucabol/archive/tags/F_2300_/default.aspx">F#</category><category domain="http://blogs.msdn.com/lucabol/archive/tags/Statistics/default.aspx">Statistics</category></item><item><title>Bayesian inference in F# - Part I - Background</title><link>http://blogs.msdn.com/lucabol/archive/2008/11/07/bayesian-inference-in-f-part-i-background.aspx</link><pubDate>Fri, 07 Nov 2008 19:48:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:9052523</guid><dc:creator>lucabol</dc:creator><slash:comments>8</slash:comments><comments>http://blogs.msdn.com/lucabol/comments/9052523.aspx</comments><wfw:commentRss>http://blogs.msdn.com/lucabol/commentrss.aspx?PostID=9052523</wfw:commentRss><description>&lt;P mce_keep="true"&gt;Other posts:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;
&lt;DIV mce_keep="true"&gt;&lt;A href="http://blogs.msdn.com/lucabol/archive/2008/11/26/bayesian-inference-in-f-part-iia-a-simple-example-modeling-maia.aspx" mce_href="http://blogs.msdn.com/lucabol/archive/2008/11/26/bayesian-inference-in-f-part-iia-a-simple-example-modeling-maia.aspx"&gt;Part IIa - A simple example - modeling Maia&lt;/A&gt;&lt;/DIV&gt;&lt;/LI&gt;
&lt;LI&gt;
&lt;DIV mce_keep="true"&gt;&lt;A href="http://blogs.msdn.com/lucabol/archive/2009/01/19/bayesian-inference-in-f-part-iib-finding-maia-underlying-attitude.aspx" mce_href="http://blogs.msdn.com/lucabol/archive/2009/01/19/bayesian-inference-in-f-part-iib-finding-maia-underlying-attitude.aspx"&gt;Part IIb - Finding Maia underlying attitude&lt;/A&gt;&lt;/DIV&gt;&lt;/LI&gt;&lt;/UL&gt;
&lt;P&gt;My interest in Bayesian inference comes from my dissatisfaction with 'classical' statistics. Whenever I want to know something, for example the probability that an unknown parameter is between two values, 'classical' statistics seems to answer a different and more convoluted question.&lt;/P&gt;
&lt;P&gt;Try asking someone what "the 95% confidence interval for X is (x1, x2)" means. Very likely he will tell you that it means that there is a 95% probability that X lies between x1 and x2. That is not the case in classical statistics. It is the case in Bayesian statistics. Also all the funny business of defining a Null hypothesis for the sake of proving its falseness always made my head spin. You don't need any of that in Bayesian statistics. More recently, my discovery that statistical significance is an harmful concept, instead of the bedrock of knowledge I always thought it to be, shook my confidence in 'classical' statistics even more.&lt;/P&gt;
&lt;P&gt;Admittedly, I'm not that smart. If I have an hard time getting an intuitive understanding of something, it tends to go away from my mind after a couple of days I've learned it. This happens all the time with 'classical' statistics. I feel like I have learned the thing ten times, because I continuously forget it. This doesn't happen with Bayesian statistics. It just makes intuitive sense.&lt;/P&gt;
&lt;P&gt;At this point you might be wandering what 'classical' statistics is. I use the term classical, but I really shouldn't. Classical statistics is normally just called 'statistics' and it is all you learn if you pick up whatever book on the topic (for example the otherwise excellent "&lt;A href="http://www.amazon.com/Introduction-Practice-Statistics-w-CD-ROM/dp/0716764008/ref=sr_1_1?ie=UTF8&amp;amp;s=books&amp;amp;qid=1225747881&amp;amp;sr=1-1" mce_href="http://www.amazon.com/Introduction-Practice-Statistics-w-CD-ROM/dp/0716764008/ref=sr_1_1?ie=UTF8&amp;amp;s=books&amp;amp;qid=1225747881&amp;amp;sr=1-1"&gt;Introduction to the Practice of Statistics&lt;/A&gt;"). Bayesian statistics is just a footnote in such books. This is a shame.&lt;/P&gt;
&lt;P&gt;Bayesian statistics provides a much clearer and elegant framework for understanding the process of inferring knowledge from data. The underlying question that it answers is: "If I hold an opinion about something and I receive additional data on it, how should I rationally change my opinion?". This question of how to update your knowledge is at the very foundation of human learning and progress in general (for example the scientific method is based on it). We better be sure that the way we answer it is sound.&lt;/P&gt;
&lt;P&gt;You might wander how it is possible to go against something that is so widely accepted and taught everywhere as 'classical' statistics is. Well, very many things that most people believe are wrong. I always like to cite &lt;A href="http://en.wikipedia.org/wiki/Benjamin_Graham" mce_href="http://en.wikipedia.org/wiki/Benjamin_Graham"&gt;old Ben&lt;/A&gt; on this: "The fact that other people agree or disagree with you makes you neither right nor wrong. You will be right if your facts and your reasoning are correct.". This little rule always served me well.&lt;/P&gt;
&lt;P&gt;In this series of posts I will give examples of Bayesian statistics in F#. I am not a statistician, which makes me part of the very dangerous category of 'people who are not statisticians but talk about statistics". To try to mitigate the problem I enlisted the help of &lt;A href="http://research.microsoft.com/~rherb/" mce_href="http://research.microsoft.com/~rherb/"&gt;Ralf Herbrich&lt;/A&gt;, who is a statistician and can catch my most blatant errors. Obviously I'll manage to hide my errors so cleverly that not even Ralf would spot them. In which case the fault is just mine.&lt;/P&gt;
&lt;P&gt;In the next post we'll look at some F# code to model the Bayesian inference process.&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=9052523" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/lucabol/archive/tags/F_2300_/default.aspx">F#</category><category domain="http://blogs.msdn.com/lucabol/archive/tags/Statistics/default.aspx">Statistics</category></item></channel></rss>