Luke Hoban's Blog

  • LukeH's WebLog

    Taking LINQ to Objects to Extremes: A fully LINQified RayTracer

    • 51 Comments

    Not too long ago I blogged about a C# raytracer which took advantage of a lot of C#3.0 language constructs.  However, you may have noticed that it did not actually use LINQ query expressions all that much.  Well, after discussing this with a coworker on the PLINQ team at lunch one day - I was convinced that it should be possible to express the logic of the raytracer in a LINQ to Objects query.  The result is below - a single 60-line LINQ query which captures the complete raytracing algorithm.

    [DISCLAIMER:  I am by no means suggesting it's a good idea to write code like this.  Do so only at your own risk!  Just as you *could* write your entire application in one method body, or decide to replace all if statements with virtual method dispatches, doing what I show below is a clear abuse of this language construct.  But it's an interesting example to show that there is a lot of expresiveness available in C#3.0 query expressions.]

    One Really Big Query Expression!

    var pixelsQuery =
        from y in Enumerable.Range(0, screenHeight)
        let recenterY = -(y - (screenHeight / 2.0)) / (2.0 * screenHeight)
        select from x in Enumerable.Range(0, screenWidth)
               let recenterX = (x - (screenWidth / 2.0)) / (2.0 * screenWidth)
               let point =
                   Vector.Norm(Vector.Plus(scene.Camera.Forward,
                                           Vector.Plus(Vector.Times(recenterX, scene.Camera.Right),
                                                       Vector.Times(recenterY, scene.Camera.Up))))
               let ray = new Ray() { Start = scene.Camera.Pos, Dir = point }
               let computeTraceRay = (Func<Func<TraceRayArgs, Color>, Func<TraceRayArgs, Color>>)
                (f => traceRayArgs =>
                 (from isect in
                      from thing in traceRayArgs.Scene.Things
                      select thing.Intersect(traceRayArgs.Ray)
                  where isect != null
                  orderby isect.Dist
                  let d = isect.Ray.Dir
                  let pos = Vector.Plus(Vector.Times(isect.Dist, isect.Ray.Dir), isect.Ray.Start)
                  let normal = isect.Thing.Normal(pos)
                  let reflectDir = Vector.Minus(d, Vector.Times(2 * Vector.Dot(normal, d), normal))
                  let naturalColors =
                      from light in traceRayArgs.Scene.Lights
                      let ldis = Vector.Minus(light.Pos, pos)
                      let livec = Vector.Norm(ldis)
                      let testRay = new Ray() { Start = pos, Dir = livec }
                      let testIsects = from inter in
                                           from thing in traceRayArgs.Scene.Things
                                           select thing.Intersect(testRay)
                                       where inter != null
                                       orderby inter.Dist
                                       select inter
                      let testIsect = testIsects.FirstOrDefault()
                      let neatIsect = testIsect == null ? 0 : testIsect.Dist
                      let isInShadow = !((neatIsect > Vector.Mag(ldis)) || (neatIsect == 0))
                      where !isInShadow
                      let illum = Vector.Dot(livec, normal)
                      let lcolor = illum > 0 ? Color.Times(illum, light.Color) : Color.Make(0, 0, 0)
                      let specular = Vector.Dot(livec, Vector.Norm(reflectDir))
                      let scolor = specular > 0
                                     ? Color.Times(Math.Pow(specular, isect.Thing.Surface.Roughness),
                                                   light.Color)
                                     : Color.Make(0, 0, 0)
                      select Color.Plus(Color.Times(isect.Thing.Surface.Diffuse(pos), lcolor),
                                        Color.Times(isect.Thing.Surface.Specular(pos), scolor))
                  let reflectPos = Vector.Plus(pos, Vector.Times(.001, reflectDir))
                  let reflectColor = traceRayArgs.Depth >= MaxDepth
                                      ? Color.Make(.5, .5, .5)
                                      : Color.Times(isect.Thing.Surface.Reflect(reflectPos),
                                                    f(new TraceRayArgs(new Ray()
                                                    {
                                                        Start = reflectPos,
                                                        Dir = reflectDir
                                                    },
                                                                       traceRayArgs.Scene,
                                                                       traceRayArgs.Depth + 1)))
                  select naturalColors.Aggregate(reflectColor,
                                                 (color, natColor) => Color.Plus(color, natColor))
                 ).DefaultIfEmpty(Color.Background).First())
               let traceRay = Y(computeTraceRay)
               select new { X = x, Y = y, Color = traceRay(new TraceRayArgs(ray, scene, 0)) };
    
    foreach (var row in pixelsQuery)
        foreach (var pixel in row)
            setPixel(pixel.X, pixel.Y, pixel.Color.ToDrawingColor());
    
    Let Expressions

    "let" query clauses allow you to introduce a variable into scope which can be used by the following query clauses.  Similar to local variables in a method body, this gives you a way to avoid evaluating a common expression multiple times by storing it in a variable.  This can be very useful even in much simpler queries.  Of course, in the query above - let is absolutely critical.

    Recursion

    A raytracer is inhenertly recursive - whenever a ray hits a reflective or transparent surface, new rays have to be cast.  Thus the query that expreses the raytracer needs to be recursive.  But the let clause doesn't support declaring recursive functions.  Not to let that stop us (as it really should!) - we can use the Y combinator to introduce recursion by hand.  Mads wrote a very detailed description of the recursive lambda expressions in C# recently - and the code example above uses my personal implementation of some of the same ideas he describes. 

    Expression-based control structures

    When you are trying to write things in a single query expression - you don't have the luxury of using any statements.  Since most of the C# control structures are  statement based (if, switch, for, while, etc.) - this can make things a little trickier.  One really valauble expression-based control structure is the ternary operator ( x ? y : z ).  This is used in a couple of places above to provide the necessary branching logic.  The rest of the control structuring is accomplished with LINQ Standard Query Operators - Select, SelectMany, Where, OrderBy, Aggregate and DefaultIfEmpty.  As functional programmers have shown for years, the expressiveness of these very general list processing operations is impressive.

    Conclusion

    In case you had any doubts - hopefully this convinces you that C#3.0 Query Expressions are really a very expressive language construct.  Of course - this expressiveness is probably best used in more moderation than shown here!

    File iconLINQRayTracer.cs

    kick it on DotNetKicks.com

  • LukeH's WebLog

    A Ray Tracer in C#3.0

    • 33 Comments

    Ray tracers are a lot of fun.  When I was in middle school, I discovered POV-Ray and was so excited about the cool graphics it could create that I would often leave my 286 on overnight rendering ray-traced scenes and movies.  When I was in high school, I discovered Computer Graphics: Principles and Practice and spent weeks working through it's ray tracing algorithms trying to implement my own ray tracer in C++.  When I was in college, taking an introductory course which used Scheme, I re-wrote the raytracer in Scheme to show some friends that you really could write programs longer than 20 lines in Scheme!  And then when I joined the C# team 3 years ago, I of course had to re-write my raytracer in C#. 

    More recently, I took a pass through the code to update it to use C#3.0.  It's not a particularly fancy or efficient ray tracer - but then again, it's only 400 lines of code.  Below are a few of the interesting uses of C#3.0 from the ray tracer code.

    C#3.0 Without the Databases and XML

    Although we often demo C#3.0 using databases and XML to show off LINQ - it turns out that the new language features really are also great for applications which have little to do with querying.  Ray tracing, for example, is certainly not one of the prototypical scenario for query and transformation of data.  Nonetheless, I found quite a few places in the code where C#3.0 and LINQ to Objects really improved the code - making it easier to express what the program was doing.

    LINQ to Objects

    Here's one example of the method which computes the intersections of a ray with the objects in a scene.   

            private IEnumerable<ISect> Intersections(Ray ray, Scene scene)
            {
                return scene.Things
                            .Select(obj => obj.Intersect(ray))
                            .Where(inter => inter != null)
                            .OrderBy(inter => inter.Dist);
            }

    The method intersects each object in the scene with the ray, then throws away those that didn't intersect, and finally orders the rest by the distance from the source of the ray. The first object in this result is the closest object hit by the ray.  If there are no elements in the result, there were no intersections.  The LINQ to Objects query methods provide a nice way to describe this, and lambdas make it easy to write the code which processes the objects and intersections.

    Object and Collection Initializers 

    Here's the code that describes the scene rendered in the image above.

            internal readonly Scene DefaultScene =
                new Scene() {
                        Things = new SceneObject[] { 
                                    new Plane() {
                                        Norm = Vector.Make(0,1,0),
                                        Offset = 0,
                                        Surface = Surfaces.CheckerBoard
                                    },
                                    new Sphere() {
                                        Center = Vector.Make(0,1,0),
                                        Radius = 1,
                                        Surface = Surfaces.Shiny
                                    },
                                    new Sphere() {
                                        Center = Vector.Make(-1,.5,1.5),
                                        Radius = .5,
                                        Surface = Surfaces.Shiny
                                    }},
                        Lights = new Light[] { 
                                    new Light() {
                                        Pos = Vector.Make(-2,2.5,0),
                                        Color = Color.Make(.49,.07,.07)
                                    },
                                    new Light() {
                                        Pos = Vector.Make(1.5,2.5,1.5),
                                        Color = Color.Make(.07,.07,.49)
                                    },
                                    new Light() {
                                        Pos = Vector.Make(1.5,2.5,-1.5),
                                        Color = Color.Make(.07,.49,.071)
                                    },
                                    new Light() {
                                        Pos = Vector.Make(0,3.5,0),
                                        Color = Color.Make(.21,.21,.35)
                                    }},
                        Camera = Camera.Create(Vector.Make(3,2,4), Vector.Make(-1,.5,0))
                    };

    The scene consists of a collection of things, a collection of lights, and a camera, all of which are initialized in one statement using object and collection initializers.  This makes it quite a bit easier to describe the scene.  This format also helps to make the structure of the scene clear.

    Lambdas

    Here's the code describing the two surface textures used in the image above:

        static class Surfaces {
            // Only works with X-Z plane.
            public static readonly Surface CheckerBoard  = 
                new Surface() {
                        Diffuse = pos => ((Math.Floor(pos.Z) + Math.Floor(pos.X)) % 2 != 0) 
    ? Color.Make(1,1,1)
    : Color.Make(0,0,0), Specular = pos => Color.Make(1,1,1), Reflect = pos => ((Math.Floor(pos.Z) + Math.Floor(pos.X)) % 2 != 0)
    ? .1
    : .7, Roughness = 150 }; public static readonly Surface Shiny = new Surface() { Diffuse = pos => Color.Make(1,1,1), Specular = pos => Color.Make(.5,.5,.5), Reflect = pos => .6, Roughness = 50 }; }

    These two static fields are initialized with the surface styles for the checkerboard and shiny textures on the objects in the scene above.  The surfaces are created with object initializers, and lambda expressions are used to provide the functions to compute diffuse, specular and reflection values based on the position on the surface.  The combination makes it possible to do something similar to prototype objects.

    Try Out the Code

    Want to try it out yourself?  Just compile RayTracer.cs with the Orcas C# compiler.  Then run.

    File iconRayTracer.cs

    kick it on DotNetKicks.com

  • LukeH's WebLog

    .NET Framework Multitargeting in Visual Studio 2008 (aka Orcas)

    • 33 Comments

    One of the really great features I worked on for our upcoming release is ".NET Framework Multitargeting" for Visual Studio.  This allows you to build applications targeting any of these frameworks using Visual Studio 2008:

    • .NET Framework 2.0 - released with Visual Studio 2005
    • .NET Framework 3.0 - released with Windows Vista
    • .NET Framework 3.5 - will release with Visual Studio Orcas

    When targeting .NET Framework 2.0, for example, Visual Studio will try to protect you from using any features that are only available in a higher framework version.  This lets you confidently use Visual Studio 2008 for targeting any of these three platforms.  New projects can be created targeting any of these frameworks, and projects can later be changed to target a different framework.

    Why is this so important?

    We've heard over-and-over again from .NET developers about how much harder it is to move to the next framework version than to move to the next Visual Studio version.  Upgrading to a new Visual Studio means installing on a few developer machines - upgrading to target a new framework means ensuring that every client of the application has the new framework installed.  Because of this, we very often get asked about whether it's possible to target the older frameworks with the newer tools.  With Visual Studio 2008 - we're enabling just that.

    There's another reason this is important for Visual Studio 2008 in particular.  With .NET 2.0, .NET 3.0 and .NET 3.5, we'll have released three framework versions in about 2 years - but only two Visual Studio versions.  Because of this - Visual Studio 2008 needs to be a great tool for targeting both .NET3.0 and .NET3.5, and multitrargeting enables this.  In addition, we want developers and development organizations to be able to easily move from Visual Studio 2005 to Visual Studio 2008 - and support for multitargeting lowers the barrier to entry - you can continue working on your .NET 2.0 applications in Visual Studio 2008.

    A few details...

    New Project Dialog

    The New Project and New Website dialogs now have an additional UI element to choose the framework version to target.  Choosing a target will limit the set of available templates in the dialog.  Additionally, templates that are available for multiple targets will be generated with different default references and usings/imports based on the framework target.  Out of the box, the defaults are to target .NET3.5 - but, just like other options in the dialog, the choice is "sticky", so if you pick .NET2.0, it'll default to that in the future.

    Since some templates are unaffected by the .NET Framework target - the description of the template also mentions whether or not a specific .NET Framework target will be applied.

    Project Properties

    You can change the framework target of your project at any time through the Project->Properties dialog.  If you change the target to a higher framework version, you will be able to use the new API features available in that framework.  If you change the target to a lower framework version, any references in your project which are no longer allowed will be disabled (but left in your References list), and compiling your project will likely result in build failures due to the missing assemblies.  However, we've tried to ensure that changing to a lower framework is easily reversible.

    In C# projects, the drop-down to make these changes appears on the Application tab, for VB projects it appears in the Compiler->Advanced... tab and for C++ projects it appears in the Framework and References section. 

    Add Reference Dialog

    The real heart of multitargeting support is the Add Reference dialog (see "So, how does this work?" below for details).

    In the ".NET" tab, references which are not available for the target framework are grayed out and cannot be added.  A hover-tip message explains the reason why.  All the assemblies that ship as part of the higher frameworks will be grayed out, as will any other assemblies which themselves have references to assemblies in the higher frameworks.

    In the "Browse" tab, you can still browse to any assembly, but if that assembly requires a higher framework version than your project, a warning dialog will be presented.

    Add New Item Dialog

    Some of the items that can be added to a project are dependent on a minimum framework version.  For example the "Linq to SQL Classes" item template requires .NET 3.5.  The Add New Item dialog thus filters out these items that are not available for the project.  Custom or third party templates can also set the framework versions they want to be available in - so that they can integrate correctly into this filtering.

    Toolbox

    The toolbox provides controls from many different assemblies - some of which may not be available for the framework target of your project.  So any controls defined in assemblies which are not available on your target framework will be filtered out of the toolbox.  For example, the ElementHost WinForms control which allows WPF interop as part of .NET3.0 is not available in the toolbox when building a WinForms project targeting .NET 2.0.

    Deployment

    At the end of the day, .NET framework target is all about deployment prequisites.  So in the Prerequisites dialog we've provided a few options for which .NET Framework package to include with your deployment.

    Language

    Many of the new language features in the .NET languages can be used with any of the available .NET framework targets.  So Visual Studio 2008 allows you to use C#3.0, VB9, and VC9 for all projects, including those targeting .NET2.0 and .NET3.0.

    Websites are different in this regard - you cannot use the new language features in a web site targeting .NET 2.0 or .NET 3.0.  This restriction is in place because Web Sites need to compile on the web server, and this requires that the new compilers be installed on the Web Server.  Since these new compilers come as part of the .NET 3.5 framework, this framework is required on the web server to use new language features.

    Express

    The Visual Studio 2008 Express products have a limited version of multitargeting.  All of the infrastructure described above is available - except for the dropdown in the New Project Dialog.  This keeps the multitargeting features out of the way for beginning and hobbyist developers - but ensures that projects can continue to move between Express and full versions of Visual Studio as they could in Visual Studio 2005.

     

    So, how does this work?

    The three frameworks supported by Visual Studio Orcas are each supersets of one another - instead of being replacements as the previous .NET Framework releases have been.  One benefit of this is that .NET Framework Multitargeting can be based just on the available set of references.  That is - when you target .NET3.0, the only things you should be prevented from using are those that depend on assemblies in .NET3.5 - and this can be done by ensuring that you don't accidentally reference these assemblies.

    There is one caveat to this - called "red bits".  These are the changes in the framework that are part of .NET Framework 2.0 Service Pack 1 and .NET Framework 3.0 Service Pack 1, both of which will be released along with Visual Studio 2008.  These two service packs are required for .NET3.5 to be installed, and they include a few very targetted additions of new functionality to the .NET 2.0 and .NET 3.0 frameworks.  This means that when you target .NET 2.0 in Visual Studio Orcas, you are really targeting .NET 2.0 SP1.  This is similar to what happens when you are using Visual Studio 2003 after the .NET Framework 1.1 service pack is installed on your machine - so it's really nothing new.  But for those who want to be extra cautious about their framework dependencies, it's somthing to be aware of. 

    Try it Now!

    Orcas Beta1 is available now, and includes almost all of the multitargeting support described above.  Take a look and let us know what you think.  Orcas Beta2, which will be available later this summer, will include everything described here - along with a ton of other improvements and bug fixes across the rest of Visual Studio 2008 and the .NET 3.5 Framework.  As always, send us feedback through Connect.

  • LukeH's WebLog

    What is in Visual C# Express?

    • 26 Comments

    So far, the first two comments I’ve gotten on my blog have been asking essentially the same question, “What is C# Express?”  This seems like a good indication that I should try to explain what you are getting when you go and download C# Express.

     

    First off, Visual C# Express is only for writing Windows applications in C#.  For web applications, you can use Visual Web Developer Express, and for other languages, there are Visual Basic Express, Visual J# Express and Visual C++ Express. 

     

    That said, here is a rundown of some of the things you’ll find in C# Express.  Please note that this is a beta , so by the time the Express products are released, there may be some new features added, and there may be some features trimmed down to enhance the experience.

     

    C# Editor

     

    C# Express includes almost all the cool editor features you’ll find in higher end Visual Studio products.  For instance:

    -         Support for the C# 2.0 language, including new language features like generics.

    -         Full IntelliSense, including all the new IntelliSense features that have been added in the 2005 version.

    -         Code Snippets, a new 2005 feature that lets you insert common blocks of code instantly, for instance property definitions.

    -         Some Refactorings, such as intelligently renaming a field or extractinging a method.  There are a few additional refactorings which are not available in C# Express.

    -         And so many more…

     

    Debugger

     

    The debugging environment has been streamlined a little to provide a simpler and easier to use debugger without all the windows that you are likely to never use. For instance, we have taken the disassembly window out, and cut back to having only one watch window instead of four.  C# Express does have all of the debugging features you use (or will use) all the time like:

    -         Breakpoints

    -         Watch window

    -         Call stack

    -         Data Tips (hover over your code while in debug mode to see what these are)

    -         Visualizers (click on the little magnifying glass in the watch window to see what these are)

    -         And many more…

     

    Windows Forms Designer

     

    The Windows Forms designer in C# Express is the fully functional designer from other versions of Visual Studio.  All of the form layout and databinding features are available in C# Express along with some of the new features in 2005 like:

    -         Snaplines

    -         Smart tags to help you do common tasks with your controls.

    -         A bunch of new controls

    -         Strongly typed resources

    -         Settings designer

    -         And many more…

     

    Data Connectivity

    With the included SQL Server Express, you can write applications that connect to a local database.  See the VSData team’s blog for details about this.  And with full support for consuming webservices, you can easily write applications that talk to Amazon, Google, or any of the other thousands of webservices out there.

     

    Some Things You Won’t Find in C# Express

    Although it is a very full featured environment, there are some things you won’t find in C# Express that you would find in other Visual Studio products. 

    -         Source control

    -         Add-ins and Macros

    -         Class Designer

    -         ClickOnce deployment

    -         Remote debugging

    -         Mobile development

    -         Unit testing

     

    If these are features you use a lot, C# Express probably won’t be the right development tool for you.  However, if you are looking for a great small tool for writing C# Windows applications, C# Express will be a great choice.

     

    But why trust me, go download the beta and see for yourself!  Then come back here and let me know what you think.

  • LukeH's WebLog

    F#

    • 25 Comments

    A few weeks back, Soma blogged about an increased investment by the Microsoft Developer Division in the F# language.  Part of this increased investment has been the creation of a small team in Redmond to work with F#'s creator Don Syme to bring F# into the set of first class languages supported on .NET.  This provided a great opportunity for me to become one of the early members of this Redmond based F# team.  So, as of a few weeks ago, I've traded in my C# Compiler PM role for the newly created F# PM job!

    As you can probably tell from some of my previous blog posts, I have a lot of interest in functional programming with .NET.  This makes F# is a naturally exciting language for me to be involved in.  F# provides a language in which functional programming is easy and expressive, while at the same time practical and well-connected to the underlying .NET type-system and programming model.  A exciting language for the .NET platform indeed!

    Sample F# Code - Mandelbrot

    Here's a little example piece of F# code to hopefully pique your interest to learn more about the language.

    [Note: This code has been updated to work with .NET4, which now includes a built-in Complex type]

    open System.Numerics
    open System
    
    let maxIteration = 100
    
    let modSquared (c : Complex) = c.Real * c.Real + c.Imaginary * c.Imaginary
    
    type MandelbrotResult = 
        | DidNotEscape
        | Escaped of int
        
    let mandelbrot c =
        let rec mandelbrotInner z iterations =
            if(modSquared z >= 4.0) 
                then Escaped iterations
            elif iterations = maxIteration
                then DidNotEscape
            else mandelbrotInner ((z * z) + c) (iterations + 1)
        mandelbrotInner c 0
    
    for y in [-1.0..0.1..1.0] do
        for x in [-2.0..0.05..1.0] do
            match mandelbrot (Complex(x, y)) with
            | DidNotEscape -> Console.Write "#"
            | Escaped _ -> Console.Write " "
        Console.WriteLine () 
    
    Let

    The let keyword is used to declare a new variable or function.  Note in particular the similarities between the declaration of maxIterations and modSquared.  Functions are treated much the same as values of any other type throughout F#. 

     

    Let can be used both at the top level and also inside function definitions to declare a local variable or function.  For example, the mandelbrot function uses a local function mandelbrotInner which computes the result, and simply calls it with the initial values.  Note also that mandelInner refers to the parameter c passed to the mandelbrot function - local functions are true closures.

    Recursive function can be defined using let rec.

    Types

    F# is built on the .NET type system, and provides access to any type defined in a .NET assembly.  Types can also be defined in F# using the keyword type.  There are many kinds of types that can be created in F#, for example, standard .NET types such as classes and interfaces can be defined, but F# also supports types such as records and discriminated unions. 

    As an example of discrimated unions, in the code above, MandelbrotResult is defined to be a type whose values are either DidNotEscape or are Escaped with an integer.  This accurately captures the mathematical definition of the mandelbrot set, which is defined in terms of points in the complex plane either escaping to infinity after a certain number of iterations, or remaining within a bounded region.

    Terse Syntax

    One of the most striking features of F# code is that it is very terse - ideas can typically be expressed with a small amount of code.  There are a few significant language features which contribute to this:

    1. Type Inference: F# is strongly typed, like C#, but instead of having to declare the type of variables, parameters and return types, F# uses type inference to determine these automatically.  When the types cannot be inferred, type annotations can be supplied in the code, such as in the definition of the modSquared function above.
    2. Indentation-aware syntax: By dfault, F# allows code to omit begin...end keywords and some other tokens, and instead relies on indentation to indicate nesting.  This is similar to languages like Python, and enables the same kind of syntactic lightness that programs in these languages enjoy.
    3. Expressions: F# programs are built out of expressions, which can be composed very simply. For example, if is an expression in F#, as opposed to in C# where it is a statement.  This can make code simpler, while also enabling a high degree of flexibility.
    Libraries

    F# code can use all of the exisiting .NET libraries, such as the Console class used in the code above.  But F# also has access to a rich set of F# libraries, providing types that are well suited to functional programming and F# in particular.  A few notable libraries:

    1. Collections:  The standard .NET Framework collections can be used from F#, but there is also a fully-featured set of functional collections, including the ubiquotous immutable linked list (List<A>), an efficient immutable set built on binary trees (Set<A>) and an immutable dictionary (Map<Key,A>)
    2. Control:  High-level control structures, such as compositional eventing (IEvent<A>), laziness (Lazy<A>) and most importantly, asynchronous programming primitives (Async<A>).  The last of these in particular is a very exciting feature of F# for multi-threaded programming.
    Interactive

    F# comes with an "F# Interactive" toolwindow for Visual Studio, and also a command line interactive shell (fsi.exe).  These are tremendously useful for protyping and exploring, and can also be used as a testbed while working on larger projects.  As an example (see screenshot below) the code above can be pasted into an interactive shell to execute immediately.  If you want to make changes, just edit the code and paste into the interactive prompt again to run the new version.

    Summary

    Now that I'm working full-time on F#, you can expect to see more blogs posts here about F# in the future.  If you want to try out the code above, or any of the other great F# samples that are floating around the web, go to http://research.microsoft.com/fsharp/release.aspx and grab the most recent .msi download of the current Microsoft Research release of F#.

    File iconmandelbrot.fs

  • LukeH's WebLog

    Monadic Parser Combinators using C# 3.0

    • 19 Comments

    Parser combinators are an idea that I enjoy every time I go back and look at again.  They are an approach to building parsers by composing very simple atomic parsers into bigger and bigger units which can ultimately express real world grammars.  This idea has been particularly popular in functional languages where the parsers can naturally be thought of as functions from input strings to parse trees, and composition of parsers is just function composition.  This approach often leads to a simple syntax which makes the resulting parsers pleasantly declarative in that internal-DSL kind of way. 

    There's been plenty of great research and implementation of parser combinators over the last 15 years.  I can't describe it all, or even list all the references here - so let me just point you to a couple of fun papers co-authored by my Microsoft colleagues Erik Meijer and Daan Leijen.  These papers also include many references out to the broader set of research papers on the topic.

    Building a Parser Combinator Framework in C#3.0

    The code below is some sample code I wrote shortly after getting ahold of one of the first C#3.0 prototype compilers a couple years ago.  It uses some of the new languages features in interestingly unusual ways, which I'll highlight as I go along.  For reference, it is most directly influenced by the great parser combinator example given near the end of Scala by Example.

    Step 1: A Type for Representing Parsers
    using System;
    using System.Linq;
    using System.Text;
    
    // The result of a parse consists of a value and the unconsumed input.
    public class Result<TInput, TValue> {
        public readonly TValue Value;
        public readonly TInput Rest;
        public Result(TValue value, TInput rest) { Value = value; Rest = rest; }
    }
    
    // A Parser is a delegate which takes an input and returns a result.
    public delegate Result<TInput, TValue> Parser<TInput, TValue>(TInput input);
    

    Taking a cue from the functional language implementations, our type for representing parsers will be a delegate type.  This naturally captures the idea that a parser is a function which takes in some input, and returns a value as well as the unconsumed part of the input.

    Step 2: Provide Infix AND and OR Operations on Parsers
    public static class ParserCombinatorExtensions {
        public static Parser<TInput, TValue> OR<TInput, TValue>(
    this Parser<TInput, TValue> parser1,
    Parser<TInput, TValue> parser2)
    { return input => parser1(input) ?? parser2(input); } public static Parser<TInput, TValue2> AND<TInput, TValue1, TValue2>(
    this Parser<TInput, TValue1> parser1,
    Parser<TInput, TValue2> parser2)
    { return input => parser2(parser1(input).Rest); } }

    Before building any actual parsers, we'll just come up with ways to combine them.  There are two obvious ways.  Try the first one, and if it fails, apply the second one.  This corresponds to alternate productions in a BNF grammar.  Or we could apply the first one, and then apply the second one to whatever input is left.  This corresponds to sequencing (juxtaposition) in BNF grammars. 

    Note that these are extension methods applied to the Parser type, which is a delegate.  This is something we couldn't have done in C# prior to C# 3.0, because there was no way to add methods to a delegate type.  The "infix" syntax we get by making these extension methods instead of plain static methods is very important for the readability of the resulting parsers, where ordering is critical, as the example at the end of this post will show.

    Step 3: Support Composing Parsers using C# Query Expressions
    public static class ParserCombinatorsMonad {
        // By providing Select, Where and SelectMany methods on Parser<TInput,TValue> we make the 
        // C# Query Expression syntax available for manipulating Parsers.  
        public static Parser<TInput, TValue> Where<TInput, TValue>(
    this Parser<TInput, TValue> parser,
    Func<TValue, bool> pred)
    { return input => { var res = parser(input); if (res == null || !pred(res.Value)) return null; return res; }; } public static Parser<TInput, TValue2> Select<TInput, TValue, TValue2>(
    this Parser<TInput, TValue> parser,
    Func<TValue, TValue2> selector)
    { return input => { var res = parser(input); if (res == null) return null; return new Result<TInput, TValue2>(selector(res.Value), res.Rest); }; } public static Parser<TInput, TValue2> SelectMany<TInput, TValue, TIntermediate, TValue2>(
    this Parser<TInput, TValue> parser,
    Func<TValue, Parser<TInput, TIntermediate>> selector,
    Func<TValue, TIntermediate, TValue2> projector) { return input => { var res = parser(input); if (res == null) return null; var val = res.Value; var res2 = selector(val)(res.Rest); if (res2 == null) return null; return new Result<TInput, TValue2>(projector(val, res2.Value), res2.Rest); }; } }

    This is where the fun really starts!  Any type which implements Select, SelectMany and Where methods supports (part of) the "query pattern" which means we can write C#3.0 queries including multiple froms, an optional where clause and a select clause to process objects of this type.  This can make it a *lot* easier to combine parsers, and makes the syntax more transparent.  That the parsers we are using here support these three operations is captured in the idea that this parser type is a monad - something very much in LINQ's blood - which I hope to write more about in a future post.

    Step 4: The Fundamental Parser Combinator Building Blocks
    // Contains all the basic parsers that are independent of return type.
    public abstract class Parsers<TInput> {
        public Parser<TInput, TValue> Succeed<TValue>(TValue value) {
            return input => new Result<TInput, TValue>(value, input);
        }
        public Parser<TInput, TValue[]> Rep<TValue>(Parser<TInput, TValue> parser) {
            return Rep1(parser).OR(Succeed(new TValue[0]));
        }
        public Parser<TInput, TValue[]> Rep1<TValue>(Parser<TInput, TValue> parser) {
            return from x in parser 
                   from xs in Rep(parser)
                   select (new[] { x }).Concat(xs).ToArray();
        }
    }
    
    // Contains a few parsers that parse characters from an input stream
    public abstract class CharParsers<TInput> : Parsers<TInput> {
        public abstract Parser<TInput, char> AnyChar { get; }
        public Parser<TInput, char> Char(char ch) { 
    return from c in AnyChar where c == ch select c;
    } public Parser<TInput, char> Char(Predicate<char> pred) {
    return from c in AnyChar where pred(c) select c;
    } }

    So now we need to write some (almost) real parsers.  The Succeed parser always succeeds with the provided result without consuming any input.  Rep and Rep1 apply a parser repeatedly, they differ only in whether they require the parser to parse at least once.  Note that Rep1 uses Query Expressions to capture the idea of parsing using "parser", then parsing using "Rep(parser)" then returning a result built out of the two individual results. 

    The AnyChar and Char parsers parse actual characters out of the input stream.  AnyChar parses any single character, whereas the two overloads of Char only accept certain classes of characters.  Note that AnyChar is defined abstract because it can only be implemented once we've decided what kind of input we're going to parse - but we want to be able to build on it without committing to a particular input type.

    Step 5: An Example Parser for a MiniML Language
    // Term and its derived classes define the AST for terms in the MiniML language.
    public abstract class Term { }
    public class LambdaTerm : Term { public readonly string Ident; public readonly Term Term; public LambdaTerm(string i, Term t) { Ident = i; Term = t; } }
    public class LetTerm : Term { public readonly string Ident; public readonly Term Rhs; public Term Body; public LetTerm(string i, Term r, Term b) { Ident = i; Rhs = r; Body = b; } }
    public class AppTerm : Term { public readonly Term Func; public readonly Term[] Args; public AppTerm(Term func, Term[] args) { Func = func; Args = args; } }
    public class VarTerm : Term { public readonly string Ident; public VarTerm(string ident) { Ident = ident; } }
    
    // Provides a set of parsers for the MiniML Language defined above.  
    public abstract class MiniMLParsers<TInput> : CharParsers<TInput>{
        public MiniMLParsers() {
            Whitespace = Rep(Char(' ').OR(Char('\t').OR(Char('\n')).OR(Char('\r'))));
            WsChr =  chr => Whitespace.AND(Char(chr));
            Id =     from w in Whitespace
                     from c in Char(char.IsLetter)
                     from cs in Rep(Char(char.IsLetterOrDigit))
                     select cs.Aggregate(c.ToString(),(acc,ch) => acc+ch);
            Ident =  from s in Id where s != "let" && s != "in" select s;
            LetId =  from s in Id where s == "let" select s;
            InId =   from s in Id where s == "in" select s;
            Term1 = (from x in Ident 
                     select (Term)new VarTerm(x))
                    .OR(
                    (from u1 in WsChr('(') 
                     from t in Term 
                     from u2 in WsChr(')') 
                     select t));
            Term =  (from u1 in WsChr('\\')
                     from x in Ident
                     from u2 in WsChr('.')
                     from t in Term
                     select (Term)new LambdaTerm(x,t))
                    .OR(
                    (from letid in LetId
                     from x in Ident
                     from u1 in WsChr('=')
                     from t in Term
                     from inid in InId
                     from c in Term
                     select (Term)new LetTerm(x,t,c)))
                    .OR(
                    (from t in Term1
                     from ts in Rep(Term1)
                     select (Term)new AppTerm(t,ts)));
            All =    from t in Term from u in WsChr(';') select t;
        }
       
        public Parser<TInput, char[]> Whitespace;
        public Func<char, Parser<TInput, char>> WsChr;
        public Parser<TInput, string> Id;
        public Parser<TInput, string> Ident;
        public Parser<TInput, string> LetId;
        public Parser<TInput, string> InId;
        public Parser<TInput, Term> Term;
        public Parser<TInput, Term> Term1;
        public Parser<TInput, Term> All;
    }

    Here we've used the parsers built so far to implemented a parser for a (toy) language called MiniML.  You can immediately see that the parser is structured much like the BNF it implements - which is the internal-DSL concept I mentioned above.  You can also see how C# Query Expressions are used to help with this - in a context that is very different than their standard query usage.

    Step 6: Parse Something!
    public class MiniMLParserFromString: MiniMLParsers<string> {
        public override Parser<string, char> AnyChar {
            get { { return input => input.Length > 0 ? new Result<string,char>(input[0], input.Substring(1)) : null; } }
        }   
    }
    class Program {
        static void Main(string[] args) {
            MiniMLParserFromString parser = new MiniMLParserFromString();
            Result<string,Term> result = 
                parser.All(@"let true = \x.\y.x in 
                             let false = \x.\y.y in 
                             let if = \b.\l.\r.(b l) r in
                             if true then false else true;");
        }
    }

    We finally have something we can execute!  MiniMLParserFromString makes everything concrete by describing how to parse characters from a string input.  And then in the Main method we see an example which parsers a little example expression from the MiniML language.

    Conclusion

    The little parser combinator framework here is very much a toy implementation.  It's horribly inefficient, doesn't do error reporting, and doesn't support any form of error recovery.  But it's a fun 150 lines of code!  If you are interested in the more practical applications of parser combinators - these papers provide a lot of good ideas:

    And here's the code:

    File iconParserCombinators.cs

  • LukeH's WebLog

    F12 - Go To Definition

    • 17 Comments

    A few months ago when Bill Gates was on the Daily Show, Jon Stewart asked him a particularly insightful software question:

    "So...what does the F12 button do?  Does it do anything?  Is it a joke button".  To which Bill replied jokingly - "I'd stay away from it if I were you." 

    If you use Visual Studio though - I recommend you ignore Bill's advice here.  F12 is a quite useful keyboard shortcut.

    Go To Definition

    F12 itself invokes the "Go To Definition" (GTD) command.  This works on types, members, locals, range variables, etc.  It uses all the compiler's internal understanding of the code to make sure it takes you to the right member - even when there are many overloads or if your code is using a lot of type inference.

    Metadata As Source

    One of the additions to "Go To Definition" support in Visual Studio 2005 was something we call "Metadata as Source".  If you go to definition on a type which is defined in a referenced assembly - you will be presented with a "header file"-style view of the corresponding type and it's members.  This keeps you in the editor - and gives a convenient view of the type.  You can continue to navigate other members/types directly from the Metadata As Source view by using GTD again on those tokens.

    Backward and Forward navigation

    If you use GTD a lot, you'll probably find yourself wanting to go "back" to where you invoked "go to definition" from.  It turns out there is a keybinding for this:

    • ctrl-shift-8: Takes you back in the GTD stack, to the the last place you invoked GTD
    • ctrl-shift-7:  Takes you forward in the GTD stack, to the last place you invoked back
    Code Definition Window

    The Code Definition Window is toolwindow which shows a live GTD of whatever symbol is currently under the cursor in the editor.  If you find yourself using GTD a lot, you may want to pop up the Code Definition Window with ctrl-w-d.

    BTW - If you are looking for more useful keyboard shortcuts using the C# Profile in Visual Studio - take a look at  the C# keybinding posters Karen blogged about a little while ago.

  • LukeH's WebLog

    Using LINQ to solve puzzles

    • 15 Comments

    A couple months ago, I had a great time participating in Microsoft's PuzzleHunt event along with our team "Cup<T>".  Normally, the puzzles in puzzle hunt are designed to limit the value of writing programs to help solve them.  But this year, I did end up writing some code to help with one of the puzzles - and it happened to be a simple LINQ query that helped.  Since this is an example of using LINQ in a problem domain that's pretty different than the usual querying examples, I thought it might be worth sharing.

    The Puzzle

    Here's a puzzle similar to the one in the puzzle hunt.  The diagram below is a bunch of weights (A-M) hanging from a system of bars.  Each weight has an integer value between 1 and 13, and the goal is to figure out what each weight must be for the the diagram below to balance correctly as shown: 

                              |
                              |
                  +--+--+--+--+--+--+--+
                  |                    |
                  |                    |
               +--+--+--+--+--+        |
               |     L        M        |
               |                       |
      +--+--+--+--+--+--+     +--+--+--+--+--+
      H              |  I     |  J        K  |
                     |        |              |
            +--+--+--+--+--+  |     +--+--+--+--+--+
            E              F  |     G              |
                              |                    |
                  +--+--+--+--+--+  +--+--+--+--+--+--+
                  A              B  C                 D
    

    The rules for this kind of puzzle are: (1) The weights on either side of a given pivot point must be equal, when weighted by the distance from the pivot, and (2) a bar hanging beneath another contributes it's total weight as through it were a single weight.  For instance, the bar on the bottom right must have 5*C=D, and the one above it must have 3*G=2*(C+D).

    A First Solution

    Here's what I tried first.  Note that it's not really much different than a bunch of for loops with an if statement inside:

    using System;
    using System.Linq;
    
    class WeighsAndMeans
    {
      static void Main(string[] args)
      {
        var solveForWeights =
          from a in Enumerable.Range(1, 13)
          from b in Enumerable.Range(1, 13)
          from c in Enumerable.Range(1, 13)
          from d in Enumerable.Range(1, 13)
          from e in Enumerable.Range(1, 13)
          from f in Enumerable.Range(1, 13) 
          from g in Enumerable.Range(1, 13)
          from h in Enumerable.Range(1, 13)
          from i in Enumerable.Range(1, 13) 
          from j in Enumerable.Range(1, 13)
          from k in Enumerable.Range(1, 13) 
          from l in Enumerable.Range(1, 13)
          from m in Enumerable.Range(1, 13) 
          where (4 * a == b)
             && (5 * c == d)
             && (3 * e == 2 * f)
             && (3 * g == 2 * (c + d))
             && (3 * (a + b) + 2 * j == k + 2 * (g + c + d))
             && (3 * h == 2 * (e + f) + 3 * i)
             && ((h + i + e + f) == l + 4 * m)
             && (4 * (l + m + h + i + e + f) == 3 * (j + k + g + a + b + c + d))
          select new { a, b, c, d, e, f, g, h, i, j, k, l, m, 
    Total = a + b + c + d + e + f + g + h + i + j + k + l + m };
    solveForWeights.ToList().ForEach(result => Console.WriteLine(result)); } }

    A More Efficient Solution

    Part way through writing the code above, I recognized that it wasn't going to be very fast.  It's not too hard to see that it'll execute the body of the where clause 13^13 times.  That's more than 300 trillion times!  Turns out this is exactly what join can help with, and in contrast to the for loop case, it's pretty easy to turn the query above into one which uses joins.  The following code solves the puzzle right away:

    using System;
    using System.Linq;
    
    class WeighsAndMeans
    {
      static void Main(string[] args)
      {
        var solveForWeights =
          from a in Enumerable.Range(1, 13)
          join b in Enumerable.Range(1, 13) on 4 * a equals b
          from c in Enumerable.Range(1, 13)
          join d in Enumerable.Range(1, 13) on 5 * c equals d
          from e in Enumerable.Range(1, 13)
          join f in Enumerable.Range(1, 13) on 3 * e equals 2 * f
          join g in Enumerable.Range(1, 13) on 2 * (c + d) equals 3 * g
          from h in Enumerable.Range(1, 13)
          join i in Enumerable.Range(1, 13) on 3 * h - 2 * (e + f) equals 3 * i
          from j in Enumerable.Range(1, 13)
          join k in Enumerable.Range(1, 13) on 3 * (a + b) + 2 * j - 2 * (g + c + d) equals k
          from l in Enumerable.Range(1, 13)
          join m in Enumerable.Range(1, 13) on (h + i + e + f) - l equals 4 * m
          where (4 * (l + m + h + i + e + f) == 3 * (j + k + g + a + b + c + d))
          select new { a, b, c, d, e, f, g, h, i, j, k, l, m, 
    Total = a + b + c + d + e + f + g + h + i + j + k + l + m }; solveForWeights.ToList().ForEach(result => Console.WriteLine(result)); } }

    Conclusion

    It turns out this is an example of using LINQ to solve integer constraints problems.  In the general case, special purpose libraries built to solve these problems are almost certainly more efficient, but the LINQ example using joins is sufficiently quick at least for this case.  More importantly, the LINQ solution is not too hard to arrive at, and doesn't require knowledge of some special purpose Integer Programming framework.  I see this as an example of one of the great benefits of LINQ - that I can do all sorts of query and transformation in C# without resorting to special purpose frameworks for each different domain I need to work with.

    kick it on DotNetKicks.com

  • LukeH's WebLog

    In-Memory Query with C#2.0 and C#3.0

    • 12 Comments

    Here's some sample code from a demo I did a few months ago.  It's a simple example of how much easier in-memory query and transformation are with C#3.0 and LINQ, compared with a straightforward implementation using C#2.0.

    The Query

    The goal is to implement this query:

    "Get all the instance methods on System.String, ordered by name and grouped into overloads, each with the name of the overloaded member and the number of overloads."

    For example, you can imagine that a C# editor IntelliSense implementation might need code such as this one to present a completion list on an instance of System.String.

    C# 2.0 Implementation

    MethodInfo[] methods = typeof(string).GetMethods();
    
    List<MethodInfo> instanceMethods = new List<MethodInfo>();
    foreach (MethodInfo method in methods)
        if (!method.IsStatic)
            instanceMethods.Add(method);
    instanceMethods.Sort(delegate(MethodInfo m1, MethodInfo m2) { return m1.Name.CompareTo(m2.Name); });
    Dictionary<string, List<MethodInfo>> methodGroups = new Dictionary<string, List<MethodInfo>>();
    foreach (MethodInfo method in instanceMethods)
    {
        if (methodGroups.ContainsKey(method.Name))
        {
            methodGroups[method.Name].Add(method);
        }
        else
        {
            methodGroups.Add(method.Name, new List<MethodInfo>());
            methodGroups[method.Name].Add(method);
        }
    }
    List<Container> result = new List<Container>();
    foreach (string name in methodGroups.Keys)
    {
        Container container = new Container();
        container.MethodName = name;
        container.Overloads = methodGroups[name].Count;
        result.Add(container);
    }
    foreach (Container item in result)
    {
        Console.WriteLine(item);
    }
    

    C# 3.0 Implementation

    MethodInfo[] methods = typeof(string).GetMethods();
    
    var query =
        from m in methods
        where !m.IsStatic
        orderby m.Name
        group m by m.Name into g
        select new { MethodName = g.Key, Overloads = g.Count() };
    
    foreach (var item in query)
    {
        Console.WriteLine(item);
    }

    Conclusion

    The C#3.0 implementation is quite a bit shorter!  But more importantly, it reads a lot like the English description of the query given above.   It's also a more declarative description of the query - it says what to do, but not how to do it.  These are general features of using LINQ to Objects with C#3.0.  There are many cases where what would have been for loops, if statements, and assignments to intermediate collections - can now be written as queries, providing a clearer description of what the code is doing.

    BTW - Which do you think is faster?  Why?

    kick it on DotNetKicks.com

  • LukeH's WebLog

    C# 3.0 and CodeDOM

    • 12 Comments

    The CodeDOM is a very handy .NET API which allows you to programatically compile code using the .NET compilers and programatically construct code without just pasting together strings. 

    With the new version of the language, we've heard a numer of questions about how to use the CodeDOM with the new compiler.

    There are two aspects to the inegration of C#3.0 with CodeDOM:

    • Programatically compiling C# source code: This is supported in .NET3.5.  The existing CodeDOM is augmented with an overload of CSharpCodeProvider which takes a "providerOptions" argument.  This can be used to point CodeDOM at the new .NET Framework 3.5 C# compiler (which supports C#3.0), by passing a "providerOptions" dictionary containing "CompilerVersion" as "v3.5".  This can also be controlled via the .config file, which is done  for example in Orcas ASP.NET websites targeting .NET3.5. 

    • Programatically constructing C#3.0 source code:  There won't be support for the this in the .NET Framework 3.5 CodeDOM that ships with Orcas.  Luckily, very few of the C# 3.0 features need CodeDOM support - since most new features are expressions, and the CodeDOM doesn't go down to the expression level. 
    Example of programatically compiling C#3.0 source code:

    using System;
    using System.Collections.Generic;
    using System.Linq;
    using Microsoft.CSharp;
    using System.CodeDom.Compiler;
    class Program
    {
        static void Main(string[] args)
        {
            var csc = new CSharpCodeProvider(new Dictionary<string, string>() { { "CompilerVersion", "v3.5" } });
            var parameters = new CompilerParameters(new[] { "mscorlib.dll", "System.Core.dll" }, "foo.exe", true);
            parameters.GenerateExecutable = true;
            CompilerResults results = csc.CompileAssemblyFromSource(parameters,
            @"using System.Linq;
                class Program {
                  public static void Main(string[] args) {
                    var q = from i in Enumerable.Rnge(1,100)
                              where i % 2 == 0
                              select i;
                  }
                }"
    );
            results.Errors.Cast<CompilerError>().ToList().ForEach(error => Console.WriteLine(error.ErrorText));
        }
    }

  • LukeH's WebLog

    Huffman Coding with F#

    • 11 Comments

    I recall my sense of awe when I first wrote a simple compression application using Huffman Coding a few years ago for a school assignment.  Compression is one of those things that just kind of feels like magic - you get to take something, and make it smaller without losing any information!  It's done so often that it now seems commonplace - but, like so many other programming projects, there is something rewarding about writing your own implementation and seeing it actually work.

    When I was in Cambridge a couple weeks ago visiting the Microsft Research-based part of the F# team, I stopped by one of the nice bookstores in Cambridge and ended up picking up a copy of Information Theory, Inference and Learning Algorithms by David MacKay - which has been a truly great read so far.  When I got to the section on Huffman coding, it reminded me of how much I enjoyed implementing the algorithm a a few years back, that time in C.  So I decided I should try again, this time in F#!

    [ Note: If you are more interested in F# than in Huffman Coding, feel free to skip to the comments after the code below - where I'll talk about some of the reasons why F# works so well for this kind of application. ]

    Compression with Huffman Coding

    Wikipedia has a great description of Huffman Coding - so instead of trying to describe it myself, let me just quote from the Wikipedia topic:

    Huffman Compression

    "In computer science and information theory, Huffman coding is an entropy encoding algorithm used for lossless data compression. The term refers to the use of a variable-length code table for encoding a source symbol (such as a character in a file) where the variable-length code table has been derived in a particular way based on the estimated probability of occurrence for each possible value of the source symbol."

    Basic Technique

    "The technique works by creating a binary tree of nodes. These can be stored in a regular array, the size of which depends on the number of symbols, n. A node can be either a leaf node or an internal node. Initially, all nodes are leaf nodes, which contain the symbol itself, the weight (frequency of appearance) of the symbol and optionally, a link to a parent node which makes it easy to read the code (in reverse) starting from a leaf node. Internal nodes contain symbol weight, links to two child nodes and the optional link to a parent node. As a common convention, bit '0' represents following the left child and bit '1' represents following the right child. A finished tree has n leaf nodes and n − 1 internal nodes."

    A Huffman Coder in F#

    open System
    
    /// Huffman coding uses a binary tree whose leaves are the input symbols 
    /// and whose internal nodes are the combined expected frequency of all the
    /// symbols beneath them.
    type HuffmanTree = 
        | Leaf of char * float
        | Node of float * HuffmanTree * HuffmanTree
    
    /// Provides encoding and decoding for strings containing the given symbols and expected frequencies
    type HuffmanCoder(symbols: seq<char>, frequencies : seq<float>) =
       
        /// Builds a list of leafs for a huffman encoding tree from the input frequencies
        let huffmanTreeLeafs =    
            Seq.zip symbols frequencies
            |> Seq.toList
            |> List.map Leaf
            
        /// Utility function to get the frequency from a huffman encoding tree node
        let frequency node =
            match node with
            | Leaf(_,p) -> p
            | Node(p,_,_) -> p    
    
        /// Builds a huffman encoding tree from a list of root nodes, iterating until a unique root node
        let rec buildCodeTree roots = 
            match roots |> List.sortBy frequency with
            | [] -> failwith "Cannot build a Huffman Tree for no inputs" 
            | [node] -> node
            | least::nextLeast::rest -> 
                       let combinedFrequency = frequency least + frequency nextLeast
                       let newNode = Node(combinedFrequency, least, nextLeast)
                       buildCodeTree (newNode::rest)
                   
        let tree = buildCodeTree huffmanTreeLeafs
         
        /// Builds a table of huffman codes for all the leafs in a huffman encoding tree
        let huffmanCodeTable = 
            let rec huffmanCodes tree = 
                match tree with
                | Leaf (c,_) -> [(c, [])]
                | Node (_, left, right) -> 
                    let leftCodes = huffmanCodes left |> List.map (fun (c, code) -> (c, true::code))
                    let rightCodes = huffmanCodes right |> List.map (fun (c, code) -> (c, false::code))
                    List.append leftCodes rightCodes
            huffmanCodes tree 
            |> List.map (fun (c,code) -> (c,List.toArray code))
            |> Map.ofList
    
        /// Encodes a string using the huffman encoding table
        let encode (str:string) = 
            let encodeChar c = 
                match huffmanCodeTable |> Map.tryFind c with
                | Some bits -> bits
                | None -> failwith "No frequency information provided for character '%A'" c
            str.ToCharArray()
            |> Array.map encodeChar
            |> Array.concat
           
        
        /// Decodes an array of bits into a string using the huffman encoding tree
        let decode bits =
            let rec decodeInner bitsLeft treeNode result = 
                match bitsLeft, treeNode with
                | [] , Node (_,_,_) -> failwith "Bits provided did not form a complete word"
                | [] , Leaf (c,_) ->  (c:: result) |> List.rev |> List.toArray
                | _  , Leaf (c,_) -> decodeInner bitsLeft tree (c::result)
                | b::rest , Node (_,l,r)  -> if b
                                             then decodeInner rest l result
                                             else decodeInner rest r result
            let bitsList = Array.toList bits
            new String (decodeInner bitsList tree [])
                     
        member coder.Encode source = encode source
        member coder.Decode source = decode source
    
    Pattern Matching

    Pattern matching is one of the basic but very powerful features of F#.  Using pattern matching allows code to be succinct while at the same time very clear about it's behaviour.  Just about every function in the code above uses pattern matching - which is typical of a lot of F# code. 

    In simple cases, such as in "huffmanCodes" above, pattern matching makes it easy to switch on the possible cases of a union datastructure. 

    match tree with
    | Leaf (c,_) -> //...
    | Node (_, left, right) -> //...

    In more complex cases like "decodeInner" above, pattern matching helps to guide your code. You list each shape of input that you know how to handle, and the leaf nodes of the pattern you matched indicate the data needed to define the behaviour of that case.  Then the compiler kindly tells you which cases you haven't covered. When I wrote this function originally, I was missing the first case, and got this helpful compiler warning:

    Warning: Incomplete pattern matches on this expression. The value '([],Node (_, _, _))' will not be matched 

    Which was exactly right - this particular input indicates that the user provided illegal inputs!

     

    Pipelining

    Pipelining is a nice way to declaratively describe a series of operations to perform on an input.  It's philosophically just like pipelining in a command shell - take the input data from the left and pass it as input to the function on the right. 

    Because the F# libraries provide a good collection of primitive operations to perform on your data, it's easy to describe a transformation as a pipeline of data through a sequence of these operations.  For example, you can filter, project, collapse, zip, and re-package data easily and declaratively.

     

    Collections

    The code uses the 4 common F#/.NET collections:

    • F# "List": Immutable linked list used for recursive algorithms which walk across a list
    • F# "Map": Immutable dictionary used to store the codes for each symbols
    • F# "Seq" = .NET "IEnumerable":  Primitive collection interface, used for the inputs
    • .NET "Array": Basic typed arrays used as the output of the encoding

    Note that it is easy and common to switch between these collections as needed, using functions like "List.toArray" and "Map.ofList".

     

    "It's a .NET component!" a.k.a. "The Slickness"

    When I wrote this code, I started out in "experimentation" mode. I just wrote a bunch of small functions at the top level and used F# Interactive to execute them as I went. But once I had something that seemed to work, I wanted to wrap it up in a class that could encapsulate all the functionality, and provide a nice .NET interface. Here's what I did:

    1. Tab everything in one level
    2. Wrap the code in:

      type HuffmanCoder(symbols : seq<char>, frequencies : seq<float>) = 




         
      // All the code goes here...


         
      member coder.Encode source = encode source
          member coder.Decode source = decode source

    I can't really understate how amazing this is.  In F#, it is super-simple to smoothly transition from experimentation coding to component design coding.  This is something that F# can manage because it is a hybrid functional/OO language intergated carefully into .NET.  And because of this, it is really easy to build components of a larger .NET system in F#.  I heard this referred to once as "the slickness" of functional OO coding in F#, and I like that term, because whenever I do this it just feels so slick. :-)

    In case you are curious, here's what it looks like from C#:

        public class HuffmanCoder
        {
            public HuffmanCoder(IEnumerable<char> symbols, IEnumerable<double> frequencies);
    
            public string Decode(bool[] source);
            public bool[] Encode(string source);
        }

    Summary

    I found this code to be a good example of some of the things I really like about F#.  The algorithmic nature of the code lends itself well to F#'s syntax and programming style.  You can start off just writing a bunch of simple, composable functions, each just a half dozen lines of pattern matching and pipelining.  As you are doing this, you can use the F# Interactive to experiment with the code and the algorithms as you evolve them into the right functionality.  Then when they are working, you can smoothly wrap them up into a .NET component and expose it off to the rest of your .NET consumers.

  • LukeH's WebLog

    C# is verbose. Verbose is good.

    • 10 Comments

    On ScottWil’s blog recently, Thomas Eyde commented on C#, saying “Basically I think C# is too static, complex and verbose. There [is] so much you have to type which does not add value.”  I am admittedly biased, but I think that C#’s verboseness is a good thing. 

     

    On the face of it, being able to express the same idea in fewer lines or characters of code sounds like a useful trait in a language.  But there are a couple of reasons why I don’t think this is as good of a goal as it sounds like.

     

    The first is that far more time is spent reading code than writing it.  If writing code in a particular language takes twice as many lines or characters, but this causes the reader to spend half as much time trying to understand it, the tradeoff is very likely worthwhile (since the code will be read more times than it was written).  This is the same argument that is made to explain why it’s important to use descriptive variable names.  Using variable names that don’t carry any connotations forces the reader to follow the entire flow of the variable through the code to understand what it represents.  On the other hand, variable names that carry connotations about their meaning can help a programmer convey information about the variable, so that the reader can just read a snippet of code and have a good idea what it is doing.

     

    The second is that tools, like intellisense, can help make the number of characters typed far less than the number of characters of code that are generated.  As a simple example of this, take the following block of code:

     

    foreach (Item<T> item in state.Items)

    {

      if (item.Index >= item.Production.Right.Count)

      {

        foreach (Terminal<T> follow in Follow(item.Production.Left))

        {

          ActionTable.Add(state, follow, new Reduce<T>(item.Production));

        }

      }

    }                                    (233 keystrokes)

     

    This can be written in the Visual Studio C# editor with approximately half as many keystrokes as produced characters, by typing the following sequence of characters.

     

    fore(It<T> item in sta.I)

    {

    if(it.I>=i.P.R.C)

    {

    f(Te<T> follow in Fol(it.P.L))

    {

    ActionT.A(s,follow,new Re<T>(i.P));

    }

    }

    }                                    (122 keystrokes)

     

    Or even better, you could take advantage of code snippets, and write the same code in even fewer keystrokes (“” represents a tab):

     

    fore→→It<T>→item→sta.I→

    if→→it.I>=i.P.R.C→

    f→→Te<T>→follow→Fol(it.P.L)→

    ActionT.A(s,follow,new Re<T>(i.P));  (107 keystrokes)

     

    Of course, the characters typed in above are clearly unreadable, and so the presentation (and persistence) of these as the original C# program helps readability and maintenance of the code for future readers.  But the end result is that in relatively few keystrokes, a programmer can create code that is verbose enough to be easy to read by the next programmer who needs to modify it.  This gives a sort of “best of both worlds”, which feels like a good thing to me.  What do you think?

  • LukeH's WebLog

    F# in Silverlight

    • 9 Comments

    Over the last couple years, there has been an explosion of interest in Silverlight.  As a .NET-based runtime, it is possible to compile Silverlight applications with any .NET language, and we’ve seen a lot of F# developers using F# in Silverlight.  However, until recently this involved building an application using the desktop version of the F# runtime, which could result in some pitfalls and mixed levels of success.

    With the recent F# May CTP though, we now provide a Silverlight version of the F# runtime, FSharp.Core.dll, along with the F# release.  This enables building truly first-class Silverlight components using F#.

    To make this easier, I’ve posted some Silverlight F# project templates and samples on Code Gallery. 

    Download

    F# Templates and Samples for Silverlight

    Templates

    image

    Samples

    L-Systems

    Lindenmayer Systems are an interesting way of generating a variety of fractals using a simple set of rewrite rules.  Check out the fascinating book The Algorithmic Beauty of Plants for details.  The Silverlight application below uses an L-System rewriter and rendered written in F#.

    • F, G, A, B are drawn as a move forward.
    • + is a turn right.
    • - is a turn right.
    • X, Yand anything else is skipped during rendering.

    Launch the L-System demo 

    open System.Windows
    open System.Windows.Shapes
    
    let rec internal applyRulesInOrder rules c =
        match rules with
        | [] -> string c
        | rule::rules' -> match rule c with | None -> applyRulesInOrder rules' c
            | Some result -> result
    
    let internal step rules current =
         current
         |> String.collect (applyRulesInOrder rules)
    
    let internal rotate (x,y) theta =
        let x' = x * cos theta - y * sin theta
        let y' = x * sin theta + y * cos theta
        (x',y')
    
    let rec internal render (x,y) (dx,dy) angle points system =
        match system with | [] -> (x,-y)::points
        | 'A'::system' | 'B'::system' | 'F'::system' | 'G'::system' -> let x',y' = x+dx,y+dy
            render (x',y') (dx,dy) angle ((x,-y)::points)  system'
        | '+'::system' -> let (dx',dy') = rotate (dx,dy) angle
            render (x,y) (dx',dy') angle points system'
        | '-'::system' -> let (dx',dy') = rotate (dx,dy) (-angle)
            render (x,y) (dx',dy') angle points system'
        | _::system' -> render (x,y) (dx,dy) angle points system' let rec internal applyN f n x =
        if n = 0 then x
        else f (applyN f (n-1) x)
    
    let internal normalize points =
        let minX = points |> Seq.map (fun (x,_) -> x) |> Seq.min
        let minY = points |> Seq.map (fun (_,y) -> y) |> Seq.min
        points |> List.map (fun (x,y) -> new Point(x-minX, y-minY)) type LSystem(rulesString:string, start:string, angle:int, stepSize:int, n:int) =
        let expanded,isError =
            try let rules =
                    rulesString.Split([|"\r";"\n"|], System.StringSplitOptions.RemoveEmptyEntries)
                    |> Array.map (fun line -> line.Split([|"->"|], System.StringSplitOptions.RemoveEmptyEntries))
                    |> Array.map (fun fromAndTo -> (fromAndTo.[0].[0], fromAndTo.[1]))
                let ruleFunctions = [ for (c, s) in rules -> fun x -> if x = c then Some s else None]
                applyN (step ruleFunctions) n start, false with | e -> "", true member this.Render(polyline : Polyline) =
            let points = render (0.0,0.0) (float stepSize,0.0) (float angle * System.Math.PI / 180.0) [] (List.ofSeq expanded)
            for pt in normalize points do polyline.Points.Add(pt)
            isError
    
            

    Console Control

    A resuable Silverlight control providing a console emulation. The control exposes input and ouput streams akin to those on the System.Console class. Could be used to provide console input and output as part of a Silverlight application, or as a way to convert Windows Console apps to Silverlight apps.

    This samples hooks the Console up to a simple echo loop.

    Launch the Console Control

    namespace System.Windows.Controls
    
    open System
    open System.IO
    open System.Windows
    open System.Windows.Controls
    open System.Windows.Input
    open SilverlightContrib.Utilities.ClipboardHelper
    open System.Text
    
    // A shared base implementation of Stream for 
    // use by the console input and output streams
    [<AbstractClass>] type private ConsoleStream(isRead) = inherit Stream() override this.CanRead = isRead override this.CanWrite = not isRead override this.CanSeek = false override this.Position with get() = raise (new NotSupportedException("Console stream does not have a position")) and set(v) = raise (new NotSupportedException("Console stream does not have a position")) override this.Length = raise (new NotSupportedException("Console stream does not have a length")) override this.Flush() = () override this.Seek(offset, origin) = raise (new NotSupportedException("Console stream cannot seek")) override this.SetLength(v) = raise (new NotSupportedException("Console stream does not have a length")) /// A control representing a Console window
    /// Provides an InputStream and OutputStream
    /// for reading an writing character input.
    /// Also supports copy/paste on some browsers
    type SilverlightConsole() as self = inherit TextBox() // The queue of user input which has been collected by the
    // console, but not yet read from the input stream
    let readQueue = new System.Collections.Generic.Queue<int>() // A stream that reads characters from user input
    let inputStream = { new ConsoleStream(true) with override this.Write(buffer,offset,count) = raise (new NotSupportedException("Cannot write from Console input stream")) override this.Read(buffer,offset,count) = do System.Diagnostics.Debug.WriteLine("Starting to read {0} bytes", count) let rec waitForAtLeastOneByte() = let shouldSleep = ref true let ret = ref [||] lock readQueue (fun () -> shouldSleep := readQueue.Count < 1 if not !shouldSleep then let lengthToRead = min readQueue.Count count ret := Array.init lengthToRead (fun i -> byte (readQueue.Dequeue()))) if !shouldSleep then System.Threading.Thread.Sleep(100); waitForAtLeastOneByte() else !ret let bytes = waitForAtLeastOneByte() System.Array.Copy(bytes, 0, buffer, offset, bytes.Length) do System.Diagnostics.Debug.WriteLine("Finished reading {0} bytes", bytes.Length) bytes.Length } // A stream that sends character output onto the console screen
    let outputStream = { new ConsoleStream(false) with override this.Read(buffer,offset,count) = raise (new NotSupportedException("Cannot read from Console output stream")) override this.Write(buffer,offset,count) = let isDelete = offset < 0 let newText = if isDelete then "" else UnicodeEncoding.UTF8.GetString(buffer, offset, count) let _ = self.Dispatcher.BeginInvoke(fun () -> if isDelete then if self.Text.Length >= count then self.Text <- self.Text.Substring(0, self.Text.Length - count) else do self.Text <- self.Text + newText do self.SelectionStart <- self.Text.Length do self.SelectionLength <- 0) () } let shiftNumbers = [|')';'!';'@';'#';'$';'%';'^';'&';'*';'('|] let currentInputLine = new System.Collections.Generic.List<int>() // Handles key down events
    // Processes the pressed key and turns it into console input
    // Also echos the pressed key to the console
    let keyDownHandler(keyArgs : KeyEventArgs) = try do keyArgs.Handled <- true let shiftDown = Keyboard.Modifiers &&& ModifierKeys.Shift <> (enum 0) let ctrlDown = Keyboard.Modifiers &&& ModifierKeys.Control <> (enum 0) let p = keyArgs.PlatformKeyCode if ctrlDown || keyArgs.Key = Key.Ctrl then if keyArgs.Key = Key.V then lock currentInputLine (fun () -> let clipboard = new ClipboardHelper() let fromClipboard = clipboard.GetData() for c in fromClipboard do do currentInputLine.Add(int c) outputStream.WriteByte(byte c) if c = '\n' then for i in currentInputLine do do System.Diagnostics.Debug.WriteLine("Enqueued {0}", char i) do readQueue.Enqueue(i) do currentInputLine.Clear() ) elif keyArgs.Key = Key.C then let text = self.SelectedText let clipboard = new ClipboardHelper() clipboard.SetData(text) else System.Diagnostics.Debug.WriteLine("Got key {0} {1} {2}", p, char p, keyArgs.Key) let ascii = match p with | n when n >= 65 && n <= 90 -> if shiftDown then p else p+32 | n when n >= 48 && n <= 57 -> if shiftDown then int shiftNumbers.[p-48] else p | 8 -> 8
    // backspace
    | 13 -> int '\n'
    | 32 -> int ' '
    | 186 -> if shiftDown then int ':' else int ';'
    | 187 -> if shiftDown then int '+' else int '='
    | 188 -> if shiftDown then int '<' else int ','
    | 189 -> if shiftDown then int '_' else int '-'
    | 190 -> if shiftDown then int '>' else int '.'
    | 191 -> if shiftDown then int '?' else int '/'
    | 192 -> if shiftDown then int '~' else int '`'
    | 219 -> if shiftDown then int '{' else int '['
    | 220 -> if shiftDown then int '|' else int '\\'
    | 221 -> if shiftDown then int '}' else int ']'
    | 222 -> if shiftDown then int '\"' else int '\''
    | _ -> -1 if ascii = 8 then lock currentInputLine (fun () -> if currentInputLine.Count > 0 then currentInputLine.RemoveAt(currentInputLine.Count - 1) outputStream.Write([||], -1, 1) ) elif ascii > 0 then lock currentInputLine (fun () -> do currentInputLine.Add(ascii) outputStream.WriteByte(byte ascii) ) if ascii = int '\n' then lock currentInputLine (fun () -> for i in currentInputLine do do System.Diagnostics.Debug.WriteLine("Enqueued {0}", char i) if i = 10 then do readQueue.Enqueue(13) do readQueue.Enqueue(i) do currentInputLine.Clear()) do self.SelectionStart <- self.Text.Length do self.SelectionLength <- 0 with | e -> System.Diagnostics.Debug.WriteLine(e) // Lazily initialized StreamReader/StreamWriter
    let outReader = lazy (new System.IO.StreamWriter(outputStream, Encoding.UTF8, 256, AutoFlush=true)) let inReader = lazy (new System.IO.StreamReader(inputStream, Encoding.UTF8, false, 256)) // Manually handle the Return key so we can accept newlines
    do self.AcceptsReturn <- true
    // Make sure a vertical scrollbar appears when needed
    do self.VerticalScrollBarVisibility <- ScrollBarVisibility.Auto // Make the control read-only so that users cannot move the cusor or change the contents
    // Unfortunatley, this also greys it out - ideally we could seperate theese two.
    do self.IsReadOnly <- true
    // Hookup the keyDownHandler
    do self.KeyDown.Add(keyDownHandler) /// The raw input stream for the Console
    member this.InputStream = inputStream :> Stream /// The raw ouput stream for the Console
    member this.OutputStream = outputStream :> Stream /// A StreamWriter for writing to the Console
    member this.Out = outReader.Value /// A StreamReader for reading from the Console
    member this.In = inReader.Value

    Summary

    Try out F# with Silverlight using the F# May CTP and the F# for Silverlight templates.

  • LukeH's WebLog

    Intro + C# Express

    • 8 Comments

    Hi, my name’s Luke and I am a program manager on the C# team at Microsoft.  I’ve wanted to start up a blog for awhile now, so here goes...

    I’m pretty new to Microsoft, but since I’ve been here I’ve had a chance to be involved in some really fun work.  I’m on the C# team, and in particular the C# IDE team.  That means I get to work a lot with Cyrus, Anson, Joe, JayBaz, and all the other folks on the IDE team.  It also means I get to work on some really interesting projects.

    The project that I’ve been working on for most of my time at Microsoft is the Visual C# Express Edition.  So as you can probably guess, I’m really excited that the betas of the Express editions are now available for download.  It’s been really great seeing all the feedback we’ve already received on the Express idea after less than 48 hours.  It sounds like people are really excited about the Express products.

     

    So go grab an Express product, and let us know what you think!

  • LukeH's WebLog

    C# Express RSS Screensaver

    • 7 Comments

    One of the cool projects I’ve worked on as part of working on C# Express is the “starter kit” that we added to the C# Express product.  Dan and Shaykat have both blogged about the starter kit, but I’ll try to go into some more detail about what these starter kits are.

     

    What is a starter kit?

    This is a question we get all the time.  The answer is that starter kits are like samples, but with better documentation and with clear ideas for extending the application to add new features.  They can be used as just a cool application, as an example of how to use some new language features (generics) and .NET framework APIs (strongly typed resources), or as a starting point for playing around with a small application.  They are also integrated closely into the environment by being added to the File->New->Project… dialog box.

     

    What is the RSS Screensaver starter kit?

    The starter kit we have in the C# Express beta is an RSS Screensaver.  Dan has some great screenshots here.  The documentation that’s included with the starter kit has lots of information about how it works.  There is also a section at the end with some possible ways to extend the screensaver.

     

    What is cool about the RSS Screensaver starter kit?

    1)      It’s a fully functional screensaver.  You can set it to be your screen saver so you can always be up to date on the latest C# info!

    2)      You can use the source to learn about C#, the .NET frameworks and how to use the Visual Studio environment.

    3)   You can learn about lots of neat GDI+ tricks, like using transparency and writing formatted text to the screen.

    3)      It has a set of classes included that can parse RSS feeds.  You can use these as a starting place for writing your own applications that use RSS feeds.

     

    How do I get the RSS Screensaver starter kit?

    Download the C# Express Beta from here, and look in the File->New->Projects… dialog box.

     

    How do I report a bug or a suggestion for the starter kit?

    You can use the MSDN Product Feedback Center to report any bugs or suggestions you have.  You can also post your experiences here.  And if you have extended the screen saver in a cool way, tell us about it.

  • LukeH's WebLog

    F# on Windows Azure

    • 5 Comments

    Windows Azure was announced yesterday, and along with it, the first CTP of the SDK and Visual Studio tools.  If you haven’t yet tried it, go take a look.  On top of serving as a hosting service for web applications, Azure also provides a really simple way to do distributed compute and storage in the cloud.

    Azure supports running .NET applications, which means you can build Azure worker roles using F#! The tools released with Azure don’t have F# support out of the box though, so I’ve posted a few simple templates and samples up on Code Gallery.

    Download

    F# Templates and Samples for Windows Azure

    Templates

    image

    Cloud WebCrawl Sample

    namespace SearchEngine_WorkerRole
    
    open System
    open System.Threading
    open Microsoft.ServiceHosting.ServiceRuntime
    open System.Net
    open System.IO
    open System.Text.RegularExpressions
    open Microsoft.Samples.ServiceHosting.StorageClient;
    open System.Web
    open System.Runtime.Serialization.Formatters.Binary
    
    type WorkerRole() =
        inherit RoleEntryPoint()
    
        // The page to start crawling from
        let startpage = @"http://blogs.msdn.com/lukeh"
        // The filter to apply to links while crawling
        let pageFilter = fun (url:string) -> url.StartsWith("http://blogs.msdn.com/")
    
        /// Get the contents of a given url
        let http(url: string) = 
            let req    = WebRequest.Create(url) 
            use resp   = req.GetResponse()
            use stream = resp.GetResponseStream() 
            use reader = new StreamReader(stream) 
            let html   = reader.ReadToEnd()
            html
    
        /// Get the links from a page of HTML
        let linkPat = "href=\s*\"[^\"h]*(http://[^&\"]*)\""
        let getLinks text =  [ for m in Regex.Matches(text,linkPat)  -> m.Groups.Item(1).Value ]
        
        /// Handle the message msg using the given queue and blob container
        let HandleMessage (msg : Message) (queue : MessageQueue, container: BlobContainer) =
            // There was a new item, get the contents
            let url = msg.ContentAsString();
            let urlBlobName = HttpUtility.UrlEncode(url)
            // Don't get the page if we've already seen it
            if not(container.DoesBlobExist(urlBlobName)) 
            then
                do RoleManager.WriteToLog("Information", String.Format("Handling new url: '{0}'", url));
                try
                    // Get the contents of the page
                    let content = http url
                    // Store the page into the blob store
                    let props = new BlobProperties(urlBlobName)
                    let _ = container.CreateBlob(props, new BlobContents(System.Text.UTF8Encoding.Default.GetBytes(content)), true);
                    
                    // Get the links from the page
                    let links = getLinks content
                    
                    // Filter down the links and then create a new work item for each
                    links
                    |> Seq.filter pageFilter
                    |> Seq.distinct
                    |> Seq.filter (fun link -> not(container.DoesBlobExist(HttpUtility.UrlEncode(link))))
                    |> Seq.iter (fun link -> queue.PutMessage(new Message(link)) |> ignore)
                    queue.DeleteMessage(msg) |> ignore
                with
                | _ ->()
        
        /// Main loop of worker process
        let rec Loop (queue : MessageQueue, container: BlobContainer) = 
            // Get the next page to crawl from the queue
            let msg = queue.GetMessage(240);
            if msg = null
            then Thread.Sleep(1000)
            else HandleMessage msg (queue, container)
            Loop(queue,container)
        
        override wp.Start() =
            // Initialize the Blob storage
            let blobStorage = BlobStorage.Create(StorageAccountInfo.GetDefaultBlobStorageAccountFromConfiguration());
            let container = blobStorage.GetBlobContainer("searchengine");
            let a = container.CreateContainer(null, ContainerAccessControl.Public);
    
            // Initialize the Queue storage
            let queueStorage = QueueStorage.Create(StorageAccountInfo.GetDefaultQueueStorageAccountFromConfiguration());
            let queue = queueStorage.GetQueue("searchworker");
            let b = queue.CreateQueue()
            
            // Put an initial message in the queue, using the start page
            let c = queue.PutMessage(new Message(startpage));
            
            // Begin the main loop, processing messages in the queue
            Loop(queue, container)
            
        override wp.GetHealthStatus() = RoleStatus.Healthy 

    Worker Roles

    The code above defines the implementation of a Worker Role – a process which runs in the background, waiting for work to do, and then processing these work requests.  The worker role is set to run 4 instance simultaneously, which means that there will be 4 instances of this worker processing work items as they come in.  This gives an implicit parallelism – in fact, the initial release of Azure will run one process per core, so you really are getting effective parallelism this way.  Notice also that this requires that the worker processes are inherently stateless.  Both aspects make typical functional design approaches that are common in F# natural for developing these worker roles.

    Queues and Blobs

    This sample uses two of the three data formats supported by Windows Azure.  The queue storage holds the work items. The blob storage holds the pages visited during the web crawl.  When an instance of the worker role Starts, it connects to the blob store and the queue store, then puts an initial work item in the queue and goes into a loop processing work items out of the queue.

    Conclusion

    Ideas for any other interesting F# applications on Windows Azure?  Download the templates and samples.
  • LukeH's WebLog

    At TechEd Orlando

    • 4 Comments

    I just arrived in Orlando for TechEd 2007.  I'll be presenting later on in the week on C#3.0 - but untill then, I get to spend some time talking to folks at the C# booth and attending a couple of the C#, LINQ and Visual Studio sessions to see what people are saying.

    If you are at TechEd this year - come by some of our C# and LINQ sessions - starting with the LINQ Overview talk Tuesday at 8:30AM.  And definitely stop by the C# booth to say "Hi"!

    If you aren't at TechEd this year - there's actually a number interesting public live webcasts of TechEd presentations which will be available at Vistual TechEd.  In particualr, the LINQ Overview presentaiton will be WebCast live.

  • LukeH's WebLog

    F# Scaling from Explorative to .NET Component - F# Talk @ TechEd 2010

    • 3 Comments

    Last Thursday I gave an F# talk at TechEd New Orleans titled "F# in Visual Studio 2010".  The talk aimed to highlight a few ideas:

    • F# enables exciting new developer scenarios in Visual Studio (explorative bottom-up programming in the F# Interactive)
    • F# can leverage many/most of the powerful developer tools in Visual Studio (projects, debugging, editing, help integration, etc.) and .NET (produce and consume vanilla .NET APIs)
    • F# has unique strengths in key target domains and extends the .NET platform to these audiences
    • F# scales smoothly from explorative to .NET component development (start with ideas and iteratively and smoothly add structure as methods, types and projects)

    The video of the talk is available now on TechEd Online and here is the demo code below for those interested in taking a look. 

    File iconTechEd2010Demos.zip

    Those who read my blog may recognize a number of the examples as things I've posted about in the past.   The F# Silverlight apps used in the demos are also available in those previous posts.

    There was one theme that seemed to resonate especially well - here's a few more thoughts on the idea of scaling from explorative to component building in F# from the perspective of using the "tab" refactoring in F#.

    "tab" as Extract Method (or, more generally, Encapsulate)

    Indentation is meaningful in F#, and is used to indicate nesting.  This, along with type inference and a strong uniformity of syntax at different levels of nesting, enables a simple but really fundamental shift to how you can think about encapsulation in F#. 

    An example. I start by solving a problem by writing code directly at the top level.  I don't know what shape the solution has to take yet, but I leverage the F# Interactive to prototype and explore the solution.

     

    let url = "http://www.yahoo.com"
    let req    = WebRequest.Create(Uri(url))
    let stream = req.GetResponse().GetResponseStream()
    let reader = new StreamReader(stream)
    let html   = reader.ReadToEnd()
    
    html.Length

    Notice that because we're in the F# Interactive, we tend to have some sample data we want to experiment with bound in our environment - like "url" above.

    Once we establish that this code does what we want - we can "give it a name" - using a simple and local set of changes:

    let url = "http://www.yahoo.com"
    
    let http url = 
        let req    = WebRequest.Create(Uri(url))
        let stream = req.GetResponse().GetResponseStream()
        let reader = new StreamReader(stream)
        let html   = reader.ReadToEnd()
        html
    
    let html = http url
    html.Length
    

    We just hit "tab", and then wrote out the signature for the new function.  Note that we were able to "parameterize over" any of the names used in this piece of code that we wanted to be variable in our abstraction.  So in this case, we parameterize over "url".  Any later dependencies on the results of the original code can be left intact by re-binding the name "html" to a call to the new function - thereby making this a truly local code change.

    Notice that a key to this was the uniformity of "let" between the top level and the function body scope. 

    Later in my coding I may end up with something like this:

    let username = "abc"
    let password = "def"
        
    let http url = 
        let credentials = NetworkCredential(username, password)
        let req    = WebRequest.Create(Uri(url), Credentials=credentials)
        let stream = req.GetResponse().GetResponseStream()
        let reader = new StreamReader(stream)
        let html   = reader.ReadToEnd()
        html
        
    let homeTimeline() = http "http://api.twitter.com/statuses/home_timeline.xml"
    let publicTimeline() = http "http://api.twitter.com/statuses/public_timeline.xml"
    let userTimeline() = http "http://api.twitter.com/statuses/user_timeline.xml"
    
    

    At this point, I have some interesting functionality - and may want to introduce a true component to encapsulate the implementation and expose an intentionally designed API.  OO is an excellent and proven tool for component design, and is a key part of F# programming.  It is also very easy to smoothly transition from simple top level coding into F# component design with OO.  And again - it's mostly just about "tab".

    let username = "abc"
    let password = "def"
    
    type Twitter(username, password) =     
        let http url = 
            let credentials = NetworkCredential(username, password)
            let req    = WebRequest.Create(Uri(url), Credentials=credentials)
            let stream = req.GetResponse().GetResponseStream()
            let reader = new StreamReader(stream)
            let html   = reader.ReadToEnd()
            html
        
        let homeTimeline() = http "http://api.twitter.com/statuses/home_timeline.xml"
        let userTimeline() = http "http://api.twitter.com/statuses/user_timeline.xml"
        let publicTimeline() = http "http://api.twitter.com/statuses/public_timeline.xml"
        
        member this.HomeTimeline() = homeTimeline()
        member this.UserTimeline() = userTimeline()
        member this.PublicTimeline() = publicTimeline()
    
    let twitter = new Twitter(username, password)
    let homeTimeline = twitter.HomeTimeline()
    let userTimeline = twitter.UserTimeline()
    let publicTimeline = twitter.PublicTimeline()

    Notice how similar this transition was to the change we needed to give the "http" function a name.  In this case, we just tab in all the code except the things we want to parameterize over, give the signature of the type and it's constructor (again, the things we want to parameterize over), and then define the API we want to expose from the functionality as "member" definitions.

    So we've now introduced a true .NET class with the API we want.  The last step is to transition from the explorative environment to a component.  We can just take the code above - strip off the sample data that was just there to experiment in the F# Interactive - and compile into a .NET assembly:

    namespace TwitterAPI
    
    open System
    open System.Net
    open System.IO
    
    type Twitter(username : string, password : string) =     
        let http url = 
            let credentials = NetworkCredential(username, password)
            let req    = WebRequest.Create(Uri(url), Credentials=credentials)
            let stream = req.GetResponse().GetResponseStream()
            let reader = new StreamReader(stream)
            let html   = reader.ReadToEnd()
            html
        
        let homeTimeline() = http "http://api.twitter.com/statuses/home_timeline.xml"
        let userTimeline() = http "http://api.twitter.com/statuses/user_timeline.xml"
        let publicTimeline() = http "http://api.twitter.com/statuses/public_timeline.xml"
        
        member this.HomeTimeline() = homeTimeline()
        member this.UserTimeline() = userTimeline()
        member this.PublicTimeline() = publicTimeline()
    
    

    image

    And then we can use this from C# (or VB, or any other .NET language):

    using System;
    using TwitterAPI;
    
    class Program
    {
        static void Main(string[] args)
        {
            Twitter twitter = new Twitter("abc", "def");
            string homeTimeline = twitter.HomeTimeline();
            Console.WriteLine(homeTimeline);
        }
    }
    
    

    Summary

    The TechEd talk covered a number of interesting aspects of using F# in Visual Studio - tools, core syntax, some examples of current commercial users, and the physical structure of F# projects used in client applications.  But the theme of scaling from explorative programming to .NET component is one that I find particularly powerful, and which is a really unique strength of the F# language.

     

    File iconTechEd2010Demos.zip

  • LukeH's WebLog

    Twitter OAuth in F#

    • 3 Comments

    I have a few F# demos which use the Twitter APIs as simple examples of accessing online data interactively and working with it in F#.    Recently, Twitter moved to require OAuth for accessing Twitter APIs on behalf of a user.  Below is the F# code I wrote to integrate OAuth, which should work for any other F# Twitter scripts and apps.

    OAuth Implementation
    open System
    open System.IO
    open System.Net
    open System.Security.Cryptography
    open System.Text
    
    // Twitter OAuth Constants
    let consumerKey : string = failwith "Must provide the consumerKey for an app registered at http://dev.twitter.com/apps/new"
    let consumerSecret : string = failwith "Must provide the consumerSecret for an app registered at http://dev.twitter.com/apps/new"
    let requestTokenURI = "https://api.twitter.com/oauth/request_token"
    let accessTokenURI = "https://api.twitter.com/oauth/access_token"
    let authorizeURI = "https://api.twitter.com/oauth/authorize"
    
    
    // Utilities
    let unreservedChars = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789-_.~";
    let urlEncode str = 
        String.init (String.length str) (fun i -> 
            let symbol = str.[i]
            if unreservedChars.IndexOf(symbol) = -1 then
                "%" + String.Format("{0:X2}", int symbol)
            else
                string symbol)
    
    
    // Core Algorithms
    let hmacsha1 signingKey str = 
        let converter = new HMACSHA1(Encoding.ASCII.GetBytes(signingKey : string))
        let inBytes = Encoding.ASCII.GetBytes(str : string)
        let outBytes = converter.ComputeHash(inBytes)
        Convert.ToBase64String(outBytes)
    
    let compositeSigningKey consumerSecret tokenSecret = 
        urlEncode(consumerSecret) + "&" + urlEncode(tokenSecret)
    
    let baseString httpMethod baseUri queryParameters = 
        httpMethod + "&" + 
        urlEncode(baseUri) + "&" +
        (queryParameters 
         |> Seq.sortBy (fun (k,v) -> k)
         |> Seq.map (fun (k,v) -> urlEncode(k)+"%3D"+urlEncode(v))
         |> String.concat "%26") 
    
    let createAuthorizeHeader queryParameters = 
        let headerValue = 
            "OAuth " + 
            (queryParameters
             |> Seq.map (fun (k,v) -> urlEncode(k)+"\x3D\""+urlEncode(v)+"\"")
             |> String.concat ",")
        headerValue
    
    let currentUnixTime() = floor (DateTime.UtcNow - DateTime(1970, 1, 1, 0, 0, 0, 0)).TotalSeconds
    
    
    /// Request a token from Twitter and return:
    ///  oauth_token, oauth_token_secret, oauth_callback_confirmed
    let requestToken() = 
        let signingKey = compositeSigningKey consumerSecret ""
    
        let queryParameters = 
            ["oauth_callback", "oob";
             "oauth_consumer_key", consumerKey;
             "oauth_nonce", System.Guid.NewGuid().ToString().Substring(24);
             "oauth_signature_method", "HMAC-SHA1";
             "oauth_timestamp", currentUnixTime().ToString();
             "oauth_version", "1.0"]
    
        let signingString = baseString "POST" requestTokenURI queryParameters
        let oauth_signature = hmacsha1 signingKey signingString
    
        let realQueryParameters = ("oauth_signature", oauth_signature)::queryParameters
    
        let req = WebRequest.Create(requestTokenURI, Method="POST")
        let headerValue = createAuthorizeHeader realQueryParameters
        req.Headers.Add(HttpRequestHeader.Authorization, headerValue)
        
        let resp = req.GetResponse()
        let stream = resp.GetResponseStream()
        let txt = (new StreamReader(stream)).ReadToEnd()
        
        let parts = txt.Split('&')
        (parts.[0].Split('=').[1],
         parts.[1].Split('=').[1],
         parts.[2].Split('=').[1] = "true")
    
    /// Get an access token from Twitter and returns:
    ///   oauth_token, oauth_token_secret
    let accessToken token tokenSecret verifier =
        let signingKey = compositeSigningKey consumerSecret tokenSecret
    
        let queryParameters = 
            ["oauth_consumer_key", consumerKey;
             "oauth_nonce", System.Guid.NewGuid().ToString().Substring(24);
             "oauth_signature_method", "HMAC-SHA1";
             "oauth_token", token;
             "oauth_timestamp", currentUnixTime().ToString();
             "oauth_verifier", verifier;
             "oauth_version", "1.0"]
    
        let signingString = baseString "POST" accessTokenURI queryParameters
        let oauth_signature = hmacsha1 signingKey signingString
        
        let realQueryParameters = ("oauth_signature", oauth_signature)::queryParameters
        
        let req = WebRequest.Create(accessTokenURI, Method="POST")
        let headerValue = createAuthorizeHeader realQueryParameters
        req.Headers.Add(HttpRequestHeader.Authorization, headerValue)
        
        let resp = req.GetResponse()
        let stream = resp.GetResponseStream()
        let txt = (new StreamReader(stream)).ReadToEnd()
        
        let parts = txt.Split('&')
        (parts.[0].Split('=').[1],
         parts.[1].Split('=').[1])
    
    /// Compute the 'Authorization' header for the given request data
    let authHeaderAfterAuthenticated url httpMethod token tokenSecret queryParams = 
        let signingKey = compositeSigningKey consumerSecret tokenSecret
    
        let queryParameters = 
                ["oauth_consumer_key", consumerKey;
                 "oauth_nonce", System.Guid.NewGuid().ToString().Substring(24);
                 "oauth_signature_method", "HMAC-SHA1";
                 "oauth_token", token;
                 "oauth_timestamp", currentUnixTime().ToString();
                 "oauth_version", "1.0"]
    
        let signingQueryParameters = 
            List.append queryParameters queryParams
    
        let signingString = baseString httpMethod url signingQueryParameters
        let oauth_signature = hmacsha1 signingKey signingString
        let realQueryParameters = ("oauth_signature", oauth_signature)::queryParameters
        let headerValue = createAuthorizeHeader realQueryParameters
        headerValue
    
    /// Add an Authorization header to an existing WebRequest 
    let addAuthHeaderForUser (webRequest : WebRequest) token tokenSecret queryParams = 
        let url = webRequest.RequestUri.ToString()
        let httpMethod = webRequest.Method
        let header = authHeaderAfterAuthenticated url httpMethod token tokenSecret queryParams
        webRequest.Headers.Add(HttpRequestHeader.Authorization, header)
    
    type System.Net.WebRequest with
        /// Add an Authorization header to the WebRequest for the provided user authorization tokens and query parameters
        member this.AddOAuthHeader(userToken, userTokenSecret, queryParams) =
            addAuthHeaderForUser this userToken userTokenSecret queryParams
    
    
    
    
    let testing() =   
    
        // Compute URL to send user to to allow our app to connect with their credentials,
        // then open the browser to have them accept
        let oauth_token'', oauth_token_secret'', oauth_callback_confirmed = requestToken()
        let url = authorizeURI + "?oauth_token=" + oauth_token''
        System.Diagnostics.Process.Start("iexplore.exe", url)
        
        // *******NOTE********:
        // Get the 7 digit number from the web page, pass it to the function below to get oauth_token
        // Sample result if things go okay:
        //    val oauth_token_secret' : string = "lbll17CpNUlSt0FbfMKyfrzBwyHUrRrY8Ge2rEhs"
        //    val oauth_token' : string = "17033946-8vqDO7foX0TUNpqNg9MyJpO3Qui3nunkZPixxLs"
        let oauth_token, oauth_token_secret = accessToken oauth_token'' oauth_token_secret'' ("1579754")
    
    
        // Test 1:
        let streamSampleUrl2 = "http://api.twitter.com/1/statuses/home_timeline.xml"
        let req = WebRequest.Create(streamSampleUrl2) 
        req.AddOAuthHeader(oauth_token, oauth_token_secret, [])
        let resp = req.GetResponse()
        let strm = resp.GetResponseStream()
        let text = (new StreamReader(strm)).ReadToEnd()
        text
    
        // Test 2:
        System.Net.ServicePointManager.Expect100Continue <- false
    let statusUrl = "http://twitter.com/statuses/update.xml" let request = WebRequest.Create (statusUrl, Method="POST") let tweet = urlEncode("Hello!") request.AddOAuthHeader(oauth_token,oauth_token_secret,["status",tweet]) let bodyStream = request.GetRequestStream() let bodyWriter = new StreamWriter(bodyStream) bodyWriter.Write("status=" + tweet) bodyWriter.Close() let resp = request.GetResponse() let strm = resp.GetResponseStream() let text = (new StreamReader(strm)).ReadToEnd() text
  • LukeH's WebLog

    Benoit Mandelbrot

    • 3 Comments

    I just read that earlier today, Benoît Mandelbrot passed awayMandelbrot’s most famous discovery, the Mandelbrot fractal, played an important early role in my interest in both programming and mathematics, and I’ve revisited Mandelbrot fractals many times as programming samples for new languages or frameworks I’ve looked at.  I was lucky to have an opportunity to hear Mandelbrot speak in person a few years ago when he stopped by the Microsoft campus on a book tour for his book The Misbehaviour of Markets: A Fractal View of Financial Turbulence

    In honor of Mandelbrot, here are two very simple takes on the Mandelbrot fractal that I’ve written in the last couple years.

    Mandelbrot in the F# Interactive

    One of the first pieces of F# code I wrote is below.  I blogged about this here.

    image  

    Mandelbrot in 175 characters of Javascript

    Inspired by the js1k.com competition earlier this year, I adapted Ken Perlin’s very cool C program to Javascript as a 175 character program for rendering a Mandelbrot fractal in any browser.  Couldn’t quite get it to fit in a single tweet though (rule #6 of the js1k challenge).

    y=-16;s="<pre>";while(s+='\n',y++<15)for(x=0;x++<84;s+=(" 
    .:-;!/>)|&IH%*#"[k&15]))for(i=k=r=0;j=r*r-i*i-2+x/25,i=2*r
    *i+y/10,j*j+i*i<11&&k++<111;r=j);document.body.innerHTML=s

     clip_image001

    Check it out here for the full page source.

    [UPDATE:  And here's a slightly more fully featured implementation, using Javascript and HTML5 canvas. ] 

  • LukeH's WebLog

    F# Session at TechEd Israel

    • 3 Comments

    I've had the opportunity to do a lot of F# presentations recently, which has been really great.  Over the last few months, I've done presentations at the Financial Services Developer Conference in New York, at the Lang.NET conference in Redmond, and at a few internal "brown bag"s for teams in Microsoft.

    Most recently though, I was in Israel last week for TechEd Israel 2008 - a great 3-day conference with more than 3000 attendees, held in sunny Eilat, Israel.  I presented an "Introduction to F#" session at the conference, which drew a packed room and some great questions and discussion during and after the talk.

    Here are the slides and demo I presented for the TechEd Israel session:

    File iconDev341.pptx     File iconFSharpDirectX.zip

     

  • LukeH's WebLog

    Standard Deviation and Event-based Programming

    • 3 Comments

    A couple weeks ago, I had the opportunity to do a really fun presentation on F# at the .NET Meetup in New York.  At the end, I got a great question, which I talked about a little at the presentation, but thought I’d talk about further in a blog post.

    The following isn’t exactly how the conversation went - but roughly captures what we talked about, and covers a lot of fun F# topics.  There’s some allusions here to ideas from statistics, time-series analysis, CEP and FRP.

    “How would you compute standard deviation over a stream of data in F#?”

    Standard deviation is a pretty straightforward computation with F#.  There’s a few ways to write it, but with “average” and “averageBy” built-in, they provide a particularly simple option.

    /// Square a number
    let sqr x = x * x
    
    /// Compute the standard deviation of a list of numbers
    let stddev nums = 
        let mean = nums |> List.average
        let variance = nums |> List.averageBy (fun x -> sqr(x - mean))
        sqrt(variance)

    “Okay – but what if I want the inputs to be long, potentially infinite sources of data?”

    The previous function worked over lists.  We can change it to work over sequences (IEnumerables) instead.  Pleasantly, it’s basically the same code.  But the result of working over sequences is that this function will work with any sort of data.  It could be a sequence of numbers that result from a LINQ to SQL query, it could be a sequence of numbers read lazily from a very long file, etc. 

    /// Compute the standard deviation of a sequence of numbers
    let stddevSeq numSeq = 
        let mean = 
            numSeq |> Seq.average
        let variance = 
            numSeq |> Seq.averageBy (fun x -> sqr(x - mean))
        sqrt(variance)

    “That’s computing what I want, but can I get all the partial results as my stream of data is processed? ”

    This makes things more interesting.  Instead of just computing the standard deviation, we really want to do an “online” calculation, where we compute enough information so that as each new datapoint comes in, we can easily produce the new standard deviation.  This is a different algorithmic approach to the same mathematical formula.  Luckily, wikipedia has what we need- an on-line algorithm for standard deviation from Knuth.

    /// Compute the running mean and standard deviation
    /// over a sequence of numbers
    let meanAndStdDev nums = 
        // A function which updates the on-line computation of 
        // n, mean and M2 given a new datapoint x
        let newValues (n, mean, M2) x = 
            let n' = n+1.
            let delta = x - mean
            let mean' = mean + delta/n'
            let M2' = M2 + delta*(x - mean')
            (n', mean', M2')
        // The initial vaues of n, mean and M2
        let startValues = (0., 0., 0.)
        // The resulting event stream is a scan over the input events 
        nums 
        |> Seq.scan newValues startValues
        |> Seq.map (fun (n, mean, M2) -> (mean, sqrt(M2/(n))))

    This time the code has a few interesting features:

    1. A local function “newValues” updates n, mean and M2 based on old values and a new datapoint x.  Notice that (a) this is strikingly similar to the pseudo-code in the wikipedia article but (b) this is a functional implementation, so instead of changing the values, we just compute new ones.
    2. The function “Seq.scan” takes care of the heavy lifting.  This function, and the related “fold” and “reduce” can be used to capture many common design patterns seen in “programming in the small”.  Scan walks across an input collection applying a function to it’s current accumulator argument, and returning a collection of the results that it builds up as it goes.  If you’ve haven’t written code using these before, they can feel uncomfortable at first, but after you’ve used each of these once or twice, you’ll notice them as common patterns that you are currently encoding as for loops, if statements and local state in your programs.  “scan” and it’s related functions give us the ability to name that design pattern and then re-use it – resulting in more declarative code.

    “Perfect!  But my actual stream of data (stock ticker prices) is long running, and will provide new data points periodically over minutes, days and weeks.”

    Another fun twist on the problem!  Sequences (IEnumerables) have the property that the producer of the data offers up a way to ask for more, and the consumer is in charge of pulling it out.  While it’s doing so, it’s busy the whole time, so if the producer  doesn’t have the data yet, it will have to block waiting for it to be available.  This is not what you want in the long running case, where instead we want to think about the data stream as a sequence of events.  With events, the consumer offers up a way to deal with new data whenever it’s available, and the producer is in charge of pushing the new data points out.  For the same reasons that there is so much interest in smart phones with push email, we should all want our long-running data streams to be event based instead of sequence based.

    let boundedMouseMax = 
        form.MouseMove 
        |> Event.filter (fun mea -> mea.X > 10 && mea.Y > 10 && mea.X < 90 && mea.Y < 90)
        |> Event.map (fun mea -> max mea.X mea.Y)
     
    do boundedMouseMax |> Event.add (fun maxCoord -> printfn "max is: %A" maxCoord)

    Events in F# are just standard .NET events.  But F# allows you to program with these events in a “first-class” way, meaning you can pass them around as data, and there are functions in the “Event” module for taking events and generating new events from them.  For instance, we could take a mouse move event and call “Event.filter” to filter out any time it is outside a bounding box, then “Event.map” to pick the larger of the X or Y coordinate of the mouse.  The result is then a new event which will fire only when the mouse is in the bounding box, and will produce the larger of it’s X and Y coordinate as it’s value. We can then hook up a listener to print out the values whenever this new event fires.

    But how do we use this to solve the original problem?

    /// Compute a running mean and standard deviation over the 
    /// input event stream of floating point numbers
    let meanAndStdDev nums = 
        // A function which updates the on-line computation of n, mean and M2
        let newValues (n, mean, M2) x = 
            let n' = n+1.
            let delta = x - mean
            let mean' = mean + delta/n'
            let M2' = M2 + delta*(x - mean')
            (n', mean', M2')
        // The initial vaues of n, mean and M2
        let startValues = (0., 0., 0.)
        // The resulting event stream is a scan over the input events 
        nums 
        |> Event.scan newValues startValues
        |> Event.map (fun (n, mean, M2) -> (mean, sqrt(M2/(n))))

    It’s almost identical to the “on-line” algorithm over sequences! This is one of the important benefits of writing more declarative code using constructs like “scan”.  Although the implementations of “scan” over sequences and over events are quite different under the hood – the design pattern of scanning makes sense over both, and is the concept we can program with to tackle the problem at a higher-level.  It really makes you feel that we’ve succeeded in expressing the “what” in place of the “how”.

    Here’s what it looks like to use this new function we’ve created:

    // Create a new event – we would hook up to the stock ticker events if they were available
    let numEvent = new Event<float>()
    let numEventStream, fireNewNum = numEvent.Publish, numEvent.Trigger
    
    // Derive a new event that computes the running mean and stddev 
    // over the original event stream
    let stddevEvents = meanAndStdDev numEventStream
    
    // Hook up an event handler to print out the new mean and standard deviation 
    do stddevEvents |> Event.add (fun (mean, stddev) -> printfn "Mean = %A, StdDev = %A" mean stddev)
    
    do fireNewNum 3.
    do fireNewNum 7.
    do fireNewNum 7.
    do fireNewNum 19.

    Conclusion

    This is a fun example that really shows off why it’s so valuable to use higher-level design patterns for programming in the small – like “map”, “average”, “averageBy” and “scan”.  First class events in F# are a unifying features which allows you to do the same kind of programming you use with sequences and apply it also to events.  This is really compelling, because it’s a case where two data sources which are conceptually very similar are often programmed in very different ways – but with higher-level programming models that functional programming and F# provide, you can program against both using the same techniques.

  • LukeH's WebLog

    ICFP Programming Contest

    • 3 Comments

    This years' installment of the ICFP Programming Contest is coming up this weekend.  For those who haven't had the chance to try out this programming contest before, I definitely recommend it.  I've done the contest 3 of the last 5 years, and each time has been an amazing experience.  This year, we've got a couple of teams here who'll be doing the contest, and we're excited to see what the organizers have in store. 

    The task for the contest changes quite a bit each year, but tends to incorporate themese like implementing interpreters/compilers for simple languages/runtimes and solving optimization or state-space exploration problems related to these interperpreters or programs written to run on these interpreters.  In each of the last two years' contests, some really fun puzzle elements have been incoporated into the contest, along with richer storytelling.

    "Universal Machine" in F#

    Of the years I've done the ICFP programming contest, my favorite so far has been the 2006 contest.  It was truly an inspired creation, and I am rather in awe of the group from CMU who put it together.  If you didn't do the contest in 2006, you can still go back to the site and try it out - which I can definitely recommend.

    The contest started seemingly simply - pointing to a spec for a virtual machine (the "UM"), a binary to run on the virtual machine (the "Codex") and a "decryption key" to use: '(\b.bb)(\v.vv)06FHPVboundvarHRAk'. 

    Earlier this week, to re-live some of the fun of the 2006 contest, and to get mentally ready for this years contest, I re-implemented the UM, this time in F#.  Here's what it looks like (take a look at the spec linked above for an expalanation of what it's doing):

    open System
    open System.IO
    
    type UM(p0 : byte[], input : Stream,  output : Stream) = 
    
        // Read byte[] in as little-endian uint[]
        let p0AsUInts =
            [| for i in 0..4..(p0.Length-4) ->
                ((uint32 p0.[i])<<<24) + 
                ((uint32 p0.[i+1])<<<16) + 
                ((uint32 p0.[i+2])<<<8) + 
                ((uint32 p0.[i+3])) |]
        // The resizable array of memory pages starts with just the input
        let mem = ResizeArray.create 1 p0AsUInts
        // The registers are initialized to 0
        let reg = [|0u;0u;0u;0u;0u;0u;0u;0u|]
    
        // The main loop of the interpreter.  Should be *fast*.
        let rec cycle finger =
            let arr = mem.[0]
            let op = arr.[finger]
            let code = op >>> 28
            let (a,b,c) = (int ((op >>> 6) &&& 0b111u), int ((op >>> 3) &&& 0b111u), int ((op) &&& 0b111u))
            let (a2, value) = (int (op >>> 25) &&& 0b111, op &&& 0b1111111111111111111111111u)
            match code with
            | 0u -> (if reg.[c] <> 0u then reg.[a] <- reg.[b]); cycle (finger+1)
            | 1u -> (reg.[a] <- mem.[int reg.[b]].[int reg.[c]]); cycle (finger+1)
            | 2u -> (mem.[int reg.[a]].[int reg.[b]] <- reg.[c]); cycle (finger+1)
            | 3u -> (reg.[a] <- reg.[b] + reg.[c]); cycle (finger+1)
            | 4u -> (reg.[a] <- reg.[b] * reg.[c]); cycle (finger+1)
            | 5u -> (reg.[a] <- reg.[b] / reg.[c]); cycle (finger+1)
            | 6u -> (reg.[a] <- ~~~ (reg.[b] &&& reg.[c])); cycle (finger+1)
            | 7u -> ()
            | 8u -> (mem.Add(Array.zeroCreate (int reg.[c])); reg.[b] <- uint32 (mem.Count - 1)); cycle (finger+1)
            | 9u -> (mem.[int reg.[c]] <- [||]); cycle (finger+1)
            | 10u -> (output.WriteByte(byte reg.[c])); cycle (finger+1)
            | 11u -> (reg.[c] <- uint32 (input.ReadByte() % 255)); cycle (finger+1)
            | 12u -> (if reg.[b] <> 0u then mem.[0] <- Array.copy mem.[int reg.[b]]); cycle (int reg.[c])
            | 13u -> (reg.[a2] <- value); cycle (finger+1)
            | _ -> failwith "Did not understand %A" (op, code, a, b, c)
    
        /// Start the Universal Machine running
        member this.Run() =
            cycle 0
    
    let program = File.ReadAllBytes("codex.umz");
    let ee = UM(program, Console.OpenStandardInput(), Console.OpenStandardOutput())
    
    //Try the following if too much is printed to the screen 
    //let ee = UM(program, Console.OpenStandardInput(), new FileStream("umix.um", FileMode.Create))ee.Run()
    
    
    Performance

    This code is doing a lot of bit-level operations in F#.  The meat of the code is the tight loop that runs executing the "cycle" function.  Since this is the runtime that the provided binary will run on, it needs to be fast.  And, fortunately, it is.  On my laptop, this loop runs about 15million iterations per second.  Of course, there are opportunities to improve on the performance of this code, but this turns out to be fast enough for the problem, and it's convenient that you get this performance profile by default.

     

    That Doesn't Look Very Functional?

    This code is not particularly 'functional'.  But it is totally reasonable F# code.  We often talk about F# as a hybrid functional, imperative and OO programming language.  This is an example where the problem domain really suggests the use of some imperative programming ideas.  The problem defines some global state: the memory, and the registers.  Since we're going to be modifying these constantly during the execution of the runtime, it makes sense to use mutable arrays to store the contents.  F# makes this nice and easy to do.

     

    Wrapping it up with a little OO

    As I menionted in a previous blog post, it's really easy to transition between prototyping and component design with F#.  The code above wraps up the state and algorithm for the UM in a simple type definition.  This provides nice encapsulation and abstraction over the specific binary to run, and the input and output streams to use.  And it also wraps up the F# code in a type which could just as easily be consumed from C# or any other .NET langauge.

  • LukeH's WebLog

    ICFP Programming Contest 2009

    • 2 Comments

    This year’s ICFP Programming Contest starts today.  We’ve got a team participating, any other F# teams out there? 

    Last year, I posted an F# implementation of the 2006 ICFP contest problem, which was an amazing and complex set of puzzles inside a custom virtual machine.   Here’s a Silverlight version of that virtual machine, implemented on top of the Console sample I posted yesterday.

    Silverlight UM – ICFP 2006

    See http://boundvariable.org/ and my previous post for details of what’s running in this console.

    Launch the UM

     /// Start the UM running on a background thread.
    /// The input and output functions will be called on the
    /// background thread, and block the execution of the UM
    static member Launch(binaryUri : Uri, input : Stream, output : Stream) =
    async {
    try
    let
    writer = new System.IO.StreamWriter(output, AutoFlush=true)
    let reader = new System.IO.StreamReader(input)
    do writer.WriteLine("Press <enter> to begin. Note: Will download a large UMIX image.")
    do reader.ReadLine() |> ignore
    let request = System.Net.WebRequest.Create(binaryUri)
    do writer.WriteLine(binaryUri)
    let! response = request.AsyncGetResponse()
    let stream = response.GetResponseStream()
    do writer.WriteLine("Downloading...")
    let progress(percent) = writer.WriteLine("Downloaded: "+percent.ToString()+"%")
    let! bytes = asyncReadAllBytesFromStreamWithProgress (stream, int response.ContentLength, progress)
    do
    writer.WriteLine("Booting...")
    let ee = UM(bytes, Func<int>(input.ReadByte), Action<byte>(output.WriteByte))
    ee.Run()
    with
    | e -> System.Diagnostics.Debug.WriteLine("UM failed with {0}", e)
    } |> Async.Start

    F# Async in Silverlight

    In Silverlight, the APIs for synchronous I/O have been removed, which forces developers to do the ‘right thing’ and make async calls for long-running I/O operations, thus (hopefully) keeping the UI responsive.  This makes F# a really nice implementation language for Silverlight applications, using F#’s async workflows.  In the code above, an asynchronous workflow describes the flow of control of launching the UM.  At the two points of long running network connections, async calls are made with “let!”, so that the UI thread is not blocked and the application remains responsive.  This ultimately hides a lot of complexity that would be added in chaining these asynchronous operations together in a standard .NET approach to this.

    Note also that the code above calls a helper function “asyncReadAllBytesFromStreamWithProgress”, which shows how async code in F# can be nicely factored using the same sort of “extract method” factoring you are used to in synchronous programming.  The implementation of this helper also shows that async calls with ‘let!’ can be placed inside ‘for’ and ‘while’ loops.  Chances are you don’t even want to try doing that with standard Begin/End calls!

    let internal asyncReadAllBytesFromStreamWithProgress(stream:Stream, length:int, progress:int->unit) =
    async {
    let offset = ref 0
    let count = ref length
    let buffer = Array.zeroCreate<byte> !count
    while !count > 0 do
    let!
    bytesRead = stream.AsyncRead(buffer, !offset, (min 524288 !count))
    if bytesRead = 0 then raise (new EndOfStreamException("Read beyond the EOF"))
    do offset := !offset + bytesRead
    do count := !count - bytesRead
    progress(100 * !offset / length)
    return buffer
    }

    Consuming F# from C#+XAML

    The majority of this app is written in F# – the console control is authored in F# and the application logic is in F#.  Here is the C# and XAML that ties it together:

    <UserControl xmlns:my="clr-namespace:System.Windows.Controls;assembly=SilverlightConsole" x:Class="UM.Page"
    xmlns=http://schemas.microsoft.com/winfx/2006/xaml/presentation
    xmlns:x=http://schemas.microsoft.com/winfx/2006/xaml
    Width="600" Height="400">
    <
    Grid x:Name="LayoutRoot" Background="White">
    <
    my:SilverlightConsole x:Name="console" Width="600" Height="400"
    FontFamily="Courier New" FontSize="13" FontWeight="Bold"
    Foreground="GreenYellow" Background="Black" Text="" />
    </
    Grid>
    </
    UserControl>
    public partial class Page : UserControl {
    public Page() {
    InitializeComponent();
    Uri curPage = System.Windows.Application.Current.Host.Source;
    UriBuilder builder = new UriBuilder(curPage.Scheme, curPage.Host, curPage.Port, "umix.dll");
    ICFP.UM.Launch(builder.Uri, console.InputStream, console.OutputStream);
    }
    }

    Summary

    Check out ICFP Programming Contest  2009 this weekend, and take F#+Async+Silverlight for a spin.

  • LukeH's WebLog

    F# for Parallel and Asynchronous Programming - PDC 2009

    • 2 Comments

    image

    Last November at PDC 2009 in Los Angeles I gave a talk on F# for Parallel and Asynchronous Programming. 

    The talk begins by covering basic F# concepts, and then focuses on four challenging issues related to concurrency and the tools F# brings for addressing these - immutability, async workflows, and agents.  In trying to cover this ground in a 60 minute talk, there were naturally a lot of relevant topics I had to skip, and I also got a number of questions about things I'd glossed over after the talk.  So I thought I'd use this post to discuss some of the other features and topics I would have highlighted in this presentation if there'd been more time.
     

    Note:  The topics covered below mostly assume you've either watched the session video or taken a look through the source code attached above.  If you haven't yet - check out the talk.

     

    Some F# Parallel and Async Topics


    PSeq

    The talk briefly shows the use of a PSeq module to write code like the following:

    let rec sumOfSquares nums =     
    nums
    |> PSeq.map sqr
    |> PSeq.sum

    As a few folks have noted, the F# core library does not actually provide this PSeq module.  In the demo code, I was just using my own little implementation of this, to show off the benefits of the more declarative programming style used here. 

    The PSeq I used is just a very simple wrapper around the functionality provided in PLinq - for example, PSeq.map just calls ParallelEnumerable.Select.  We are expecting a future release of the F# PowerPack to include this wrapper library.  In the the meantime, you can check out Matt Podwysocki's blog post on using PLinq from F#, or Talbot Crowell's recent post with an updated Beta2 version.

     

    Using Begin/End Methods with F# Async

    In showing the ease of turning synchronous code which calls long-running I/O operations into asynchronous code, I showed adding the pieces highlighed in red below to the code below to make it async:

    let downloadImageAsync(blob : CloudBlob) =
    async {
    let! pixels = blob.AsyncDownloadByteArray()
    let fileName = "thumbs-" + blob.Uri.Segments.[blob.Uri.Segments.Length-1]
    use outStream = File.OpenWrite(fileName)
    do! outStream.AsyncWrite(pixels, 0, pixels.Length)
    return fileName }

    I got a very good question after the talk though: "Where did the AsyncDownloadByteArray function come from?" 

    In general, for any asynchronous API provided on the platform, it can be simply wrapped and exposed as a method returning Async<T> to be used within F# async workflows as above.  In the core F# libraries, we wrap many of the core async APIs in .NET Framework, providing extension methods on the relevant types.  This means that functions like Stream.AsyncWrite used above are available by default in F#.  However, CloudBlob.DownloadByteArray is defined in the libraries provided with the Windows Azure SDK, and so we need to create a wrapper.

    In fact, the Azure SDK doesn't even provide a BeginDownloadByteArray/EndDownloadByteArray pair - but it does provide a more fundamental async operation BeginDownloadToStream/EndDownloadToStream:

    public IAsyncResult BeginDownloadToStream(Stream target, AsyncCallback callback, object state);
    public void EndDownloadToStream(IAsyncResult asyncResult);

    Notably, it's likely that the reason for not including the higher level async operations is just due to the substantial complexity of authoring and maintaining these methods in C# today. Had the Azure team written their API in F#, it's likely they would have found it sufficiently easy to add these.

    So, what does the wrapper look like?

    type CloudBlob with
    /// Asynchronously download blob contents as an array
    member blob.AsyncDownloadByteArray() = async {
    let ms = new MemoryStream(int blob.Properties.Length)
    do! Async.FromBeginEnd(
    (fun (cb,s) -> blob.BeginDownloadToStream(ms,cb,s)),
    blob.EndDownloadToStream)
    return ms.ToArray() }

    Some notes on this implementation:

    Async.FromBeginEnd

    The key tool for wrapping many .NET async APIs is to use Async.FromBeginEnd.  This method takes two functions as arguments - the Begin/End pair of the method being wrapped - and returns a function of type Async<T> where T is the type returned from the End method. When directly wrapping a method, this is often trivial - member foo.AsyncBar() = Async.FromBeginEnd(foo.BeginBar, foo.EndBar).

    Composing Async Operations

    There are generally two ways to create Async<T> values - call a function which returns one, like Async.FromBeginEnd, Stream.AsyncWrite, or Async.Parallel - or write an async {.}  block to include any arbitrary F# code as part of the composition.  In the case above, we need to both call the underlying Begin/End pair, but also construct a stream to download into, and ultimately return the array of bytes left in the resulting stream.  This composition of asynchronous operations with some synchronous code is easy to do inside the async {.} block.

    Other Async Interop Utilities

    We saw details above of wrapping Begin/End pairs as Async<T> objects - there are a few other interesting interop scenarios which are nicely supported by the Async API in F#:

    • Async.AsBeginEnd - coverts an async operation into a set of Begin/End/Cancel functions which can be used to expose an implementation of the .NET Asynchronous Programming Model to other .NET consumers.
    • Async.AwaitEvent/AwaitWaitHandle/AwaitIAsyncResult/AwaitTask - there are many other async models used in .NET beyond the APM, and F# provides primitives for easily wrapping any of these.

     

    Exceptions with Async

    In the talk, I commented on how hard it is to do correct exception handling with typical asynchronous programming tools available today in C#/VB.  But I didn't actually show how much easier this is with F# async workflows.  Here's what it looks like to make our downloadImageAsync function robust to the many exceptions that could possibly be raised during it's execution:

    let downloadImageAsync(blob : CloudBlob) =  async {
    try
    let!
    pixels = blob.AsyncDownloadByteArray()
    let fileName = "thumbs-" + blob.Uri.Segments.[blob.Uri.Segments.Length-1]
    use outStream = File.OpenWrite(fileName)
    do! outStream.AsyncWrite(pixels, 0, pixels.Length)
    return fileName
    with
    | e ->
    Log.Write(e.ToString())
    return "" }

    The key here is that exception handling is no different for async code than for synchronous code.  Under the hood, F# will ensure that exceptions occurring anywhere in the workflow are bubbled up through the non-blocking calls and ultimately handled by the exception handler code.  This will happen as expected even when the code executes on multiple different threads during it's operation.

     

    Cancellation with Async

    When we have long-running operations in our programs, as well as making them non-blocking, we very often want to make them cancellable.  F# async workflows support cancellation automatically, inserting cancellation checks at each let! or do!, and also at each iteration of a loop.  Cancellation is tracked and triggered using the new unified cancellation model in the .NET Framework, "System.Threading.CancellationToken".  For example, if we want to make downloadImageAsync cancellable, we simply need to pass in a cancellation token when we start the workflow, and then hook up a button to cancel a running operation.  The deltas to the PhotoViewer application are indicated in red below:

    F# code:

    member this.DownloadAll(cancellationToken) =
    let work =
    container.ListBlobs()
    |> Seq.map (fun blob ->
    downloadImageAsync(container.GetBlobReference(blob.Uri.ToString())))
    |> Async.Parallel
    Async.StartAsTask(work, cancellationToken = cancellationToken)

    C# client code:

    CancellationTokenSource cancellableWork;

    private void Start_Click(object sender, RoutedEventArgs e){
    cancellableWork = new CancellationTokenSource();
    downloader.DownloadAll(cancellableWork.Token);
    }

    private void Cancel_Click(object sender, RoutedEventArgs e){
    if (cancellableWork != null)
    cancellableWork.Cancel();
    }
     

    Other Great F# Async Samples

    Don has recently posted two great articles on some other end-to-end samples built on F# Async.  These posts cover many of the same core F# Async topics, plus some additional topics not touched on here.  The samples are also a lot of fun, using data pulled from Twitter and Bing Translation services.

     

    Summary

    F#'s immutability, async workflows and agents are useful features for enabling easier parallel and asynchronous programming in application code.  The PDC talk provides a good starting point for understanding how these features can be applied.  The topics above cover a few of the related topics that I've been asked about since then, but there is also much, much more that can be enabled on top of these core features.

Page 1 of 2 (33 items) 12