Carl Nolan’s ramblings on development
As promised, to conclude the Co-occurrence Approach to an Item Based Recommender posts I wanted to port the MapReduce code to C#; just for kicks and to prove the code is also easy to write in C#. For an explanation of the MapReduce post review the previous article:
http://blogs.msdn.com/b/carlnol/archive/2012/07/07/mapreduce-based-co-occurrence-approach-to-an-item-based-recommender.aspx
The latest version of the code now includes the MSDN.Recommender.MapReduceCs project, that is a full C# implementation of the corresponding F# project; MSDN.Recommender.MapReduce. These code variants are intentionally similar in structure.
The code for the Order Mapper is as follows:
And that of the Order Reducer is:
Your friend when writing the C# code is the yield return command for emitting data, and the Tuple.Create() command for creating the key/value pairs. The two commands enable one to return a key/value sequence; namely a .Net IEnumerable.
For completeness the code for the Product Mapper is:
That of the Product Combiner is:
And finally, that of the Product Reducer is:
A lot of the processing is based upon processing IEnumerable sequences. For this reason foreach and Linq expressions are used a lot. I could have defined a ForEach IEnumerable extension, to use Linq exclusively, but have preferred the C# syntax.
One difference in the final Reducer is how the Sparse Matrix is defined. In the F# code an F# extension allows one to build a SparseMatrix type by providing a sequence of index/value tuples. In the C# code one has to define an empty SparseMatrix and then interactively set the element values.
Note: There may be some confusion in the usage of the float type for .Net programmers, when switching between C# and F#, due to F#’s naming for floating-point number types. F# float type corresponds to System.Double in .NET (which is called double in C#), and F# float32 type corresponds to System.Single in .NET (which is called float in C#). For this reason I have switched to using the type double in F#, as it is another type alias for the System.Double type.
Of course the recommendations produced by the C# code are the same as the F# code, and clearly demonstrate that the Framework cleanly support writing MapReduce jobs in C#. I personally find the F# code more succinct and easier to read, but this is my personal preference.