Carl Nolan’s ramblings on development
In my last past I demonstrated a F# MapReduce program based on Hadoop Streaming. One thing that intrigued me was the possibility of using the JS Console for doing a quick visualization of the MapReduce output. So here is my first foray into the idea.
From the last example the data output was:
The data represents, for each mobile platform device, the min, average, and max query times. So if one wanted a quick visualization of the data, using the JS Console, the process would be as follows.
Firstly one would need to access the output of the MapReduce job and parse the data.
The output from these commands would be an array represented as:
Once we have the raw data in an array we would parse the “querytime” string into an integer representing the number of seconds for the query times. Picking the Average query time as an example one could write:
Again this gives us an array with an integer value that we can easily graph:
So once we have this array, plotting the graph becomes easy:
This then renders the following graph:
Whereas this approach is very code based and dependant on a little JScript/JQuery knowledge, it can provide a quick validation of ones MapReduce.
Thanks for the blog. Not sure why but I'm getting the following as part of the parse output:
map: [object Function]
filter: [object Function]
forEach: [object Function]
reduce: [object Function]
My parse output looks somewhat like this.
js> data = parse(file.data, "OS,Marketshare:long")
Is there a way how this can be avoided?