<?xml version="1.0" encoding="UTF-8" ?>
<?xml-stylesheet type="text/xsl" href="http://blogs.msdn.com/utility/FeedStylesheets/rss.xsl" media="screen"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:wfw="http://wellformedweb.org/CommentAPI/"><channel><title>Carl's Blog - All Comments</title><link>http://blogs.msdn.com/b/carlnol/</link><description>Carl Nolan&amp;#39;s ramblings on development and data processing</description><dc:language>en-US</dc:language><generator>Telligent Evolution Platform Developer Build (Build: 5.6.50428.7875)</generator><item><title>re: Hadoop .Net HDFS File Access</title><link>http://blogs.msdn.com/b/carlnol/archive/2013/02/08/hdinsight-net-hdfs-file-access.aspx#10419649</link><pubDate>Fri, 17 May 2013 12:28:37 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:10419649</guid><dc:creator>Oleg Subachev</dc:creator><description>&lt;p&gt;When I used the library from Windows Azure HDInsight Service CTP all was OK. I could copy files from HDInsight headnode to HDFS and then see through Interactive JavaScript console that the files are there in HDFS.&lt;/p&gt;
&lt;p&gt;But now under Windows Azure HDInsight preview something goes wrong: when I run the same test application on headnode all seems OK - diagnostic messages and email are OK, but when I look in HDFS through Interactive JavaScript console there are no copied files there. So no errors, but no files ???&lt;/p&gt;
&lt;p&gt;What&amp;#39;s wrong ?&lt;/p&gt;
&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=10419649" width="1" height="1"&gt;</description></item><item><title>re: Hadoop .Net HDFS File Access</title><link>http://blogs.msdn.com/b/carlnol/archive/2013/02/08/hdinsight-net-hdfs-file-access.aspx#10419637</link><pubDate>Fri, 17 May 2013 12:15:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:10419637</guid><dc:creator>Oleg Subachev</dc:creator><description>&lt;p&gt;The problem was solved by connecting not to local loopback address (127.0.0.1), but to normal IP address of the local computer.&lt;/p&gt;
&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=10419637" width="1" height="1"&gt;</description></item><item><title>re: Hadoop Binary Streaming and PDF File Inclusion</title><link>http://blogs.msdn.com/b/carlnol/archive/2012/01/01/hadoop-binary-streaming-and-pdf-file-inclusion.aspx#10405051</link><pubDate>Mon, 25 Mar 2013 11:33:01 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:10405051</guid><dc:creator>Carl Nolan</dc:creator><description>&lt;p&gt;Hi Sagar you can use Hadoop for document processing in this fashion, provided you have sufficient volume. If you know Java you can use the Java binary reader that comes with this code as the reader for submitting a MR job written just in Java.&lt;/p&gt;
&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=10405051" width="1" height="1"&gt;</description></item><item><title>re: Hadoop Binary Streaming and PDF File Inclusion</title><link>http://blogs.msdn.com/b/carlnol/archive/2012/01/01/hadoop-binary-streaming-and-pdf-file-inclusion.aspx#10404219</link><pubDate>Thu, 21 Mar 2013 14:32:13 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:10404219</guid><dc:creator>Sagar Nikam</dc:creator><description>&lt;p&gt;Sir,Your idea is great,as I want same with some modification.&lt;/p&gt;
&lt;p&gt;The theme is,I have thousand of &amp;nbsp;files pdf,txt,docx in a folder.I want to extracts most occuring top 10 words for each file using Hadoop/any software which gives quick results&lt;/p&gt;
&lt;p&gt;I totally don&amp;#39;t know C# &amp;amp; .NET,I try to understand the code,but I can&amp;#39;t.&lt;/p&gt;
&lt;p&gt;I know little bit of Java.Can u tell me how to modify it into Java Program?&lt;/p&gt;
&lt;p&gt;I will be thankful,if u convert it completely into MapReduce form as many peoples are using Java for Hadoop programming&lt;/p&gt;
&lt;p&gt;You can mail me also - sagarnikam123@gmail.com&lt;/p&gt;
&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=10404219" width="1" height="1"&gt;</description></item><item><title>re: Hadoop .Net HDFS File Access</title><link>http://blogs.msdn.com/b/carlnol/archive/2013/02/08/hdinsight-net-hdfs-file-access.aspx#10403629</link><pubDate>Tue, 19 Mar 2013 20:05:15 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:10403629</guid><dc:creator>Rui</dc:creator><description>&lt;p&gt;Unable to connect!! getting INFO: Retrying connect to server: 127.0.0.1/127.0.0.1:9000. Already tried 0 time(s).&lt;/p&gt;
&lt;p&gt;I have the HDInsught running on my box (http://localhost:8085/ works fine) but I cant run your test &amp;quot;WinHdfsManagedTest&amp;quot;&lt;/p&gt;
&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=10403629" width="1" height="1"&gt;</description></item><item><title>re: Hadoop .Net HDFS File Access</title><link>http://blogs.msdn.com/b/carlnol/archive/2013/02/08/hdinsight-net-hdfs-file-access.aspx#10401477</link><pubDate>Tue, 12 Mar 2013 07:05:04 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:10401477</guid><dc:creator>Imaya</dc:creator><description>&lt;p&gt;Good one.&lt;/p&gt;
&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=10401477" width="1" height="1"&gt;</description></item><item><title>re: Hive and XML File Processing</title><link>http://blogs.msdn.com/b/carlnol/archive/2012/12/13/hive-and-xml-file-processing.aspx#10399020</link><pubDate>Mon, 04 Mar 2013 09:41:46 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:10399020</guid><dc:creator>Carl Nolan</dc:creator><description>&lt;p&gt;The error relating to java.lang.UnsupportedClassVersionError: org/apache/mahout/classifier/bayes/XmlElementStreamingInputFormat : Unsupported major.minor version 51.0 could be down to the current code being compiled with the Oracle SDK. Are you running this on Azure. If so you may need to recompile the Java classes. There is a script to do this, just point it to the correct javac exe.&lt;/p&gt;
&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=10399020" width="1" height="1"&gt;</description></item><item><title>re: Hive and XML File Processing</title><link>http://blogs.msdn.com/b/carlnol/archive/2012/12/13/hive-and-xml-file-processing.aspx#10396279</link><pubDate>Fri, 22 Feb 2013 18:21:50 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:10396279</guid><dc:creator>Gordon</dc:creator><description>&lt;p&gt;I hits error when execute below statement:&lt;/p&gt;
&lt;p&gt;CREATE EXTERNAL TABLE StoresXml (storexml string)&lt;/p&gt;
&lt;p&gt;STORED AS INPUTFORMAT &amp;#39;org.apache.mahout.classifier.bayes.XmlElementStreamingInputFormat&amp;#39;&lt;/p&gt;
&lt;p&gt;OUTPUTFORMAT &amp;#39;org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat&amp;#39;&lt;/p&gt;
&lt;p&gt;LOCATION &amp;#39;/stores/demographics&amp;#39;;&lt;/p&gt;
&lt;p&gt;Error:&lt;/p&gt;
&lt;p&gt;java.lang.UnsupportedClassVersionError: org/apache/mahout/classifier/bayes/XmlEl&lt;/p&gt;
&lt;p&gt;ementStreamingInputFormat : Unsupported major.minor version 51.0&lt;/p&gt;
&lt;p&gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;at java.lang.ClassLoader.defineClass1(Native Method)&lt;/p&gt;
&lt;p&gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;at java.lang.ClassLoader.defineClassCond(ClassLoader.java:631)&lt;/p&gt;
&lt;p&gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;at java.lang.ClassLoader.defineClass(ClassLoader.java:615)&lt;/p&gt;
&lt;p&gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:14&lt;/p&gt;
&lt;p&gt;1)&lt;/p&gt;
&lt;p&gt;What wrong? How to fix?&lt;/p&gt;
&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=10396279" width="1" height="1"&gt;</description></item><item><title>re: Hadoop .Net HDFS File Access</title><link>http://blogs.msdn.com/b/carlnol/archive/2013/02/08/hdinsight-net-hdfs-file-access.aspx#10395530</link><pubDate>Wed, 20 Feb 2013 11:54:41 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:10395530</guid><dc:creator>Carl Nolan</dc:creator><description>&lt;p&gt;What account are you trying to connect to the server as? Have you tried running under the Isotope user or from the HDInsight command rpompt?&lt;/p&gt;
&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=10395530" width="1" height="1"&gt;</description></item><item><title>re: Hadoop .Net HDFS File Access</title><link>http://blogs.msdn.com/b/carlnol/archive/2013/02/08/hdinsight-net-hdfs-file-access.aspx#10394812</link><pubDate>Mon, 18 Feb 2013 11:58:53 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:10394812</guid><dc:creator>Oleg Subachev</dc:creator><description>&lt;p&gt;The problem was with debug version of my test app. There are no debug versions of C++ runtime on Azure HDInsight cluster to load debug version of &amp;#39;WinHdfsManaged.dll&amp;#39;. Release version does not throw the exception.&lt;/p&gt;
&lt;p&gt;But there is another problem:&lt;/p&gt;
&lt;p&gt;Feb 18, 2013 10:05:23 AM org.apache.hadoop.ipc.Client$Connection handleConnectionFailure&lt;/p&gt;
&lt;p&gt;INFO: Retrying connect to server: 127.0.0.1/127.0.0.1:9000. Already tried 0 time(s).&lt;/p&gt;
&lt;p&gt;Feb 18, 2013 10:05:25 AM org.apache.hadoop.ipc.Client$Connection handleConnectionFailure&lt;/p&gt;
&lt;p&gt;INFO: Retrying connect to server: 127.0.0.1/127.0.0.1:9000. Already tried 1 time(s).&lt;/p&gt;
&lt;p&gt;And so on.&lt;/p&gt;
&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=10394812" width="1" height="1"&gt;</description></item></channel></rss>