• mwinkle.blog

    So Many Cool Things Going On

    • 0 Comments

    Just a quick summary:

    • Orcas Beta 1 is out
    • BPEL folks are looking for feedback
    •  labs.biztalk.net is live (Check out Clemens, Dennis and John's posts)
      • Maybe at MIX we can talk about how this might make things interesting?
      • I imagine that I will probably do more than a couple of samples based on this
    • I'll be at MIX next week and am looking forward to meeting up with anyone who is attending.  I'll be in the mashup's area and I'd love to see folks using WF to power some mashups.
  • mwinkle.blog

    Off to Barcelona!

    • 0 Comments
    I'm in the office this morning, wrapping some things up and then it's wheels up to Barcelona for a week.  Paul, Clemens, Shy and I will all be there hanging out somewhere between our talks and the Connected Systems booths, so stop on by and say hi!
  • mwinkle.blog

    Pointing out something cool...

    • 0 Comments

    So, everybody out there probably saw this XBox 360 tracker application from http://untitlednet.com/.  I know that I've heard reports of links of this thing smoking through offices faster than a "it's Friday so you can leave an hour early" email.

    Craig takes the idea and really kicks it up a notch, while at the same time showing off some cool features that will ship as part of WinFx.  Go check out the code, and I'd like to point out the way that he uses workflow as a means by which to organize all of the work that he needs his application to do.  He also shows some ways to get your workflow instance to communicate back to your user interface. 

    Check out this channel9 video that documents the approach he took, and how he leveraged these technologies, WCF, Workflow, and WPF in order to create a pretty sweet application.  Now I just need to find an XBox 360 in Seattle!

  • mwinkle.blog

    Rules Engine Webcast

    • 0 Comments

    Jurgen put together a great webcast on building rules-based applications using WF.

     

    Check it out.  Now.

  • mwinkle.blog

    Hello World (WF Services) in Spanish

    • 0 Comments

    I noticed on Matias' blog that Ezequiel posted a hello world tutorial in Spanish based on the March CTP.  

  • mwinkle.blog

    Starting and Transacting

    • 0 Comments

    Two quick links before I run back to my day job.

    • Paul posts a Web Workflow Starter Kit.  Check it out for a good sample on hosting in ASP.NET, managing workflow data, and doing a task that a lot of people immediately think of when they start thinking workflow.
    • Jason posts a really cool "Developer Meets Server" screencast showing how WCF transactions can be flowed in and down to the Transactional NTFS capabilities.  Run, don't walk, to check this thing out here.
  • mwinkle.blog

    Musings for 2008 Resolutions

    • 2 Comments

    It seems to be the thing to note a few things one will try to do better in the new year.  What follows are my resolutions that are related to things at work.

    • Work with Outlook shut off - let me focus on the task at hand
    • Work with minimal internet induced interruption - see note above :-)
    • Find interesting WF content to blog about until I can talk more about the work going on for Oslo
    • Ask a lot more questions of you, what your experience is with the WF designer, and what we can do to make that better
    • Learn F# - I enjoyed the brief excursion into Lisp while an undergrad, I would like to get back into thinking in a functional way
    • Work to improve my speaking skills - I've been happy with my performance at the conferences I have spoken at, but I haven't really done anything to try to take that to the next level. 
      • A corollary: Deliver at least one "non-traditional" presentation - I (and many people in the audiences) see a million and a half "bludgeoned by bullet point" presentations a year.  I'd like to try something different (a good example of this is Larry Lessig's talk at TED this past spring (deeper sidenote, a great collection of interesting speakers and speaking styles are present at the TED podcast.  I always make sure to have a few of those on my Zune to watch on the bus))
    • See you all at PDC :-)
  • mwinkle.blog

    Matt’s PDC Session List

    • 0 Comments

    I’m hoping that this PDC might be a little different and I may actually get to attend some sessions, rather than just prepping for mine (or the others from my team).  I’ve gone through all 22 pages of published talks, and I think there are some interesting ones.  So, without further ado, and with little, if any regard for actual scheduling of talks in relation to mine, here are the talks I would be interested in going to at PDC09.

    Hopefully I can get to 10 of them.

    Session Name

    Comment

    Building Data-Driven Applications Using Microsoft Project Code Name "Quadrant" and Microsoft Project Code Name "M" Doug’s had a few things to say recently, and I think this talk will be interesting.  My team was part of the Quadrant team for a while, and I’m curious to see what they’ve been up to.

    Windows 7 and Windows Server 2008 R2 Kernel Changes

    Anytime that you get to hear Mark talk about the kernel, it’s a great opportunity to learn a lot about a topic we don’t think every day.

    Code Visualization, UML, and DSLs

    The architecture tools team has done some really, really neat stuff to help you understand your code base better. I've found the tools to be really useful when looking at our code base, and I'd like to learn more here.

    Advanced Microsoft SQL Server 2008 R2 StreamInsight

    I've been watching StreamInsight since there was an internal talk on the technology. The capabilities here to do high capacity event stream queries is amazing. I think there are a number of interesting classes of problems where this can be useful and I'd like to find out more

    Introduction to Microsoft SQL Server 2008 R2 StreamInsight

     

    Dynamic Binding in C# 4

    This feature is one that I was excited to hear about in Anders' talk last year. A whole hour on how this works, that's just bliss

    Windows Error Reporting

    This kind of data is gold for folks who are building software.  You can’t catch every bug, or be aware of a video card incompatibility in a certain language on XP SP2, or always know why things might go wrong in your apps.  WER gives you a way to get that kind of data.

    How Microsoft Visual Studio 2010 Was Built with Windows Presentation Foundation 4

    This has been a huge undertaking, and hearing Paul talk about this will be insightful about the challenges faced, the way to integrate large, existing code bases with WPF, native WPF interop, etc.

    Windows Presentation Foundation 4 Plumbing and Internals

    I have a weak spot for deepdives into plumbing like this talk. This kind of knowledge is so useful for building apps on top of WPF to really understand how the pieces work together and what's happening at the low levels

    Future of Garbage Collection

    I got to sit next to Patrick at a dinner and had an amazing convesation that knocked my socks off. This is one of the guys who built the GC in .NET, and hearing the way he thinks will be interesting.

    Microsoft Perspectives on the Future of Programming

    Look at the list of this panel, how could you not want to hear this conversation? Just getting these folks together means there will be some interesting topics with lots of different backgrounds (from Jeffery Snover to Erik Meijer)

    REST Services Security Using the Access Control Service

    Justin is my former partner in crime from our days as technical evangelists, and I have lunch with him regularly and the stuff that he is working with is wicked cool. He's also one of the best presenters in the company, and there is always something I learn about presenting when I watch him talk.

    Data-Intensive Computing on Windows HPC Server with the DryadLINQ Framework

    I'd love to see anything that talks about "familiar declarative syntax of LINQ combined with the fault-tolerant distributed graph scheduling of the Dryad runtime"

    Building Sensor- and Location-Aware Applications with Windows 7 and the Microsoft .NET Framework 4.0

    The location API made me open up my first C++ project in a long time, the range of scenarios that this enables in Win7 is awesome

    Code Contracts and Pex: Power Charge Your Assertions and Unit Tests

    My first "from the labs" talk on the list, Pex and Code Contracts bring some really compelling capabilities to .NET development. This would be great to see.

    Microsoft Application Server Technologies: Present and Future

    This is from my team, and I'm excited to see the reaction to some of the cool stuff we will be talking about in the App Server space

    Rx: Reactive Extensions for .NET

    Erik Meijer is one of those guys I can't get enough of, I'd sign up for the fan club on channel9 if it were available.

    Building Amazing Business Applications with Microsoft Silverlight and Microsoft .NET RIA Services

    Brad is a really great presenter, and this is a whole space I have not had a chance to pay much attention to. I'd love to learn more about the ways to rapidly create business applications.

    SketchFlow: Prototyping to the Rescue

    Having just finished a project with a lot of designer/developer interaction, I have a lot of hope for things like SketchFlow.

    Developing REST Applications with the .NET Framework

    I like watching Don's talks, and REST is kind of a thing these days. Done deal.
  • mwinkle.blog

    Picture Services

    • 1 Comments

    As Justin announces here, my team recently released the picture services sample.  This is a cool way to expose the pictures on a machine that are found via Windows Desktop Search out in a simple, easy to consume REST endpoint.

    There are a few things here that I think are pretty cool

    • Pretty easily return POX and Syndication formatted data, and creating a URI hierarchy (here)
    • Querying Windows Desktop Search (here)
    • Adding Simple List Extensions (here)

    Justin has a screencast here.

  • mwinkle.blog

    Introducing the Windows Server 2008 Developer Training Kit

    • 0 Comments

    Finally, a post of mine without code, haven't had one of those in a while.

    Through some interesting organizational hierarchies, I actually report up through the Longhorn Server, strike that, Windows Server 2008 evangelism team, and every now and then do deliver some content that is relevant to Windows Server 2008 (there, got it right that time).

    Check out this Developer Training Kit we just released over on James' blog.

    This thing contains about 15 presentations on topics relevant to Windows Server 2008 development, including some cool stuff on TxF (Transactional File System) (and of course, WF and WCF.

    Check it out, here.

  • mwinkle.blog

    WF and BizTalk

    • 1 Comments

    Before joining Microsoft, I did spend a fair amount of time in the BizTalk world, and to this day, it remains one of the most common source of questions I am asked when presenting on WF.

    Paul showed off some cool stuff at TechEd, and yesterday released the code to enable a pretty interesting pattern where processes can be modeled in WF and then an orchestration can be created from the workflow to handle the messaging.  This gives you the flexible process modeling in WF and then rely on  BizTalk to handle all those messy real world details like transforming messages in the send port, communicating via the built in adapters, and handling retries. 

    This is a cool project that let's you use both technologies together today.  Check it out and give feedback at the connect site!

  • mwinkle.blog

    WF4 Beta1 => Beta2 Breaking Changes Document Published

    • 0 Comments

    Fresh on microsoft.com downloads, you can get the details of the major breaking changes that occurred for WF between Beta 1 and Beta 2.

    Get the document here.

    We will publish a similar document for any changes between Beta2 and RTM, although that list should be on the shorter side.  If you have feedback on the document, like the way something is presented or think we could have done a better job explaining it, please let me know.  Either comment here or use the “Email” link on the side.

  • mwinkle.blog

    Upcoming Ohio and Michigan Tour

    • 3 Comments

    Next week I will be in Ohio and Michigan meeting with customers and speaking about Windows Workflow Foundation at the Cleveland and Ann Arbor .NET User Group meetings.  If you're in the neighborhood, you should stop on by!

    I'll be posting slides from the talks and will try to summarize questions that I get here as well. 

  • mwinkle.blog

    Microsoft HDInsight Installation & Dependency Management

    • 1 Comments

    It’s a rainy Saturday afternoon here in Seattle, and the kids are keeping themselves busy running around the Christmas tree, so I’ve got a little time to put together a post that addresses some questions that have come up a few times in the forums as well as in our internal discussion aliases for our on-premises install of HDInisght.

    We currently use the Web Platform Installer to take care of dependency management of the installation. 

    It’s important to point out that there are actually two key pieces that get installed

    The second has a dependency on the first.  The WebPI feed also contains a number of pre-requisites required to set up IIS and a few other things for the HDInsight dashboard.  Here’s what it looks like on a completely fresh Windows Server 2012 machine.  Most developer machines likely have some or most of the IIS pre-reqs installed.  We’re also working to clean up some of this to minimize installation & setup.

    image

     

    Let’s talk a little bit about what’s in each one.

    Hortonworks Data Platform installer

    This msi includes the core Hadoop bits (Map/Reduce, HDFS), as well as a number of other Apache projects in the Hadoop ecosystem.  The full list included in the current installer are:

    • Map Reduce
    • HDFS
    • Hive
    • Pig
    • HCatalog

    Each of these projects is packaged into a zip file that contains a PowerShell script that automates the installation and setup of the component.  There are more components in Hortonworks Data Platform, and the teams are working to get these packaged and included.

    Microsoft HDInisght Installer

    This msi contains bits that are Microsoft specific, and may also contain additional Hadoop projects.  The current install (as of today) contains:

    • HDInsight dashboard
    • Sqoop
    • Isotope.js
    • Getting Started content

    These are packaged the same way as the Hadoop projects above.  Additionally, there is an installation PowerShell script here which will do some initialization of the single node installer, such as starting the services for the Hadoop components.

    Alternate Approaches

    The team discussed a number of potential factorings, and we very much welcome feedback here.  A few ideas that we’ve thought about:

    • Stable and Experimental packages.  This would allow us to set expectations around quality and stability of the bits. 
    • Decomposing every project into an individual msi
    • Integrating and building a Chocolatey package for these

    What Does This Mean For Me?

    What this means is that when you install HDInsight out of WebPI, you are installing two different msi’s.  We are revving the Microsoft msi every two weeks to pull in bug fixes (and very shortly include some experimental features).  The Hortonworks msi will be revved on a different schedule, as the team there decides to release an update.  We are partnering closely with the team there and so we will coordinate releases so that the combined installation will always work.

    More directly, this implies that if you want to uninstall completely, you will need to uninstall both packages from Add/Remove Programs:

    image

    This also means that when we issue an update for the Microsoft HDInsight package, you don’t have to “lose” your cluster by uninstalling both products.  You should be able to simply uninstall & update the Microsoft HDInsight package. 

    The team would love to get more feedback on this approach, so, let us know what you think!

  • mwinkle.blog

    Talking About Hadoop on Windows

    • 0 Comments

    A few folks have asked, so I decided to put the data in one place.  We'll be talking more about Hadoop at TechEd North America and Hadoop Summit next week, and then later in the month at TechEd Europe.  

    Here are the sessions we're presenting:

    • TechEd (NA and Europe, links are to NA sessions)
      • Learn Big Data Application Development on Windows Azure -- Wenming Ye
        • Web 2.0 companies have been fully taking advantage of Hadoop based open source tools to tackle Big Data needs. Microsoft now offers the best of both worlds with its own Hadoop solution on Windows Azure with full compatibility and additional rich toolsets. This session is a "getting-started" tutorial on developing Big Data applications on Windows Azure. We cover application scenarios, Hadoop on Azure, tools, and applied data analytics. More importantly, we show you how to put everything together with a couple of sample applications
      • Big Data, Big Deal? -- Gert Drapers
        • Are you ready for the exploding world of big data? Do you know the difference between Hive and Pig? Do you know why MapReduce is being taught in many universities rather than SQL? If not, pay attention because this talk will help get you started in understanding this new world. While sometimes the Hadoop toolkit (which includes HDFS, MapReduce, Hive, Pig, and Sqoop) is used as an alternative to relational database systems such as SQL Server, more frequently customers are using it as a complementary tool. Sometimes it may be used as an ETL tool or to perform an initial analysis of a freshly acquired data set to determine whether or not it is worth loading into the data warehouse, and sometimes to process massive data sets that are too big to even contemplate loading into all but the very largest data warehouses. In addition to covering the basics of the various parts of the Hadoop stack, this talk discusses the strengths and weakness of the Hadoop approach compared to that provided by relational database systems and explores how the two technologies can be used productively in conjunction with one another.
      • Harnessing Big Data With Hadoop
        • Attend this session to learn about the Hadoop Big Data solution from Microsoft that unlocks insights on all your data, including structured and unstructured data of any size. Accelerate your analytics with a Hadoop service that offers integration with Microsoft BI and the ability to enrich your models with publicly available data. Finally, learn about our roadmap for Hadoop on Windows Server and Windows Azure and for broadening access to Hadoop through simplified deployment, management and programming, including JavaScript integration
    • Hadoop Summit
      • Unleash Insights On All Data With Microsoft Big Data -- Tim Mallalieu
        • Do you plan to extract insights from mountains of data, including unstructured data that is growing faster than ever? Attend this session to learn about Microsoft’s Big Data solution that unlocks insights on all your data, including structured and unstructured data of any size. Accelerate your analytics with a Hadoop service that offers deep integration with Microsoft BI and the ability to enrich your models with publicly available data from outside your firewall. Come and see how Microsoft is broadening access to Hadoop through dramatically simplified deployment, management and programming, including full support for JavaScript.
      • How Klout is changing the landscape of social media with Hadoop and BI -- Denny Lee, David Mariani (VP of Engineering, Klout)
        • In this age of Big Data, data volumes grow exceedingly larger while the technical problems and business scenarios become more complex. Compounding these complexities, data consumers are demanding faster analysis to common business questions asked of their Big Data. This session provides concrete examples of how to address this challenge. We will highlight the use of Big Data technologies—including Hadoop and Hive —with classic BI systems such as SQL Server Analysis Services.

          Session takeaways:
          • Understand the architectural components surrounding Hadoop, Hive, Classic BI, and the Tier-1 BI ecosystem
          • Get strategies for addressing the technical issues when working with extremely large cubes
          • See how to address the technical issues when working with Big Data systems from the DBA perspective

    I think that there is a pretty nice mix of Hadoop, Microsoft plans, and applications across these sessions.  Hope you get a chance to see them (or watch them after the events!)

  • mwinkle.blog

    HELP WANTED

    • 0 Comments

    My boss, James, has a post on his blog about two positions that are open on our team.  One focuses on Orcas evangelism, while the other is for IIS7.  Our team does a ton of cool stuff, and if you're interested, certainly drop James a line.  If you want to create, deliver, and scale your passion for .net (or IIS) to the world, give it a look!

  • mwinkle.blog

    Down In Orlando

    • 0 Comments

    David and I arrived in Orlando yesterday morning via the redeye for TechEd 2007. We're settled into the Port Orleans French Quarter, and will be heading on over to the conference center later today.  Once we get all checked in, I'll post some info on the talks I'll be giving, and the talks that I wouldn't want to miss.  For now, I'm out to enjoy the non-Seattle-like weather and adjust to the time change. 

  • mwinkle.blog

    PDC, 2008

    • 1 Comments

    The site went live last night, check it out and I hope to see you there.

    A few of the sessions that caught my eye.

    Windows 7: Optimizing for Energy Efficiency and Battery Life

    A single application can reduce mobile battery life by up to 30%. Windows 7 provides advances for building energy-efficient applications. In this session we will discuss how to leverage new Windows infrastructure to reduce application power consumption and efficiently schedule background tasks and services.

    Windows 7: Touch Computing

    In Windows 7, innovative touch and gesture support will enable more direct and natural interaction in your applications. This session will highlight the new multi-touch gesture APIs and explain how you can leverage them in your applications.

    Windows Mobile: Location, Location, Location

    Mobile location based services are the next "big thing", and this session will review the tools and technologies Windows Mobile makes available to developers wanting to integrate location into their applications. Location acquisition, services integration, mapping and visualization, and best practices will be discussed.

    Live Platform: Mesh Services Architecture Deep Dive

    You've heard about Microsoft's new software+services platform Live Mesh, combining the world of the web and the world of digital devices. Come take a look under the hood and learn about the underlying service architecture behind this mass-scale cloud service and client platform. We'll look at services such as FeedSync-based synchronization, accounts and security services, P2P communications, pub-sub infrastructure, and the Mesh Operating Environment (MOE).

    Architecture of the Building Block Services

    Dive into the architecture that links many of the building block services and lets ISVs and businesses deliver compelling solutions. Learn how to compose these services to create applications in the cloud and connect them with on-premises systems. In this session we'll cover the next generation of messaging, data, identity, and directory services, and how they help developers.

     

    oh, and this one...

    Advanced Workflow Services

    This session covers significant enhancements in Windows Communication Foundation (WCF) and Workflow Foundation (WF) to deal with the ever increasing complexity of communication patterns. Learn how to use WCF to correlate messages to service instances using transport, context, and application payloads. We'll show you how to use the new WF messaging activities to model rich protocols and how to use WCF as a rich default host for your workflows and expand the reach of WF with features like distributed compensation. See how service definition in XAML completes the union of WF and WCF with a unified authoring experience that dramatically simplifies configuration and is fully integrated w/ IIS activation and deployment.

    We'll probably have a few other things to talk about with WF as well ;-)

  • mwinkle.blog

    Got a cool .NET app? Hang with Tortoises for 12 days

    • 0 Comments

    tortoise

    photo courtesy of flickr user mikeweston 

    This is a something that the .NET team is putting together that has a contest with super cool prizes (seriously, 12 day trip to the Galapagos, for real (as my 4 year old says)).  (obligatory legalese here). If you’ve built a sweet app with the .NET framework, we’d like to hear about it.   So check out http://www.myDotNetStory.com today and enter to win.

  • mwinkle.blog

    Updating HDInsight Preview

    • 1 Comments

    Today we’ve shipped an update to the single node HDInsight Server Preview that is installable via Web Platform Installer.  

    What Are We Doing?

    Every two weeks, we’re going to take a snapshot of the work in progress and update the public installer with this (provided it passes a basic level of validation).  This gives us the opportunity to rapidly get new bits in front of customers, experiment with new features, and address bugs in a timely fashion. 

    We’ve also snapped versions such that releases of the SDK as well as the One-Box installer will all share the same version.  As such, we’ve versioned the msi as 0.2.0.0.

    How do I Get This?

    Simply by installing HDInisght out of WebPI onto a new machine after uninstalling (in place upgrade is not currently supported).  To completely uninstall previous versions, you will want first uninstall “Microsoft HDInsight Community Technology Preview” and then the “Hortonworks Data Platform”.  We will not preserve data in HDFS using uninstall/reinstall, so archive that prior to uninstalling.

    clip_image002

    What’s New?

    There are no new features in this release, we’ve worked to improve some of the install and setup experience to address some bugs that have been reported on this alias as well as the forums.

    • IIS / Dashboard setup issues experienced on certain OS SKU’s
    • Addition of HDInsight Dashboard link into Start Menu
    • Fix to Hive console for multiple line input errors
    • Clean up some uninstall issues
    • Minor changes to getting started content

    Hortonworks has also shipped an update to the HDP installation to address the bug encountered on Friday (documented here).  By removing both installations above and re-installing, you will get updated bits here.

    What if Something Doesn’t Work?

    Please report issues on the forums.  We’re also tracking a set of known issues via the release notes.

    When Will These Bits Be Updated In Azure?

    We’re using this as an opportunity to ship these bits early and often, and they will find their way into future service updates to Azure HDInsight, typically in the next monthly update.

    What’s Next?

    We’ll continue to address common issues as they are reported.  From a feature perspective, we hope the next update will include updates to the dashboard to include a new and improved Hive console.

  • mwinkle.blog

    Tap, tap, tap, is this thing on?

    • 2 Comments

    I'll be back to bloging here in a bit.  Since you last visited, I've had a fun journey working further on Workflow, Azure and now I've landed in the SQL team working on Hadoop, in particular, the developer story.

    Not a lot to say right now, but make sure to check out hadoopOnAzure.com !

  • mwinkle.blog

    Azure HDInsight Job Logging

    • 0 Comments

    We’ve made a nice fix to the Templeton job submission service that runs on the HDInsight clusters for remote job submission.  We’ve talked with a number of customers who want to be able to get access to the logs for the jobs remotely as well.  This typically requires access directly to the cluster.  We’ve updated Templeton to support dropping the job logs directly into ASV as part of the status directory.

    The way to do this is to pass “enablelogs” as a query string parameter set to true.  Here’s the what the request looks like:

    image

    Upon job completion, the logs will be moved into the status directory, under a logs folder with the following structure:

    $log_root/list.xml (summary of jobs)
    $log_root/stderr (frontend stderr)
    $log_root/stdout (frontend stdout)
    $log_root/$job_id (directory home for a job)
    $log_root/$job_id/job.xml.html
    $log_root/$job_id/$attempt_id (directory home for a attempt)
    $log_root/$job_id/$attempt_id/stderr
    $log_root/$job_id/$attempt_id/stdout
    $log_root/$job_id/$attempt_id/syslog

    Here’s a screen shot from Storage Studio that shows the folder structure:

    image

    If you look in the syslog file here, you’ll see a bunch of goodness about your job execution.  For more complex jobs that spin off multiple map/reduce jobs (eg, Pig, hive, Cascading), you will see the set of jobs recorded there. The root directory will also contain a list.txt with the details of all the jobs, and each jbo will contain a jobs.xml.html which contains all of the details, environment variables, and configuration of the job.  All of this information is useful in debugging and tuning your jobs.

    We will be updating the SDK to support this parameter in the next release, but for now, you can submit jobs directly to the cluster and add this parameter to get job logs.

    Here’s the text from my execution of a simple Hive query:

    2013-07-11 02:48:56,632 INFO org.apache.hadoop.util.NativeCodeLoader: Loaded the native-hadoop library
    2013-07-11 02:48:56,726 INFO org.apache.hadoop.mapred.TaskRunner: Creating symlink: c:/hdfs/mapred/local/taskTracker/distcache/1562199005048822745_1328179698_1268869633/namenodehost/hive/scratch/hive_2013-07-11_02-48-38_757_5240085401495207012/-mr-10003/34a2bf31-e18b-440d-8d91-be8d0e445d2e <- c:\hdfs\mapred\local\taskTracker\admin\jobcache\job_201307110233_0003\attempt_201307110233_0003_m_000000_0\work\HIVE_PLAN34a2bf31-e18b-440d-8d91-be8d0e445d2e
    2013-07-11 02:48:56,992 WARN org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Source name ugi already exists!
    2013-07-11 02:48:57,289 INFO org.apache.hadoop.mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.WindowsResourceCalculatorPlugin@a4f5b6d
    2013-07-11 02:48:57,726 WARN org.apache.hadoop.hive.conf.HiveConf: hive-site.xml not found on CLASSPATH
    2013-07-11 02:48:57,961 WARN org.apache.hadoop.io.compress.snappy.LoadSnappy: Snappy native library not loaded
    2013-07-11 02:48:58,101 INFO org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader: Processing file asv://mwinkletemp37@mwinkle.blob.core.windows.net/hive/warehouse/hivesampletable/HiveSampleData.txt
    2013-07-11 02:48:58,101 INFO org.apache.hadoop.mapred.MapTask: numReduceTasks: 0
    2013-07-11 02:48:58,117 INFO ExecMapper: maximum memory = 954466304
    2013-07-11 02:48:58,117 INFO ExecMapper: conf classpath = [file:/C:/hdfs/mapred/local/taskTracker/admin/jobcache/job_201307110233_0003/attempt_201307110233_0003_m_000000_0/classpath-5670276484193870096.jar]
    2013-07-11 02:48:58,117 INFO ExecMapper: thread classpath = [file:/C:/hdfs/mapred/local/taskTracker/admin/jobcache/job_201307110233_0003/attempt_201307110233_0003_m_000000_0/classpath-5670276484193870096.jar]
    2013-07-11 02:48:58,164 INFO org.apache.hadoop.hive.ql.exec.MapOperator: Adding alias hivesampletable to work list for file asv://mwinkletemp37@mwinkle.blob.core.windows.net/hive/warehouse/hivesampletable
    2013-07-11 02:48:58,164 INFO org.apache.hadoop.hive.ql.exec.MapOperator: dump TS struct<clientid:string,querytime:string,market:string,deviceplatform:string,devicemake:string,devicemodel:string,state:string,country:string,querydwelltime:double,sessionid:bigint,sessionpagevieworder:bigint>
    2013-07-11 02:48:58,164 INFO ExecMapper:
    < MAP>Id =4
      <Children>
        <TS>Id =3
           <Children>
            <FIL>Id =2
              <Children>
                 <SEL>Id =1
                  <Children>
                    <FS>Id =0
                      <Parent>Id = 1 null<\Parent>
                    < \FS>
                  <\Children>
                  <Parent>Id = 2 null<\Parent>
                <\SEL>
              <\Children>
              <Parent>Id = 3 null<\Parent>
            <\FIL>
          <\Children>
          <Parent>Id = 4 null<\Parent>
        <\TS>
      <\Children>
    < \MAP>
    2013-07-11 02:48:58,164 INFO org.apache.hadoop.hive.ql.exec.MapOperator: Initializing Self 4 MAP
    2013-07-11 02:48:58,164 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: Initializing Self 3 TS
    2013-07-11 02:48:58,164 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: Operator 3 TS initialized
    2013-07-11 02:48:58,164 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: Initializing children of 3 TS
    2013-07-11 02:48:58,164 INFO org.apache.hadoop.hive.ql.exec.FilterOperator: Initializing child 2 FIL
    2013-07-11 02:48:58,164 INFO org.apache.hadoop.hive.ql.exec.FilterOperator: Initializing Self 2 FIL
    2013-07-11 02:48:58,179 INFO org.apache.hadoop.hive.ql.exec.FilterOperator: Operator 2 FIL initialized
    2013-07-11 02:48:58,179 INFO org.apache.hadoop.hive.ql.exec.FilterOperator: Initializing children of 2 FIL
    2013-07-11 02:48:58,179 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: Initializing child 1 SEL
    2013-07-11 02:48:58,179 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: Initializing Self 1 SEL
    2013-07-11 02:48:58,179 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: SELECT struct<clientid:string,querytime:string,market:string,deviceplatform:string,devicemake:string,devicemodel:string,state:string,country:string,querydwelltime:double,sessionid:bigint,sessionpagevieworder:bigint>
    2013-07-11 02:48:58,179 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: Operator 1 SEL initialized
    2013-07-11 02:48:58,179 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: Initializing children of 1 SEL
    2013-07-11 02:48:58,179 INFO org.apache.hadoop.hive.ql.exec.FileSinkOperator: Initializing child 0 FS
    2013-07-11 02:48:58,179 INFO org.apache.hadoop.hive.ql.exec.FileSinkOperator: Initializing Self 0 FS
    2013-07-11 02:48:58,195 INFO org.apache.hadoop.hive.ql.exec.FileSinkOperator: Operator 0 FS initialized
    2013-07-11 02:48:58,195 INFO org.apache.hadoop.hive.ql.exec.FileSinkOperator: Initialization Done 0 FS
    2013-07-11 02:48:58,195 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: Initialization Done 1 SEL
    2013-07-11 02:48:58,195 INFO org.apache.hadoop.hive.ql.exec.FilterOperator: Initialization Done 2 FIL
    2013-07-11 02:48:58,195 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: Initialization Done 3 TS
    2013-07-11 02:48:58,195 INFO org.apache.hadoop.hive.ql.exec.MapOperator: Initialization Done 4 MAP
    2013-07-11 02:48:58,601 INFO org.apache.hadoop.hive.ql.exec.MapOperator: Processing alias hivesampletable for file asv://mwinkletemp37@mwinkle.blob.core.windows.net/hive/warehouse/hivesampletable
    2013-07-11 02:48:58,601 INFO org.apache.hadoop.hive.ql.exec.MapOperator: 4 forwarding 1 rows
    2013-07-11 02:48:58,601 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: 3 forwarding 1 rows
    2013-07-11 02:48:58,617 INFO ExecMapper: ExecMapper: processing 1 rows: used memory = 87647896
    2013-07-11 02:48:58,617 INFO org.apache.hadoop.hive.ql.exec.MapOperator: 4 forwarding 10 rows
    2013-07-11 02:48:58,617 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: 3 forwarding 10 rows
    2013-07-11 02:48:58,617 INFO ExecMapper: ExecMapper: processing 10 rows: used memory = 87647896
    2013-07-11 02:48:58,617 INFO org.apache.hadoop.hive.ql.exec.MapOperator: 4 forwarding 100 rows
    2013-07-11 02:48:58,617 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: 3 forwarding 100 rows
    2013-07-11 02:48:58,617 INFO ExecMapper: ExecMapper: processing 100 rows: used memory = 87647896
    2013-07-11 02:48:58,632 INFO org.apache.hadoop.hive.ql.exec.MapOperator: 4 forwarding 1000 rows
    2013-07-11 02:48:58,632 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: 3 forwarding 1000 rows
    2013-07-11 02:48:58,632 INFO ExecMapper: ExecMapper: processing 1000 rows: used memory = 87647896
    2013-07-11 02:48:58,804 INFO org.apache.hadoop.hive.ql.exec.MapOperator: 4 forwarding 10000 rows
    2013-07-11 02:48:58,804 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: 3 forwarding 10000 rows
    2013-07-11 02:48:58,804 INFO ExecMapper: ExecMapper: processing 10000 rows: used memory = 87647896
    2013-07-11 02:48:59,211 INFO org.apache.hadoop.hive.ql.exec.MapOperator: 4 finished. closing...
    2013-07-11 02:48:59,211 INFO org.apache.hadoop.hive.ql.exec.MapOperator: 4 forwarded 59793 rows
    2013-07-11 02:48:59,211 INFO org.apache.hadoop.hive.ql.exec.MapOperator: DESERIALIZE_ERRORS:0
    2013-07-11 02:48:59,211 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: 3 finished. closing...
    2013-07-11 02:48:59,211 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: 3 forwarded 59793 rows
    2013-07-11 02:48:59,226 INFO org.apache.hadoop.hive.ql.exec.FilterOperator: 2 finished. closing...
    2013-07-11 02:48:59,226 INFO org.apache.hadoop.hive.ql.exec.FilterOperator: 2 forwarded 0 rows
    2013-07-11 02:48:59,226 INFO org.apache.hadoop.hive.ql.exec.FilterOperator: PASSED:0
    2013-07-11 02:48:59,226 INFO org.apache.hadoop.hive.ql.exec.FilterOperator: FILTERED:59793
    2013-07-11 02:48:59,226 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: 1 finished. closing...
    2013-07-11 02:48:59,226 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: 1 forwarded 0 rows
    2013-07-11 02:48:59,226 INFO org.apache.hadoop.hive.ql.exec.FileSinkOperator: 0 finished. closing...
    2013-07-11 02:48:59,226 INFO org.apache.hadoop.hive.ql.exec.FileSinkOperator: 0 forwarded 0 rows
    2013-07-11 02:48:59,226 INFO org.apache.hadoop.hive.ql.exec.FileSinkOperator: Final Path: FS hdfs://namenodehost:9000/hive/scratch/hive_2013-07-11_02-48-38_757_5240085401495207012/_tmp.-ext-10001/000000_0
    2013-07-11 02:48:59,226 INFO org.apache.hadoop.hive.ql.exec.FileSinkOperator: Writing to temp file: FS hdfs://namenodehost:9000/hive/scratch/hive_2013-07-11_02-48-38_757_5240085401495207012/_task_tmp.-ext-10001/_tmp.000000_0
    2013-07-11 02:48:59,226 INFO org.apache.hadoop.hive.ql.exec.FileSinkOperator: New Final Path: FS hdfs://namenodehost:9000/hive/scratch/hive_2013-07-11_02-48-38_757_5240085401495207012/_tmp.-ext-10001/000000_0
    2013-07-11 02:48:59,336 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: 1 Close done
    2013-07-11 02:48:59,336 INFO org.apache.hadoop.hive.ql.exec.FilterOperator: 2 Close done
    2013-07-11 02:48:59,336 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: 3 Close done
    2013-07-11 02:48:59,336 INFO org.apache.hadoop.hive.ql.exec.MapOperator: 4 Close done
    2013-07-11 02:48:59,336 INFO ExecMapper: ExecMapper: processed 59793 rows: used memory = 96526440
    2013-07-11 02:48:59,336 INFO org.apache.hadoop.mapred.Task: Task:attempt_201307110233_0003_m_000000_0 is done. And is in the process of commiting
    2013-07-11 02:48:59,382 INFO org.apache.hadoop.mapred.Task: Task 'attempt_201307110233_0003_m_000000_0' done.
    2013-07-11 02:48:59,414 INFO org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
    2013-07-11 02:48:59,476 INFO org.apache.hadoop.io.nativeio.NativeIO: Initialized cache for UID to User mapping with a cache timeout of 14400 seconds.

  • mwinkle.blog

    Every Day is a Good Day When You Paint

    • 0 Comments

    Slightly off-topic, but a friend of mine posted a video on Facebook that struck me as being very relevant to our space. 

     

    At the same time I saw this, another friend and co-worker was having a fairly crummy day with lots of crazy meetings and requests from co-workers, but that day was made a lot better by sitting down and solving a fun little deployment script problem.  The deployment script is neither here nor there, it was more that so many of the folks in this industry love solving a problem, whether that is getting a query right, chasing down a pesky bug, or getting that a-ha moment when you are trying to design something.  That’s “painting” for a lot of us. 

    Every day is a good day when you paint.

    [and working auto-tuned Bob Ross into a blog post was just too good to pass up]

Page 6 of 6 (148 items) «23456