Welcome to MSDN Blogs Sign in | Join | Help

Map Geek Heaven

[After letting this blog languish for over a year - mostly because all of the OneNote topics I might have wanted to write about have been under tight wraps - I'm going to try to fire it up again.  It will probably end up being mostly general tech blather, since we're setting up a separate OneNote team blog]

Smugmug recently started supporting geocoding of images, and has nice integration with Google maps. Unless you have a camera with a built-in GPS, however, their current method to mark up images with lat/lon gets old rather quickly. There clearly needs to be an interface that shows you all the pictures in a gallery and lets you drag/drop them onto the map to indicate where they were taken. Perhaps there's some external app that hooks up to Google Earth or NASA World Wind to edit JPEG EXIF like this?

Speaking of which - I've been using Google Earth (formerly Keyhole) for a while, but only got around to playing with World Wind a few weeks ago (both of these apps are streaming 3D earth mapping clients - the map data streams in on demand as you navigate around the planet).  What I like about WW:

  • A much more useful mouse navigation model. Unlike in GE, you can actually pan, zoom and rotate without using the keyboard.
  • Many base imagery layers, including Landsat satellite, USGS topo and orthophoto (higher-resolution than all but the urban-area satellite imagery), MODIS (up-to-date imagery of current events), plus a plethora of specialized, animatable thematic layers. GE has only a single bitmap for the entire planet.
  • Related to the above, support for different layers for different camera elevations (e.g. 1:24,000 topo if you're at 2,000m, 1:100,000 topo if you're at 20,000m).
  • It's completely free (GE has "Plus" and "Pro" versions).
  • It's written in C# using managed DirectX.  Gotta love that.

Google Earth still has the edge on:

  • Render quality and speed. It's snappier and uses properly filtered textures at large distances, plus it has a nice atmosphere effect. Given the same data, it just looks better.
  • Better vector data layers (streets, rivers, etc.).
  • More consistently accurate elevation data. WW's data seems to be based on the Shuttle Radar Topography Mission data set, which can be higher resolution but is marred by occasional glitches (sudden cliffs where there shouldn't be any, etc.).
Posted by pbaer | 0 Comments

HTML Import in OneNote 2003

Copy/paste is one of those invisible features that you never really think about or notice until something goes wrong.  It should just work, right?  Especially for OneNote (which is supposed to be a sort of “data well” [1]), good clipboard integration is vital, and that turns out to actually be kind of hard.  We spent as much time and effort on it as any other major feature.  Here's why.

The primary interchange format in Office, including OneNote, is HTML.  When you paste content from one Office application, or the web, into another Office application, you're moving HTML around.  The source app has to take the content you selected and transform it from its own internal data format into HTML.  Then, the destination app has to take this HTML and transform it into its internal data format.  Choosing HTML as the lowest common denominator for this handoff has an obvious advantage - if you're writing code to read and write HTML anyway (because it's circa 1995 and that's the sort of thing you do now [2]), then you might as well use it to exchange content between apps as well.  And since it's an endlessly pliable format, it's easy to load it with Office-specific goo to make the exchange appropriately richer between Office apps than with other (“downlevel”) apps without having to use a different format.  Brilliant!

But the flexible, general-purpose nature of HTML is also what makes writing a really good (that is, invisible) importer for it maddeningly difficult.  To accommodate the needs of all these billions of web pages, HTML has evolved into an electronic publishing format that is as idiosyncratic as it is rich.  Consuming content from the web means being prepared to deal with any weird glob of HTML that the web designer, via your user, may choose to hurl at you.

Now, WYSIWYG fidelity when pasting external content was never the design goal - without a general-purpose HTML layout engine at our core, that was impossible anyway.  Rather, the goal was to turn that content into great OneNote outlines.  And here we run into a very basic problem: there are a lot of things you can express in HTML that don't have any meaning in OneNote.

For example, OneNote doesn't have tables.  You can nest headings in an outline to produce table-like layout, but that's it.  This is actually a bigger deal with respect to HTML import than it might first appear, because a lot of web designers use the <table> tag to lay out content on the page, not just to display "tables" in the traditional sense.  If we created nested headings whenever we saw a <table> in HTML, the output would, frankly, be a mess most of the time [3].  So we decided to do this only when importing from other Office apps (where we have a reasonable expectation that a <table> tag actually corresponds to something that looks like a table to the user - Excel being the prime example), and to ignore <table> tags in general HTML from the web.  That's why it's not uncommon to select a bit of harmless-looking text on a web page and have it show up linearized in some unexpected way when it's pasted into OneNote - chances are the content was chopped up into table cells in the source HTML.

We also run into problems when HTML can express something at a higher granularity than we can.  For example, we attempt to figure out what each pasted paragraph's "indent" on the page should be, so that we can preserve any outline-like structure that may have existed in the source content.  But outline elements in OneNote can only be indented in half-inch increments, so we have to snap each imported element to the next half-inch indent level, which can cause outline elements that were at different indents in the source to land at the same level in OneNote.  Argh.

The truth is, complex content pasted from the web or other apps will probably always require some amount of cleanup before you're happy with it.  But I think we've made it as painless as possible given the constraints.

1: "NoteWell” was an actual name we considered for the product at one point, though I'm not entirely sure whether “Well” was supposed to be an adverb or a noun.  Maybe that was the point.  Someone also proposed the Latin equivalent of the adverb form, “Nota Bene,” but a) apparently the company has a rule that product names have to either be English or completely made up (e.g. “Encarta”), and b) it's already taken anyway.  You can read more about the OneNote naming process in Chris Pratley's blog.

2: Chris has some background on this.

3: In the original OneNote 2003 release, our plaintext import (used when HTML isn't available) created nested headings when it saw inline TABs - i.e. TABs that are not at the beginning of a line.  A lot of text pasted from notepad and other non-HTML-emitting apps didn't show up very nicely when factored into a nested heading outline like this, so we dropped it in SP1 and now try to preserve the whitespace within a line as well as we can.

Posted by pbaer | 5 Comments
Filed under:

All the cool kids are doing it...

Sure seems like blogging is the hip thing to be doing at Microsoft these days!

I'll be writing mostly about OneNote, the note-taking application that debuted last year in the Office System.  I've been a developer on that product since its inception.  Previous to that, I worked on a (fondly remembered) product called PhotoDraw.

Posted by pbaer | 3 Comments
 
Page view tracker