Welcome to MSDN Blogs Sign in | Join | Help

A teaser on how OneNote storage and replication works

The other day someone internally was asking how OneNote stored its files and how often the save behaviour actually happened. You know if you were to pull the power cord on your computer what would you lose and what wouldn't you lose? Well Irina Yatsenko from the OneNote Test team wrote up the following to answer the question and she wanted me to post it for all to see:

Now, I'll describe in more details what we do in OneNote 2007:

  1. Internally all data from a single paragraph on a page up to a notebook are represented in a graph, which is split in areas we call "graph spaces". This allows us to load/save incrementally per a graph space, so when you open a notebook, you'd see all section tabs popping up almost immediately though pages inside those sections aren't yet loaded. When saving we can also choose which piece to save, rather than saving everything.
  2. We never save directly to the server hosting the files (even if it's a local machine). First we save into local cache file. Because the cache is local and OneNote has exclusive access to it, we can guarantee that save always succeeds (if not, OneNote will force an exit, because running without a cache means users might lose data, and we think it's better to exit then lose data). Save into cache happens every 30 sec or on exit ([descapa] I have found this to be faster at times though I am not pulling my power cord out)
  3. To propagate the data from the cache back to the original location of the sections we use background process – replication (=sync). Schedule for the sync depends on the actual store: UNC servers / local machine replicate every 30 sec, but for SharePoint it's by default set to 10 min. If replication fails (e.g. because the machine has lost power) the cache will still have the data and will try to replicate again after OneNote is restarted.
  4. Actual mechanics of the incremental save are rather technical. The bottom line is that we have our own binary format and all changes are stored in form of "revisions", sort of diff between current state and previously saved state. As these revisions grow OneNote will run optimization to clean up the revisions and update the main base state.

 

Hope it clears things a bit, let me know if you have any questions.

Thanks Irina! So I hope this explains things like why we have a cache (which allows OneNote to go offline, merge changes and more) as well as explain why our app works certain ways. The storage tech is actually quite complex and innovative; I haven't really appreciated it as much until I deal with other sync technologies that make me choose which copy is the most up-to-date, etc. There is still a lot more going on under the covers but this is a good overview, if you have more questions please let us know.

Published Tuesday, February 20, 2007 11:31 PM by descapa

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

# re: A teaser on how OneNote storage and replication works

As a OneNote 2003 junkie I have great interest in this new version.

So for the "incremental save"  does that mean I can use a USB drive to just carry the "incremental change" data around and not have to worry about having to carry the entire notebook file(s) around?

Tuesday, February 20, 2007 9:10 PM by Dave Fourputt

# re: A teaser on how OneNote storage and replication works

Dave - As you can see on this blog post:

http://blogs.msdn.com/descapa/archive/2006/08/02/686087.aspx

with OneNote 2007 you can store all of your notes on a USB drive and sycn between two computers.

Tuesday, February 20, 2007 9:29 PM by descapa

# re: A teaser on how OneNote storage and replication works

Perhaps the periodic "Optimization of revisions" described above explains my biggest problem with Onenote 2007 - sometimes Onenote disk utilization jumps to constant (hard drive light full on), and stays that way for a couple of hours. During this time, CPU utlization hovers between 90 and 100 percent, although Task Manager claims that Onenote's CPU usage is very low. However, if I terminate Onenote, the Disk and CPU usage immediately returns to normal. If this optimization really is a possible culprit, I would be interested to know if there is anything I can do to "force" the optimization to be done at a certain time, so that it doesn't happen when I'm trying to take notes in a meeting, for instance.

Wednesday, February 21, 2007 6:00 PM by Blair

# re: A teaser on how OneNote storage and replication works

I did not make my point clear.   Do just the "deltas"  go onto the USB drive?

Like:

Computer-A's OneNote  ---->User makes change--->delta goes to USB---->USB plugged into Computer-B------>Computer-B's OneNote synched.

Without this if your OneNote is larger than the USB's capacity then there will be a problem.

Wednesday, February 21, 2007 8:01 PM by Dave Fourputt

# re: A teaser on how OneNote storage and replication works

Dave - No the whole file is stored on USB (which will include deltas and the base).  If you store your notes on USB then all of your notes will be there but you can make changes on either computer and when you plugin the USB OneNote will sync the changes to the device.  If there are too many deltas then OneNote will optimize the files.

More clear now?

Thursday, February 22, 2007 12:50 PM by descapa

# re: A teaser on how OneNote storage and replication works

Blair - You can look in Tools-->Options under Save and there are some options in there.  You can tell OneNote to run all of your optimizations when you click a button and it should clean everything up.  In most cases I never have problems with optimization except for when I ran the beta release.  In RTM I haven't had problems.

Here is what I suggestion, click on the Optimize Now button and let OneNote finish.  Then see if you get those errors again.  Let us know if this fixes your problem.

Thursday, February 22, 2007 6:35 PM by descapa

# re: A teaser on how OneNote storage and replication works

It's clear now.   With the USB synch method the size of your Notebook is limited by the size of your USB drive.

Ugh.   Why not take a snapshot of the notebook at startup, let one do their work and then push a button to create the delta file?    then that delta file can be used to synch the notebook on the other machine.

In any case I am a OneNote 2003 junkie and I'll probably upgrade to OneNote 2007.   I'm also keep my fingers crossed for Zoho as a hosted solution for notes would be might cool.

Great blog!   Subscribed!!!!   :-)

Thursday, February 22, 2007 9:36 PM by Dave Fourputt

# re: A teaser on how OneNote storage and replication works

Hi Dan,

Thanks for suggesting the Optimize Now option. I tried it, and it does seem to exhibit the behavior (high disk and CPU usage, PC is much less responsive, goes on for more than 2 hours) that I was unhappy about.

I can understand what Irina described about there being a certain "threshold" of unsaved revisions beyond which OneNote decides to syncronize/optimize them. It would be helpful, though, if I had some option - when the sync starts up automatically - to tell Onenote "Now is not a good time!" and have it back off and try again later, . (I noticed that I had the option to cancel when I manually invoked the Optimize).

Saturday, February 24, 2007 2:34 PM by Blair

# re: A teaser on how OneNote storage and replication works

By the way, I should have mentioned that these problems have been experienced using 2007 RTM.

Saturday, February 24, 2007 2:35 PM by Blair

# re: A teaser on how OneNote storage and replication works

Dan,

Do you know if there's a way to tailor the UNC sync interval?  Here's why.

I tried running against a non-IIS WebDAV server to share some work with buddies of mine.  Despite all my best efforts to configure it to work correctly, the whole setup is just unstable.  So now I've set up VPNed access to a samba share on a personal machine.

Suffice to say, due to cable upload speeds, access to the SMB shares is pretty slow.  So slow that OneNote ALWAYS indicates that it is synching with the share.  It'd be nice if I could tweak the registry or something to tell it only to ping the server every 10 minutes or so.

Is this possible?

Evan

Thursday, March 01, 2007 2:35 PM by Evan Easton

# re: A teaser on how OneNote storage and replication works

Evan - I just looked and there are no policies/reg keys for the UNC sync interval, only on SharePoint.  If you were to connect via http:// then it would be 10 minutes instead of 30 seconds.

How about telling OneNote to work offline and then go back online.  You can do this by going to File-->Sync-->Sync Status.  You can choose Work Offline and then go online when you are ready to sync.  Will this work for you?

Monday, March 05, 2007 4:56 AM by descapa

# re: A teaser on how OneNote storage and replication works

Dan, that's (going offline and back on later) exactly what I and the others I'm working with are doing for the moment.  I just don't have access to a SharePoint service (or IIS WebDAV) and am not too keen on investing in a pay-for service at this point.  

Of course, every once in a while someone forgets to go online when were collaborating and has a doh moment after wondering why they're not seeing updates.

So it would be nice if the OneNote team could consider adding per-notebook sync schedule tailoring regardless of the type of share used.

Thanks.

Wednesday, March 07, 2007 2:30 PM by Evan Easton

# re: A teaser on how OneNote storage and replication works

Evan - Good feedback...have you thought about having just a simple account with Office Live?  I believe they have a free service that will let you do SharePoint over the Internet.  Perfect for what you are doing.  Maybe this doesn't work for you but it is a solution.

Otherwise good feedback

Wednesday, March 07, 2007 5:37 PM by descapa

# re: A teaser on how OneNote storage and replication works

I'll take a look into it.  I did do a quick search on "free sharepoint" and "free webdav" a while ago but most outfits had ridiculously small disk space offerings.  500MB for the Office Live Basics might do me for a while.

Thursday, March 08, 2007 8:48 AM by Evan Easton

# re: A teaser on how OneNote storage and replication works

Can we change cache location programmatically?

Monday, July 30, 2007 8:15 AM by Ravi

# re: A teaser on how OneNote storage and replication works

No you cannot do this with the OneNote API.  That value is stored in the registry so you could modify it via registry APIs and then reboot OneNote so it would use the new cache location.  Hope that helps

Wednesday, August 01, 2007 8:14 PM by descapa

# re: A teaser on how OneNote storage and replication works

Can we change OneNote cache file path programmatically?

Friday, August 03, 2007 1:44 AM by Ravi

# re: A teaser on how OneNote storage and replication works

Sorry for post the question again, i can't see the reply you have given.

Friday, August 03, 2007 1:47 AM by Ravi

# re: A teaser on how OneNote storage and replication works

No you cannot do this with the OneNote API.  That value is stored in the registry so you could modify it via registry APIs and then reboot OneNote so it would use the new cache location.  Hope that helps

Friday, August 03, 2007 2:03 AM by descapa

# re: A teaser on how OneNote storage and replication works

Can we get the Onenote Cache location using the Onenote API.

In other words i need to have (C:\Documents and Settings\....\OneNoteOfflineCache_Files) but programmatically at run time.

Friday, August 03, 2007 2:46 AM by Harshal

# re: A teaser on how OneNote storage and replication works

Thanks Dan,

Can you tell me registry location from where we can findout the offline cache files path for OneNote. I just tried out with registry but only few paths are available into registry like backup folder path and unfiledNoteSection under the HKEY_CURRENT_USER\software\Microsoft\Office\12.0\OneNote\Options\Paths.

Friday, August 03, 2007 5:35 AM by Ravi

# re: A teaser on how OneNote storage and replication works

It should be in there as well but you just haven't configured the option so it doesn't appear in the registry since ON uses the default value.

Please go to Tools-->Options, Save and choose to modify the cache location.  Then look for that key in the registry.  Now you have the key you can use.  Hope this helps

Friday, August 03, 2007 1:56 PM by descapa

# Embedded files & OneNote section file size

I love getting emails from readers of the blog, it is great to hear from so many of you. Of course I

Wednesday, December 12, 2007 9:43 PM by Daniel Escapa's OneNote Blog

# re: A teaser on how OneNote storage and replication works

One problem with this optimization scheme that I and my of my fellow students are suffering from is the following:

If you are working on one page, say for an 1.5 hr lecture, then about 45 < x min into the lecture, optimization kicks in ( I believe) and fries your session. This happens consistently for many people. I believe it has to do with "percentage of unused space allowed in files without optimizing" but I am not entirely sure...

This problem is particularly bad if you are also recording your lectures as CPU usage will skyrocket to 100% making onenote usable!

Thursday, February 14, 2008 2:33 PM by Robert

Leave a Comment

(required) 
required 
(required) 
 
Page view tracker