Welcome to MSDN Blogs Sign in | Join | Help

"User friendly" explanation of how we cache embedded files

Irina Yatsenko, a OneNote Tester, wrote up the following information on how OneNote stores embedded files and how it works with the OneNote cache.  She wrote it for another user but thought it would be good for everyone so I am posting it here.  If you have any questions please let us know. and thanks to Irina!

 

 

What you see in OneNote:

clip_image002

 

What you see in File System:

clip_image004

 

"classical.one" and "jazz.one"  both contain embedded files, but you don't see it in the file system, because the files are truly embedded inside the sections (the section size reflects this, there is a single page in each section, but because of the embedded files the size is rather big):

clip_image006

clip_image008

 

What you see in the cache:

Embedded files are stored separately in OneNoteOfflineCache_Files:

clip_image010

 

And there is also OneNoteOfflineCache.onecache file, which includes content of all your sections except the embedded files.

 

So, conceptually it looks like this:

clip_image012

What happens when you open or close a notebook?

If you have a folder with some sections (in this example, "studies") and you open it OneNote as a notebook, OneNote will go through all sections and cache them in the manner described above, but it will *not* remove embedded files from the sections stored in the File System.

 

If now you try to play the audio in OneNote, it will rely on the cached external copy of the file to be present (because it's way faster/easier to play from this file than from the embedded data). So, if you delete the folder OneNote won't be able to play the file. However, if you close the notebook and open it again, OneNote will re-cache everything and all your data should be available again.

 

What might go wrong and cause size bloat?

Some operations inside OneNote cause duplicated copies of embedded files to be created, most common example being copy-paste of an embedded file. So, if you had a session of re-organizing your notes and copy-pasted a lot of embedded files, you'd see a spike in cache size. Those copies should be removed by OneNote's garbage collection, and this leads us to the next point.

 

OneNote has asynchronous garbage collection logic when from time to time it cleans up the cached files from those that are not needed anymore. In some cases garbage collection (GC) either fails to run at all (e.g. access denied conditions sometimes happen on Vista for no apparent reason) or it fails to collect specific files because it incorrectly concludes they are still needed.

 

The surgery solution to this is to close all notebooks (ensuring that nothing ended up in Misplaced virtual notebook) and delete all cached files manually. Then re-open the notebooks and let OneNote to cache everything afresh, hoping that GC will jump over whatever hole it previously fell into...

Published Tuesday, September 18, 2007 10:21 PM by descapa

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

# re: "User friendly" explanation of how we cache embedded files

Here is the short version of who/what/why:  I am a teacher/techie at a 1:1 school in Cary, NC.  We have 700+ students with Tablet PCs and ON is the tool for us.  Last year we were in the habit of converting teacher handouts to ON with Send to ON, unfortunately file size created some server issues for us.  This year we have moved to embedding PDFs into ON pages.  Students then edit the handouts using Bluebeam Revu.  As we are dealing with kids from 12-18 y.o., it has been hard to trace, but students have indicated that annotations to the PDF sometimes "disappear."  I am speculating that either the editor or ON closes improperly (end task or system shutdown), although one student reported a hibernation may have also caused loss of data.  Are these the only logical causes for ON not to keep track of the changes in embedded files?  Thanks for your help.  I enjoy your blog and ON even more.  It is an incredible app.

Tuesday, September 18, 2007 9:42 PM by Sam Morris

# re: "User friendly" explanation of how we cache embedded files

If I may offer a suggestion as to why access denied seems to be happening on Vista for no apparent reason... I've noticed that the Vista indexer, when indexing files, tends to lock files. You can verify this using Sysinternals Process Explorer and just search for handles that have the OneNote cache in its path and you'll see all processes that have an open handle to this folder.

Wednesday, September 19, 2007 9:55 AM by Josh Einstein

# How about an explanation of how you sync content?

Dan,

Thanks for sharing.

Any chance one of your people has a detailed description of the concepts behind syncing?  I've done some deep digging, if you will, into the binary files and think I have a handle on the basics, but it'd be cool to see it described by the ON team.

Evan

Thursday, September 20, 2007 11:48 PM by Evan Easton

# re: "User friendly" explanation of how we cache embedded files

Where do you find the cash files exactly

Tuesday, November 06, 2007 3:07 AM by christian

Leave a Comment

(required) 
required 
(required) 

  
Enter Code Here: Required
 
Page view tracker