I have been working with some developers working on mobile applications that connect to HealthVault, and the issue of synchronizing data between the mobile device and HealthVault has come up a couple of times.  We’ve come up with some guidance around this area and I’d like to share it more widely.

I’m assuming that the mobile application needs to keep the data around to work effectively – for example, it may want to produce a chart of weight values for the last few years. If your application only works with small amounts of data, you may not need to store any data locally.

Our goal

First, a bit of expectation setting about what our goal is.

We can start with a goal of “perfect sync” between the mobile application and HealthVault. It is possible to be current within a few seconds – but to do so will result in:

  1. A slower mobile application
  2. More bandwidth usage by the application
  3. Lots of extra requests to the HealthVault servers

 

The first two aren’t good for the user of the application, and the HealthVault team would like to avoid the third one.

Instead of trying for the “perfect sync”, we’re going to try for a “good enough sync” – a scheme that is right the majority of time for the majority of users.

“Good enough sync” guidelines

These are our current guidelines for keeping data synced on mobile devices.

1. Fetch only the items that have changed

If your application fetches all of the weight items in a user’s record and the user has recorded daily weights for the last 3 years, it’s going to take a while to process 1000+ weights.

Instead, when the application fetches data items, request the audit section as part of the filter:

<info>
  <group>
    <filter>
      <type-id>30cafccc-047d-4288-94ef-643571f7919d</type-id>
      <thing-state>Active</thing-state>
    </filter>
    <format>
      <section>audits</section>
      <section>core</section>
      <xml/>
    </format>
  </group>
</info>

You will get back XML that looks like this:

<thing>
        <thing-id version-stamp="cdb934a8-08c9-44a2-85e2-429cde027d01">1a868d6f-26c2-4504-bbc5-7b28b11de351</thing-id>
        <type-id name="Medication">30cafccc-047d-4288-94ef-643571f7919d</type-id>
        <thing-state>Active</thing-state>
        <flags>0</flags>
        <eff-date>2008-09-12T09:46:27.987</eff-date>
        <created>
          <timestamp>2008-09-12T16:46:28.22Z</timestamp>
          <app-id name="HelloWorld-SDK">05a059c9-c309-46af-9b86-b06d42510550</app-id>
          <person-id name="Eric Gunnerson">41b87f12-9b6f-40b6-a6ee-be4fbfd12170</person-id>
          <access-avenue>Online</access-avenue>
          <audit-action>Created</audit-action>
        </created>
        <updated>
          <timestamp>2008-09-12T16:46:28.22Z</timestamp>
          <app-id name="HelloWorld-SDK">05a059c9-c309-46af-9b86-b06d42510550</app-id>
          <person-id name="Eric Gunnerson">41b87f12-9b6f-40b6-a6ee-be4fbfd12170</person-id>
          <access-avenue>Online</access-avenue>
          <audit-action>Created</audit-action>
        </updated>
        <data-xml>
          <medication>
            <name>
              <text>Marmoset</text>
            </name>
          </medication>
          <common/>
        </data-xml>
      </thing>

 

When you process the items, extract the updated timestamp out, and figure out the maximum timestamp value for all of the items that you have fetched. Then, when you call next time, you can only get items that are newer than that timestamp:

<filter>
  <type-id>30cafccc-047d-4288-94ef-643571f7919d</type-id>
  <thing-state>Active</thing-state>
  <updated-date-min>2011-10-10T16:50:45.334Z</updated-date-min>
</filter>

That will reduce the amount of data that you fetch considerably.

2. Fetch partial things initially

When a filter would return a large number of items, HealthVault will return the more recently updated items in their entirety, and then it will return the remainder of the items as partial items. Instead of a full “<thing></thing>” structure, you will see something like this:

<unprocessed-thing-key-info>
  <thing-id version-stamp="fdd3f6cd-6eb6-46d7-bd7d-3d2c2fccc08b">34b0f978-f454-476a-b4a2-004c4a504e66</thing-id>
  <type-id name="Medication">5c5f1223-f63c-4464-870c-3e36ba471def</type-id>
  <eff-date>2011-07-22T19:22:30.94</eff-date>
</unprocessed-thing-key-info>

 

The number of full items that are returned can be specified by the caller, in the following manner:

<group max-full="1">
  <filter>
    <type-id>30cafccc-047d-4288-94ef-643571f7919d</type-id>
    <thing-state>Active</thing-state>
  </filter>
  <format>
    <section>audits</section>
    <section>core</section>
    <xml/>
    <type-version-format>30cafccc-047d-4288-94ef-643571f7919d</type-version-format>
  </format>
</group>

Where the “max-full” attribute sets the number of full items returned in this request. There is a hard limit of 240 full items returned per request.

Instead of fetching a large number of instances all at once, applications should fetch a few full items initially and save the unprocessed keys around. It can then fetch any later items in the background. This can be done by listing all of the instance ids (obtained from the <thing-id> element of the unprocessed thing key info section) in the filter:

<group>
  <id>34b0f978-f454-476a-b4a2-004c4a504e66</id>
  <id>e8655aca-eb68-4670-b253-4bc3585fa513</id>
  <format>
    <section>audits</section>
    <section>core</section>
    <xml/>
  </format>
</group>

 

3. Consider fetching more than one data type in a single request

If your application needs to fetch data for more than one data type, you should consider whether you can fetch all the data in a single get request. This will reduce the number of trips you make to the server and the amount of data you send over the network.

This is done by including multiple filters in a single request:

<group>
  <filter>
    <type-id>30cafccc-047d-4288-94ef-643571f7919d</type-id>
    <thing-state>Active</thing-state>
  </filter>
  …
</group>
<group>
  <filter>
    <type-id>3d34d87e-7fc1-4153-800f-f56592cb0d17</type-id>
    <thing-state>Active</thing-state>
  </filter>
   …
</group>

The response will have one group for each group in the request.

4. Fetch only the data that you need now

HealthVault records can contain a considerable amount of data in them. A weight tracking application may work fine with your set of test data, but if it gets pointed to a record that contains daily weights for the past 3 years, it may roll over and die.

Applications should instead limit their queries to fetch bounded amounts of data. If the application is only going to show the user their weight for the last month, it should only fetch that data.

5. Fetch only on startup and when data has been changed locally

Mobile apps tend to be run for short periods of time, so checking for new data at startup and when you know there is new data (ie you put it there) is the recommended approach.

Applications may add a “refresh” option if desired.

Deleted data

HealthVault does not currently provide a simple way for applications to determine when data has been deleted from a user’s HealthVault record. Since deletions are rare in HealthVault, we recommend that mobile applications do not attempt to track deletions automatically.

An app may choose to provide a “synchronize” option that the user may choose to invoke. In that case the app should fetch using max-full=”0”, which will return the minimal amount of data.

If an application does want to detect deleted data, it should fetch the things using max-full=”0”, and then remove any items in the local cache that were not returned from the query.