Welcome to MSDN Blogs Sign in | Join | Help

Timestamps

In one of the main scenarios for data integration with Groove:  you have a Groove Forms tool with a set of records, and a SQLServer table or a SharePoint list with a set of records, and you either want to synchronize from Groove to the other system, or from the other system into Groove, or bidirectionally.

There are a couple ways to do this.  One is to listen for events from the source (or both systems, for bidirectional synch) and push the records which caused those events into the other system as soon as possible.  Another is to occasionally query each table, identify which records have changed, then push a batch of changes from one side to the other.  For a variety of reasons, this second strategy seems to be the approach most people take.

With this query-based approach, you need to be quite careful about timestamps.  So here's the gory detail.

Groove Forms System Fields

Every record in a Groove Forms tool has some "system fields" which are always present:

_Created

_CreatedBy

_CreatedByURL

_Modified

_ModifiedBy

_ModifiedByURL

_RecordID

_ParentID

_Editors

_Readers

_IgnoreUnread

Forms_Tool_grooveFormID

Forms_Tool_IPContents

For now, let's just look at _Created and _Modified.  (The others are all interesting, but that'll wait for another time).

The _Created field is a date/time value (internally, it's stored as milliseconds since Unix epoch, like the JavaScript datevalue) which is the time at which this record was created.  The _Modified field stores the time when the record was last modified.  These values are set by the creator, or editor, of the record.

Here's the problem: they're local time.  If another member of the workspace created or modified the record, their timestamps might be in the future, or in the past, compared to your local clock.

Timestamp Query

Let's assume you're writing an integration application to be deployed on a Groove Data Bridge server, which is a member of the workspace(s) which need to be synchronized with (say) SQLServer.  Your application has a timer, and every few minutes, asks Groove for a list of the records which changed since you last synchronized.

This query would seem straightforward enough.  Find the Forms tool, then call QueryRecords:

DateTime lastQueried; // from the last time we queried for data

GrooveForms2.RecordQuery recordQuery = new GrooveForms2.RecordQuery();

// set query properties, including this:

recordQuery.WhereClause = String.Format("_Modified > #{0:yyyy'-'MM'-'dd'T'HH':'mm':'ss'Z'}#", lastQueried.ToUniversalTime());

// Ask service to query records

lastQueried = DateTime.Now;

GrooveForms2.FormsRecordDataSet fds = formsService.QueryRecords(recordQuery);

Unfortunately this won't work.

First issue: "lastQueried" is local to your application, which might be running on a different machine from Groove (which is quite a common scenario with EDB server-based integration);  instead, you might want to scan your known records instead to find the high-water-mark of _Modified dates which you know about.

Second issue: the query says > where it should say >=.  The internal timestamp is milliseconds, but this query only asks for seconds-based matching, so it might miss values.  Or you might query against the milliseconds value directly.

Third issue, the kicker: that _Modified date corresponds to the client's clock where the modification originated.  Even if their machine clock was synchronized with yours, it's entirely possible that the user was offline when they made this change, then connected to the network a few days later, and you're only now seeing their changed record; and this simplistic query will never see it.

Forms_Tool_EDBSeqTime

What you need is a monotonic timestamp.  We implemented this on the Data Bridge server only -- not in the client software, because there's a small performance overhead -- using a special field called "Forms_Tool_EDBSeqTime".

On the Data Bridge, make sure you have installed the "Forms Tool Sequence Time Extension" (documentation in the data bridge administrator's guide).  This adds a local-only field to every Forms record, and updates its value every time an update is received.  This means that the Forms_Tool_EDBSeqTime value is a timestamp local to the server, not the originator of a change.

So, to implement our query properly:

DateTime lastHighWaterMark = (...last known value...); // we usually want to persist this somewhere...

GrooveForms2.RecordQuery recordQuery = new GrooveForms2.RecordQuery();

// set query properties, including this:

recordQuery.WhereClause = String.Format("Forms_Tool_EDBSeqTime >= #{0:yyyy'-'MM'-'dd'T'HH':'mm':'ss.ffff'Z'}#", lastHighWaterMark.ToUniversalTime());

// Ask service to query records

GrooveForms2.FormsRecordDataSet fds = formsService.QueryRecords(recordQuery);

lastHighWaterMark = /* highest Forms_Tool_EDBSeqTime from known data */;

Alternatives

Event-based, rather than query-based, synchronization doesn't requre the Forms_Tool_EDBSeqTime monotonic timestamp.  But implementation (as always) is another subject, for another day.

 

Published Tuesday, May 23, 2006 9:52 AM by hpyle
Filed under:

Comments

# hughpyle Timestamps | Patio Chairs

Tuesday, June 02, 2009 10:39 PM by hughpyle Timestamps | Patio Chairs
Anonymous comments are disabled
 
Page view tracker