In a recent discussion with Allan Mitchell, we talked about the event shape concepts in StreamInsight and their relationship to timestamps. I thought the information is useful for a bigger audience, so I am also posting it here.

The different event shapes - point, interval, edge - are concepts that only exist in adapters. In the engine, every event has a start and an end time. On the input side:

  • With an interval input adapter, you specify both start and end timestamps.
  • With a point adapter, you just specify a start time, and the adapter framework will set the end time to one tick later. The engine sees an interval of one tick.
  • With an edge adapter, you can create (i) a start edge with a single start timestamp, which the API will again translate into an interval event with end time = infinity, and (ii) an end edge, which will be translated into an event that “corrects” the previous one to a finite end time, assigned by you in the adapter.

All of these events, as soon as they reach the engine (and hence a query), will have both start and end timestamps. And they will be treated the same by all operators in that respect. There is one operator called ToPointEventStream(), which sets the end time to start time + 1 tick, but these are still intervals. It’s just a convenience syntax.

On the output side, the result interval events will surface to the outside world according to the output event shape specified in the ToQuery() call (provided that the output adapter factory can instantiate the according adapter):

  • Using interval, you see both start and end timestamps for each event. The events will be ordered by start time if you use StreamEventOrder.FullyOrdered in the ToQuery() call, or by their end times otherwise.
  • Using point, the end times will be discarded and you just see the start timestamps.
  • Using edge, you will get start edges for interval starts and end edges for interval ends. This will make the output more aggressive than an interval adapter, since the engine can produce start edges even when the end of the interval is not committed by a CTI yet.

As you can see, the query designer can mix and match adapters as needed. The specified event shape on the input influences how the incoming (interval) events are created from the actual event source data, while on the output it dictates how the result (interval) events are interpreted by the output adapter.

When using the event tracing functionality in the debugger tool, you might have noticed that there exist other events than just insert events, namely retraction events. They always need to match a previously appearing insert and will set their end timestamp to an earlier time. Apart from showing up in the debugger, retraction events do not surface to the user, they are mostly generated inside the engine in the course of a query execution. However, as you might have guessed, the “end edge” event will in fact be translated into a retraction when enqueued by an edge input adapter. This is why they need to match the previous "start edge” (the insert event) by payload and start time.

Regarding calculations over the event timestamps: unless the timestamps are also in the payloads, you can’t use them in projections. If your timestamps are in fact data that you need to work with, then they also belong there. We think that the separation of data (payload) and system field (timestamps) is a useful one and creates clean query semantics.

[StreamInsight Concepts on MSDN]