Comparison of the three techniques:

Even Work Load: Invoicing of 100,000 single line Sales Orders. Bundle size used for this test is 1,000.

image

 

Uneven workload: Invoicing of 1000 Sales Orders and the number of lines for the sales orders vary between 1 and 500. Bundle size used is 100.

image

Very large number of work items: Since I wanted to use over a million work items for this test, instead of using the Sales Order invoicing, used a workload that completes much faster. (Check status of few different things and update my work item table accordingly). Used 2 Million work items for this test. The bellow metric shows when you create a very large number of work items, neither ‘top picking’ nor creating individual task will scale.

image

Recap:

 

Bundling:

PROS:

  • Will work fine for simple even workload.
  • No need for a staging table, no extra maintenance by the application code.
  • Not over pollute the batch table.

CONS:

  • Fixed number of tasks is created while scheduling the job (usually).  Batch framework is designed in such a way that you can add or shrink the number of batch threads either automatically thru batch server schedule or manually by the admin.  Because the number of task are pre-created, it will not scale up or down along with the batch schedule either using or yielding the extra resources.
  • For uneven workload you may need complex algorithm to find equal distribution of the work
  • In some applications, it may not be possible to distribute the workload evenly.

 

Individual Task modeling:

PROS:

  • Will work fine with uneven workload.
  • Simple to write.
  • Since number of tasks are not fixed, it will scale up or down along with the batch schedule either using or yielding the extra resources
  • Best fit to create dependency among the work items.

CONS:

  • Relies on the batch framework fully.
  • When the number of tasks is very large, the extra overhead due to batch framework will impede the performance quite severely.
  • It can negatively affect the other batch jobs as it is putting pressure on the framework tables

Top Picking:

PROS:

  • Will work fine with uneven workload.
  • Simple to write.
  • Not over pollute the batch table.

CONS:

  • Need an extra staging table to track the progress and work load.
  • Fixed number of tasks is created while scheduling the job (usually).  Because the number of task are pre-created, it will not scale up or down along with the batch schedule either using or yielding the extra resources.
  • When a very large number of short work items need to be processed, tracking the work items thru the staging table affects the performance and the throughput.

Depending on the nature of the workload and amount of work that needs to done on a regular basis you can decide a technique that suits your need the best.