The Microsoft Dynamics CRM Blog
News and views from the Microsoft Dynamics CRM Team

Part III - Bulk Operations: SDK Support for Duplicate Detection in Titan

Part III - Bulk Operations: SDK Support for Duplicate Detection in Titan

  • Comments 0
  • The Intro is here
  • Part I - Configuration is here
  • Part II - Run-time is here

So I discussed run-time duplicate detection. However, there is more to duplicate detection in Titan. What if you need to detect duplicates on bulk of records that are already there in the system? This section explains the bulk duplicate detection paradigm.

1. Using bulk duplicate detection.

Yet another very powerful feature under the duplicate detection hood in Titan is the bulk duplicate detection. Let’s say that in spite of the strong run-time duplicate detection support, duplicates manage to enter the Microsoft Dynamics CRM somehow, maybe due to an upgraded deployment or due to a migration of large number of records from any other CRM system. In this case the user has an option to do a bulk detection of duplicates in the system. Bulk detection of duplicates is also an asynchronous operation that is carried out in the background and the duplicates thus found are logged into DuplicateRecord entity.

Although the functionality that I am going to discuss here is already present OOB, for the sake of completeness, let us see how it all works:

// Running a bulk duplicate detection job

QueryExpression query = new QueryExpression();

query.EntityName = EntityName.account.ToString();

query.Criteria.AddCondition("createdon", ConditionOperator.GreaterThan, new CrmDateTime(_lastSunday));

BulkDetectDuplicatesRequest request = new BulkDetectDuplicatesRequest();

request.Query = query;

request.JobName = "Duplicate accounts since last Sunday’s migration";

// Send email notification using bulk duplicate detection email template.

request.SendEmailNotification = true;

request.TemplateId = _templateId;

request.ToRecipients = new Guid[] { _currentUserId };

request.CCRecipients = new Guid[0];

request.RecurrencePattern = String.Empty;

request.RecurrenceStartTime = CrmDateTime.FromUser(CrmDateTime.MinValue);

BulkDetectDuplicatesResponse response = null;

try

{

response = (BulkDetectDuplicatesResponse)_crmService.Execute(request);

// Poll for the Asynchronous job

_jobId = response.JobId;

}

catch (SoapException e)

{

}

As you can see from the signature of the method, it has a lot of rich functionality grilled into it which not only detects duplicates but also sends an email notification to the users when the job completes. Moreover, the bulk duplicate detection can be run on a regular basis by specifying a recurrence pattern and a start time. This way, the duplicate detection will run on its own periodically as per the specified recurrence scheme to keep the system clean.

Kindly refer to the documentation for details about all the parameters and their usage. Refer to the appendix for fetching bulk duplicate detection email template.

2. Duplicate reporting for bulk duplicate detection.

After the bulk detection of duplicates is complete, the next question that comes to mind is how do I retrieve the duplicates that were found and how do I report them?

The duplicate record entity has two fields called the base record id and the duplicate record id. The base record id is the id of the record for which duplicates were found and the duplicate record id is the id of the duplicate record. Once the job is completed, you can fetch the duplicate records of an entity type by querying the entity based on a join with duplicate record entity. Below is an illustration showing how you can achieve this:

First we can fetch all the base records present in the duplicate record entity. Then for each of the base records, fetch the duplicate records logged as part of the bulk duplicate detection job.

// Fetch base records

QueryExpression query = new QueryExpression(EntityName.account.ToString());

query.ColumnSet = new ColumnSet(new String[] { "accountid", "name" });

LinkEntity linkEntity = new LinkEntity(query.EntityName, EntityName.duplicaterecord.ToString(),"accountid", "baserecordid", JoinOperator.Inner);

linkEntity.LinkCriteria.Conditions.Add(new ConditionExpression("asyncoperationid", ConditionOperator.Equal, _jobId));

query.LinkEntities.Add(linkEntity);

query.Distinct = true;

query.PageInfo.Count = 10;

query.PageInfo.PageNumber = 1;

RetrieveMultipleRequest request = new RetrieveMultipleRequest();

request.Query = query;

RetrieveMultipleResponse response = null;

try

{

response = (RetrieveMultipleResponse)_crmService.Execute(request);

}

catch (SoapException e)

{

}

_baseRecordId = ((account)response.BusinessEntities[0]).accountid.Value;

// For each of the base records got above, fetch duplicate records

QueryExpression query = new QueryExpression(EntityName.account.ToString());

query.ColumnSet = new ColumnSet(new String[] { "accountid", "name" });

LinkEntity linkEntity = new LinkEntity(query.EntityName, EntityName.duplicaterecord.ToString(),"accountid", "baserecordid", JoinOperator.Inner);

linkEntity.LinkCriteria.Conditions.Add(new ConditionExpression("asyncoperationid", ConditionOperator.Equal, _jobId));

linkEntity.LinkCriteria.Conditions.Add(new ConditionExpression("baserecordid", ConditionOperator.Equal, _baseRecordId));

linkEntity.LinkCriteria.FilterOperator = LogicalOperator.And;

query.LinkEntities.Add(linkEntity);

query.Distinct = true;

query.PageInfo.Count = 10;

query.PageInfo.PageNumber = 1;

RetrieveMultipleRequest request = new RetrieveMultipleRequest();

request.Query = query;

RetrieveMultipleResponse response = null;

try

{

response = (RetrieveMultipleResponse)_crmService.Execute(request);

}

catch (SoapException e)

{

}

This is how you can use the bulk duplicate detection for building powerful duplicate detection, duplicate reporting and duplicate handling applications.

I am sure that every one of you will love this exciting new duplicate detection feature in Titan and the feature will enable you to provide a richer user experience along with maintaining high data quality! The article is just an aid to open an all new duplicate free world.

Wait and see as there’s a lot more to come on this awesome feature!

APPENDIX A: Using Cross Entity Duplicate Detection Rules

Cross entity duplicate detection rules are very helpful when we want to detect duplicates across entities. For example, it will be worthwhile to detect incoming leads as duplicates of an existing contact. Wouldn’t it be good if the system can tell me that the lead you are trying to create is already present in the system as a contact? Why to create a new lead when it is already a contact! Such kinds of scenarios are not uncommon in today’s world of business. That’s exactly why we have cross entity duplicate detection rules.

You can setup cross entity duplicate rules just like any other single entity rule. Let’s say I want to create a rule which says a Lead is a duplicate of Contact if they have the same first name and the same last name. Here’s how I can do it:

// Create duplicate rule object

duplicaterule rule = new duplicaterule();

rule.name = "Leads with same first name and last name as that of Contact";

rule.baseentityname = EntityName.lead.ToString();

rule.matchingentityname = EntityName.contact.ToString();

rule.iscasesensitive = new CrmBoolean(false);

// Create duplicate rule condition objects

duplicaterulecondition ruleCondition_1 = new duplicaterulecondition();

ruleCondition_1.operatorcode = new CrmBoolean((int)DuplicateRuleOperator.Equals);

ruleCondition_1.operatorparam = null;

ruleCondition_1.baseattributename = "firstname";

ruleCondition_1.matchingattributename = "firstname";

duplicaterulecondition ruleCondition_2 = new duplicaterulecondition();

ruleCondition_2.operatorcode = new CrmBoolean((int)DuplicateRuleOperator.Equals);

ruleCondition_2.operatorparam = null;

ruleCondition_2.baseattributename = "lastname";

ruleCondition_2.matchingattributename = " lastname";

duplicaterulecondition[] ruleConditions = new duplicaterulecondition[] { ruleCondition_1, ruleCondition_2 };

// Create duplicate rule with duplicate rule conditions in one go

TargetCompoundDuplicateRule target = new TargetCompoundDuplicateRule ();

target.DuplicateRule = rule;

target.DuplicateRuleConditions = ruleConditions;

CompoundCreateRequest request = new CompoundCreateRequest();

request.Target = target;

try

{

CompoundCreateResponse response = (CompoundCreateResponse) _crmService.Execute(request);

_ruleId = response.Id;

}

catch (SoapException e)

{

}

APPENDIX B: Retrieving Duplicates for Multiple Rules

Titan’s duplicate detection takes into account all published rules for an entity when detecting duplicates for it. So if you have two published rules for an entity, then the duplicate found can be a duplicate due to any of these two rules. Let us take an example to understand this better.

An organization has setup two duplicate detection rules on lead entity such that:

1. A Lead is a duplicate of itself if they have the same first name and the same last name.

2. A Lead is a duplicate of Contact if they have the same first name and the same last name.

Now let’s say I am creating a Lead and I get a duplicate record found exception. How do I know what duplicates are there in the system? How do I report them in an organized manner? Since duplicates can be of different types, we have designed the RetrieveDuplicates SDK message to accept a matching entity name that makes fetching of duplicates more organized and meaningful. This way the user can specify what kind of duplicates she is interested in.

Great! I can have the duplicates compartmentalized based on their entity types. But that leaves me with a concern that how do I know what matching entity type to look for? There’s a very easy way to do it. To know the domain of the duplicates for an entity type we can query the duplicate rule entity to give us all the published rules for a given base entity.

Taking the example forward, after I have got a duplicate record found exception I can know what matching entities to query for as follows:

// Fetch matching entity names for a given base entity name

QueryExpression query = new QueryExpression(EntityName.duplicaterule.ToString());

query.ColumnSet = new ColumnSet(new String[] { " matchingentityname " });

query.Criteria.Conditions.Add(new ConditionExpression("baseentityname", ConditionOperator.Equal, EntityName.lead.ToString()));

query.Criteria.Conditions.Add(new ConditionExpression("statecode", ConditionOperator.Equal, _activeState));

query.Criteria.Conditions.Add(new ConditionExpression("statuscode", ConditionOperator.Equal, _publishedStatus));

query.Criteria.FilterOperator = LogicalOperator.And;

query.Distinct = true;

RetrieveMultipleRequest request = new RetrieveMultipleRequest();

request.Query = query;

RetrieveMultipleResponse response = null;

try

{

response = (RetrieveMultipleResponse)_crmService.Execute(request);

}

catch (SoapException e)

{

}

Once you get the list of matching entities for the given base entity, you can make calls to RetrieveDuplicates method for each of the matching entities to find out duplicates of that entity type.

In our case, it should return two matching entities: Lead and Contact. We can then make a couple of calls to RetrieveDuplicates method once for matching entity name lead and then for matching entity name contact. You can see how easy and efficient it is to work with multiple duplicate detection rules for duplicate detection and reporting.

Similarly, in case of bulk duplicate detection, you can query for duplicates logged due to one particular rule by using the duplicateruleid field of duplicate record entity.

APPENDIX C: Retrieving Email Template for Bulk Duplicate Detection

Bulk duplicate detection has an extremely useful feature for notifying users about the completion of the asynchronous job using an email. Although, you can use any email template of type asyncoperation, there is an OOB template available for Bulk Duplicate Detection. You can fetch the bulk duplicate detection email template using the following piece of code:

// Fetch email template for bulk duplicate detection

QueryExpression query = new QueryExpression(EntityName.template.ToString());

query.ColumnSet = new ColumnSet(new String[] { "templateid" });

query.Criteria.Conditions.Add(new ConditionExpression("templatetypecode", ConditionOperator.Equal, _asyncTypeCode));

query.Criteria.Conditions.Add(new ConditionExpression("generationtypecode", ConditionOperator.Equal, GenerationTypeCode.BulkDupDetectCompleted));

query.Criteria.Conditions.Add(new ConditionExpression("languagecode", ConditionOperator.Equal, _orgLanguage));

query.Criteria.FilterOperator = LogicalOperator.And;

RetrieveMultipleRequest request = new RetrieveMultipleRequest();

request.Query = query;

RetrieveMultipleResponse response = null;

try

{

response = (RetrieveMultipleResponse)_crmService.Execute(request);

}

catch (SoapException e)

{

}

APPENDIX D: Unpublish Duplicate Detection Rule

Unpublishing a duplicate detection rule is an exact opposite of publishing a rule. Unpublishing a rule makes it no longer participate in duplicate detection. This state of the rule is useful the rule is used only for some specific time or on specific occasion. E.g. I do a periodic data import of leads and I have a duplicate detection rule that I use only during this import operation. I can have the rule published only when I am doing the import and can have it unpublished rest of the times. You can use the unpublish SDK message to unpublish a rule:

// Unpublish duplicate detection rule

UnpublishDuplicateRuleRequest request = new UnpublishDuplicateRuleRequest();

request.DuplicateRuleId = _ruleId;

try

{

UnpublishDuplicateRuleResponse response = (UnpublishDuplicateRuleResponse)_crmService.Execute(request);

}

catch (SoapException e)

{

}

* Note: While unpublishing a rule, the fact that publishing a rule is a time consuming operation needs to be kept in mind.

Abhishek Agarwal

Leave a Comment
  • Please add 4 and 6 and type the answer here:
  • Post