SQL Server Data Quality Services (DQS) is a knowledge-driven data quality solution. You can define knowledge manually; acquire it from data samples of your organization, or knowledge provided by third party services. We call this Reference Data Services (RDS). DQS makes it easy to cleanse and enrich your data using leading reference data service providers from Windows Azure Marketplace, such as Melissa Data, Digital Trowel, Loqate and CDYNE Corp. You can attach a DQS domain or DQS composite domain to a reference data service provider and use it to cleanse and enrich your data.
This post will focus on getting you up and running with the DQS RDS feature, and demonstrate how you can cleanse US Address data using Melissa Data’s AddressCheck service and DQS.
After DQS installation and in order to start using the RDS features, the first step is to subscribe to the RDS service provider. To subscribe, go to the DataMarket site here and select the reference data provider you would like to subscribe to. For this specific example we will use Melissa Data’s AddressCheck Service:
Once subscribed go to your account data and copy your account key (you will need this key for the DQS configuration)
Now that you have an account key and you are subscribed to a reference data provider go to the DQS client and do the following:
Note - If your DQS server uses a Proxy server to access the internet you should also configure your Proxy settings in DQS general settings.
After you configured the reference data services settings in DQS, you need to attach and map the RDS to a specific domain in your knowledge base. Let’s create a knowledge base and attach a composite domain to the Melissa Data AddressCheck service:
Now that we have attached our composite domain to a RDS service we can use the knowledge base we created in a cleansing project to cleanse and enrich data.
Have fun with the Reference Data Service feature, we look forward to hearing your comments, suggestions, and bug reports.
The DQS Team