This blog provides information, news, tips, and announcements about the SQL Server Data Quality Services (DQS) feature introduced in SQL Server 2012.
SQL Server 2012 Data Quality Services (DQS) enables you to cleanse data using your own Knowledge Base. In this blog article, I am going to show how to create a simple Knowledge Base which you can use to cleanse your customer data.
The steps for using DQS to cleanse data is as follows:
A. Create DQS Knowledge BaseB. Create a DQS project to cleanse your data using the Knowledge Base
For this example, I am using the following sample data - you can run the following script on your SQL Server database :
CREATE TABLE MyCustomers ( CustomerID INT, CustomerName NVARCHAR(255), City NVARCHAR(32), Province NVARCHAR(32), LastUpdate DATETIME ) INSERT INTO MyCustomers VALUES (1, 'Consolidate Co Ltd', 'Miami', 'FL','2013-01-01'), (2, 'Consolidation Company Ltd', 'New York', 'NY','2013-01-01'), (3, N'什锦的件', 'LA', 'CA','2013-01-01'), (4, 'Chop-suey Chinese', 'Los Angeles', 'CA', '2013-03-03'), (5, 'Big Cheese, The', 'Redmond', 'WA', '2013-02-02'), (6, 'THE BIG CHEESE', 'Chicago', 'Il','2013-02-02'), (7, 'To Be Filled Later', 'Redmond', 'Wash.', '2013-01-01')
Note: by default, DQS include DQS_NULL as valid value for the domain, you can change the type to invalid if you would like the record with missing value to be flagged
Creating your own DQS Knowledge Base sometimes requires a lot of effort. For things such as address cleansing, phone number cleansing, creating your own complete list of all valid and invalid values can be a huge effort. DQS supports integration with third party service provider to cleanse your data through DQS. Refer to my blog article on how to cleanse Customer Data using Dun & Bradstreet for more information.
In this blog article, I discussed how to create a Knowledge Base to cleanse customer data, including use of domain values and term based relations. You can then refer to the Knowledge Base to create Data Quality Projects to cleanse your data. You can use the same knowledge base to perform cleansing on many Data Quality Projects. You can also automate the cleansing using SQL Server 2012 Integration Services. Matt Mason wrote a nice article : Overview of DQS Transform that describes the SSIS DQS Cleansing transform.
You may also notice that there appears to be duplicate records in the sample data I used. In the next article, I will describe on how to enhance your knowledge base by adding matching policy and identify duplicate and related records in your dataset.
Indeed a very nice post. I am also associated with imaginationworksla, Foam board printing los angeles, vehicle wraps los angeles, car wraps los angeles, banner printing los Angeles, banner printing san fernando valley, coroplast printing los Angeles Thanks for writing such good posts and as I have subscribed to your blog.
Thanks, this is really useful. I'm working with the DQS Client at the moment. If I open an existing (completed) project it will open at stage 4 - Export. I can use the BACK button to get to Stage 3, but it then becomes greyed out.
How can I get back to Stages 1 and 2 to refiew the Mapping/Cleansing stages?
Richard. You can close the project without clicking finish to allow you to change/restart your project. However, after you click finish, then you can't go back beyond reviewing the results.
great blog,excelent way to cleanse our data and to replace the old unreliable data with the up-to-date data. For more details visit us at <a href="www.optinlistsgroup.com/.../data_cleansing.php">click here.</a>
Informative. Data cleansing plays a vital role.. for clean Data cleansing service contact us at <a href="b2bemaillistz.com/data-matching.php">Data Cleansing</a>
The DQS toolset is at the moment very inmature and unusable in large enterprise environments. I've been working on it for the past 5 months and I finally had to remove it completely and bastardize MDS & SQL to perform this function. The main obstacles I've found are:
- The SSIS task is a joke. It cannot be migrated between environments as it cannot be parameterized. It has and awfully slow performance, even with the suggestions to split the data-flow and work with multiple items at the same time it would take long minutes to map anything beyond trivial amounts of data (less than 1000 rows)
- The project files that the SSIS task generates are also useless as they omit all the contextual information about the rows and concentrate just on the values to clean. Any data steward would have a hard time figuring out the source of any individual row that has to be cleaned.
- Domain editing features are too cumbersome for large domains.
- Importing and exporting values from the client does not support standard formats like csv or excel, just the propietary xml format of the tool.
- Integration with MDS is in its infancy as you cannot use MDS as a reference service for DQS
- The APIs are closed so the product is not integrable with other toolsets.
Hopefully some resources have been allocated towards improving this in the upcoming SQL 2014.
Great Blog,,,, Agree with you, data cleansing has played a key role to maintain updated data. for further details on data cleansing up to date information ,please visit:<a href="http://www.b2bemaillistz.com">click here</a>
Excellent Article, Cleansing Customer Data Using Data Quality Services is really very informative to get Great deals on Data Cleansing services visit<a href="www.pioneerlists.com/.../index"> Pioneerlists.com</a>
Helpful Post on data cleansing .Data cleansing play very important role in email marketing .To Know more about data cleansing visit <a href="www.b2bmarketingpartners.com/.../Data-Cleansing">Data Cleansing</a>
Impressive blog, data cleansing service helps to maintain updated information which really helps email marketing. For more details on data cleansing visit <a href=” www.b2bmarketingarchives.com/.../Data-Cleansing “> b2bmarketing archives </a>
Creative blog, the article is well described to make effective use of quality data service, for more information on Data cleansing visit <a href="http://www.b2boptinlists.com">B2Boptinlists.com</a>
At Create DQS Knowledge Base, step 8, my 'Import values' option is disabled. How do I turn it on? Thx.