This blog provides information, news, tips, and announcements about the SQL Server Data Quality Services (DQS) feature introduced in SQL Server 2012.
Call to Action: When you’re done reading this article and checking out the resources it references, please take a moment to comment and let the DQS team know what additional resources you would like provided. If there’s a topic you would like covered in a blog post or webcast or demo, we want to know about it. We can’t promise we’ll be able to create everything that’s requested, but if you don’t ask, odds are you won’t receive.
This morning the Data Quality Services team hosted as session as part of the ongoing “Twelve Days of SQL Server 2012” webcast series. Senior Program Manager Matthew Roche ran the session, while team members Howie Kroehl, Gadi Peleg, Welly Lee, Cim Ryan and Matt Masson hung out in the Q&A area answering attendee questions. We ended up having just over 200 attendees for the session, and the feedback was overwhelmingly positive. If you’re interested in viewing the recorded session on-demand, please take a look at the Twelve Days of SQL Server 2012 event web site – the recording should be online within 24 to 48 hours.
While the recording includes the session content, it does not include the questions asked (and answers provided) during the live session. For your reference, here is the Q&A transcript, slightly formatted for ease of reading:
Question: Will Microsoft ever do at source profiling?Private Answer: Can you clarify what you mean by "at source profiling"?Question: At source Profiling = going to the source DB2 system and looking at the data instead of moving the data to the SQL Server in chunks to profile itAnswer: You're right that profiling the data in non-SQL Server systems isn't really feasible using DQS or SSIS, directly, in SQL Server 2012; the data does need to be staged in SQL Server, first. As for the future, all we can say here is that Microsoft hasn't publicly announced capabilities along those lines.Question: If i can get the right drivers to set up a linked server for the AS400 db i should be able to profile it correct?Answer: Correct
Question: What is the cleansing speed?Answer: http://blogs.msdn.com/b/dqs/archive/2012/04/17/significant-performance-enhancements-in-dqs-with-the-cumulative-update-1-cu1-release-for-sql-server-2012.aspx Answer: The table in the blog post above describes the performance of DQS in different activities comparing the RTM version to the latest improvements in CU1 (cumulative update).
Question: Can this be integrated with SSIS so as to make it a part of my process of loading the data?Answer: Yes, there is a DQS Cleansing Transform for use in the SSIS Data Flow.
Question: Do you have to have Windows Azure to use Data Quality Services?Answer: No, DQS is not a cloud product in SQL Server 2012. The RDS capabilities are useful on top of the capabilities offered in the box.
Question: This is amazing !!! I love this DemoAnswer: Thanks!
Question: Can data in xml source be used as source data for DQS?Answer: Unfortunately no, you can use Excel, CSV, SQL Table/view or linked server to connect to any other source.
Question: do the azure RDS have a fee?Answer: Yes, you can subscribe to these services and pay according to the plan you select: <https://datamarket.azure.com/browse/Data>Question: Are the RDS's used with no fee? Is it included in the DQS license?Answer: Most RDS do have additional fees, although some provide cleansing free of charge. For more information, see https://datamarket.azure.com/
Question: did I hear this SSIS integration is not now in 2012?Answer: The DQS cleansing component for SSIS is in SQL Server 2012
Question: Can you post the link to DQS online tutorial please?Answer: http://technet.microsoft.com/en-us/sqlserver/hh780961
Question: Can I use local SQL Server data as a reference data set (i.e. a lookup table as a reference data set)?Answer: DQS does not support lookup matching however you can import or discover your data into your domains in the knowledge base and use it to cleans your data via SSIS or project in DQS.
Question: Is MDS the tool to manage the data between the systems and DQS for cleansing and matching?Answer: To "manage", yes. To transfer, use SSIS.
In the final slide of the session we included links to additional DQS resources online. One of them is the DQS blog (which you have already found) while the other were for the “DQS Movies” video series and the DQS forums. Here are the URLs for these resources and a few more:
Please let us know what you think (both about the session and about DQS in general) and what additional resources you’d like to see us create. We can’t wait to hear from you!
When opening a SSIS Data Cleansing project in DQS Client, all the attributes aren't available in DQS. It only shows Domain related columns. It will be good to show all columns for the entity so that on exporting, all these records can be properly referenced back to original records.
When i open a SSIS Data Cleansing project in DQS Client, all the attributes are not available in DQS. It only shows Domain related. It will be good to show all columns for the entity so that on the exporting, all these records are can be properly referenced back to the original records.