SQL Server 2012 CDC for Oracle – a Review of One Implementation

SQL Server 2012 CDC for Oracle – a Review of One Implementation

  • Comments 7

SQL Server 2012 shipped with a new feature named SQL Server 2012 Change Data Capture for Oracle By Attunity (shortened to SQL 2012 Oracle CDC in this blog). This feature allows using Change Data Capture in a SQL Server database, with an Oracle database as the source of data. One notable advantage of this feature is that very low data latency can be achieved; another is that knowledge of Oracle is not required for ETL development. Overall, this feature can significantly improve the quality of SQL Server data warehousing projects that source some or all of their data from Oracle. The feature was developed by Attunity and is shipping as a part of SQL Server 2012.

This blog is a review of SQL 2012 Oracle CDC implementation on a data warehousing project at one of Microsoft's customers. I will concentrate here on implementation details that are not very obvious, are not described in existing documentation, and may take some time to figure out on your own. I will also mention a few design patterns we used that may be helpful to someone implementing this feature. This blog assumes that the reader is familiar with the basics of the "native" SQL Server CDC, as well as with the basics of SQL 2012 Oracle CDC. If you are new to this topic, then it is highly recommended to first review the MSDN SQL Server CDC documentation, and then a number of other sources of information on SQL 2012 Oracle CDC that are currently available:

  • SQL 2012 CDC documentation
  • SQL 2012 Oracle CDC documentation, available on the machine where the feature is installed. Look for two help files, named Attunity.SqlServer.XdbCdcSvcConfig.chm and Attunity.SqlServer.XdbCdcDesigner.chm, located by default in C:\Program Files\Change Data Capture for Oracle by Attunity
  • SQL 2012 Oracle CDC TechNet Wiki
  • SSIS team blog on SQL 2012 Oracle CDC
  • Attunity support forum

The implementation described in this blog is based on the requirements of a specific data warehousing project. Not everything mentioned here will necessarily be applicable or even optimal on other such projects. This blog is not a comprehensive step-by-step guide, and what I present here is not intended as the "official" implementation recommendations for SQL 2012 Oracle CDC; however, my hope is that most of this will be useful to others implementing this new feature of SQL Server 2012, particularly when dealing with advanced design and troubleshooting scenarios.

As of this writing, the most current version of SQL 2012 Oracle CDC is available as a part of SQL 2012 SP1 CU7. These bits contain several important fixes. Note that unlike other SQL Server components, SQL 2012 Oracle CDC ships as two standalone MSI files; therefore you only need these MSIs from a SQL Server SP or CU to fully install the product. For example, for this update, this means downloading the files named 2012_SP1_AOrcleSer_CU7_2894115_11_0_3393_x64 and 2012_SP1_AOrcleDes_CU7_2894115_11_0_3393_x64. Over time, updated versions of SQL 2012 Oracle CDC may be released in a similar fashion as part of regular SQL Server 2012 updates.

SQL 2012 Oracle CDC can be installed on a Windows server running on commodity hardware. This can be, but does not have to be, the ETL/SSIS server, or the data warehouse server. The server where SQL 2012 Oracle CDC is installed will run one or more instances of the CDC service (xdbcdcsvc.exe). The CDC service connects to an Oracle database, captures change data, and saves that data in a SQL Server CDC enabled database. That database, called the change data database in this blog, can be placed on any SQL Server 2012 instance, not necessarily on the same server where SQL 2012 Oracle CDC is installed. In our project, both the CDC services and the SQL Server instance hosting change data databases were installed on the same midrange server with 8 cores and 192 GB of RAM. I have not found the CDC service to be a big consumer of either CPU or RAM, so in all likelihood, a less powerful server would suffice as well.

Once the current product version is installed, two MMC consoles can be found under All Programs, Change Data Capture for Oracle by Attunity. One is the CDC Service console, typically used once at setup time to configure the CDC service(s). The other is the CDC Designer console, used to create and monitor CDC instances.

CDC Service Configuration

Setting up the CDC service is the first step to be done after installing the product. Documentation describes that process step by step, so here I will only point out a few additional details.

  • A current version of the Oracle client must be installed on the server where the CDC service will be running. The bitness of the Oracle client must match the bitness of the CDC service (although it's ok to also install another Oracle client of different bitness if needed for other purposes). If installing on a cluster, Oracle client must be installed identically on all nodes.
  • Most CDC service configuration and status info is stored in a small database named MSXDBCDC, hosted on the same SQL Server instance where the change data database(s) will be hosted. This database is created when you use the Prepare SQL Server action from the CDC Service console. It is created with default settings of the model database – consider adjusting some of those (recovery model, owner, etc.) as required in your environment.
  • If necessary, multiple instances of the CDC service can be set up on the same server. In our project, we created two service instances, because we were sourcing data from two independent Oracle databases. This both improved scalability and avoided a single point of failure.
  • When installing under Windows Server 2008 R2 or newer, the CDC service by default does not use a service account. Instead, permissions are granted to the service identity using per-service SID. This is a good thing, as there is one less account to manage. Because of that, the default user name shown in the New Oracle CDC service dialog is NT Service\OracleCDCService1. Note that the Password box in this case is still enabled, which is a little misleading, since a password is not applicable when per-service SID is used. In this scenario, you may also run into a kind of a chicken and egg problem. The service identity (NT Service\OracleCDCService1) requires a login on the SQL Server instance where the change data database(s) will be hosted. However, that login cannot be created until the Windows service itself is created and the SID associated with NT Service\OracleCDCService1 becomes known to Windows. But, when the CDC Service console creates the service, it also tries to connect to the SQL Server instance, which fails because the login does not exist yet. The workaround is simple – after getting the Login failed error, manually create a login for NT Service\OracleCDCService1 (there is no need to grant it any additional privileges). At that time, the service has already been created in Windows, so its SID is known. Then, hit OK one more time in the New Oracle CDC service dialog. This will create the service successfully.
  • CDC services can be clustered as a Generic Service resource. In our project, we installed the CDC services on the same cluster where the SQL Server instance hosting change data databases was installed, and added the CDC services to the SQL cluster group for that instance, making them dependent on the SQL resource. This provided the same level of high availability for the CDC services as was there for SQL Server. When clustering the CDC service, you need to use a domain account to run the service (just like you need to use a domain account to run SQL Server on a cluster). As a side benefit, this avoids the above problem with service creation, as long as you create a login for the CDC service domain account before creating the service in the New Oracle CDC service dialog. When clustering the CDC service, it must be first installed in an identical fashion on all nodes of the cluster. After that, open the Failover Cluster Management console on the cluster node owning the SQL Server group, add the Generic Service resource(s) for the CDC service(s) to that group, and make them dependent on the SQL Server resource. This only needs to be done on one node of the cluster.
  • In the same New Oracle CDC service dialog, you have to enter the CDC service master password. This is the password that protects an asymmetric encryption key, created in each change data database. That key is used to encrypt the password of the Oracle log mining account. One caveat here is that the asymmetric key password is subject to password complexity rules in effect on the Windows server. However, the password is not validated against those rules by the CDC Service console when the CDC service is created. If the master password is not complex enough, the CDC service will be created successfully, but will later fail to create the asymmetric key in the change data database.

Preparing Oracle Database for Data Capture

Before a CDC instance can be configured, the source Oracle database must be prepared for data capture. As described in greater detail in product documentation, you have to enable and configure Oracle Log Miner, create the log mining account that will be used by the CDC service to connect and capture changes from Oracle redo logs, and grant a number of privileges to that account. These steps require holding sufficient privileges in the Oracle database; therefore, you may need to work with an Oracle DBA who can run corresponding scripts.

To enable Log Miner, the source Oracle database must be in ARCHIVELOG mode, with supplemental logging enabled at the database level. This is done by executing the following:

ALTER DATABASE ARCHIVELOG;
ALTER DATABASE ADD SUPPLEMENTAL LOG DATA;

We used the following script to create the Oracle log mining account (named DW_ETL here) and grant it required privileges:

CREATE USER DW_ETL IDENTIFIED BY &Password DEFAULT TABLESPACE USERS TEMPORARY TABLESPACE TEMP PROFILE DEFAULT ACCOUNT UNLOCK;
GRANT CREATE SESSION TO DW_ETL;
GRANT SELECT ANY TRANSACTION TO DW_ETL;
GRANT EXECUTE ON DBMS_LOGMNR TO DW_ETL;
GRANT SELECT ON V_$LOGMNR_CONTENTS TO DW_ETL;
GRANT SELECT ON V_$ARCHIVED_LOG TO DW_ETL;
GRANT SELECT ON V_$LOGMNR_LOGS TO DW_ETL;
GRANT SELECT ON V_$LOG TO DW_ETL;
GRANT SELECT ON V_$LOGFILE TO DW_ETL;
GRANT SELECT ON V_$DATABASE TO DW_ETL;
GRANT SELECT ON V_$INSTANCE TO DW_ETL;
GRANT SELECT ON V_$THREAD TO DW_ETL;
GRANT SELECT ON V_$PARAMETER TO DW_ETL;
GRANT SELECT ON DBA_REGISTRY TO DW_ETL;
GRANT SELECT ON ALL_INDEXES TO DW_ETL;
GRANT SELECT ON ALL_OBJECTS TO DW_ETL;
GRANT SELECT ON DBA_OBJECTS TO DW_ETL;
GRANT SELECT ON ALL_TABLES TO DW_ETL;

Depending on Oracle database configuration, some of these privileges may have to be granted on views starting with V$ rather than V_$, e.g. V$THREAD instead of V_$THREAD. The documentation mentions that, however there is an apparent typo where instead of V_$ it has V $.

Even though the log mining account does not query the source Oracle tables directly, it still requires the SELECT privilege on each source table, so a separate script granting that has to be executed. If SELECT privileges are missing, data capture will fail.

In addition to enabling supplemental logging at the database level, it has to be also enabled for each source Oracle table. The CDC Designer console will generate a script to do that (more on this below).

CDC Instance Design Workflow

Once the Oracle database is prepared for data capture, the CDC Designer console can be used to create a CDC instance. A CDC instance refers to a combination of a source Oracle database, a CDC service, and a SQL Server CDC enabled database that stores change data. Note that the same CDC service may be associated with more than one CDC instance (in our case we could have used a single CDC service to capture changes from both of our source Oracle databases). It is also possible to create multiple CDC instances for the same source Oracle database, and include a different set of source tables under each CDC instance.

To design our CDC instances, we used a specific workflow that resulted in a set of deployment scripts. We then used these scripts to quickly deploy CDC instances to various environments, instead of having to use the CDC Designer console interactively every time a new CDC instance needed to be created. The workflow steps follow. For each step, I will mention some details that are relevant in the context of the step.

Step 1

Start the New CDC Instance wizard and follow through the steps. The wizard does the following:

  • Creates change data database
  • Prompts for Oracle connection information and credentials
  • Prompts to add source tables to the CDC instance
  • Creates a script to enable supplemental logging for each source table
  • Validates Oracle environment, enables CDC in the change data database, and saves CDC instance configuration

There are several details to be added here:

  • If the wizard is canceled, it will drop the change data database it has created.
  • The change data database created by the wizard uses the default database options of the model database, which may or may not be optimal for your purposes.
  • When specifying Oracle connection info, you can enter it in the format suggested on the wizard page (host:port/servicename); however, you can also enter a TNS name from TNSNAMES.ora file on the server where the CDC service is running. The latter option is particularly useful when connecting to a RAC cluster, since additional connection options such as multiple addresses of RAC nodes can be specified in the TNSNAMES.ora file.
  • It is not necessary to add all (or any) source tables to the CDC instance as a part of running the wizard. It is entirely possible to create an empty CDC instance and add tables later.
  • Similarly, it is also not necessary to run the Oracle supplemental logging script as a part of running the wizard. In fact, there is a reason to postpone this step until later (more on this below). Also, since that script must be executed with sufficient privileges, there is a good chance that it will have to be handed over to an Oracle DBA.

Step 2

If you have not added any source tables while running the wizard, do that now. For a given CDC instance, use the Properties command under Actions, click on the Tables tab in the Properties dialog, and then click Add Table. Note that the Table Selection dialog may appear a bit slow and unresponsive with large Oracle databases. When searching for a table, you can use % as a wildcard in the Table name pattern box. If you have many tables to add, you don’t have to add them all at the same time. You can add tables in batches – each batch will be saved with the CDC instance.

Step 3

While in the Properties dialog, click on the Advanced tab and add any advanced options you might need (available options are described in product documentation). In our implementation, we added two options:

  • cdc_stop_on_breaking_schema_changes. When set to 1, this will stop data capture with an error if a schema change occurs in the Oracle database such that accurate data capture can no longer be guaranteed. If this option is missing or is set to 0, then in the event of such schema change, the corresponding capture instance in the SQL Server change data database will be silently dropped, while data capture continues for the rest of the tables. In our project, capturing consistent data was more important than capturing the latest data, therefore we chose to stop the capture process in the event of a schema change in the Oracle database, rather than continue without some table(s). It is important to understand the mechanism that is used to detect schema changes – more on that below. Note that advanced option names are case sensitive, and this particular option is incorrectly spelled as CDC_stop_on_breaking_schema_changes in the documentation. The correct option name is all lower case; otherwise the option will not be enabled.
  • trace. By default, SQL 2012 Oracle CDC writes basic status information to the cdc.xdbcdc_trace table in the change data database. This includes status changes, errors, etc. More detailed tracing can be enabled by setting the trace option to 1. This can be done even while the CDC instance is running. An even more detailed trace can be created by setting this option to SOURCE. Note that in the latter case, a large amount of trace data will be generated quickly, so it is not recommended to keep this on for a long time. Trace data is very helpful in troubleshooting CDC instance configuration and operation problems. When tracing to the cdc.xdbcdc_trace is not possible, i.e. because the SQL Server instance is down, trace data will be written to the Application event log on the server where the CDC service is running. We added this option with the value of 0, to not trace detailed information by default, but to make the option visible in case tracing will be required later. Note that by default, the CDC service prunes the cdc.xdbcdc_trace table to contain only the last 24 hours of trace data.

Step 4

Once all tables and advanced options have been added to the CDC instance, click on CDC Instance Deployment Script under Actions. This generates a SQL Server script that will create the change data database and configure it for data capture. We used this script as a starting point to create our own CDC instance deployment scripts, described below. A few notes on the generated script:

  • The change data database is always created with the SQL_Latin1_General_CP1_CS_AS collation, which may not match the server collation and is not necessarily the optimal choice.
  • Even though only SQL Server 2012 is supported, the script sets the compatibility level of the change data database to 100. This should be 110.
  • The script adds an extended database property named Description. Even though the value of this property may be blank, it still must be present – removing this part of the deployment script will break CDC Designer console when it works against the deployed CDC instance.
  • The password of the Oracle log mining account is not scripted. This makes sense from security perspective, but it will cause problems if the CDC instance is deployed using the deployment script just as it was generated by the console (more on handling Oracle account password below).

Step 5

Split the generated deployment script into three parts:

Script 1

The first script creates the change data database, sets its options as required, and enables the database for CDC. Here we adjusted database file sizes and filegroup configuration, used an appropriate collation, set database recovery model as required, and changed the database owner to a specific server principal. Other parts of the generated deployment script remained unchanged.

Script 2

The second script creates mirror tables and capture instances in the change data database. This is a good point to take a step back and describe some relevant details of SQL 2012 Oracle CDC architecture.

Mirror tables are tables in the SQL Server change data database that correspond one-to-one to the source tables in the Oracle database. Each mirror table has exactly the same column names as the Oracle table, these columns are in the same order as in the Oracle table, and have data types that match the Oracle data types as closely as possible.

Mirror tables will remain empty at all times. In fact, to ensure that, the generated deployment script denies all DML permissions on each mirror table. Conceptually, this makes sense – the change data database will contain only changes, not the entire contents of source tables in the Oracle database. Mirror tables are used to create capture instances in the change data database by calling the sys.sp_cdc_enable_table stored procedure, but actual data changes occur in the corresponding Oracle tables, not in SQL Server. This is really the “secret sauce” of SQL 2012 Oracle CDC: The CDC service captures changes in the Oracle database and writes them to the SQL Server change data database, however it writes them not to the mirror tables, but to the corresponding system tables in the cdc schema (the ones with the _CT suffix), which are created when each mirror table is enabled for CDC. With this design, any consumer of change data uses the well-established API of SQL Server CDC (i.e. the fn_cdc_get_all_changes and fn_cdc_get_net_changes functions), even though the actual changes occur in an Oracle database. In other words, consumers of change data can be completely unaware that changes actually originate in an Oracle database – working with the change data database is no different from working with the regular SQL Server CDC enabled database.

One reason for having to keep all columns of the Oracle table in the mirror table is to enable detection of schema changes in Oracle. For every captured column in a mirror table, the column_id value in the cdc.captured_columns table must match the COLUMN_ID value of the corresponding column in the corresponding Oracle table (as it appears in the SYS.ALL_TAB_COLUMNS Oracle view). If the mirror table had fewer columns than the corresponding Oracle table, then its column_id values would not match the Oracle COLUMN_ID values, which would be interpreted as a schema change in the Oracle database. As noted earlier, this either causes the capture instance for the table to be dropped, or stops data capture altogether.

Now, just because the mirror tables must contain all columns of the corresponding Oracle table does not mean that the capture instance must actually capture changes in all of these columns. If you are only interested in a subset of columns in a table, you can modify the call to the sys.sp_cdc_enable_table stored procedure and change the @captured_column_list parameter to only include the columns of interest. This is what we did to avoid the overhead of capturing many unneeded changes.

Finally, we made another rather important change to the mirror table definitions. When a capture instance is created for a table in a CDC enabled database, the sys.sp_cdc_enable_table stored procedure will enable the net changes mode for the capture instance (i.e. will create the fn_cdc_get_net_changes function) only if the table has a primary key/unique constraint or a unique index. The net changes mode can simplify and improve performance of the ETL code that consumes changes in the change data database and applies them to the destination database. In the all changes mode, which is the only option when the source table does not have a unique constraint or unique index, each change must be applied to the destination database individually, one row at a time, which is both relatively complex and inefficient. In the net changes mode however, all changes processed in an incremental ETL run can be applied as a set (i.e. using the MERGE statement) – this is both faster and simpler. Now, if the source Oracle table has a primary key constraint defined, then the corresponding mirror table in the generated deployment script will have that primary key constraint as well, and the net changes mode will be enabled. However, in our source system, most tables did not have a declared primary key, but did have one or more unique indexes, typically on a sequence based not nullable numeric column. Therefore, in the corresponding mirror tables, we made the matching columns not nullable, and added primary key constraints on these columns. This allowed us to use the net changes mode for most of the source tables and greatly simplified our ETL code.

Script 3

The third script drops the CDC capture job, which is created automatically when a database is enabled for CDC but is not used with SQL 2012 Oracle CDC. We added a call to the sys.sp_cdc_change_job to increase the retention period for change data from the default of three days. Then, we modified the UPDATE statement that writes CDC instance configuration info to the cdc.xdbcdc_config table in the change data database to include the encrypted password for the Oracle account (if you recall, this is not included in the generated script). The password must be encrypted by the asymmetric key that the CDC service creates in the change data database, which is done by calling the ENCRYPTBYASYMKEY() function. The CDC service will use the same key to decrypt the password in order to use it for connecting to Oracle. However, the caveat here is that it may take some time after the change data database is created for the CDC service to create the key. So by the time this script runs, the key may or may not exist. Therefore, we added a WAITFOR loop to the script to wait until the key appears in the database, before attempting to write configuration info. The part of the third script that does all this looks like this:

:setvar OracleAccountPassword "<SENSITIVE>" 
:setvar OracleConnectString "TNSNAME"

-- Wait for the CDC service to create the encryption key
WHILE NOT EXISTS (
                
SELECT 1
                
FROM sys.asymmetric_keys
                
WHERE name = 'xdbcdc_asym_key'
                
)
    
WAITFOR DELAY '00:00:05';

UPDATE cdc.xdbcdc_config SET
    
connect_string = N'$(OracleConnectString)',
    
use_windows_authentication = 0,
    
username =N'DW_ETL',
    
password = ENCRYPTBYASYMKEY(ASYMKEY_ID('xdbcdc_asym_key'), N'$(OracleAccountPassword)'),
    
transaction_staging_timeout = 120,
    
memory_limit= 50,
    
options= N'cdc_stop_on_breaking_schema_changes=1;trace=0;'
;

On a couple of occasions, we noticed that the WHILE loop would run indefinitely, waiting for the key to be created. While we did not find out the root cause for the CDC service not creating the key for a new change data database, in all such cases restarting the CDC service caused the key to be created, and allowed this script to finish.

Step 6

At this point, we have the customized scripts needed to recreate the CDC instance with the desired configuration, so we can delete the existing "prototype" CDC instance in the CDC Designer console. Note that deleting the CDC instance also drops the associated change data database. Once deleted, we can go ahead and execute the three scripts described above in the same sequence, which should result in a new CDC instance, configured as desired. The instance status will be INITIAL (you may need to refresh the CDC Designer console to see it).

Step 7

Next step is to create the Oracle supplemental logging script using the corresponding action in the CDC Designer console. Note that we deliberately postponed this step until now. If we did this as a part of initial CDC instance design, the generated script would enable supplemental logging for all columns of every source table, causing unnecessary overhead on the Oracle side. However, at this point the supplemental logging script will only enable logging for those columns that are being captured in each table, i.e. the columns specified in the @captured_column_list parameter of the sys.sp_cdc_enable_table stored procedure. The Oracle supplemental logging script can now be handed over to an Oracle DBA to be executed. Note that when running this script, database activity against a table may prevent a log group for that table from being created. If this happens, temporarily suspend the Oracle database workload, if possible, or wait until a period of low activity in the Oracle database. Then, rerun the script. The script generated by the CDC Designer console is idempotent and can be executed against the same Oracle database as many times as needed while the CDC instance is in the INITIAL status. If you have to manually edit the Oracle supplemental logging script, be aware that each Log Miner supplemental log group for a table cannot contain more than 32 columns. If you are capturing more than 32 columns per table, create additional log groups for the same table. The CDC Designer console handles this correctly, so we chose to simply regenerate the script when a change was needed. Also note that if for some reason a column that is added to a capture instance is omitted from the Oracle supplemental logging script, everything will work without error, however the column will always be NULL in the change data.

Step 8

Once supplemental logging is enabled for each source table, you can start the CDC instance. It may take about a minute to start change capture – the status will be changing from INITIAL to IDLE to PROCESSING, and the monitoring counters on the console will start showing non-zero values. If you see LOGGER as the status, it could mean two things. One is that the CDC instance encountered a temporary delay mining Oracle logs, i.e. due to a spike in the amount of changes. If so, LOGGER should change to PROCESSING fairly soon, typically within seconds. However, if you see LOGGER for a relatively long time, i.e. for longer than a couple of minutes, it probably means that there is some problem with mining Oracle logs. If so, check the cdc.xdbcdc_trace table for errors. If necessary, set the trace parameter to either 1 or SOURCE to troubleshoot further.

Step 9

Once the CDC instance is processing changes, you can start the initial load from the source Oracle database into the destination SQL Server database, and once that completes, start running incremental loads. For both types of loads, we used the new CDC components in SSIS 2012 (CDC Control task, CDC Source and CDC Splitter transformations) to manage LSN ranges and to consume change data from the change data database. Since these components use the standard SQL Server CDC APIs, using them for Oracle-sourced data is no different from using them against a native SQL Server CDC enabled database.  For SSIS implementation details, refer to the CDC Flow Components MSDN documentation, and also to Matt Masson’s TechEd 2012 presentation, which describes several design patterns for working with CDC components in SSIS 2012.

Here are some other operational details related to the CDC Designer console:

  • While the console is typically installed on the same server where the CDC service is running, it does not have to be. Similar to SQL Server client tools, it can be installed on any machine that has connectivity to the SQL Server instance where the MSXDBCDC database and the change data database(s) are hosted. When you start the CDC Designer console, it prompts you for a SQL Server name to connect to. If you deployed SQL 2012 Oracle CDC in multiple environments, then you could manage all environments from multiple instances of CDC Designer console on the same administrator machine (but see the next point).
  • The console does not display the name of SQL Server instance it is connected to. If you open multiple instances of the CDC Designer console to manage SQL 2012 Oracle CDC in multiple environments (i.e. QA and PROD), then you may not be able to easily tell which is which. Needless to say, you don't want to make a change in the wrong environment.
  • Neither the CDC Designer console, nor the CDC Service console displays the product version. As of this writing, the only way to determine which version is installed is to check the properties of files on disk in C:\Program Files\Change Data Capture for Oracle by Attunity.
  • All configuration and status information that the CDC Designer console uses or displays is stored in the change data database(s), in several tables in the cdc schema prefixed with xdbcdc_ (note that these appear under System Tables in SSMS). This means that you can monitor each CDC instance by querying these tables. For us, low data latency was very important, therefore we set up a data latency monitor job that would run the following script every minute and raise a warning if data latency exceeded a predefined threshold:
DECLARE @status nvarchar(16),
        
@sub_status nvarchar(64),
        
@status_message nvarchar(512),
        
@last_transaction_timestamp nvarchar(30),
        
@last_change_timestamp nvarchar(30),
        
@db_name sysname,
        
@current_latency int,
        
@latency_alert_threshold_minutes int = 15 -- May be adjusted as required
;

SELECT TOP (1)
      
@status = ISNULL(status, '')
      ,
@sub_status = ISNULL(sub_status, '')
      ,
@status_message = ISNULL(status_message, '')
      ,
@last_transaction_timestamp = ISNULL(CAST(last_transaction_timestamp AS nvarchar(30)), '')
      ,
@last_change_timestamp = ISNULL(CAST(last_change_timestamp AS nvarchar(30)), '')
      ,
@current_latency = DATEDIFF(minute, last_transaction_timestamp, SYSUTCDATETIME())
      ,
@db_name = DB_NAME()
FROM cdc.xdbcdc_state
WHERE DATEDIFF(minute, last_transaction_timestamp, SYSUTCDATETIME()) > @latency_alert_threshold_minutes
;

-- Raise a warning if no transactions have been captured within the threshold number of minutes.
-- May not be actionable during periods of low activity in the source system.
IF @@ROWCOUNT > 0
    
RAISERROR(
'WARNING: Oracle CDC has not captured any transactions for at least %d minutes. Most recent captured Oracle transaction occurred %d minutes ago. Database: %s, Status: %s, SubStatus: %s, Status Message: %s, Last Transaction Timestamp (UTC): %s, Last Change Timestamp (UTC): %s.',
16, 1, @latency_alert_threshold_minutes, @current_latency, @db_name, @status, @sub_status, @status_message, @last_transaction_timestamp, @last_change_timestamp
            
);

Troubleshooting

Here I will mention several problems that we ran into, and what the solution was for each. This list is by no means comprehensive, but may provide some helpful troubleshooting ideas and techniques.

  1. Character data types in the deployment script generated by the CDC Designer console were doubled. For example, for a VARCHAR(10) column in the Oracle table, the mirror table had nvarchar(20). This was a bug in the product, fixed in the latest version referenced at the top of this post.
  2. Occasionally, data capture would become permanently suspended with the LOGGER status. From the trace, we could see that a call to Log Miner resulted in the "SCN not found" error (in Oracle, SCN stands for System Change Number, which is analogous to LSN in SQL Server). This had everyone stumped for a while, until we noticed that this was consistently happening in one particular environment only. The root cause turned out to be rather simple – that environment had the original RTM version of the product deployed (this is something that isn’t easy to notice unless you check file versions on disk). Installing the latest bits resolved the problem.
  3. Change data values for a non-nullable column in an Oracle table were all NULL in the capture instance _CT table. This turned out to be because the column was mistakenly omitted from the Oracle supplemental logging script.
  4. As mentioned earlier, we were sourcing data from two independent Oracle databases. Both systems were running the same version of the same off-the-shelf Oracle application, with each system being used by a different regional organization of our customer. Our initial assumption was that the database schemas of the two Oracle databases must be the same, too. Therefore, we used the CDC instance deployment script generated for the first Oracle database to deploy the CDC instance for the second database, too. However, we found that as soon as we started the second CDC instance, it would log this error to the trace: "ORACDC323I:The metadata validation for Oracle table cdc.<table name>_CT failed. Column <column> index was changed", and then immediately drop the capture instance for the table (the cdc_stop_on_breaking_schema_changes option was not enabled at that time). It turned out that our initial assumption was incorrect – each database had slightly different schemas due to customizations supported by the Oracle application that each organization has made. Specifically, column order in some tables was different. Because the method used to detect schema changes is based on comparing column order between Oracle tables and mirror tables (see above), this resulted in the above error. Note that the error message may be slightly misleading on the first read – you may think it's referring to an index on the referenced column. In fact, the word "index" refers to the column ordinal position in the table. The solution here was to maintain separate CDC instance deployment scripts for the two source databases.
  5. Occasionally, data capture would become suspended with the LOGGER status. Unlike the previous similar problem, the error in the trace was different. We saw that a call to Log Miner resulted in the "missing logfile" error. There were three distinct causes for this. In one case, it was happening because the backup job on the Oracle database server would delete archived logs while they were still being mined by SQL 2012 Oracle CDC. The solution was to increase the retention period for archived logs. In another case, running out of disk space prevented Oracle from archiving an online log once it has rolled over to a new log file. This resulted in an inactive online log without a corresponding archived log. SQL 2012 Oracle CDC does not use inactive online logs, so when it requested a particular SCN range from Log Miner that happened to be in that inactive online log, that range was not found in any log file. This problem was resolved once Oracle DBAs freed some disk space and allowed the inactive online log to be archived. Finally, we have seen the same error in a non-production environment when Oracle DBAs had to reset the logs as a part of database maintenance work. In this case, the only way to resolve the problem was to stop capture and reset CDC using the Reset action in the CDC Designer console. This removes all change data. If your requirement is not to miss any changes, you will need to restart data capture and run the initial load again.

If you run into a problem and your initial troubleshooting is unsuccessful, you may need to open a support case. If so, it is highly recommended to collect a set of diagnostic information for the support engineers (actually this may help you in your troubleshooting efforts as well). The CDC Designer console provides a convenient Collect Diagnostics link, which will create a text file with diagnostics collected from both Oracle and SQL Server databases. Note that this includes the contents of cdc.xdbcdc_trace table. If you have set the trace option to SOURCE sometime prior to using Collect Diagnostics, then the trace table may be so large that Collect Diagnostics will fail because of a timeout retrieving all that data. Therefore, while it is recommended to have trace set to SOURCE while Collect Diagnostics is running, it’s best to do that just prior to clicking on the link.

In addition to the output from Collect Diagnostics, if the problem seems to be related to the Oracle database and Log Miner, then it is also helpful to include the output from the following Oracle queries, which shows the state of online and archived logs:

SELECT * FROM V$LOG;
SELECT * FROM V$LOGFILE;
SELECT * FROM V$ARCHIVED_LOG;
SELECT * FROM V$DATABASE;

To conclude, SQL 2012 Oracle CDC is still a relatively new feature of SQL Server, as of this writing. The purpose of this blog is to make it easier for SQL Server developers and DBAs who are working on data warehousing projects to discover, implement, and support this feature. Please post your comments and questions below.

 

Leave a Comment
  • Please add 7 and 2 and type the answer here:
  • Post
  • Hey, this is a great article and has helped us out a huge amount.  One thing we're unable to work out.  How would you recover from error if the Mark Processed Range task failed, but everything else before had executed and committed ok?

  • Glad it helped.

    As a general pattern, you want to make all ETL between Get Processing Range and Mark Processed Range restartable. In other words, it should be able to transparently reprocess data that has already been processed and is present at the destination. Either MERGE or INSERT ... WHERE NOT EXISTS work well for that if you stage change data from the CDC Source in intermediate tables, before applying it to the final destination. Another option is to use the __$reprocessing column. CDC Source will set it to 1 for every row that may have already been processed.

    This way, once you resolve the root cause of the failure, the next run after the failed one will succeed.

  • Does this functionality require Oracle Enterprise or will it work fine on Oracle Standard?

  • SQL 2012 Oracle CDC documentation does not make a distinction between Enterprise and Standard edition. Here are the supported versions of Oracle database from the current documentation:

    [Supported] Source Oracle Database

    Oracle Database 10g Release 2: 10.2.0.1-10.2.0.5 (patchset as of April 2010)

    Oracle Database 11g Release 1: 11.1.0.6-11.1.0.7 (patchset as of September 2008)

    Oracle Database 11g Release 2: 11.2.0.1-11.2.0.3 (patchset as of September 2011)

    I would expect everything to work on Standard as long as Standard includes Log Miner. To be certain, you can ask this question in the Attunity support forum - see link at the top of this blog post.

  • Thanks Dimitri.

  • What kind of impact did this have on the Oracle database's performance?  The oracle database we are using is the OLTP database?

  • We did not see any noticeable impact on the source database performance, but that surely depends on the specific workload, i.e. the rate of changes in your OLTP database. Testing is recommended.

    One impact on the source database that we did see was that the Oracle alert log was flooded with Log Miner informational messages, such as "Begin mining logfile for session ..." and "End   mining logfile for session ...". From what we were able to find, this is a part of Log Miner functionality, and cannot be disabled.

Page 1 of 1 (7 items)