When working with federations, database schema requires a special consideration.
With federations, you take parts of your schema and scale it out. With each federation in the root database, a subset of objects are scaled-out. You may create multiple federations because distribution characteristics and scalability requirements may vary across sets of tables. For example, with an ecommerce app, you may both have a large customer-orders set of tables and a very large product catalog that may have completely different distribution requirements.
All the schema artifacts in the root and in federation members are scoped to the database. Meaning objects in the root database are only visible when connected to root and objects in the members are only visible in the member. There is no schema enforcement across members of a federation to have exactly matching schema elements. So federation member 1 and 2 may have completely different schema.
The root database, provide the containment boundary for all information required to access all the information app cares about – root know about all federations and all federation members in these federation members and their distribution scheme and current distribution layout. This is all exposed through metadata for federations. New bunch of system tables like sys.federation, sys.federation_members and more will give you this information in the root and in federation members. Root’s containment boundary also serves as a control point for authentication and account management across all federations and their members. the following figure represent the setup I described above;
To express all this, there are surprisingly few annotations required in the federated schema. All object other than tables do not require any special annotations. Stored procedures, indexes, functions or triggers work within the scope of the database they are created in, regardless of the specific location of the object: in the root or in any federation member. The only annotations required are on the federated tables. Lets take a step back and take a look at the types of tables in databases with federations. Federations introduce 3 types of tables; federated tables, reference tables and central tables.
Federated Tables: Refer to tables that contain data that is distributed by the federation. Federated tables are created in federation members and CREATE TABLE syntax contain a federation distribution key annotated with the FEDERATED ON(federation_distribution_key = column_name) clause.
CREATE TABLE t1(…) FEDERATED ON (distribution_key=column_name)
In the figure above, federated tables are marked light blue. Federated tables contain part of the scaled out data. There can be many federated tables in a federation member. When a SPLIT operation is issued; federated table schema is fully copied to the destination members. However, federated table data is filtered based federation key and based on the value of the split point to the destination federation members.
First thing to remember is that column specified in the federated tables as the distribution key is required the exactly match the data type of the federation schemes data type definition. This is specified as part of the CREATE FEDERATION statement for the federation.
Another important aspect is that the column that is the federation key is present in each federated table. For some tables, this may require denormalization. For example, order_details table may not contain customer_id but to annotate the table as part of the classic orders & order_details schema, customer_id needs to be added to the order_details table as well as the orders table.
There are a few others requirements to remembers;
Reference Tables: Refer to tables that contain reference information to optimize lookup queries in federations. Reference tables are created in federation members and no special annotation is required when creating reference tables. The absence of the FEDERATE ON clause is all that is needed. Reference tables typically contain small lookup information useful for query processing such as looking up zipcodes that is cloned to each federation member.
In the figure above, reference tables are marked green. When a SPLIT operation is issued, reference table schema and data is cloned to both destination members.
A few things to remember with reference tables;
Central Tables: Refer to tables that are created in the federation root for typically low traffic objects such as application metadata. No special annotation is required for the these. In the figure above, central tables are marked orange. Central tables simply only exist in the root database context and are accessible only when you are in the root database. Some of the limitations listed above do not apply to the root database and tables created in the root database. Since centralized tables are in the root database only, they do not participate in any repartitioning operations. I refer to this above as well but one thing to remember is that centralized tables are only visible when connected to the root and not accessible when connected to the member databases.
Other Considerations for Federation Member Schema
Finally, a few other limitation in v1 exist in federation member schema.
The above list summarizes the list of requirements for the schema. As always, if you have questions, feel free to reach out through the blog.
-Cihan