Moving to Multi-Tenant Database Model Made Easy with SQL Azure Federations

Moving to Multi-Tenant Database Model Made Easy with SQL Azure Federations

Rate This
  • Comments 2

In an earlier post I talked about how federations provide a robust connection model. I wanted to take a moment to drill into the great ability of the filtering connections and how it makes taking a classic on premise database model known as; single-tenant-per-database model to a fully dynamic multi-tenant database model. With the move, app has to make little changes to the business logic but now can support multiple tenants per database to take advantage of best economics.

Obvious to many but for completeness, lets first talk about why single-database-per-tenant model causes challenges;

  1. In some cases the capacity requirements of a small tenant could scale down below the minimum capacity database available in SQL Azure. Assume you have tenants with MBs of data. Consuming a GB database wastes valuable $s. You may ask why there is a scale-down limit on a database in SQL Azure and it is a long discussion but the short answer is this; there is a constant cost to a database and a smaller database does not improve much in cost to provide benefits. In future we may be able make optimizations but for now, 1GB Web Edition database size is the smallest database you can buy.
  2. When the tenant count is large, managing very large number of independent surfaces causes issues. Here are a few; with hundreds or thousands of tenants, if you were to create thousands of databases for them, you would find that operations that are scoped globally to address all your data such as schema deployments or queries that need to report size usage per tenant across all databases would take excessively long to compute.
  3. Apps that have to work with such large number of databases end up causing connection pool fragmentations. I touched on this on previous post as well but to reiterate; assuming that your middle tier that can get a request to route its query to any one of the hundreds or thousands of tenants, the app server ends up maintaining as many pools  as the number of tenants. That is simply because every app server would have a single connection in each pool pointing to the individual db that contain each tenant. Some of these connection go ‘cold’ If each app server does not always get a request for each tenant frequently enough. So many times, the app finds no connection in the pool thus need to establish a cold connection from scratch creating performance problems.

Federations address all of these problems. Imagine a Federation with ‘TenantID’ as the federation key;

  1. Multi-tenant database model with federations allow you to dynamically adjust the number of tenants per database at any moment in time. Repartitioning command like SPLIT allow you to perform these operations without any downtime.
  2. Federation allow dense packing of tenants because you are not stuck with a static tenant layout. This means you get cost efficient dynamic tenant model that can adjust to growing workloads in individual tenants or easily onboard new tenants.
  3. With federations connection pool efficiency is built in: You can connect through a single endpoint and don’t have to worry about connection pool fragmentation for your applications regular workload. SQL Azure’s gateway tier does the pooling for you.

What’s the catch? Isn’t there any downside to multi-tenancy?

Sure there are downsides to implementing multi-tenancy in general. The most important one is database isolation. With multi-tenant database model, you don’t get strict security isolation of data between tenants. For example, you can no longer have a user_id that can only access a specific tenant in that database in SQL Azure with a multi-tenant database. You have to govern access always through a middle tier to ensure that isolation is protected. Access to data through open protocols like OData or WebServices can solve these issues.

Another complaint I hear often is the need to monitor and provide governance between tenants in your application in the multi-tenant case. Imagine a case where you have 100 tenants in a single database and one of these tenants consume excessive amount of resources. Your next query from a well-behaving tenant may get slowed-down/blocked/throttled in the database. In v1, federations provide no help here. Without app level governance you cannot create a fair system between these tenants in federations v1. The solutions to governance problem tends to be app specific anyways so I’d argue you would want this logic in the app tier and not in the db tier anyways. Here is why; with all governance logic, it is best to engage and deny/slow-down the work as early as possible in its execution lifetime. So pushing this logic higher in the stack to app tier for evaluation is a good principle to follow. That is not to say db should not participate in the governance decision. SQL Azure can contribute by providing data about a tenant’s workload and in future versions we may help with that.

Yes, there are issue to consider… But multi-tenancy has been around for a while and there are also good compromises and workarounds in place.

I like the multi-tenant model and I’d like to migrate my single-tenant per db over. How do I inject tenantID into every query?

Lets switch gears and assume you are sold… Lets see how federations help.

In a single-tenant app, the query logic in application is coded with the assumption that all data in a database belongs to one tenant. With multi-tenant apps that work with identical schemas, refactored code simply injects tenant_id into the schema (tables, indexes etc) and every query the app issues, contains the tenant_id=? predicate. In a federation, where tenant_id is the federation key, you are asked to still implement the schema changes. However federations provide a connection type called a FILTERING connection that automatically injects this tenant_id predicate without requiring app refactoring. Our data-dependent routing sets up a FILTERING connection by default. Here is how;

  1: USE FEDERATION orders_federation(tenant_id=155) WITH RESET, FILTERING=ON

Since SQL Azure federations know about your schema and your federation key, you can simply continue to issue the “SELECT * FROM dbo.orders” query on a filtered connection and the algebrizer in SQL Azure auto-magically injects “WHERE tenant_id=155” into your query. This basically means that connection to federations do the app logic refactoring for you…You don’t need to rewrite business logic and fully validate this code to be sure that you have tenant_id injected into every query.

With federations there is still good reasons to use UNFILTERING connections. FILTERING connections cannot do schema updates or does not work well when you’d like to fan out a query. So, “FILTERING=OFF” connections come with no restrictions and can help with efficiency. Here is how you get an UNFILTERING connection.

  1: USE FEDERATION orders_federation(tenant_id=155) WITH RESET, FILTERING=OFF 

Existence of two modes of connection however create also an issue; what happens when TSQL intended for FILTERING connections is executed on an UNFILTERING connection? We can help here; federation provide a way for you to detect the connection type; in sys.dm_exec_sessions you can query the federation_filtering_state field to detect the connection state.

  1: IF (select federation_filtering_state from sys.dm_exec_sessions 
  2:   where @@SPID=session_id)=1
  3: BEGIN
  4:   EXEC your single-tenant logic …
  5: END
  6: ELSE 
  7: BEGIN
  8:   RAISERROR ..., "Not on a filtering connection!"
  9: END

There are many other ways to ensure this does not happen but defensive programming is still the best approach to ensure robust results.

So net net, multi-tenancy is great for a number of reasons and you can move applications with heavy business logic coded in single-tenant model, with smaller work into a multi-tenant model with federations. I specifically avoid saying this is a transparent port due to issues I discussed and you still need to consider compromises in your code.

When federations become available this year, we’ll have more examples of how to move single-tenant applications over to multi-tenant model with federations but for those of you dreading the move, there is help coming.

-Cihangir Biyikoglu

  • Very useful information. We are building a multi tenanted application in Azure, and have gone with a multi tenanted architecture in SQL Azure. We have taken the approach of having separate schemas per tenant. So the question is - can Federation work with the concept of DB Schemas instead of tenant_id to separate the data?

    One challenge we face is how to permit users to take backups of their data in this scenario/ Any thoughts on this - either in the Schema approach, or in the tenant_id approach?

  • Hi Cihan,

    This is an interesting concept. It addresses a big reason not to go from single tenant to multi tenant dbs in the reuse of a lot of the data access logic, reporting etc.

    What I wonder about is some of the administrative benefits of single tenant such as:

    *a single tenant database means clients data can be backed up independently and restored independently.

    Consider Tim's comment from www.mygreatwindowsazureidea.com/.../655599-enable-backups

    "Support phone line rings: Hello…… I see, one of your users has run the year end routine by mistake and you want to restore your database from last night's backup? Ok, well, your database is SQL Azure and it is totally resilient to any kind of disaster and has a guaranteed uptime of over 99.9% and is totally scalable. It is ‘awesome’ but you can’t restore from last night’s backup so better tell your accountant that you have just started a new financial year…"

    Here he is talking about not be able to restore data, this is made even more difficult if it is only a logical database not a separate single tenant database.

    Is there any planning around this, to enable independent tenant backup and restore? Are the 'shards' able to be backed up and restored independently?

    My fear is that every feature that is not in the platform, has to be built by the platform consumer, and making these low level features from scratch is expensive.

    I would appreciate your thoughts.

    Cheers

    Simon

Page 1 of 1 (2 items)