Geeks Bearing Gifts

Unleash your potential

Appfabric Cache: Read from & Write to Database (Read through - Write behind)

Appfabric Cache: Read from & Write to Database (Read through - Write behind)

Rate This
  • Comments 9

The Windows Server Appfabric 1.1 CTP has the much awaited Read-through and Write-through feature.

  • Read through - If a requested item doesn't exist in the cache, the data can be loaded from backend store and then inserted into cache. With a read-through provider, the cache detects the missing item and calls the provider to perform the data load. The item is then seamlessly returned to the cache client.  
  • Write behind - In the same way, items that are added or updated in the cache can be periodically written to the backend store using a write-through provider. This happens asynchronously and on an interval defined by the cache.

One key scnarios where this feature is handy is when a client requests for an item that has already been removed from cache. The item will automatically be loaded back to cache from the backend store. Likewise the changes you do to the items in cache are persisted asynchronously from time to time without any user intervention.

To accomplish this feature, you will have to write a provider (class library) that extends DataCacheStoreProvider abstract class. Then you need to install (in gac) your provider to each of the cache host of the cache cluster. The final step is to configure your named cache to use this provider. To get started make sure you have installed the Windows Server Appfabric 1.1 CTP. Next, create  a C# class library and reference the library Microsoft.ApplicationServer.Caching. For 1.1 CTP, this is normally located under 'Program Files/Windows Server Appfabric'. Now implement the abstract class DataCacheStoreProvider.

using Microsoft.ApplicationServer.Caching;

namespace SampleProvider
{
public class Provider:DataCacheStoreProvider
{

private String dataCacheName;
private String connectionString;

public Provider(string cacheName, Dictionary<string, string> config)
{
dataCacheName = cacheName; //Store the cache name for future use
connectionString = config[0].Value;
}

public override void Delete(System.Collections.ObjectModel.Collection<DataCacheItemKey> keys){}

public override void Delete(DataCacheItemKey key){}

protected override void Dispose(bool disposing){}

public override void Read(System.Collections.ObjectModel.ReadOnlyCollection<DataCacheItemKey> keys, IDictionary<DataCacheItemKey, DataCacheItem> items){}

public override DataCacheItem Read(DataCacheItemKey key){}

public override void Write(IDictionary<DataCacheItemKey, DataCacheItem> items){}

public override void Write(DataCacheItem item){}
}
}
A few things about the above code first. Your provider needs to have a constructor with the following signature. These parameters are passed in when your cache host initializes. The cacheName would contain the name of the cache for which you have created the provider. 
The config will contain any configuration details like the connection string that you can pass while registerting your provider in the final step. public Provider(String cacheName, Dictionary<string, string> config) The other methods you need to implement are the Read, Write, Delete and Dispose. 
You need not implement all the methods. If you want only read-through feature, just implement the Read. The following table gives you a gist of what each of the function does. 
Read Read a value from database, load it into the cache as well as return it to the client requesting for the cache item.
Write Persist items in the cache to the backend store (database) after a specified amount of time.
Delete Delete an item from the backend store when a cache item is removed from the cache by the client.
Dispose Called when the host is shutting down. Any cleanup code goes here.

 

 

 

Implementing Read

Read has two methods. One which takes a dictionary collection and the other takes a DataCacheItemKey. Read requests are always sent as a dictionary collection. Let us first implement the function with DataCacheItemkey parameter and then use it to process the other read function which accepts the dictionary collection.

 public override DataCacheItem Read(DataCacheItemKey key)
{
Object retrievedValue = null;
DataCacheItem cacheItem;

retrievedValue = ReadFromDatabase(key.Key); //Your implemented method that searches in the backend store based

if (retrievedValue == null)
cacheItem = null;
else
cacheItem = DataCacheItemFactory.GetCacheItem(key, dataCacheName, retrievedValue, null);
return cacheItem;
}

You need to use DataCacheItemFactory class to create your cache item as follows DataCacheItemFactory.GetCacheItem(key, dataCacheName, retrievedValue, null). If the item is not found, we are returning null which matches with the cache cluster behaviour. Next lets implemet the read which accepts the dictonary collection.

 public override void Read(System.Collections.ObjectModel.ReadOnlyCollection<DataCacheItemKey> keys, IDictionary<DataCacheItemKey, DataCacheItem> items)
{
foreach (var key in keys)
{
items[key] = Read(key);
}
}

As mentioned before, we are just using the 'public override DataCacheItem Read(DataCacheItemKey key)' to read the items from backend store. You need to add the items to IDictionary<DataCacheItemKey, DataCacheItem> items as in the code above before returning from read method.

 

Implementing Write

Write is implemented the same way as read. It's important to remove the items which has been written to the backend store from the IDictionary<DataCacheItemKey, DataCacheItem> items. If you have successfully written all the values to the backend store, you can simply clear the items dictionary object by calling items.Clear().

 public override void Write(IDictionary<DataCacheItemKey, DataCacheItem> items)

 

Compiling your provider

Your provider should be signed with a stong key. To do this, go to your Project Properties> Signing > Check 'Sign the assembly'> Choose or create a strong key name file.

 

Installing your provider

Your provider should be placed in the GAC of all the hosts of the cluster. Run the visual studio command prompt and type in the following command to add your provider to gac.

gacutil /i SampleProvider.dll

You now need to get the fully qualified name of your provider. To do this type the command without the dll extension

gacutil /l SampleProvider

This would give you a message as follows:

SampleProvider, Version=1.0.0.0, Culture=neutral, PublicKeyToken=0dca281230246e10, processorArchitecture=MSIL

We need to add the class details to the above message and remove the processorArchitecture. We will be using this string to register your provider with the cache 

SampleProvider.Provider, SampleProvider, Version=1.0.0.0, Culture=neutral, PublicKeyToken=0dca281230246e10

Registering your provider

Time to fire up the cache administration powershell. Here I am registering a provider for an existing named cache. If you are creating a new cache, you need to use the same parameters with the new-cache command.

Set-CacheConfig TestCache -ReadThroughEnabled true -WriteBehindEnabled true -WriteBehindInterval 60 -ProviderType "SampleProvider.Provider, SampleProvider, Version=1.0.0.0, Culture=neutral, PublicKeyToken=0dca281230246e10" -ProviderSettings @{"DbConnection"="<your connection string>";}

If your provider supports read through, set -ReadThroughEnabled to true. If your provider supports write behind, set -WriteBehindEnabled to true and also the interval between which cache items are persisted to database store. The minimum duration that can be set in this CTP is 60 seconds. In the -ProviderType, pass the string that we created in the step "Installing your provider". To pass config parameters to the initializer 'public Provider(string cacheName, Dictionary<string, string> config)', make use of -ProviderSettings as in example above. Make sure you have set the same config for all the hosts in your cluster. Once that is done, start your cache cluster. If any of your cache host fails to start, probably there is some problem with your provider. Look for errors in your system event log.

 

Testing your provider

To test your provider, create a sample test client and request for an item that is not present in cache. The read of your provider will be called when the cluster fails to find the item in cache. To test write, add an item to cache cluster and wait for the -WriteBehindTimeInterval to pass.

 

Debugging your provider

To debug your provider code, open the class project and then go to Debug>Attach to process> and select DistributedCacheService.exe. Put debug points as necessary. Read will be hit when you request for a key.

 

Feel free to send in your comments or questions. Happy coding. :)

 

Leave a Comment
  • Please add 4 and 6 and type the answer here:
  • Post
  • Hi Prathul,

    can you please post code for implementing the Delete method. I have a problem that I have described here:

    social.msdn.microsoft.com/.../e03c2c18-cdb7-4e78-b827-13695d414579

    Thanks, Andree

  • Hi Andree,

    Your implementation of Delete is fine. Even I noticed the same behaviour in both 1.1 CTP and 1.1 RTM. When a key is removed, the count is decremented to negative. Could be a bug. I will convery this to the appropriate team.

  • Which I am looking for.

    Thank you very much.

  • Hi,

    Did you think that "Read through - Write behind" could be used for persisting data in a backend store ? In our IS, you are planning to use AppFabric Caching like a datasource for all of our web sites. This feature could help us to "dump" the cache at regular intervals.

    Thanks,

  • Yes, you can use read through to load data from database if its not present in the cache and write behind to write the data at regular intervals.

  • Maybe I am a little unclear. Data will be put in the cache by back/middle tiers components/notifications messages received from external providers (write model) and will be consumed by our front end services/websites (read model). The Read through - Write behind feature will be used "only" for storing items in the cache (maybe a NoSQL database). It's not our main database, but simply a physical copy of the cache (seems better than requesting again a provider when there is a cache miss). Another reason for this is that we do not to have business logic in the cache host (painful deployment)

    Do you think it is a good approach ?

  • Hi,

    I wount be able to give a definite answer unless I understand your scale, requirements etc which I think will not be possible. For clarity, let me break down your statements and give my comments on each so that you can better understand the limitations and take a call.

    Data will be put in the cache by back/middle tiers components/notifications messages received from external providers (write model) and will be consumed by our front end services/websites (read model).

    Do you mean to say that the data will be put into the cache by an external provider? If so, this is not an implementation of write behind feature of Appfabric. In write behind feature of Appfabric, the data in the cache is periodically written to the backend store. Do note that if the cache is changed multiple times, only the latest value is persisted. In read through feature, if a product of id=3 is requested and that particular data is not present in cache, you can load the same to the cache by querying your backend.

    Typically the cache items are changed through front end (shopping cart of user) or middle tier. If you are indeed proposing to use an external source to write to the cache, you will have to take into consideration the complexities involved. For example, what would happen if a value was changed by some other source and was overwritten by your external source?

    The Read through - Write behind feature will be used "only" for storing items in the cache (maybe a NoSQL database). It's not our main database, but simply a physical copy of the cache (seems better than requesting again a provider when there is a cache miss).

    You are just dumping the cache data loaded from your external source to a NoSql database. Are you proposing that if there is a cache miss, you will be requesting the data from the NoSql database rather than your actual backend? Unless your backend operation is very costly, there is no significant performance gain here. You need to take into consideration as to what would happen if an item is requested first time that is not present in the NoSql.

    Another reason for this is that we do not to have business logic in the cache host (painful deployment)

    If you are going by read through-write behind approach, as you must have already read, you need to install your custom provider in all the cluster hosts. Of course you will have to give a mapping on where to read a key from or where it must be written to.

  • First of all, sorry for my poor English.

    I have read some posts and articles on Read through - Write behind feature and I think I understand it quite well now. I am an Architect in a quite large web site ( more than 100 millions pages views /month)

    1) (on providers)

    No. Providers won't update the cache directly. We have publish/subscribe services that are notified by providers. Each service is implemented by our team and will put data into the cache.  

    2) (on dumping)

    Yes, the idea is dumping the cache in order to have a fast/ready-heavy database. As a classic WebForm app, our web site does not scale very well and database is quite overloaded (average CPU at 50% and more than 3 000 sp execs /s, can not scale up).

    2 bis) Of course ! Read through - Write behind is a first fallback method. The second one is the cache aside programming model. These two implementations may seem the same but will improve performance and availability. Yes, our backend operation is very costly, because of the database and we need frequent refresh of UI (like trading). We will also have many named caches for all of out Biz domains (trading, account, marketing, cms, offer...)

    3) (Another reason ...)

    As I said, reloading data can involve providers, others middle tiers or main database, ... I find quite painful to put all the logic in the provider. It does not seems flexible (publish to each host and restart the cluster).

    Thanks for all,    

  • I got the picture now. I understand that you have proper synchronization in place so that you don't use stale data from cache. Your implementation looks fine.

Page 1 of 1 (9 items)