Recently, a particular deployment with cache servers on domain1 and cache clients (app servers or web servers) running on domain2 lead to some debugging challenges when the cache clients are unable to communicate with the servers. Whether it is your DEV servers accessing cache servers or a production topology that is common in your enterprise, this blog might help to unblock you with a quick workaround.
In such a deployment, your cache client might receive the following exception:
Message : ErrorCode<ERRCA0016>:SubStatus<ES0001>:The connection was terminated, possibly due to server or network problems or serialized Object size is greater than MaxBufferSize on server. Result of the request is unknown.
We understand that this message is misleading especially when the object being stored is only a few bytes, a string object for instance. Secondly, you might also notice that the instantiation of the DataCacheFactory and getting a reference to the DataCache object in your code succeeds and the exception gets thrown only when the first cache operation (GET or PUT) is executed.
Here are a set of things to confirm before concluding the problem:
Here is an extract from a trace file captured when this problem occurred.
2010-9-15 13:33:01.466
DistributedCache.ClientChannel.Client1
0x000005CC
Creating channel for [net.tcp://SERVER1.DOMAIN1:22233]."
----
2010-9-15 13:33:01.664
DistributedCache.DRM.Client1
0x00000A1C
'2:-1' PUT;Routed;MyCache;Default_Region_0982;1975349082;test key;Version = 0:0 - Starting to process."
2010-9-15 13:33:01.665
Config for [MyCache,1975349082] is [net.tcp://SERVER1:22233 (120)]."
The problem is that the DataCacheFactory instantiation uses FQDN as seen above. Subsequently, the internal data structures reference only the server name which is maintained in the internal routing table. This causes an issue during a cache operation execution, since the cache client machine (app server or web server machine) DNS is unable to resolve SERVER1.
The above snapshots have been taken from a customer engagement which resulted in this key lesson learnt. We deeply appreciate such feedback and patience in working with us to identifying the root cause.
We have surfaced this issue to the product team who are fixing this in the next subsequent release.
Author: Rama Ramani
Reviewers: Jaime Alva Bravo, Rahul Kaura, Jason Roth