It's been a while since I last looked at Rx and I must confess that my first impression was that the amount of possibilities to do the same thing and all the extension methods was overwhelming at start. But like with any new framework you learn you'll settle for a few to solve your most common problems after a while. But then I never saw a really compelling reason for using Rx and it has been very rare among my friends and colleagues to use Rx. So last week I was excited to read that Netflix uses it a lot and I'm looking forward to read more about how they uses and what they've learned along the way.
This was brought to my attention and I was blown away by the fact that somebody would mark classes as TestClass without any tests in them just to reuse some setup code. And that they then make any assumptions on in which order the methods are called. If you really want to do that the constructor is a great place for that and that is also why prefer xUnit.Net which does not have a TestInitialize through attributes but actually use the constructor instead. But I actually think it is better if you're more explicit about how you initialize your tests than relying on an order just defined by the order in which base classes are called by default. In general I believe the consensus is that composition is superior to inheritance when it comes to code reuse which is why you should not put yourself in a situation where you rely on execution order between base and child classes.
Recently I was asked about a specific scenario when some code was being converted into using TAP and this was code that already used tasks. Since the existing code used the "IsFaulted" property a lot I came up with this little extension method making it easy to see if a task failed or not using the await keyword but without adding a lot of exception handling.
1: public static Task<bool> WaitAsync(this Task task)
3: var tcs = new TaskCompletionSource<bool>();
5: t => tcs.TrySetResult(!(t.IsFaulted || t.IsCanceled)));
6: return tcs.Task;
So why avoid exceptions? Well I think that is a separate topic but a few short observations; exceptions have some performance penalties so you don't want to use them for common errors and some people think exceptions make it harder to follow the flow of the code. Whatever you think about all that I think you'll find the WaitAsync method useful more or less often.
Time for my 2012 according to this blog's statistics. As last year we'll start of with the five most read posts:
For the fourth year in a row native code coverage reports is the most read article I have on my blog. I was sad to see Object calisthenics not make top five. Another interesting thing though is that Ten commandments of egoless programming was linked by some popular sites which made it into the top. It in its turn links to the Viking laws which turned out to be much more popular than the commandments post. Exactly the way I feel. The top ten search terms to land on this blog in 2011 were (2010 ranking within parenthesis):
Once in a while I need to convert one object from one type to another because they represent slightly different views of the same data but they do not share a common parent. An example would be an object used internally representing some complex state for something (let us call it FooComplex) and someting simple you just want to return in an API (let us call it FooSimple). So how do I convert between these? There are several options:
fooSimple.ToComplex() and fooComplex.ToSimple()My first option is to att a member function to each of the classes that converts to the other. The first obvious downside with this is that these classes now both need to know about eachother which is probably really bad since they might be in different assemblies. It also means that when I need to add a property in both classes I need to also change the converters in two different files.
Explicit castsYou can also implement explicit cast operators which is really just a variant of the option above. Some may argue that it is better since it uses a language feature to illustrate that you convert between two classes. While that is true I think it will be confusing since cast operators are typically used to change the type when the two types actually have an inheritance relationship. So casting between two logically related but in practice seperate types might actually be more confusing than using ToXxx methods.
Partial classesTo get around the downside of having ToComplex and ToSimple in two different code files you can always make FooSimple and FooComplex partial classes. But this might not always be possible.
fooSimple.ToComplex() and FooSimple.FromComplex(FooComplex)To get around the problem of two way dependencies and needing to implement the conversion in two different classes (and files) you can just replace one direction with a static FromXxx method. This works pretty well but now you have two different patterns for converting depedning on direction (To vs From).
Extension methods to the rescue!However if you make your ToComplex and ToSimple methods be extension methods I think you get the best of all worlds; Both directions in the same file, same pattern and not using explicit cast operators. I also like to call them AsXxx rather than ToXxx to indicate there is some transformation happening.
I couldn't resist to create a method to deal with a scenario even less common than WhenSome. The crazy scenario here is that you have N tasks of type Task<T> and you want to return when a random task completes. The easiest way to do this is to just pick a random task and wait for it like this:
1: public static Task<T> WhenRandom<T>(params Task<T> tasks)
3: var random = new Random();
4: return (from task in tasks
5: orderby random.Next()
6: select task).FirstOrDefault();
However what if you want to be able to cancel the WhenRandom call you need to be a little more creative I guess:
8: public static async Task<T> WhenRandom<T>(CancellationToken cancellationToken,
9: params Task<T> tasks)
11: var random = new Random().Next(0, tasks.Length);
12: var remainingTasks = new List<Task<T>>(tasks);
13: while (remainingTasks.Count > 0)
15: await Task.WhenAny(remainingTasks);
16: for (int i = 0; i < remainingTasks.Count; i++)
18: if (remainingTasks[i].IsCompleted)
20: if (random-- == 0)
22: return await remainingTasks[i];
32: return default(T);
In this last version I don't pick a random task and wait for it but I rather pick a random completed task, i.e. given the same random number the order in which tasks complete will affect which task is returned.
While I wish I could write a long article on how Spotify works technically this is not what I want to tell you about today. Nor will I tell you how I would build Spotify if I had to, but that would be an interesting blog post. But today I want to tell you about a great article describing how Spotify has organized their teams, how they work and best of all; they have cool names for it too: Squads, Tribes, Chapters & Guilds!
In HTTP 1.1 connections are reused by default. This means that if you make two HTTP requests after each other you can do it over the same TCP connection. This saves you the overhead of setting up a new TCP connection. This is even more important if you're using HTTPS since the SSL handshake to setup the secure communication is relatively expensive for the server. That is why you typically want to reuse connections when you can. if you're using an HTTP client library (such as HttpClient in .Net) the client library will open a few connections if needed (this is configurable and the default is two) and then reuse them as long as there is new requests pending soon enough. but this can also be a problem if your server is a cluster such as a few instances in an Azure deployment. The problem occurs when you have few (relatively speaking) clients generating a lot of requests on your servers.
This is easily illustrated with an example. Let's assume that you have two servers behind a VIP (i.e. for each new TCP connection one server is chosen using a round robin selection) and three clients who each need one TCP connection but do 10 requests per second using this connection. Client one will first open a connection to server one, client two connects to server two and then client three connects to server one. And now we have an uneven load of 20 requests per second on server 1 and 10 requests per second on server 2. This might not be too bad if we had 101 clients and two servers given that each client is equal but that is probably not the case. Consider for example the case where client two in the example above only generates one request per second and closes the connection between each request. Now server one will alternate between 20 and 21 requests per second while server two has zero or one request per second. If you have a mix of these short lived connections and long lived connections and assuming that even the long lived connections once in a while will be closed, then you will notice that some of your servers will have much more load than some others. This is probably not desirable since random servers will have very high load at random times while a lot of servers will have less than average load. This problem is called connection affinity.
You might think that a smarter VIP choosing where to route new connections based on capacity would help and yes it will. A little. The problem is that once a TCP connection is established the VIP cannot do anything to reroute it. At least not your vanilla VIP. Maybe there is some super advanced HTTP inspecting VIP out there but you probably don't have one so don't bother thinking about that too much.
What you want to do is to let a few requests be handled by the server on each TCP connection to get the best of both worlds; reuse a connection for a while to reduce connection overhead but once in a while force the client to connect again so that gout VIP can load balance. While this is definitely possible the overhead to keep a list of all clients currently connected in your web service will waste some memory and CPU cycles and instead you can let math help you. If you want to reuse each connection N times on average you can randomly choose to close a connection after each request with the probability of 1/N. This works great for a mix of long lived and short lived connections since the long lived connections will be closed on average every N requests (trust math!) while short lived connections with just a few requests are likely to not be closed prematurely.
You might be temped to just have a global counter and close connections every time your total counter hits N. This does not achieve what you want. There is a famous problem called the coupon collector's problem that tells us that if you have N options that are equally probably the number of picks you need to make in order to expect to have picked all N options is N ln(N). That means that if you have C connections and close every N requests it will take C N ln(N) requests before you can expect to have closed all connections so he average life time of each connection is going to be larger than N. Once you add a number of short lived connections it gets even worse. Trusting randomness is much easier and more accurate!
Last week I helped a colleague who was experiencing UnobservedTaskExceptions I his code. The problem was essentially that the code started several tasks and then in a loop checked each one if it was faulted or not. If a task was faulted the method threw an exception. This meant that if two tasks faulted in the collection of tasks then the second one was never observed causing an UnobservedTaskException that brought down the process. While this sounds simple it turned out to be a hard nut to crack for a number of reasons.
First of all you need to know some things about the UnobservedTaskExceptions; while they by default crash your process in .Net 4.0, they don't in .Net 4.5. The fun thing is that you get the 4.5 behavior on your 4.0 assemblies by just having 4.5 installed. There is a way to configure your application to use the old 4.0 behavior and you can read about that and why the default behavior changed here. Forgetting this can frustrate you if your build environment does not have 4.5 but you have it on your own machine.
Second you have to remember that even if you use async/await you can still end up writing code that have the same "problem". The "problem" is that you only bubble up the first error you see and not all errors. For many reasons I think this is what you actually want (WhenAllOrError anybody?) but if you really want to get all exceptions you can just use Task.WhenAll and you'll be good.
I read this interesting article that illustrated the difference between processes optimized for flow efficiency versus resource efficiency. Maybe not obvious in that article why flow is cheaper (or same) cost as resource optimized but if we assume customer satisfaction is a great asset I think it is obvious which process is preferably from a customer satisfaction perspective...