Being Cellfish

Stuff I wished I've found in some blog (and sometimes did)

Change of Address
This blog has moved to blog.cellfish.se.
Posts
  • Being Cellfish

    Using triggers to protect your database from developers - still not a good idea

    • 0 Comments

    A long time ago I mentioned that triggers is rarely a good option in your database. I recently read this article (requires free registration) that shows you a solution to prevent developers from updating a full table (or deleting it all) by mistake because they failed to ad a WHERE clause to their query. Is this a good reason to add triggers to your database? I think not! There are so many things wrong here in my opinion...

    First of all the script proposed the example have the same trigger duplicated three times (once for update, delete and update/delete triggers each) when that could have been avoided easily since the script is just building a string to run anyway. So from the start the script proposed could easily contain bugs depending on the type of trigger added.

    Second this is only protecting against the most simple mistakes where you forget your where clause. What if a query has a where clause like "WHERE ID > 1" (assuming there is at least one record not satisfying that condition. You are probably still miserable.

    I would still say that the use of stored procedures and proper security setup so that users can only execute stored procedures and not modify data is the solution. That way the stored procedure can be reviewed. Even checked automatically for things like always having a where clause. The stored procedure can also be tested with some unit tests. All that should be enough and better than adding triggers that gives you some false sense of protection. And remember your backups in the case you failed to catch something like this earlier. But that goes without saying whenever you work with a database I think.

  • Being Cellfish

    SQL server backup myths

    • 0 Comments

    Any kind of backups other than straight file copies have Always been magic to me. Mostly because in the early years of my career it was so rare that the backups actually worked when needed because nobody evern tried the backups to see if they were actually working... This post (free registration needed to read) goes over some SQL server myths that are backup related. Way to many screen shots for my taste but interesting.

  • Being Cellfish

    Azure Websites and logs that are no longer updated

    • 0 Comments

    Last weekend I helped my father-in-law (who is hosting his company web site on azure websites) troubleshoot why he was no longer getting any more data in the web server logs. Logs that he uses to analyze what content is popular. Well it turned out that you should read the FAQ... Once your logs reach 35MB, they are no longer updated until you clean up the directory. I guess it kind of makes sense because I did think that it was weird that you could just have any amount of logs for your website for free. But there is no free lunch I guess... I just wish the user interface where you turn this thing on would mention this in the little help popup because there are probably a lot of people out there that don't read the FAQ and just assumes the data will just be there...

    Also the logs didn't just start to magically appear due to some bug mentioned here. I had to change the size of the VM temporarily and then back in order to get the logs going again.

  • Being Cellfish

    Asynchronous enumerations - summary

    • 0 Comments

    So what did we end up to do after the discussion two months ago? Well we ended up using option #1 mostly because it simplified things to have all items available during processing. technically not needed, but it simplified things.

    My generic advice would be to use option #1 or #2 and to favor option #2 whenever possible since #1 can easily be implemented wrapping #2. If your project is already heavy on using Rx, then and only then would I consider option #3.

    Here are the links to previous parts for your convenience:

  • Being Cellfish

    Asynchronous enumerations - Part 5: MyEnumerationAsync

    • 0 Comments

    For background, read the introduction.

    If you have no need for filtering using LINQ (or Rx) and you do not want to expose an IEnumerable you can create tour own "IEnumeratorAsync" that asynchronously fetches the next item. Again this is something I would advice against. First of all you are loosing the option to use constructs like foreach since you do not use standard interfaces. Second you loose the option of using LINQ even for simple things.

  • Being Cellfish

    Asynchronous enumerations - Part 4: IEnumerable<Task<IEnumerable<T>>>

    • 0 Comments

    For background, read the introduction.

    This is really just a variant of option #2 using the fact that you use azure tables and know that you will get several items at a time. In my opinion you shouldn't even consider this option since it uses a feature of the underlying storage layer and compared to IEnumerable<Task<T>>you do not get any other benefits other than that you create fewer Tasks. Compared to the time it will take you to actually retrieve the data items, the overhead of creating a task for each item is negligible.

  • Being Cellfish

    Asynchronous enumerations - Part 3: IObservable<T>

    • 0 Comments

    For background, read the introduction.

    In part 2 I briefly mentioned that if you have simple filtering needs then you might implement your own LINQ-like methods. Well if you have more advanced filtering needs there is already something out there to help you; Reactive Extensions (Rx). The only thing you have to do is to expose your data result as an IObservable<T>. So are there no disadvantages with this approach? Well I think there are several.

    • Rx have a very long set of extension methods to do all kind of fancy things to your data collection. Except for really simple stuff the learning curve can be quite steep since there often are many ways to achieve the same thing.
    • Some interfaces have unexpected results at first glance. For example the Count() function does not return an integer but an IObservable<int>. Naturally that is the way it has to be in order to be asynchronous, but for new developers this adds to the learning curve of Rx and easily leads into misunderstandings.
    • A lot of developers are very familiar with IEnumerable and LINQ. Tasks and async/await are becoming household knowledge for more developers each day. Rx is a new framework you have to learn.
    • Since Rx uses IObservables rather than IEnumerables and Tasks your code will be inconsistent in its use of asynchronous patterns. Something that might have impact on your maintainability.
    • Rx is a framework that is not part of .Net but must be downloaded separately (even though it is developed by Microsoft).

    As you can see; while Rx is a great fit for asynchronous enumerations (because it is what it really is), it uses clever constructs rather than language support to achieve this. Things that are clever often end up complex rather than simple and complex stuff typically breaks. But if your project already uses Rx a lot then it's a no-brainer; use Rx!

  • Being Cellfish

    Asynchronous enumerations - Part 2: IEnumerable<Task<T>>

    • 2 Comments

    For background, read the introduction.

    This is essentially the opposite of option #1. With this approach you are optimized for processing one item at a time. Or even process all items in parallel! So this is a very good option if you do not need all your data at once and you also have a large number of rows to work on. This option is also very flexible if you have a few scenarios where you actually need all the data since you can always do a Task.WhenAll to wait for all data to be available.

    The only big drawback with this approach is filtering using LINQ. Since it is an enumeration of tasks there is no way for you to asynchronously wait for an item in order to for example filter on its value using LINQ. Hence this is not a good option if you need to do a lot of filtering on your items. If you have just a few simple filters it might be worth implementing your own filters like WhereAsync and SelectAsync. All that together miht be worth it given your data and scenarios.

  • Being Cellfish

    Asynchronous enumerations - Part 1: Task<IEnumerable<T>>

    • 2 Comments

    For background, read the introduction.

    If you need all data before you start processing it or if you expect just a few records each time, then this is probably your simplest way to get an "asynchronous enumeration". Technically the enumeration is not asynchronous since you retrieve all data asynchronously first and then process it. You can also easily use LINQ on the enumeration for processing which is very nice.

    The only real drawback is that you're not flexible in case you one day will retrieve a lot of data (or no longer need al data at once for processing). If you have those needs from day one, then this is not the option for you I think.

  • Being Cellfish

    Asynchronous enumerations - Introduction

    • 0 Comments

    A couple of weeks ago I had a discussion with a co-worker about what would be the proper way to asynchronously iterate over some data in azure tables. Exploring different options was very interesting and let us understand different pros and cons for each asynchronous strategy. So over the next few weeks I'll go over each option we looked at in more detail. But first a little background so you understand the basic problem.

    When you retrieve rows form an azure table you can obviously get all rows first and then do what ever you need to do on the result. However if that is a lot of rows it means a (relatively) long waiting time to get the data that also will use a lot of memory. If you do not need all that data at once or if the first thing you do is some kind of filtering on the data that results in only a few records being left in the final collection you want to execute on, then an asynchronous retrieval of records would be a good thing. If you know a little about azure you know that when you retrieve records from azure tables you may or may not get a continuation token. Something that most people never deal with manually. But in this case you might want to by for example using the BeginExecuteSegmented method. That method will give you an asynchronous access to zero or more rows at a time.

    Already you can see that there are several different scenarios to take into account. Do you need all the records together for your processing? Can you process each record individually? Do you need to scan a lot of records only filtering out a few? Depending on what you need I believe your choice of "asynchronous enumeration" will differ. But be assured; I'll help you pick the right one for you! The options I will cover in the next few weeks are:

    • Get the whole enumeration asynchronously, a.k.a. Task<IEnumerable<T>>
    • Get each item asynchronously, a.k.a. IEnumerable<Task<T>>
    • Use reactive extensions, a.k.a. IObservable<T>
    • Get segments asynchronously, a.k.a. IEnumerable<Task<IEnumerable<T>>>
    • Create your own custom solution, a.k.a. MyEnumerationAsync

     

Page 3 of 49 (481 items) 12345»