October, 2011

Dan on eScience & Technical Computing @ Microsoft

eScience & Technical Computing - Web Services and Scientific Research

October, 2011

  • Dan on eScience & Technical Computing @ Microsoft

    .NET Bio: the new name for Microsoft Biology Foundation and now open source

    .Net Bio samples9.png

    Microsoft Research is putting .NET Bio, a bioinformatics toolkit into the Outercurve Foundation, allowing community involvement in the future of this open-source project.

    See the full post by Simon Mercer describing the transfer to Outercurve as a new Research Accelerator and the new functionality being included in this release.

    There is a training event this week on .NET BIO (10/20-21) at UCSD.



    .NET Bio logoThe Microsoft Biology Foundation (MBF) has undergone a significant transformation since it was first released. Over time, it’s become clear that a new name was also in order. So today, I am pleased to announce that MBF will now be known as .NET Bio. In addition to the new name, .NET Bio will also have a new location: the Outercurve Foundation. This move is the next logical step in the life of the project: transferring its ownership to a nonprofit foundation that is dedicated to open-source software underscores our community-led philosophy; while Microsoft will continue to contribute to the code, it will do so as one among a growing community of users and contributors.


    Users can perform a range of tasks with .NET Bio, including:

    • Importing DNA, RNA, or protein sequences from files with a variety of standard data formats, including FASTA, FASTQ, GFF, GenBank, and BED.
    • Constructing sequences from scratch.
    • Manipulating sequences in various ways, such as adding or removing elements or generating a complement.
    • Analyzing sequences by using algorithms such as Smith-Waterman and Needleman-Wunsch.
    • Submitting sequence data to remote websites (for example, a Basic Local Alignment Search Tool [BLAST] website) for analysis.
    • Outputting sequence data in any supported file format, regardless of the input format.

    Microsoft Biology Foundation Evolves into New Toolkit: .NET Bio - Microsoft Research Connections Blog

  • Dan on eScience & Technical Computing @ Microsoft

    Big Data and LINQ in CACM


    Just read the The World According to LINQ article in October’s Communications of the ACM – Erik Meijer does a really good job describing LINQ and how it can be used with Big Data from my different data sources – ie. DBs, REST services and other unstructured data sources…also describes the mathematical foundations of LINQ….

    The World According to LINQ

    [article image]

    Erik Meijer

    Big data is about more than size, and LINQ is more than up to the task.

    Programmers building Web- and cloud-based applications wire together data from many different sources such as sensors, social networks, user interfaces, spreadsheets, and stock tickers.  Most of this data does not fit in the closed and clean world of traditional relational databases.  it is too big, unstructured, denormalized, and streaming in real time.  Presenting a unified programming model across all these disparate data models and query languages seems impossible at first.  By focusing on the commonalities instead of thee differences, however, most data sources will accept some form of computation to filter and transform collections of data.

    Erik Meijer. 2011. The world according to LINQ. Commun. ACM 54, 10 (October 2011), 45-51. DOI=10.1145/2001269.2001285 http://doi.acm.org/10.1145/2001269.2001285

    The World According to LINQ | October 2011 | Communications of the ACM

Page 1 of 1 (2 items)