I am a developer at Microsoft and work in the .NET Common Language Runtime (CLR) team. For the last 4 years I have been working on virtual machine technologies on a variety of form factors including desktops (Windows, Linux), tablets (Win8), gaming-consoles (Xbox 360), mobile devices (Windows Phone 7, Windows CE, Symbian).I have worked on various core pieces of the runtime including Garbage Collector, memory manager, platform abstraction layer, runtime-performance, etc.Before working on .NET I worked on Visual Studio Team Foundation Server, Visual Studio Team System, Adobe Framemaker, Adobe Acrobat, Texas Instrument's Code Composer Studio.
This is an announcement only post, do subscribe to this blog feed or on to http://twitter.com/abhinaba as I’d be making more detailed posts on these topics as we get close to handing over these bits to our developer customers.
ARM processors support SIMD (Single Instructions Multiple Data) instructions through the ARM® NEON™technology that is available on ARMV7 ISA. SIMD allows parallelization/HW-acceleration of some operations and hence performance gains. Since the Windows Phone 7 chassis specification requires ARMV7-A; NEON is available by default on all WP7 devices. However, the CLR on Windows Phone 7 (NETCF) did not utilize this hardware functionality and hence it was not available to the managed application developers. We just announced in MIX11 that in the next version of Windows Phone release the NETCF runtime JIT will utilize SIMD capabilities on the phones.
What it means to the developers
Certain operations on some XNA types will be accelerated using the NEON/SIMD extensions available on the phone. Examples include operations on Vector2, Vector3, Vector4, Matrix from the Microsoft.Xna.Framework namespace will get this acceleration. NOTE: At the point the exact types and the exact operations on them are not closed yet and subject to change. Do note that user types will not get this acceleration. E.g. if you have rolled out your own vector type and use say dot operations on it, the CLR will not accelerate them. This is a targeted acceleration for some XNA types and not a vectorizing JIT compiler feature.
Apps and types heavily using these XNA types (our research shows a lot of games do) will see good performance gain. For example we took this Fluid simulation sample from a team (note this was not written specifically for us to demo) and saw huge gains because it heavily uses Matrix and Vector operations to simulate fluid particles and the forces that work in between them. Frame rates shot up from 18fps to 29fps on the very same device.
Based on the usage of these types and operations in your app you’d see varying amounts of gains. However, this feature should be a good motivation to move to these XNA types.
How does SIMD work
SIMD as the name suggests can process the same operation on multiple data in parallel.
Consider the following Vector addition
public static Vector2 Add(Vector2 value1, Vector2 value2)
vector.X = value1.X + value2.X;
vector.Y = value1.Y + value2.Y;
If you see the two lines in blue it’s essentially doing the same addition operation on two data sets and putting the result in two other locations. This today will be performed sequentially. Where the JITer will emit processor instructions to load the X values in registers, add them, store them, and then do the same thing again for the Y values. However, this is inherently parallelizable and the ARM NEON provides an easy way to load such values (vpop, vldr) in single instructions and use a single VADD NEON instruction to add the values in parallel.
A way to visualize that is as follows
A single instruction both X1 and Y1 is loaded, another instruction loads both X2, Y2 and the 3rd in parallel adds them together.
For an easy sample on how that works head onto http://www.arm.com/files/pdf/NEON_Support_in_the_ARM_Compiler.pdf
This is an announcement only post, do subscribe to this blog feed or on to http://twitter.com/abhinaba as I’d be making more detailed posts on how we are building the Generational GC and what developers need to know.
Today in the MIX11 keynote ScottGu just announced something that I’ve been working on for some time. The next version of the Windows Phone will have a Generational Garbage Collector (GenGC for short). A bunch of folks has worked real hard to get this piece into WP7 in a short time.
Today on Windows Phone 7 we have a stop the world, mark-sweep-compact, non-generational GC. When it runs it pauses the entire execution, looks through each object in the application to find and eliminate all unused data. This manifests as longer app startup time and stutters during time critical execution.
In Mango we are adding Generational GC to reduce collection latency to address both of these problems . Existing apps and games even without any changes can expect faster startup, faster level loads and reduction in gameplay stutters due to collection. Developers can specifically optimize for the new generational GC to completely remove stutters during animations and game play that came due to these GC pauses.
As an example see how one of the existing games butterfly benefits from the GenGC. One of the phone below is running the GenGC and the other is not and it should be obvious which one is based on which starts up first (do note that both in Keynote and here we are showing startup gains because it’s easier to show that. In gameplay stutters is hard to show on lower resolution videos). Also note that not just at the core startup at every level it gets a bit faster.
Direct link http://www.youtube.com/watch?v=FtusaSuFIpc
Please refer to my previous blog http://blogs.msdn.com/b/abhinaba/archive/2009/03/02/back-to-basics-generational-garbage-collection.aspx on what is a generational GC and how it helps.
The new GenGC uses 2 generations and write barriers to track Gen1 to Gen0 references. This post is just to announce the feature. I will be making a series of posts to get into the gory details as we get closer to handing over the bits to our developers. I am sure the developers would want to know the sizes of the various generations, when full vs generational collections happen and so much more. Do register to my blog feed or my twitter account http://twitter.com/abhinaba for the announcements as I publish these posts.
One of the most requested feature for the emulator was support for sensor. Developers were apparently asking their managers for travel budget to go to Hawaii/Europe so that they could test out their location based apps :). Today at MIX11 we announced that GPS sensor support for the emulator will be shipped in the next version of the tools. Some of the features that will be present are
Here’s a short video which I quickly took from the laptop of our PM. Being away from family in Las Vegas was getting me down so he took me to Seattle in a second (or rather simulated it).
Direct link http://www.youtube.com/watch?v=wt0DdiYqdkk
Another new sensor support is the accelerometer. So no more picking up the PC monitor running the emulator and rotating it to see if the emulator gets the movements. There’s going to be a new tool window using which the developer can feed in motion to the apps running in the emulator. Also some pre-canned motions like shake will be present. Here’s another video of that
Direct link http://www.youtube.com/watch?v=Gc1kuXj7eCE