I originally started working on Code Canvas back in 2007, but it was essentially put on hold after 2009 because the technology to make it testable in the field was not available. For example, Visual Studio at the time did not support editing code in anything other than the standard file-based tabbed-editors. Also there was too much work required to integrate Code Canvas into the rest of the IDE, but this integration would have been necessary in order to test it with real developers on real code. Right about this time Brown University came out with their own canvas-based environment called Code Bubbles which we wanted to collaborate on, so we created a project called Debugger Canvas which combined aspects of Code Canvas and Code Bubbles together. Debugger Canvas was a success and is alive and well on Microsoft DevLabs with over 15,000 downloads so far.
Debugger Canvas is similar to Code Canvas but is really only a small piece of the overall vision of Code Canvas. Code Canvas aims to let the developer view and interact with their entire software system as a whole, much like a code map or schematic, whereas Debugger Canvas only lasts for one debugging session and is only meant to make that debugging session easier to understand. You can read more on my personal thoughts and comparisons of Code Canvas vs. Code Bubbles vs. Debugger Canvas in my previous blog post.
The first major Code Canvas prototype was a standalone Windows application based on WPF and was shown at MSR TechFest in 2009. The demo consisted of several features that worked together to give the concept of what code on a canvas might look like and the possibilities that you could do with it.
At the heart of Code Canvas is the ability to view all of a system’s source code, designers, and artifacts at the same time next to each other on a single zoomable canvas. The prototype demoed at TechFest did not have the ability to show visual designers yet, but it did show what it would be like to have multiple C# code files on a canvas next to each other at the same time. The figure on the right shows the entire source code for the old CTP of the WPF Ribbon Preview from www.codeplex.com.
By itself this is not so impressive (even though it was incredibly difficult to get it to scale well), but this was the foundation for all of the features that came next.
Since most codebases today are organized based on files and directories, Code Canvas of course has the ability to display them as such. But one of the earliest design goals of Code Canvas was to have the ability to view your code based on the structure of the code instead of its organization on disk. The first prototype had the ability to make the code on the canvas look like Visual Studio’s Class Diagram instead, by drawing borders and backgrounds behind type definitions using the same colors and shapes as the built-in Class Diagram view. You’ll need to look at the figures in high detail to see the difference when zoomed out, but viewing the code based on its types instead of its files makes it easy to see how many types there are, the relative size of them, and even makes nested types much more obvious than they would have if viewing the code in traditional text editors.
You can also turn on all of these views at once. In that case, you will see the files surrounded by the silver file background with the class diagram graphics layered on top and the directory/group background colors underneath. In the case of C# files that consist mainly of one gigantic type definition, the only place you’d really notice the silver file background would be behind the using statements and the namespace declaration.
Code Canvas takes the approach of zooming out instead of scrolling when you run out of space on your screen (although you could certainly pan everywhere instead of zooming if you liked). Unfortunately as you’re zooming out then everything quickly becomes too small to read. But that’s ok, since when you’re zooming out it’s reasonable to assume you want a “higher level” view of your system anyway, for example the view offered by Visual Studio’s Class Diagram which only shows you boxes and names of types and members without showing you any of the actual code. We could have abruptly switched from one view to the other at some predefined level of zoom, but that would be jarring to the user and may even make it feel like two disconnected views.
Instead, Code Canvas slowly introduces floating labels with icons as soon as the zoom level is less than 99%, and the labels line up exactly with the text identifiers defining the types and members. As the user zooms out, the floating labels and icons stay the same size even though everything else becomes smaller and smaller. The floating icons and labels are anchored to the defining text in the code (which gets closer together as the user zooms out) but since the icons and labels don’t get smaller then they eventually start to collide. When this happens, they push each other out of the way but maintain their relative order so that they begin to stack vertically like a list. The labels are constrained to the boundaries of their containing type, so eventually they also run out of room horizontally and vertically, at which point their either clip or “squish” together, with the private members squishing out of existence first, then the protected members, then the public members, etc..
Since the floating labels are opaque, they eventually cover up all of the teeny tiny code behind them and all the user is left with is a bunch of the icons and labels stacked on top of each other which looks pretty much like the view in the Class Diagram does. This all happens fluidly and dynamically in real-time as the zoom scale increases and decreases so that it truly feels like the same content as the user zooms in and out.
While it’s possible to drag everything around by hand on the canvas just like you would in a diagramming program, there are times when you would rather automatically arrange the code based on some rule or structure (such as class inheritance for example). Code Canvas used Microsoft’s Automatic Graph Layout libraries to accomplish this. Unfortunately the built-in algorithms were tailored towards large node-link diagrams where nodes would align their midpoints (instead of aligning their top or bottom edges with each other). When moving nodes around they would also attempt to “gravitate” towards one another to continually optimize for the shortest paths and minimal edge crossings. Code Canvas also has similar desires (short edge lengths, minimal edge crossings), but after playing with the layout it felt very odd to have the code fragments “floating” around. A much more grid-based (or line/column-based) layout algorithm is needed in order to line up lines of code and left-side margins since we are usually dealing with fixed-width text and not the typical nodes or “bubbles” that are common in graph-based layout algorithms today.
The original Code Canvas prototypes grouped all code files by their directory by default. The groups were obviously hierarchical and often deeply nested, resulting in lots and lots of yellow folders surrounding each other. Usually this was somewhat redundant since if you had a file named C:\Users\MyUser\Documents\Projects\Foo\Bar.cs then we’d show Bar.cs surrounded by a yellow folder named “Foo”, which was surrounded by another yellow folder named “Projects”, surrounded by another folder named “Documents”, etc. We eventually added the feature to automatically collapse this situation into a single yellow folder named “C:\Users\MyUser\Documents\Projects\Foo”.
You could also add your own groups and name them whatever you’d like. The idea behind this was more than simply a means of controlling layout, it was also meant to provide a way to provide architectural context on the canvas. For example, a common use case may be to have a group for “Model” code, “View” code, “ViewModel” code, etc. The layout engine would make sure that items in one group would not be allowed in another, and that groups could not overlap unless they were strictly nested. We did not have the ability at the time to have non-hierarchical grouping (e.g. Venn or Euler diagrams) but this is something that Code Canvas definitely needs to take full advantage of this idea.
The groups could also be turned on or off individually (which simply toggled the visibility of the background, label, and graphics) but the layout engine would always consider them when performing layout even if the graphics were not visible. This made it so toggling the visibility would not cause the layout to change, which was a good thing. Code Canvas has the ability to show many different aspects of the code and it was a design goal that changing the display of the code would not change the layout of the code. After all, we are attempting to leverage spatial memory to its fullest extent, and having everything jump around by simply changing the visibility of the graphics would be a bad thing.
In addition to adding context to the canvas via grouping and named labels, we also had support for adding other basic annotations such as sticky notes and push-pins. Sticky notes were rather rudimentary; you could type text in them and point an arrows from a sticky note to the items it was referring to, but that was about it. The pushpins were simple but they were very useful as landmarks since they did not scale linearly with the rest of the canvas. They used what is known as power-law scaling which essentially makes them shrink slower than the rest of the canvas so that you can always see them no matter how far you zoom out.
Most IDEs (or their add-on packages) have the ability to highlight code in the editor to display some kind of information (e.g. code coverage, test metrics, etc.). Code Canvas also has this ability and it makes it even more useful by allowing you to zoom out and see the highlights across your entire codebase which can help you visually detect “problem areas” or “hot spots” if your code is arranged in a meaningful way. Even when zoomed in you are still immersed in the ambient information with the highlighting all around you and the surrounding code. Whether the ambient information represents code coverage, bug data, team activity/ownership, or anything else, you can be constantly surrounded by the information in real-time without having to switch tool windows or run separate tools. This can act as an “augmented reality” for your code and help enable early problem detection and smarter software development.
Code Canvas has the ability to overlay graphics other than simple highlighting. One of the prototypes included Control Structure Diagrams drawn next to the code, and another integrated with Pex to show all possible execution paths through the code, with the ability to show or hide individual paths. Code Canvas also draws a link from an identifier to it’s definition whenever you click a floating icons or labels, or when the text cursor is on an identifier token. When you select a range of text then you can see edges for all of the selected identifiers pointing to the things they reference.
When you are zoomed in far enough to select text then it’s likely that whatever that text references is going to be off the screen (unless you have an extremely small project or an extremely large monitor). In this case Code Canvas shows a little preview on the edge of the screen (kind of like a floating rectangular magnifying glass) that gives you a preview of what the other end of the arrow points to. Clicking on the preview “glass” automatically zooms you out from where you are and flies you to the area being shown in the preview at the same level of zoom. This of course means the source of the arrow is now off the screen in the other direction, so you would then be left with a new preview glass that shows you a preview of where you just were. You can quickly and easily navigate back and forth between two ends of an edge this way.
Another feature of Code Canvas is the ability to search and see results overlaid on top of the canvas. This can provide more meaning than a traditional output window or list view, since it’s easy to see clusters of results and patterns of frequency in different areas of the codebase. This is another case where it’s important to have a meaningful layout of the code rather than a temporary or automatically-generated layout, since if the locations of code don’t mean anything then the clusters and patterns seen will also not mean anything. If, on the other hand, you organize your canvas into an architecturally meaningful design (for example, UI code in one section, business logic in another, data access in a third, etc.) then it will mean something when a majority of your search results appear in a certain area of the canvas instead of another. Code Canvas even takes advantage of this spatial meaning when zoomed in by keeping off-screen search results snapped to the edge of the screen in the direction of the result. This uses the same algorithm as the off-screen reference preview window, but instead of showing a preview window of the result it simply shows the search result in a semi-transparent way. The transparency not only distinguishes it from an actual on-screen result, but it also allows multiple groups of results to “build-up” into a more solid, thicker blob indicating a high frequency of results in that direction.
One of the most well-received demos at TechFest was the ability to play back a recorded trace of a program’s execution and watch as the flow of control moved throughout the code on the canvas. You can't tell from the picture, but the dashed lines actually animate along the execution path and you can control them with the slider(s) on top. In the case of a multi-threaded program you will have multiple sliders (one per thread) which are linked by default so that when you drag one from left to right then the other will move automatically to match the execution as it executed in parallel. When the sliders are linked this way then you are essentially dragging the slider “through time” and it’s easy to see how thread execution interleaves and interacts. This is just the tip of the iceberg in this area. We haven't even begun visualizing the data state that's associated with each thread, or what it looks like when there is a preemption in a thread, or how to tell what the "current" thread is (or current "threads" when you're running multicore), or lots of other things.
The original Code Canvas prototypes made for excellent demos and received internal and external acclaim, but several technological factors prevented us from being able to actually deploy and test it “in the field” on large codebases. These are some of the things that we really need(ed) in order to continue the Code Canvas research and make it into something that can be tested in the field and used on a regular basis.
In order to be a fully immersive experience, Code Canvas needs to include visual designers such as WinForms, WPF, Silverlight, ASP.NET/HTML, etc. Not only is this necessary to keep the developer immersed within their spatial context, but also because a primary benefit of Code Canvas is the ability to see arrows pointing from tokens to whatever they refer to. When tokens refer to visual UI elements with no “real” code associated with them (other than auto-generated code by the designer) then we need to be able to display the UI element on the canvas to give the arrow somewhere to go. One example is the relationship between a button on a form and its code-behind “click” handler. Another example is when a piece of code references a textbox there should be a link pointing from the variable in the code to the textbox on the form.
Our first attempt at integrating Code Canvas with Visual Studio gave us the ability to display WinForms-based designers on the same canvas as code editors, but the API did not give us access to the sub-elements on the form. This prevented us from drawing relationship/reference arrows to elements on the forms or even highlighting code fragments when their corresponding elements were selected. This somewhat nullified the effectiveness of displaying the two side-by-side and we never had the chance to research it accordingly.
The crux of Code Canvas that differentiates it from other visualizations or diagramming tools is that it must support editing on the canvas so that the developer is constantly surrounded by the context of the surrounding code, the code’s relationships, and other ambient information about the software. Unfortunately, the original prototypes of Code Canvas did not allow editing at all because it was a standalone application and no code editors outside of Visual Studio were available at the time. When Visual Studio 2010 came out we attempted to integrate Code Canvas with it but editing was slow and several issues with the Visual Studio SDK prevented us from providing a desirable enough experience to confidently deploy internally or test in the lab.
We also need to explore the possible ways of dealing with elided, occluded, or hidden code is. For example, code surrounded by #region is can be “collapsed” in the traditional editor, but what does this mean on the canvas? One of the great benefits of the canvas is tractability, meaning you can simply zoom out to get a visual representation of “how much code exists in this project” and be assured that “if I read through all of this code laid out before me then there will be nothing left that I don’t know about”. If we give the ability to collapse or hide code then that confidence can no longer be trusted and new developers will never be sure that they won’t run into some tiny piece of collapsed code that explodes into another huge area to learn.
Even if you ignore the issue of tractability (which you shouldn’t), what should expanding/collapsing the code do to the layout of the code that surrounds it? By clicking the innocent-looking +/- box next to the code you have the potential of pushing all sorts of adjacent code out of the way since we don’t allow overlaps. This may be better suited to the “filtered views” described below.
Real-time layout and edge routing is critical for Code Canvas, and we are fortunate to have Microsoft Automatic Graph Layout (MSAGL ) to help with this. MSAGL had just added support for Fast Incremental Layout when we were creating the first Code Canvas prototypes, and performance was good enough to give demos, but it was too slow to support real-sized code projects and several features were missing that Code Canvas needed. For example it lacked support for concave group/node geometry, non-hierarchical overlapping group layout, snap-based/grid-based node layout, rectilinear edge routing, edge labels, etc. One thing that it did have is the ability to specify “constraints” on the layout, for example you could create a rule that super-classes must be above sub-classes by saying that all inheritance edges must point downwards (or upwards, depending on if you’re coming from UML or not). You could also specify alignment constraints, for example you could force the tops, middles, or bottoms of code fragments to be aligned, and when you drug one fragment around then the other fragments would follow to ensure the constraints were not broken. Unfortunately we didn’t get a chance to test this in the field or explore the best way to expose this functionality to the user.
The original prototype briefly touched on the ability to add annotations and landmarks, but it was never really studied and it never had the ability to “draw” next to or around code. It seems intuitive that any canvas-based environment should have the ability to draw a border around a bunch of items to form a group, or that you should be able to scribble a comment and draw an arrow to the thing you’re talking about. But what does it mean when we zoom in and out? In traditional ZUIs (ranging all the way from modern-day Prezi back to the original Pad++), you wind up scaling the annotation equally with the rest of the content and you’re left with either gigantic pen strokes taking up your entire screen or teeny tiny pixels that you might not even know are there. Denim improves on this with it’s semantic zoom but will it scale to a real system or work well with text-based code fragments? It’s also unclear where to draw the line between annotating directly on the canvas vs. simply linking to content in a new window. For example, text and illustrative drawings probably belong directly on the canvas but an embedded video might be better off opening in a separate window.
The buzz around Code Bubbles and the success of Debugger Canvas makes it clear that a whole-system view where everything is on the canvas at once is not always what you want. When you have a very focused task to do it may be desirable to filter the canvas to only those portions relevant to your current task. But how can we do this in a way that maximizes reuse of your already-built spatial memory, and keeps you oriented in the context of your system even though the rest of the system is no longer visible? We experimented a bit with what we called “filtered canvases” when we attempted to integrate Code Canvas with Visual Studio, but we didn’t have enough technology available to actually test what we wanted in the field.
Video games that involve panning have had minimaps since the dawn of time, but it’s unclear whether they are needed when zooming is supported too. Minimaps also have the choice of being a “dumbed-down” representation of the actual surface (e.g. representing each item as a single pixel or icon in the minimap), or being a realistic scaled-down version of the surface instead. When semantic zoom is involved then a minimap might actually be both at once, since the semantically zoomed-out version of items might actually be a single icon instead). In other words, the minimap might simply show exactly what the real surface would show if the user zoomed all the way out. In this case, the minimap is simply an extra view that is simply smaller and docked in a corner of the window, with the extra addition of having a little rectangle that represents the viewport on the actual surface. There’s should be nothing stopping the extra view from being hosted outside the original window, perhaps on a secondary monitor, and if it were maximized then you would essentially have two large, detailed views on the same surface at once. Each one could contain “little rectangles” that represents where the other view(s) are looking, and thus the original “minimap” is just one particular configuration of the capability of having more than one view in general.
Being able to use spatial memory to navigate code makes it much harder to “get lost in your code”, but it’s not impossible. Especially if you’re frequently using the off-screen preview windows or clicking on search results to “fly” across the canvas, you will most certainly want a way to go back to where you just were (just like in a web browser). I have a 5-button mouse and it’s extremely convenient to use the extra buttons to navigate backwards and forwards. The only question is of how fine-grained the buttons should be. For example, if I’ve used the arrow keys to move the cursor up and down the code, the forward- and back-buttons should probably not move me back and forward one line at a time. But if I used PageUp and PageDown instead of the arrows keys then perhaps they should. If I double-clicked on a search result or preview window then I should definitely be able to click the back-button and fly back to where I came from, but if I got there manually by zooming and panning the canvas by hand then it’s not so clear.
We’ve also done some work experimenting with the best way to use the keyboard to navigate from one code fragment (or item) to another, but it’s still confusing to use Up/Down/Left/Right to navigate essentially a graph where items are not necessarily aligned in a grid-like fashion. We know that most programmers find it too time consuming to lift their hands off the keyboard in order to use the mouse, so being able to navigate throughout the canvas using the keyboard is important. (No, we don’t suggest that software development should be done using an Xbox controller.)
Code Canvas has a ton of information layered on top of a huge (infinite) zoomable surface. Making it all user-friendly, visually appealing, and non-overwhelming is a constant exercise in user experience design. There are several additional techniques that we would like to implement and test, for example:
Even if the original Code prototypes were fully functional and scaled to real code, we never determined the best way to “get started” using Code Canvas. Our original target audience – internal Microsoft developers – has large amounts of pre-existing and legacy code. In the standalone prototype, the workflow was to click File -> Import and point it to a .sln or .csproj file and it would do a one-time import of the code. The user would then have to wait for the import to finish which could take a very long time on nontrivial projects.
It’s also unclear whether it is best to start out by “dumping the entire codebase onto the canvas” instead of allowing (forcing?) the developer to selectively choose which pieces to incrementally add to the canvas by hand. If we dump the entire codebase onto the canvas, how should it be laid out? Preliminary work was done to develop a fixed-size leaf node tree map algorithm but it still needs work to support predefined group hierarchies.
In addition to the features already prototyped and demonstrated, there are still several new areas and aspects we’d like to research and explore. Here are just a few.
Software inevitably makes use of external libraries and is typically structured into “layers” of dependencies either intra-project or inter-project. Typically the references between layers are unidirectional but widespread. For example, consider the .NET framework. All .NET programs have a reference to the .NET framework, and it may be useful to visualize which areas of code reference which areas of the .NET framework (e.g. perhaps only a certain area of user code should access the System.IO or System.Net namespaces in order to maintain a sandbox level of security). It may be visually meaningful to organize one’s user code on the canvas to roughly match the areas of functionality in the framework. For example, consider the following sketch showing an ASP.NET web app that generates visualizations from files on disk. The file access code is on the far left of the canvas, the HTML web page code on the far right of the canvas, and the image loading and visualization generation code is in the middle.
In this case there are only two “layers” of dependency: 1) the application code seen on top, and 2) the framework code shown below. .NET programs can become more complex with multiple layers of dependency (for example if this program used the Entity Framework, which ships separately and on top of the .NET Framework, then it would sit “between” the user code layer and the .NET Framework code layer. Programs can also contain internal layers or aspects which may be ideal to visualize in this pseudo-3D kind of way. For example a program may contain logging code that is accessed throughout the entire codebase, and rather than showing many arrows overlapping on the same plane as the rest of the user code, it may be desirable to show the logging code “underneath” the rest of the user code instead.
In the case of the .NET Framework (and in fact also with the Entity Framework and most other frameworks by Microsoft), the libraries are read-only monolithic packages that can essentially be treated as black boxes. They are of high quality and maintained by Microsoft, so in the example above there is not much need to zoom into the .NET framework to view its source code (unless debugging of course), and one might question the value of being able to do so in Code Canvas. But when dealing with open-source and highly segmented but interconnected libraries (jQuery.js, Knockout.js, Hammer.js, Sigma.js, Smoke.js, Envision.js, to name a few) the ability to drill into and even modify the libraries in-place becomes more and more important as the web of dependencies in the software community becomes more and more complex.
In traditional corporate-based software development where all team members work in the same hallway it’s fairly easy to know who’s working on what and who’s responsible for a particular piece of code or functionality. But what can we do in the world of open-source development and crowdsourcing? Not only do we want to explore how we can use the canvas to help in this area (such as color coding areas of the code based on ownership, or drawing “fences” around “territory”, or showing thumbnail portraits of other developers on the edge of the screen when they are actively editing code nearby), but what should be done about the ownership of the canvas itself? Who decides how the software should be laid out and what the architectural boundaries should be? Should it be possible for every developer to be moving code around and re-arranging it at once? Certainly this would be very confusing to one developer if someone else moves his code “out from under him” while he’s working on it! Since the relative positioning of code and artifacts has meaning in the overall system, perhaps layout changes should not be made without consensus. But even if a single benevolent dictator owns the layout of the canvas and it’s read-only to everyone else, it still be possible for individual contributors to add annotations and illustrative drawings to the canvas, right? We know from experience that meaningful layout of diagrams and code is useful and is usually shared across a team, but we haven’t had the opportunity to study what the best method of sharing is in this type of interactive real-time immersive environment.
The 2D canvas works well for displaying static code since there is only one “copy” of the code to display, but how can we show several “instances” of the code at runtime while still maintaining the spatial context and layout of the canvas? Should we show several instances of the code itself (which would be very straightforward and easy to understand, but wastes a lot of screen real-estate), or just show the data values contained by each instance instead? Should 3D be used here, and if so then how would it coincide with other uses of 3D such as the layered view shown above? It seems obvious that the spatial layout of the code on the canvas should be re-used and preserved as much as possible in order to take advantage of spatial memory when displaying runtime heap data and object graphs, but the exact visualization of the scenario is a research area that needs much more exploration.
In the ever-changing world of open-source it becomes more important than ever to track and understand code changes and see how a code base is changing over time. In traditional bento box IDEs this is accomplished by comparing one file at a time and typically opening a new window that shows the two files side-by-side or interleaved line-by-line. This may not be the best approach when using the canvas, since 1) showing a code fragment to the side of an existing one would end up physically colliding or overlapping any source code adjacent to it, and 2) it should also be useful to see the differences in the relationships between code and how the layout of the code on the canvas changes over time as well. This is an area that needs to be explored further so that the user does not need to escape from the canvas in order to view diffs and code changes using external programs.
Wow, how did I miss this before now? Very cool.
Integrate it with Kinect for the full Minority Report coding experience... 8o)
Can we also have this for C++?
This is so cool. I really like to put my hands on it to try out.
I, daniel, second Daniel's first comment. Secondly, I too am daniel.
Based on how fun Google Maps 8-bit was, I'd love to see something that generated tile data for code projects so it mapped solutions onto an atlas. Continent per project, maybe, with national/state boundary overlays for classes/methods/etc. :). It'd be interesting just to leverage the work the map teams (Bing, Google, whoever) have already done in terms of zooming/granularity (BirdsEye, StreetView, etc).
Of course, then the v2 switches over to a game engine like Civilization or The Sims / Sim City so then you can map bugs to disasters/crime/etc you need to fix and features to things like building a space station. :)