Visual Studio 2012 New Features: Code Clone Analysis

Tips Search

Visual Studio 2012 New Features: Code Clone Analysis

  • Comments 4

In my travels across the country, with my fellow Evangelist, Clint Edmonson, talking about Visual Studio we often come across great stories to tell. One of our favorite true stories is of a customer that had a web application running very slow. We ran code metrics against it and, sure enough, the Page_Load event had 9,000 lines of code in it.

 

Naturally we were curious so we opened it up to see that it was basically the same if statement copied over and over. Apparently they needed to find out who was coming into the website in order to show customized content and the solution they came up with was this massive set of statements.

 

For better or worse we have all had code that gets copied throughout our solutions. Until now there was no tool to tell us there were copies and, instead, we had to rely on other metrics to hopefully reveal any code smells that lead us to duplicates. Now, however, we have the new Code Clone Detection (aka Code Clone Analysis) feature.

 

According to the documentation:

Code clones are separate fragments of code that are very similar. They are a common phenomenon in an application that has been under development for some time. Clones make it hard to change your application because you have to find and update more than one fragment. Visual Studio can help you find code clones so that you can refactor them.”

http://msdn.microsoft.com/en-us/library/hh205279%28v=vs.110%29.aspx

 

 

Specific Clones

You can find clones of specific code by selecting the segment you are interested then right click on the selection to choose Find Matching Clones in Solution from the context menu:

image

 

Visual Studio will search for code clones and produce the result in the new Code Clone Search Results window:

image

 

The original line of code is put in a group on its own and then all the matches are put into a different group. You can expand the groups to see the specific locations of the matches:

image

 

Also, you can double click on any entry in the list to go to the selection in your code file:

image

 

 

Solution Clones

Besides looking for specific clones you can also look for code clones for the entire solution. To use this feature go to Analyze | Analyze Solution for Code Clones:

image

 

This creates a result set for the entire solution:

image

 

By default it groups and sorts the results by the strength of the match. Exact matches come first then those matches that may be close but not exact come next and so on. The terms you may see are Exact, Strong, Medium, and Weak.

 

 

Reviewing Matches

Once you have the result set, there are a couple of ways you can compare them against each other.

 

Comparison Tools

If you have a comparison tool configured you can Right-click on any item and select Compare from the shortcut menu:

image

 

You would know if you have this feature available by going to Tools | Options | Source Control | Team Foundation Server and click on Configure User Tools.

 

 

Manual Comparison

If you don’t have a comparison tool you can do manual comparisons between two entries in the list. If the clones are in different files then you can just double-click each entry and it will open the file as well as highlight the entry that is duplicated as mentioned earlier.

 
 

What Is Found

You are probably curious as to what is found by this tool. The heuristics for finding clones will find duplicates even if the following changes have happened:

· Renamed identifiers

· Insert and delete statements added

· Rearranged statements

 

 

What Is Not Found

There are some rules for what is not found as well. I have taken this list from the documentation pretty much verbatim.

· Type declarations are not compared. For example, if you have two classes with very similar sets of field declarations, they will not be reported as clones. Only statements in methods and property definitions are compared.

· Analyze Solution for Code Clones will not find clones that are less than 10 statements long. However, you can apply Find matching clones in solution to shorter fragments.

· Fragments with more than 40% changed tokens.

· If a project contains a .codeclonesettings file, code elements that are defined in that project will not be searched if they are named in the Exclusions section of the .codeclonesettings file.

· Some kinds of generated code are excluded:

· *.designer.cs, *.designer.vb

· InitializeComponent methods

 

 

Code Clone Settings and Exclusions

A settings file is available to configure this feature at the project level. Currently we have only announced the ability to do exclusions in the file but there will most likely be other elements that are added later on. The file is just XML with a .CODECLONESETTINGS extension. The only requirement for use is that the file exists in the top level directory of the project.

 

The base elements consist of a CodeCloneSettings element with an Exclusions child:

image

 

Within the Exclusions element you can have the following children:

<File>

This element is used to indicate files that should be excluded from analysis. Path names can be absolute or relative and you can use wildcards as well. So, for example, to ignore all the C# text template files that have been put in their own directory (called MyTextTemplates) you might have the following:

image

 

<Namespace>, <Type>, and <FunctionName>

You can also exclude namespaces, types, and functions. Just like files these items can use absolute names or names with wildcards in them. Here is an example of what it might look like:

image

 

 

Exclusion File Example

In the Tailspin Toys sample there is some generated code in the TailSpin.SimpleSqlRepository project that is the bulk of the duplications:

image

 

When I run code analysis, this is the result:

image

 

Code clone analysis doesn’t automatically know to ignore text templates so I would create an XML file called TailSpinRepository.codeclonesettings and insert an entry like this:

image

 

Now if I run clone analysis again here is what I get:

image

 

As you can see the results are significantly less than the first time the analysis ran. It’s common to create several exclusions in different projects to weed out noise in the analysis results.

 

 

Finally

Code Clone Detection is a great new tool to add to your arsenal for improving code quality. Combined with Code Analysis and Code Metrics, this will help quickly find potential issues.

  • Unfortunately, for some reason, when i do this, it's not ignoring *.designer.cs files automatically.  I am finding I have to add a codeclonesettings file and specify that pattern manually.  I have no idea why!

  • Hey Grank :)

    Strange behavior but looks like the settings file did the trick.

    Z

  • It's a nice added feature for Visual Studio to keep everything integrated, but the statement that "up until now there was no way to check these code smells" is not entirely accurate. For code clone / duplication analysis there has been the tool Simian. This tool has been around for 10 years or so. Also DevExpress has the code duplication feature in their product for 2 years or so? Anyway still a welcome feature in VS.

  • John Doe (love the name btw),

    You are absolutely correct :) For the most part I look through the lens of Out-of-the-Box Visual Studio since I don't know what various company policies allow. There are some organizations that do not allow extensions to be installed and I try to write my posts to the lowest common denominator.

    Z

Page 1 of 1 (4 items)
Leave a Comment
  • Please add 3 and 7 and type the answer here:
  • Post