Welcome to MSDN Blogs Sign in | Join | Help

Testing Reverse Engineering Tools/Framework

Testing Reverse Engineering Tools/Framework

Hello, I am Manish Vasani, a Software Design Engineer in Test on the Phoenix Analysis and Tools team. In this post, I’ll be giving a brief overview of the Phoenix Pereader-based Analysis tools with an emphasis on the difficulties traditionally involved in testing these types of tools and how Phoenix makes things much easier. For those not familiar with Pereader, it’s a Phoenix component that reads Portable Executable (PE) files and converts them to Intermediate Representation (IR). You can find more information on Pereader in the Phoenix SDK Documentation/Phoenix Technical Overview Article.

Overview of Pereader based Analysis Tools

It was during my bachelor’s third year that I first heard the term “Decompiler”. A couple of my close friends had implemented an i386 to C transformation tool for their main project. Not surprisingly, this was the most talked about project in the department and rightly so. Reverse engineering is just so amazingly cool! You need to have the same expertise as a traditional compiler developer to understand the different compiler optimizations and codegen outputs in order to reverse engineer it. Additionally, a huge challenge lies in ensuring robustness and correctness of your tool against a vast and dynamic input matrix.

Quite often the term Reverse Engineering gets associated with the notion of applying algorithms or heuristics to map certain patterns in lower level constructs to higher level constructs. One needs to be clear that there are two distinct steps involved in this process: (a) Using the binary (and pdb for native binaries) as a repository of information and converting it to IR (b) Applying the algorithms to identify patterns in IR to convert them to higher level constructs. One of most widely used feature of the Phoenix framework is raising managed/native binaries to intermediate code representation using pereader (step a). Once the binary has been raised to Phoenix IR, Pereader client can do various types of analysis on the IR such as call dependency analysis (to build a call graph), security analysis (to identify security vulnerabilities in code), etc. Our test team’s job is to test the Pereader framework for these tools.

Testing Reverse Engineering Tools

Testing a reverse engineering tool has two main challenges:

1) The primary challenge, as I mentioned earlier, is the test matrix. Difficulty lies in the fact that different compilers (or even the same compiler used with different switches) have differing codegen outputs for the same piece of code. Your implementation will work fine for a subset of constructs or code patterns built with a selected set of compilers. But as soon as you expose it to a wider test matrix, you find the need to modify your algorithm to account for the new patterns that you encounter in the input. Everything might work fine for binaries built with c2 shipped with VS2005, but a cool new optimization switch added in VS2008 c2 or brand new machine instructions added by the hardware vendors might render your tool useless for these binaries. The test team needs to track such changes and should continuously add to their test matrix.

2) Another challenge in testing reverse engineering tools is how do you verify that you read\raise the binary correctly. The possible techniques that come to mind are:

a. Adding asserts and trace during different phases of the raise. This solution is not suitable for automated scenario testing as well as for the release builds, so I won’t dwell on this technique here.

b. Verifying semantic correctness of the raise: This can be done by lowering back the raised IR to binary level to generate a cloned binary and then verify the clone by either:

      1) Executing the original and cloned binary and comparing the execution results

      2) Comparing the original and cloned binary. For managed assemblies this can be done through  metadata comparison, but there is no trivial technique for comparing native binaries and pdbs.

c. Dumping the raised IR for manual validation/comparison with IR dump during compile. This technique is suitable only if you have the IR dump during compile, against which you wish to compare.

None of these can individually provide a complete testing strategy. Strategy (b) [m4] seems appropriate for breadth wise and robustness testing against Real World binaries, but is does not guarantee 100% correctness. Piece of code which wasn’t raised correctly might lie inside a conditional statement and never get hit during normal execution flow. On the other hand, strategy (c) [m5] seems more appropriate for depth wise testing against targeted test code samples. But comparing IR dumps for huge Real World binaries will be non-trivial as you would expect some amount of noise in the dumps. Hence a hybrid of all these is normally required.

How Phoenix makes it easier to write tests for these tools

Now that I have given you some background on different testing strategies for Reverse engineering tools, I would like to explain how Phoenix aids us in implementing tests for each of these strategies. The modular and extensible architecture of Phoenix provides plug-in points for adding tests at various phases of the tool. I will mainly concentrate on testing strategy b (cloning the binary) and strategy c (correctness test through dumps).

Cloning Binaries through Pewriter

Phoenix Framework provides Pewriter APIs which enables the client to generate binaries from IR. Using these APIs we generate a clone of the original binary for our “Verifying semantic correctness” test. Once the clone has been generated, the exact technique to compare it with the original binary is for the client to decide. You can find more information about Pewriter in Phoenix SDK Documentation

Correctness Test through dumps

A basic correctness test for a Pereader based tool might dump out the IR and any other relevant information to the analysis tool and then compare it to a previously generated baseline dump or an expected output. In a more advanced automated test, you would want to have the baseline dump generated through a tool during compilation of the source and have a compare tool which compares these dumps.

Let us consider the basic test first. Figure below demonstrates how you would design it using Phoenix:

phoenix1

The above figure shows how each of the Phoenix Components (eg. Pereader framework to raise binaries) can be represented as a list of Phoenix Phases. Our test can be implemented as a modular Test phase which plugs into the PhaseList for the client tool[m6] and would be executed when the client triggers its PhaseList execution. Let us consider an example scenario to understand this well.

Consider an analysis tool which uses the Pereader Phases to raise the binary and run static analysis using Phoenix lattice framework (if you are unfamiliar with the lattice framework you may ignore the specifics of the test). A test for this tool could be implemented as a LatticeTestPhase which runs lattice simulation on the raised IR and then dumps out the relevant lattice cell properties. [m7] The following code snippet shows how we can implement and plug in the Lattice Test Phase into the tool.

//Implementation of Lattice Test Phase

namespace TestPhases

{

public ref class LatticeTestPhase : public Phx::Phases::Phase

{

public:

static TestPhases::LatticeTestPhase ^

New

(

Phx::Phases::PhaseConfiguration ^ configuration);

);

static void StaticInitialize();

protected:

override void

Execute

(

Phx::Unit ^ unit

);

void OutputConstantCellProperties

(

Phx::Lattices::ConstantCell ^ constantCell

);

void OutputDeadCellProperties

(

Phx::Lattices::ConstantCell ^ deadCell

);

...

void

RunSimulation

(

Phx::FunctionUnit ^ functionUnit,

Phx::Simulators::RegionSimulator ^ simulator,

Phx::Collections::LatticeList ^ latticeList,

Phx::Lifetime ^ currentLifetime,

Phx::Simulators::SimulatorDirection directionToTest,

Phx::Simulators::SimulatorMode modeToTest

);

};

}

//Append the lattice test phase to the list of phases to be executed for the Pereader based tool

Phx::Phases::Phase ^

PEReaderBasedTool::BuildFunctionPhaseList

(

Phx::Phases::PhaseConfiguration ^ configuration

)

{

Phx::Phases::PhaseList ^ unitList;

// Append RaiseIR phase and other pereader phases

...

if (this->doLatticeTest)

{

// Plug in the Lattice test phase into Tool’s PhaseList

unitList->AppendPhase(TestPhases::LatticeTestPhase::New(configuration));

}

...

}

We can enable the lattice test by just turning on a switch for our client application. The actual validation of the lattice output will be done against a manually generated expected output file.

Automating the Correctness test

The next logical extension to our test is to automate it end-to-end. This can be done by using a Phoenix based c2 during the compile phase. Just as we plugged in the LatticeTestPhase into the PhaseList for Pereader based analysis tool, we can write a Phoenix c2 plug-in to insert the same LatticeTestPhase into c2 PhaseList. LatticeTestPhase will be executed during compile as well as after raise. A simple comparison tool can compare the outputs and provide an end to end automated test.

phoenix2

Conclusion

Compare this with having to implement the test for a non-Phoenix based tool. The points to consider are:

1) Does your compiler support adding plug-ins? If the answer is yes,

a. Can we re-use the same plug-in after raise?

b. Does the plug-in need to be statically linked? (With Phoenix, the plug-in can lie in a different dll)

2) Does the test plug-in logically fit into your overall architecture or do you require providing some dirty hooks to plug it in?

This is what makes Phoenix such a cool framework. You feel you have identified all the places where you can put into use and up comes a new one J. Hopefully you have enjoyed reading this and if you find another new way in which Phoenix can be used, please do share it with us.

Terminology:

c2: VC++ compiler backend

Binary Raising: Process of reverse engineering PE files into IR.

Phoenix Simulation based optimization and lattices: Phoenix Simulation-Based Optimization (SBO) is a framework for analyzing a function by simulating the execution of Intermediate Representation (IR) in an abstract interpreter called a lattice. Lattices can be used as a foundation to determine correct transformations in an optimizing compiler.

Posted by vcblog | 8 Comments

TechEd 2008 - meeting customers at the booth!

Hello, my name is Li Shao. I am a Software Design Engineer in Test in Visual C++ team. From June 3 to June 6, I had the opportunity to attend TechEd 2008 in Orlando along with two colleagues, Marian Luparu and Martha Wieczorek.  Most of my time was spent at the Visual C++ Booth to answer customers’ questions. I also had the opportunity to work with Kate Gregory, one of our MVPs, at one booth secession. Here are some overall impressions based on the customers that I talked to:

 

·         Customers are really excited to know that we have new MFC functionality.

Customers from many of the companies came by and looked at our demos of new MFC features. I have followed up with many of them with information such as the download site for the Visual Studio 2008 feature pack, pointers to walkthroughs, and a link to Kate Gregory’s PowerPoint presentation on new MFC functionality. Customers are all very glad to know about this new addition to MFC and agree that this provides a solution to modernizing their MFC applications.

 

There were some other questions related to MFC. Some customers were asking whether they should stay with MFC or should they migrate to new technologies, such as WPF.  Some customers are concerned about staying with MFC as that they cannot find entry level people to do the UI work in MFC while it is much easier to find an entry level developer to write WPF/WinForm applications.   Basically, these customers want guidance on how they should move their applications forward.

 

Of course, there is no simple answer to this question that applies to all our customers – each one has different constraints and priorities.  In the end, this is a decision you’ll need to make based on your particular application needs.  Regardless of what you choose, we’ll support you within Visual C++!  If you want to stick with MFC, you can rest assured there will be many more updates to this technology.  If you prefer to use WPF, we are working on building out guidance to help with this. In fact, Henry Sowizral from the Expression team gave a talk on how to refactor MFC application and migrate to WPF.  With the new support, it is fairly easy to modernize existing MFC applications or create new MFC application with modernized UI.

 

There was one customer asking for new MFC support on Smart Devices which we don’t currently support. As far as I know, there is no current plan supporting new MFC on Smart Devices.

 

Here are some additional resources to help you learn more about the new MFC: a Channel 9 Video by Pat Brenner, one of our developers: http://channel9.msdn.com/showpost.aspx?postid=355087 and Pat’s VC Blog entry: http://blogs.msdn.com/vcblog/archive/2007/11/09/quick-tour-of-new-mfc-functionality.aspx

 

·         Customers have a lot of questions about native/managed interop

We received many questions regarding native managed interop. Many developers seem to face the challenge of moving their native application to .NET.  We mentioned to customers the Marshalling library and they were interested in it. Here is a VCBlog Entry on Marshalling library: http://blogs.msdn.com/vcblog/archive/2007/04/25/marshaling-library-in-orcas.aspx. We have a Channel 9  Video too on Marshalling library by Sarita Bafna, one of our program managers: http://channel9.msdn.com/shows/WM_IN/Sarita-Bafna-VC-quotOrcasquot-Marshaling-Library-and-MFC-support-for-Common-Controls/ . Kate Gregory also has a great site for Marshalling library: http://www.marshal-as.net/ .

 

Some other interop related suggestions we had from customers are:

o   Some customers asked about whether or not they can automatically generate wrapper class for their native components. The answer is that if the native component is a COM component, you can use the “Add reference” feature to generate a managed wrapper class, which is essentially the same as the wrapper class that would be generated by calling Tlbimport.exe. However, there is no automatic way to generate wrapper class for regular native classes.

o   One customer asked if there are any samples on calling Silverlight components from native application. The answer is no for now. But this is a good suggestion and we can consider it.

o   One customer would like to have a tool to merge his pure managed assemblies and IJW assemblies. Currently, ILMerge.exe can only merge pure managed assemblies. The recommended way to produce a single assembly for C# application and managed C++ applicationis to generate .NETMODULE files from your C# applications and pass the .NETMODULE files to the C++ linker along with the .obj files from the C++ applications.

o   One customer commented that C++/CLR is great! He is using more /clr, less pInvoke or COM interop.

 

·         Customers are interested in TR1 support

People were impressed with TR1 – specifically the Shared_Ptr and Regex support. One customer asked question about compatibility between Boost library and TR1. Here is the answer I conveyed from Stephan T. Lavavej:  

 

TR1 is compatible with Boost in the sense that you can include both <regex> and <boost/regex.hpp> and nothing bad will happen. TR1 isn't compatible with Boost in the sense that tr1::regex and boost::regex are completely different types. TR1 is almost compatible with Boost in the sense that they both follow the TR1 spec closely.  Programs currently using the TR1 subset of Boost can be converted to use TR1 with minor changes.

 

You can take a look at this blog entry and attached slides by Spephan T. Lavavej to learn more about TR1: http://blogs.msdn.com/vcblog/archive/2008/02/22/tr1-slide-decks.aspx.

 

·         Customers in general are very excited by parallel computing and concurrency run time.

They would like to know more about parallel computing and the time frame parallel computing libraries and programming support will be available to public. There were a couple of talks on Parallel Computing during the TechEd.

 

Overall it is a very exciting and very educational experience to be able to talk to so many customers, listen to their concerns and answer their questions. Meanwhile, the questions and feedback we got from the customers also confirm our main strategies:

·         Continued native support: such as renewed MFC support, C++0x, TR1, parallel computing, etc

·         Improved and “friction free” managed and native interop.

 

I, and the other members of our team,  hope to meet many more of our customers, in fact a number of my colleagues will be at the PDC this year – hope to see some of you there!

Posted by vcblog | 7 Comments

Channel 9 video: STL Iterator Debugging and Secure SCL - Stephan T. Lavavej

Hello

Stephan has just completed his latest channel 9 video, this one on debugging in our implementation of the STL: http://channel9.msdn.com/shows/Going+Deep/STL-Iterator-Debugging-and-Secure-SCL/. This topic gets a little confusing as we have two technologies available here, namely Iterator Debugging and Iterator Checking – if you are occasionally confused by exactly what is the difference (and when should you use one or the other) then this video is for you.

You can also see Stephan’s Channel 9 video on TR1 here: http://channel9.msdn.com/shows/Going+Deep/Stephan-T-Lavavej-Digging-into-C-Technical-Report-1-TR1/

Thanks Damien

Visual C++

Posted by vcblog | 9 Comments
Filed under: ,

Interesting Visual C++ Resources

A few weeks ago the Visual C++ team delivered two days of technical content to developers down in Northern California.  At this event, we mentioned a large number of useful resources.  We thought we’d take the opportunity to pass them on to readers of the blog as well.  Hopefully you’ll find them valuable.

 

Debugging Tips & Tricks

¡  Debugging blogs

            http://blogs.msdn.com/jimgries

http://blogs.msdn.com/greggm

http://blogs.msdn.com/rchiodo

http://blogs.msdn.com/jacdavis

http://blogs.msdn.com/stevejs

http://blogs.msdn.com/ms_joc

http://blogs.msdn.com/jmstall

¡  Good books

Debugging Microsoft .NET 2.0 Applications by John Robbins.  This book has a lot of great information in it about debugging in general, not just about debugging managed code.

¡  Finding COM pointers on the stack

            http://blogs.msdn.com/greggm/archive/2005/08/01/446293.aspx

¡  Retail code debugging

            http://blogs.msdn.com/greggm/archive/2004/12/15/315673.aspx

¡  Getting crash dumps before Windows Error Reporting (WER) sends them off

            http://blogs.msdn.com/greggm/archive/2007/05/24/debugging-windows-error-reporting.aspx

¡  Gadgets for Window Error Reporting (WER)

            http://www.codeplex.com/wer/

 

 

Concurrency

¡  Concurrency developer center (very managed-focused today, but you’ll see a lot more native shortly)

     http://msdn.microsoft.com/concurrency

¡  Native concurrency blog

http://blogs.msdn.com/nativeconcurrency 

 

TR1

¡  VC++ 2008 Feature Pack download

           http://www.microsoft.com/downloads/details.aspx?FamilyID=d466226b-8dab-445f-a7b4-448b326c48e7&displaylang=en

¡  Pete Becker’s book: The C++ Standard Library Extensions: A Tutorial and Reference

¡       http://www.amazon.com/C%2B%2B-Standard-Library-Extensions-Reference/dp/0321412990/ref=pd_bbs_sr_1?ie=UTF8&s=books&qid=1210782172&sr=8-1

¡  Channel9 VC++ videos      

      http://channel9.msdn.com/Showpost.aspx?postid=385821

¡  VC++ Libraries forums
http://forums.msdn.microsoft.com/en-US/vcgeneral/threads/

¡  C++ Standard

            http://www.open-std.org/jtc1/sc22/wg21/

¡  TR1 Standard

            http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2005/n1836.pdf

¡  MSDN article

            http://msdn.microsoft.com/en-us/magazine/cc507634.aspx

 

New MFC Features

¡  VC++ 2008 Feature Pack download

            http://www.microsoft.com/downloads/details.aspx?FamilyID=d466226b-8dab-445f-a7b4-448b326c48e7&displaylang=en

¡  MSDN article

            http://msdn.microsoft.com/en-us/magazine/cc507634.aspx

¡  VC++ Libraries forums
http://forums.msdn.microsoft.com/en-US/vcgeneral/threads/

¡  Channel9 VC++ videos

http://channel9.msdn.com/Showpost.aspx?postid=355087

¡  BCGSoft

            http://www.bcgsoft.com/

 

Phoenix

¡  Phoenix

           http://connect.microsoft.com/phoenix

¡  Forum

           http://forums.msdn.microsoft.com/en-US/phoenix/threads/

¡  Channel 9

           http://channel9.msdn.com/tags/Phoenix+Framework

 

And there always are the general Channel 9 videos on VC++

http://channel9.msdn.com/tags/C++  

 

Enjoy!

 

-          The Visual C++ Development Team

Posted by vcblog | 5 Comments

Data-driven Visual C++ Editing Test Framework

Hello, I am Smile, a member of the QA team on Visual C++ Compiler Team. I would like to write about the methods we use to verify the intellisense results while editing in the Visual C++ IDE. To better understand this, consider the following common user scenario:

Andy opens his project in IDE to continue his work the day before. He opens Client.cpp, finds the line he finished at  yesterday, and starts coding. After adding a few lines of code, he wants to use a member of a struct and calls the MemberList intellisense operation (Ctrl + J on the keyboard) on the instance of that struct to check the full name of that member ….

A simple scenario, right? Yes, but our job is to test whether the MemberList operation would report accurate and complete results, not only in simple scenarios like this, but also in very complex scenarios that may involve thousands of lines of code and a variety of IDE operations, such as FindAndReplace, FindAllReference, Delete/Undelete, SwitchBetweenFiles, …, and so on.

To improve the productivity of writing tests and reduce their maintenance, you might think that it could be a good idea to write C++ code for each test. Unfortunately this solution is not good because each line of the test code itself might be buggy and therefore needs to be verified. Also some tests may be used for 20 years and modified from time to time. It might be painful for a QA to read and try to understand test code written 20 years ago, considering the fact that everyone has his own coding style and there might not be enough comments for each of the modifications on this test between 20 years ago and now.

Then how to do it?

The solution we use is called “Data-driven Visual C++ Editing Test Framework”. The idea is to abstract common IDE edit operation/operation sequences into APIs, and wraps those APIs into script statements. The script statements corresponding to one or multiple specific user scenarios are stored into a .xml file as xml data. To write a test, a QA won’t need to write even one line of C++ code. What he needs to do is to figure out the user action sequences and maps them to the pre-defined script statements. That’s all. The test framework could read those script statements from the .xml file, interpret them into corresponding APIs and execute them at runtime. Compared with writing C++ code for each test, this data-driven approach has the following advantages:

1.       relatively bug free – because all the error cases/exceptional handling are already done and verified in the implementation of those APIs.

2.       easy to maintain – to add/change/remove user editing operations just need to add/edit/remove a few lines of script statements. No C++ test code need to be written, and therefore no verification code need to be written to justify the test code.

3.       easier to understand – high-level abstracted script statements hide implementation details, and therefore much easier to be understood than low-level C++ code, especially after 20 years.

Ok, I guess the next question would be: what kind of script statements would we support?

I list some examples below:

·         OPEN_SOLUTION -- Open a solution.

·         OPEN_FILE -- Open a file.

·         FIND_TEXT -- Search for particular text in currently opened file.

·         INS_AT_OFFSET -- Insert some texts at the beginning of the line a few lines below the first appearance of an existing text.

·         MOVE_CURSOR_TO_OFFSET -- Move cursor to a specified location relative to an existing text or current location of the cursor.

·         ADD_FI_INCLUDED – Change project configuration to add /FI option to the configuration of a VC project.

·         MEMBER_LIST -- Verify MemberList IntelliSense operation results where the cursor locates.

With those script statements, a test on the simple scenario I introduced above can be as simple as following:

·         OPEN_SOLUTION@@TEST_ROOT@@\TestCode\Client\Client.sln   //Open solution

·         OPEN_FILE@@TEST_ROOT@@\TestCode\Client\Client.cpp   //Open Client.cpp

·         INS_AT_OFFSET@@//To be done@@1@@ int b = mySon.    //Add “int b = mySon.” 1 line below “//To be done”

·         MEMBER_LIST@@CONTAIN@@int Parent::g(int j)    //Verify whether memberlist contains “int Parent::g(int j)”

       (@@ is the separator)

This framework could still be improved in multiple ways, such as enriching the script statement library, introducing more complex language structures (e.g. if-else, while-do statements) and enhancing debugging mechanism. Among all the thoughts, one improvement might be most beneficial to our users as well as our product, which I called “motion-capture-based problem repro” feature. With this feature, whenever an user meets a problem in IDE, he no longer needs to write a long Email describing how the problem should be reproed. What he needs to do is to enable this feature and repeat his previous operations in the IDE (for example: move the cursor to a specific location -> edit some code -> trigger memberlist … -> problem appears). Each of those operations will be logged and used to generate a script as described above. So the problem report on the user side becomes fairly easy: enable feature -> repeat operations -> send out the generated scripts and related source codes. The problem repro on our side also becomes fairly easy: simply run the script, and then we can see the problem repro. Besides, those problem repro scripts can be categorized and selected as regression tests, which further help us improve our productivity and product quality.

Any thoughts/suggestions about our test framework, script language, and future work? Look forward to hear from you guys. J

Thanks,

Smile Wei

Visual C++ Compiler Team

 

Posted by vcblog | 7 Comments

Some C++ Gotchas

Hi - Jonathan Caves again.  Over the last couple of weeks I’ve seen some reports from users that the C++ compiler does not act the way they think it should.  As it turns these reports weren’t real bugs, but the issues brought up are interesting enough to share with a wider audience.

 

The first was from a customer who reported that the compiler was calling the wrong function.  The problem code can be reduced to the following:

 

class string {

public:

string(const char*);
};

 

void f(string, string, bool = false);

void f(string, bool = false);

 

void g()

{

f(“Hello”, “Goodbye”);
}

 

The user’s observation was that the compiler should call the first function:

 

void f(string, string, bool = false);

 

but in reality it was calling the second function:

 

void f(string, bool = false);

 

and they thought that this was a bug.  At first glance it does appear that the user is correct - but looks can be deceiving.  Just because the string class has a converting-constructor from a string-literal doesn’t mean that the compiler has to use it.  For the first argument to the function call, the conversion is straight forward - both of the candidate functions expect a string and so the compiler will use the provided converting constructor to convert the string-literal to an instance of the string class. The second argument is not so straight forward.  For the first function the compiler can again use the converting-constructor, but for the second function it can use the standard pointer-to-bool conversion to convert the string-literal (which the compiler will consider as type “const char*”) to bool.  As this is a standard conversion, the C++ Standard considers this conversion to be cheaper than calling the converting-constructor (which is a user-defined conversion) and hence the second function is a better match than the first function and the compiler, correctly, calls that function.

 

Note: the real issue here is with the use of default-arguments.  Without default-arguments the user would not be left to the mercy of the C++ conversion rules.  If they wanted to call the three parameter version then they would need to provide three arguments; if they want to call the two parameter version then they need to provide two arguments. At first glance, default-arguments seem to be a great C++ language feature but I have seen them cause users no end of problems. I’ve even seen users doing stuff like the following:

 

SomeFunction(arg1, arg2 /*, arg3 = false, arg4 = true */);

 

They do this just so it is clear to readers of the code that default arguments are being used.  If you are going to go this far then just get rid of the default arguments (and the comments).  Believe me that it will make your life much easier and your code more maintainable!

 

The second issue was around code that compiled but when the user applied what they thought was a minor edit the code no longer compiled. The code could be reduced to the following:

 

namespace std {

   template<typename T>

   class list {

   public:

      size_t size() const;

   };
}

 

class X : std::list<int> {

public:

   size_t mf1() const { return list::size();      }
   size_t mf2() const { return std::list::size(); }
};

 

The problem was that while mf1 compiles fine, mf2 generates the following error message:

 

a.cpp(12) : error C2955: 'std::list' : use of class template requires template argument list

        a.cpp(3) : see declaration of 'std::list'

 

The user’s question was why? Surely if the first function compiles then the second function should also compile because all the user has done is to make it clearer to the compiler what was going on. But the problem was that they have been too specific. It all comes down to something in C++ called the “injected-class-name”. 

 

In C++, each class has a member that is added – injected – by the compiler and this member has the same name as the class. (Note: don’t confuse this with a constructor which only looks as if it has the same name as the class. In reality constructors have no-name or at least a name that cannot be written in C++ code.) This member is needed in order that there can be rules for defining a constructor outside of a class.  Without this injected-member the C++ Standard would have to revert to hand-waving - something writers of Standards really hate.  One further twist is that in the case that the class is a specialization of a class template, there are two versions of the injected-class-name: one is the name of the class without the template-arguments, list in the example above, and the other is the name of the class with the template-arguments, list<int> in the example above.

 

So in the first example when the compiler sees the identifier, list, it does normal name-lookup.  It first looks up ‘list’ in the current class scope and finds nothing.  It then looks up ‘list’ in the scope of the base class where it finds the injected-class-name and, eventually, works out that by ‘list’ the user means ‘std::list<int>’ which is a base-class of the current class. So the compiler treats the code as-if the user had written:

 

     size_t mf1() const { return list<int>::size(); }

In the second example the compiler first sees ‘std’ which it looks-up and finds the namespace std.  It then looks up ‘list’ within the namespace ‘std’ and finds the class-template - but this is not an injected-class-name – and there is no version in which the template-arguments are implied. So in this case the user needs to explicitly provide the template-arguments, hence the error message.  

 

I think the lesson here is don’t try to help the compiler – let it resolve the code by itself.  If you have got it wrong it will tell you.  If it compiles and you want to double check the result, then you should debug the code (you already do step through all the code you write in the debugger - don’t you?). The worse offence in this category are people who add casts thinking that they will force the compiler to compile the code they way they want it to be complied. At best the cast is unnecessary (and hence makes the code more difficult to maintain) and worst it will lead to hard-to-detect runtime errors.

 

Thanks,

Jonathan

Posted by vcblog | 24 Comments

Channel9: The Route to C++ Code Optimization

A Channel9 video just got published where Russell Hadley (Senior Developer on the VC++ Team) speaks more about the VC compiler. Check out the video at http://channel9.msdn.com/showpost.aspx?postid=405345

Thanks,
Visual C++ Development Team

 

Posted by vcblog | 1 Comments
Filed under:

VC Runtime Binding...

Hello, I am George Mileka, a developer on the Visual C++ Libraries Team. After we released the Visual C++ 2008 Feature Pack back in April, we got a lot of useful feedback from MVPs and customers. One of the areas that definitely needed clarification is the version dependency embedded in the generated manifest.

 

The question goes like this: Why does the VS team provide two options:

    (1) bind the application to the RTM version of the VC runtimes.

    (2) bind the application to the current version of the VC runtimes.

And what is the default behavior?

 

Before going into why each case can be useful, I'd like to emphasize something...

 

The policy redirection provided by VC redirects application requests for older versions to newer ones (for a given servicing baseline*, and after applying the <app>.config file if present**).

This means that:

    - an application that binds to the RTM version of the libraries can be directed to RTM or later (if a newer runtime is installed).

    - an application that binds to the current version of the libraries can be directed to the current version (or later version when a newer version is released).

 

Now - back to answering the question.

 

Option (1) is useful for the following scenario:

- Alan installs the RTM product and uses it to develop his product.

- Alan builds his product and ships the VC runtime dlls he has along with it (RTM ones).

- Microsoft releases VS SP1, and the user installs it to get a fix for a non-libraries related issue (perf improvement in some area for example).

- Alan finds a bug in his own product and decides to fix it, rebuild his binaries and ship a patch for his customers.

 

In this scenario, Alan does not really care about the newer VC runtime dlls. Alan should be able to rebuild his binaries, and hand them to customers without having to redistribute the VC runtime. To decouple the new application binaries from the new VC runtime dlls, the user chooses to bind to the RTM version of the VC runtime dlls.

 

Option (2) is useful for the following scenario:

- Jim installs the RTM product and uses it to develop his product.

- Jim finds a problem with the VC libraries. The problem affects his product behavior.

- Jim asks Microsoft for a fix (a QFE) so that his product works properly.

- Microsoft fixes the problems and hands Jim a QFE with the fix. The QFE contains new headers/libraries/runtime dlls.

- Jim builds his product and ships the VC runtime dlls he got through the Microsoft QFE.

 

In this scenario, Jim’s product correct behavior is dependent on the newer runtime dlls. Jim prefers that his application does not start at all rather than starting and inhibiting incorrect behavior. Jim can enforce that by binding to the QFE version of the runtime dlls.

 

In VS2005, the default (out of the box after installing SP1) is Scenario #2.

In VS2008, the default is Scenario #1.

 

The reason we have switched the default behavior is that (based on customer feedback) scenario #1 is way more common than scenario #2.

 

One can argue that we could have had different defaults based on the release context (QFE vs. SP1), but we think this will be more confusing than just sticking with one policy for everything.

 

Please, let me know if you have any questions…

 

 

George Mileka

Visual C++ Development Team

 

* A “servicing baseline” can be VS RTM, VS SP1, etc.

**  The <app>.config can be used to disable the central policy altogether – or it can map the application request to something else before the central policy is applied.

Posted by vcblog | 12 Comments

Visual Studio 2008 Service Pack Beta now available

We are very excited to announce that Microsoft Visual Studio 2008 SP1 Beta is now available for download. We encourage you to go ahead and try out this release. Please remember that this is a beta so some caution is advised. The goal of this beta is to gather feedback from the community. Please use the connect site to report any issues or improvements.

 

The Visual C++ 2008 Feature Pack  has been rolled into this Service Pack Beta release. In addition, there are also a number of bug fixes from Visual Studio 2008 RTM that are included in this update. Some of the improvements are:

o    Connect bug: Microsoft Macro Assembler (MASM or vc/bin/ml.exe) has been included in Visual C++ Express

o    Connect bug: Silent bad codegen in _mm_load_ss() and _mm_load_sd() intrinsics is fixed

o    Connect bug