Welcome to MSDN Blogs Sign in | Join | Help

TechEd Europe Demo Session - Face-Lifting MFC Applications on Windows 7

Hello,

My name is Damien Watkins, and I am a Program Manager on the Visual C++ team. Today, I thought I would post the “overview script” for my TechEd Europe session – so those who could not attend can try it out for themselves at home. My talk was about the new MFC features we have added in Visual Studio 2010.

Just before we get started, I would like to point out that Pat Brenner has been doing some Channel 9 videos on the new MFC functionality too, you can find the first three videos here (and keep your eyes open for more in the future):

1.       Pat Brenner: Visual Studio 2010 - MFC and Windows 7 (a 25 minute interview on the rationale for and design of the new MFC features)

2.       MFC: Implementing handlers for preview, thumbnail and search filtering (a 3 minute “How To” style video)

3.       MFC: Integrating your application with the Windows Restart Manager (a 5 minute “How To” style video)

Creating my Solution/Application

To begin, you could run Visual Studio 2010 (VS2010) as an administrator.  This is needed to register the “File Type Handlers” for providing Preview, Search and Thumbnail support for your document type. Alternatively, you could run VS2010 as a normal user and run the command to register the file type handlers in an elevated command prompt.

After starting VS2010 (and executing it as an Administrator) you will create a new MFC Application (File->New->Project or CONTROL+SHIFT+n). In the New Project Wizard you will select MFC Application and enter a name for the Project. In my example below I entered “VCBlogDemo” but you can use anything you like however, if you use the same name as me it will be easier to follow the instructions below when I refer to specific classes, methods, etc. I then click “OK”. As an aside, unless I say otherwise I always select the defaults on any screen.

Now work your way through the MFC Application Wizard dialogs, selecting/enabling these features as you go:

1.       On the “Overview” page, just click “Next”.

2.       On the “Application Type” page, set the Project Style to Office (I will show some Ribbon features further on and so I want a Ribbon based UI) and then click “Next”.

3.       On the “Compound Document Support” page, just click “Next”.

4.       On the Document Template Properties page:

·         Add a file extension, I entered demo-ms

·         And I added support for Preview, Thumbnail and Search handler support by clicking the various buttons once they were enabled.

·         Check the screenshot you have the same settings:

·         Then click “Next”.

5.       On the “Database Support” page, just click “Next”.

6.       On the “User Interface Features” page, just click “Next”.

7.       On the “Advanced Features”, please note that “Support Restart Manager” is already selected, and then just click “Next”.

8.       On the “Generated Classes” page:

·         Change the Base Class for the CVCBlogDemoView class to CEditView. The reason for doing this is that the default implementation of this class can be used to highlight the Preview, Search and Thumbnail functionality without the need for the developer to write any additional code. Your screen should now look like this:

·         Then click Finish

Once the Wizard completes, as a quick sanity check, I changed my “Solution Configuration” to be “Release”, build the solution (F7) and started the application without debugging (CONTROL+F5) – I essentially just closed it after it I saw it running. Then I returned to VS2010.

Configuring Restart Manger and Auto-Save

By default, the auto save interval is 5 minutes, but rather than have to wait five minutes after starting the application to demonstrate this functionality I am going to set the interval to 15 seconds. I do this by first finding the command that enables Restart Manger support in the CVCBlogDemoApp constructor and then by adding a line immediately after it. The Constructor will then have these two lines in:

 

m_dwRestartManagerSupportFlags = AFX_RESTARTMANAGER_SUPPORT_ALL_ASPECTS;

m_nAutosaveInterval = 15000; // add this line after the preceding line

Editing the Ribbon

You can then change to the Resource View and go to the VCBlogDemo project and expand VCBlogDemo.rc. Then expand the Ribbon resources and double click on the IDR_RIBBON resource. Next I dragged a Panel from the Toolbox and plac it on the ribbon. I changed the caption of the panel to “Restart”. I then dragged a Button from the Toolbox to the Restart Panel. I then changed the button’s ID to ID_CRASHME and its Caption to CrashMe.

Return of the MFC Class Wizard!

Now you need to add events handlers for the “CrashMe” button in the ribbon. At this point it becomes time to welcome back an old friend, the Class Wizard. Hit CONTROL-SHIFT-X (different key binding to the original) and the Wizard appears. It is hard to imagine a more commonly requested feature than the return of the Wizard – we really hope you enjoy using it!

Now change the “Class name” field to CVCBlogDemoApp and on the Commands Tab enter ID_CRASHME in the search box. The Class Wizard screen looks like this:

Next, selected COMMAND and clicked on “Add Handler” and then select UPDATE_COMMAND_UI and clicked on “Add Handler”.  The COMMAND handler will crash our application, so we can see the Restart Manager functionality at work. The UPDATE_COMMAND handler will disable the “CrashMe” button until our application has been running for more than 1 minute so Windows Restart Manger machinery is enabled (and so the auto save timeout has expired at least once.) Now use the “Edit Code” button to add the following code (of course, you can add whatever logic you like to enable/disable the button and terminate the application – the code below is the simplest and most constrained I could think of – however it is not an example of good coding practices):

Demonstrating our application

So now rebuild the Solution and start the application (remember - it should be built for “Release”.) When you start the application without debugging (CONTROL+F5) you will notice that the “CrashMe” button is initially disabled. Next I will open two more documents (keying CONTROL-N twice will do).

1.       In document one I will type “This document is saved” and click on the save icon and save the documents in the Documents folder.

2.       In document two I will type “This document is not saved” (and I will NOT save it).

3.       In document three I will type “This sentence is saved” and click on the save icon and save the documents in the Documents folder. Next I will type on the second line “This sentence is not saved” (and I will NOT save it).

Windows 7 Taskbar

To allow the auto-save functionality time to work, at this point you can hover over the application’s icon in the new Windows 7 Taskbar. Notice how each document has a separate tab (although the writing is very small in this example) and you can see a real time view of the document – for example notice the flashing cursor in the active document (whichever one that happens to be.)

 

 

Demonstrating the Restart Manager

Now we can crash the application by clicking on the enabled “CrashMe” button. After taking a few seconds to restart, you get a nice example of our new CTaskDialog and the option to auto-recover our work.

I recover everything by clicking on the “Recover the auto-saved documents” option.  Now I hover on the application icon in the task bar and, when the document previews appear, I start closing the documents by clicking on the red X. You will notice how a document that has been saved is just closed, a document that has already been saved  but subsequently has been updated will  have the “Save” dialog enabled and a document that has never been saved will have the “Save As” dialog enabled. 

Explorer Interaction

Finally, I start Explorer in the documents folder, type “saved” in the Search field and select one of the demo-ms files and thus we can see the Icon, Preview and Search functionality.

Support for High DPI

For those who have a few minutes, at this stage you can change the resolution of your screen and either restart or logoff (depending on your OS requirements) and see how MFC has added support for High DPI displays.

Summary

As you can see, using the new Windows functionality from MFC is very simple. Add to this the return of the Class Wizard and the addition of the Ribbon Editor and we hope that MFC developers will be excited about the upcoming Visual Studio 2010 release.

One other point, you may notice is that the Ribbon Designer looks a lot like the BCGSoft designer. Again, as with the MFC extensions in VS2008 SP1 (a.k.a. the MFC Feature Pack) we got some assistance from BCGSoft on some of the implementation of the new MFC Features. So that was the overview script of my TechEd Europe talk.

Thanks,

Damien

 

 

Posted by vcblog | 0 Comments

Improvements to Find all references in Visual Studio 2010

Hello, my name is Raman Sharma, and I am a Program Manager on the VC++ team.  Through this blog post, I wish to highlight some changes we have made to an important feature in the C++ IDE, called “Find all References”.  As most of you would know, this feature is used to search for references to any element of your code (classes, class members, functions etc.), inside the entire solution.  In Visual Studio 2010, we have made some changes to this feature to provide more flexibility.

Let’s look at these changes through an example.  Let’s say you have the following piece of code:

Figure 1: Sample Code

The Old World

In Visual Studio 2008, if you invoke “Find all References” on the member function printArea of class Circle as in:

Figure 2: Invoke Find all References

You will see the results showing line numbers 7, 20 & 26 as below:

Figure 3: Results in VS 2008

It is to be noted that the results window does not list all the places the word “printArea” appears.  It only lists specifically the places where printArea means the “member function printArea of class Circle” (on which Find all References was invoked).  The point is that this is in some sense a compiler-verified-search (I just coined that term but the concept is correctJ) wherein you will get exactly the C++ symbol you searched for.

The New World

In Visual Studio 2010 we have effectively created two modes of search: one that focuses on speed and the other that focuses on accuracy.  Needless to say, you can specify the default mode through options and we will remember that setting.  More on that later in this post.

Speed Mode

If you perform the same operation as described above in VS 2010, by default you will get the following results:

Figure 4: Default results in VS 2010

Please note that this time around, the results window does list all the places the word “printArea” appears (including the comments!).  Why? Because we wanted to provide the user with an option to search without invoking the compiler.  Needless to say, this option will speed up the search significantly, especially for large projects with a lot of hits.

It’s worth mentioning that this search is unlike just searching for some text in the entire solution.  This is because for this search, the C++ IDE uses an index to narrow the list of files to search.  So it doesn’t look through all the files in the solution.  The outcome is significantly better performance than just “Find in Files” (or grep), especially for large solutions.

Accuracy Mode

However, we understand that there can be instances where you want to reduce the ambiguity by filtering the results further.  This means you want better accuracy.  If you search for a member function of a class, you only want references to that member function of that class.  This is even if there are other classes with same named members or other overloads for this function within the same class.  To do this, right-click on the results in the results window and invoke “Resolve Results”:

Figure 5: Invoke Resolve Results

“Resolving results” uses the compiler to actually verify/confirm the entries and removes references that don’t precisely match.  You will be presented with the following results which contain only exact references to Circle::printArea:

Figure 6: Resolved results in VS 2010

If you are feeling more adventurous, there is another part of the accuracy mode which is like asking the following question to the compiler: “Filter out as many extraneous results as you can, and for those you are not sure about, show them anyway”.  The reason you would want to ask that question is if you want comments, code in different macro states etc. to be included in the search results.

The way to ask this question is to simply right-click on the resolved results in the window and to uncheck “Hide Unconfirmed”:

Figure 7: Uncheck "Hide Unconfirmed"

Doing this will make sure that search results will exclude only those results which the compiler has verified are definitely not references of Circle::printArea. Anything that the compiler verifies is correct and anything it is not sure about will be listed.  The results will be as shown below:

Figure 8: Resolved and Unconfirmed results in VS 2010

Notice this time the results include the comments.  It’s worth mentioning that this is the only mode that was supported by VS 2008

The default mode for search is the speed mode.  This means, when you invoke Find-all-References, you will see all the places your search item appears.  However as with everything else in Visual Studio, there is a way to change this default behavior.  If you go to “Tools -> Options -> Text Editor-> C/C++ ->Advanced”, under References you will find two new options “Disable Resolving” and “Hide Unconfirmed”:

Figure 9: Options to set default search mode

By default the “Disable Resolving” flag is set to True (means Speed mode).  Setting it to False would cause all results to be verified with the compiler (means Accuracy mode).  Similarly the default value of “Hide Unconfirmed” flag is True.  Setting it to False will ensure that search results will contain unconfirmed results in addition to the resolved results.  The second flag makes sense only when the first flag is set to False.  Also to be noted is that resolving/unresolving from the results window will not affect the values of these flags.  These are global settings meant to specify default behavior.

Overall, we believe that these changes have introduced more flexibility enabling users to optimize their experience based on their own needs.  We are excited about this change and hope you will like it.

Thank you.

 

Posted by vcblog | 24 Comments

Visual C++ Precompiled Header Errors on Windows 7

Several customers have encountered the following error while using the Visual C++ compiler on Windows 7:

 

fatal error C1859: 'stdafx.pch' unexpected precompiled header error, simply rerunning the compiler might fix this problem

 

This error manifests under the following conditions:

·         The Visual C++ compiler is invoked on Windows 7.

·         Precompiled header (PCH) files are enabled.

·         /analyze is enabled (this is not a necessary condition, but it increases the probability of encountering this behavior).

 

Despite the error message’s suggestion, “simply rerunning the compiler” probably won’t help the situation. Indeed, the underlying cause of this error is far from “simple” as it stems from an interaction between our venerable precompiled header architecture and the new security enhancements in Windows 7.

 

Visual C++ Precompiled Headers and ASLR

Precompiled header files store the “state” of a compilation up to a certain point, and that state information can be reused in subsequent compiler invocations to significantly increase build throughput. For the past 15 years, our compiler has persisted precompiled headers to disk and reloaded them directly into virtual memory with 99.999% reliability and considerable performance gains. The tradeoff, however, was a degree of fragility in our architecture.

 

Since the PCH file itself contains internal pointers, it must be loaded at the exact same address in virtual memory where it was created. The pointers will be inaccurate if the PCH is loaded at a different address in subsequent compilations. To complicate matters, the PCH also contains polymorphic objects, and each polymorphic object contains its own virtual function table pointer (VFTP).  These VFTPs point to virtual function tables stored in modules. Therefore, if a polymorphic object in the PCH depends on a virtual function table in a particular module, that module must be loaded at the exact same virtual address as when the PCH was created.  If the module is loaded at a different address in subsequent compilations, the VFTP’s in the PCH will be inaccurate.

 

That’s a long-winded way of saying “both the PCH file and the modules it depends upon must not move between compilations!” The Visual C++ compiler will verify that both of these conditions are met before building, otherwise it will fail immediately with the above error. The latter condition bit us on Windows 7, which introduced more aggressive algorithms for Address Space Layout Randomization (ASLR). ASLR mitigates certain malware exploits by randomly relocating modules within a process. To circumvent ASLR randomization in Vista, the compiler modules were previously built with /dynamicbase:no back in Visual Studio 2008. This was insufficient in Win7 as randomization became more aggressive.

 

Our first attempt to fix the problem involved setting the preferred base address of each compiler module to a location that we considered “safe” (i.e. would decrease the odds of modules colliding sufficiently). Unfortunately, cascading rebases continuously thwarted our efforts as one module would move into the preferred address space of another, and the domino effect would continue until a module that the PCH used was rebased. Failures like this were difficult to diagnose and often involved subtle factors, such as process creation order (i.e. devenv.exe loads a module that cl.exe uses, etc) and the Native DLL Loader.  We were essentially locked in a losing battle with the Butterfly effect.

 

Our Solution

The majority of alternative solutions required either a substantial amount of work or an unacceptable performance hit. We finally decided to implement our own dispatch mechanism within the PCH data structures that eliminated the virtual function tables altogether. By “devirtualizing” the PCH data structures, we successfully eliminated the second criterion: compiler modules can now move about the process without breaking the precompiled header file.

 

This fix will be available in the final release of Visual Studio 2010, and a hotfix for Visual Studio 2008 will be released shortly. If you are encountering this problem in the interim, please try the following workarounds:

·         Disable /analyze (if enabled).

·         Invoke a clean build.

·         Reboot your machine.

·         Disable PCH files.

 

Thanks,

Mark Roberts

Visual C++ Compiler Team

Posted by vcblog | 19 Comments

Steps to Open Actionable Bugs

Hi, I am Vaishnavi Sannidhanam, test lead from the Visual C++ Compiler Backend team. I joined Visual C++ team 4 years  ago and helped ship VS2005 and VS2008. In this blog, I would like to introduce you to the most effective way of reporting actionable bugs to us.

Where to open the bug

 

When you find a bug, the best way for you to report it, is to reach us through Microsoft Connect. To report a bug through Microsoft Connect, open the Connect page (https://connect.microsoft.com/) and click on “Your Dashboard”. Select the project the bug pertains to and proceed with opening the bug.

Usage Scenario

 

Giving a description of why you are doing what you are doing, to help us understand and evaluate the importance of this bug fix for you.

Providing us with a repro

 

A repro is nothing but a series of steps that reproduce the behavior you are seeing. It is always useful to give us a repro and the smaller a repro the better it is for us. Sometimes you might not be able to attach a repro case through Microsoft Connect because of the size of the repro. If that is the case, please do make a note in the bug that you have a repro that you were not able to attach and we will make other arrangements to get that from you. It would also be great if you provide us with the expected and the actual behaviors. For UI/accessibility/localization related issues, it is very useful if you were to give us a screen shot of the broken behavior. It would also be very valuable if you let us know what version of the product you found the bug in.

 

Below are specific issues where special pieces of information pertaining to the bug would greatly help us

               

1.       Project conversion issues: It would be tremendously helpful for us when the opened bug contains the following information

1.1.    Original project file

1.2.    Conversion log (upgradelog*.xml which can be found in the solution or project directory depending on what is getting converted)

1.3.    Description of the problem you ran into

 

2.       Visual Studio Crashes: These are usually reported to us as Watson reports hence nothing further needs to be done. However, if for some reason you are not getting a Watson prompt, you can open a bug with the repro steps detailing the version of VS, the type of the project (you are more than welcome to share out the project with us as this improves our chances of reproducing the crash), the series of actions you performed that caused VS to crash, etc. If the crash is on Win7, then the usual JIT window to attach debugger after crash will not be shown, by default the settings hide it.  If you want to debug through the issue, here are steps to turn the debugger on:

2.1.    Go to Control Panel à System and Security à Action Center à  Change Action Center settings à Problem Reporting Settings

2.2.    Select “Each time a problem occurs, ask me before checking for solutions”

Alternatively if your group policy blocks this setting you can click on Select programs to exclude from reporting and add - %ProgramFiles%\Microsoft Visual Studio 10.0\Common7\IDE\devenv.exe and %ProgramFiles%\Microsoft Visual Studio 10.0\VC\vcpackages\vcpkgsvr.exe

 

3.       Intellisense failures and browsing related issues: You can use the new dev10 feature that enables logging. To enable this feature you can go to Tools à Options à Text Editor à C++ à Advanced à Logging Level and set this to true and level to be 5. By default this is disabled. Once enabled, you can reproduce the problem and attach the resulting log. You can see this log through the output window by selecting “Visual C++ Log”.

 

4.       Memory Consumption Issues: It would be easier for us to track down memory issues if you let us know the amount of memory being used along with a detailed description of what type of actions is causing this to happen. Sharing out your project or a project that exhibits this problem as a repro case would greatly increase our ability to repro this bug on our end. You can also use the sysinternals tool vmmap (http://technet.microsoft.com/en-us/sysinternals/dd535533.aspx) to look at various memory metrics such as virtual memory, working set, native heap, managed heap etc. You can create vmmap files by running vmmap.exe, attaching to devenv.exe when it is in the high-memory state, and selecting Save from the menu.  This saves some data about how the memory is being used in the system which allows us to more easily spot where to start looking for the problem. You can attach this file as part of the bug as well.

 

5.       Project Build issues:  Detailed build log will be very helpful when you hit project  build issues. You can enable detailed logging by going to Tools à Options àProjects and Solutions à Build and Run, set “MSBuild Project build output verbosity” to “Diagnostic” in IDE build or pass /v:diag switch to MSBuild for command line build. It will be helpful if you can also provide the build log from previous versions of Visual Studio.

 

6.       Compile Time Issues: You can produce a pre-processed file or a link repro depending on what kind of problem you are running into. The steps on how to generate these files are documented at  http://support.microsoft.com/kb/974229. Also it is always useful to gather all the switches you are using when doing a compile. By default the switches passed to the compiler will not be shown in the output window and you can turn this on by going to  Tools à Options àProjects and Solutions à Build and Run, set “MSBuild Project build output verbosity” to “Diagnostic”.

 

Pre-processed files:

The very best kind of repro is the small section of source that compiles all on its own without any dependencies and produces an executable that reproduces the problem. If you have this kind of a dependence free repro, please report it along with what wrong behavior you observe and what you think the right behavior is. However, it is very hard and time consuming to produce a dependence free repro. That’s why the next best kind of repro is the single, pre-processed file.

 

For managed code, it is more useful for us if you give us the preprocessed file (following the instructions above) along with all the assemblies referenced during compilation. You can find out what assemblies are referenced by adding /showIncludes to the command line or by clicking no project properties à Configuration Properties à C/C++ à Advanced à Show Includes à Yes

 

Reporting bugs in the pre-processor:

For bugs in the pre-processor (i.e. compiling something /P or /EP doesn’t yield what you think it should), a more elaborate repro helps us out. The elaborate repro is nothing but the main one source file where you noticed the problem along with all the others that the main one #includes.

 

The easiest way to obtain all the source files is to build your repro case with /showIncludes. This will list all the files that are brought in during compilation. If you copy all those files to a directory, you should be able to reproduce the problem in isolation from the rest of your sourcebase. For instance, you should be able to set INCLUDE=. and still be able to produce an object file.

 

Link Repros:

Link repros are used when bugs are occurring during link time for both LTCG and non-LTCG cases.

To generate repros for problems with Profile Guided Optimization (PGO): Problems with the Instrumented Build,  i.e. passing /PGI, simply follow the link repro instructions detailed at http://support.microsoft.com/kb/974229. All needed files will be copied over to the link repro directory.

To generate repros for problems with Profile Guided Optimization (PGO): Problems with the Optimized Build, i.e. passing /PGO, simply follow the link repro instructions detailed at http://support.microsoft.com/kb/974229. In addition, include the .PGC files from your scenario that is causing the problem you want to report.

 

7.       Application Runtime Issues:

 

            Floating point precision issues

After using the above described techniques to reduce your repro size, letting us know what your expected result is and what we are producing is very valuable for us. Also helping us understand what kind of an impact this precision error has on your product would be useful for us when we decide on when to fix the bug. You might also want to look into the various floating point options that the compiler provides that help you choose between precision and performance (http://msdn.microsoft.com/en-us/library/e7s85ffb(VS.100).aspx ).

 

            Application crashes

When your application crashes and you think it is a compiler issue because

you don’t find any fault in your code

you see that this crash happens with the new version after upgrade and does not happen with the old version

you see that the crash happens only in release mode and not in debug mode

In these cases you could provide us with the source code that exhibits this behavior. We would really appreciate it if it were a reduced repro case as this tremendously helps us in narrowing down the problem.

 

Thank you for reading through this blog. I hope this blog enables you to open actionable bugs which in turn will help us ship a higher quality product. Thank you for dogfooding our product.

Posted by vcblog | 28 Comments

Visual C++ Code Generation in Visual Studio 2010

Hello, I’m Ten Tzen, a Compiler Architect on the Visual C++ Compiler Code Generation team. Today, I’m going to introduce some noteworthy improvements in Visual Studio 2010.

 

Faster LTCG Compilation:  LTCG (Link Time Code Generation) allows the compiler to perform better optimizations with information on all modules in the program (for more details see here).  To merge information from all modules, LTCG compilation generally takes longer than non-LTCG compilation, particularly for large applications.  In VS2010, we improved the information merging process and sped up LTCG compilation significantly. An LTCG build of Microsoft SQL Server (an application with .text size greater than 50MB) is sped up by ~30%.

 

Faster Pogo Instrumentation run:  Profile Guided Optimization (PGO) is an approach to optimization where the compiler uses profile information to make better optimization decisions for the program.  See here or here for an introduction of PGO.  One major drawback of PGO is that the instrumented run is usually several times slower than a regular optimized run.  In VS2010, we support a no-lock version of the instrumented binaries.  With that the scenario (PGI) runs are about 1.7X faster. 

 

Code size reduction for X64 target: Code size is a crucial factor to performance especially for applications that are performance-sensitive to the behavior of instruction cache or working set.  In VS2010, several effective optimizations are introduced or improved for X64 architecture. Some of the improvements are listed below:

·         More aggressively use RBP as the frame pointer to access local variables. RBP-relative address mode is one byte shorter than RSP-relative.

·         Enable tail merge optimizations with the presence of C++ EH or Windows SEH (see here and here for EH or SEH).

·         Combine successive constant stores to one store. 

·         Recognize more cases where we can emit 32-bit instruction for 64-bit immediate constants.

·         Recognize more cases where we can use a 32-bit move instead of a 64-bit move.

·         Optimize the code sequence of C++ EH destructor funclets.

 

Altogether, we have observed code size reduction in the range of 3% to 10% with various Microsoft products such as the Windows kernel components, SQL, Excel, etc.

 

Improvements for “Speed”:  As usual, there are also many code quality tuning and improvements done across different code generation areas for “speed’.  In this release, we have focused more on the X64 target.  The following are some of the important changes that have contributed to these improvements:

·         Identify and use CMOV instruction when beneficial in more situations

·         More effectively combine induction variable to reduce register pressure

·         Improve detection of region constants for strength reduction in a loop

·         Improve scalar replacement optimization in a loop

·         Improvement of avoiding store forwarding stall

·         Use XMM registers for memcpy intrinsic

·         Improve Inliner heuristics to identify and make more beneficial inlining decisions

Overall, we see an 8% improvement as measured by integer benchmarks and a few % points on the floating point suites for X64.  

 

Better SIMD code generation for X86 and X64 targets:  The quality of SSE/SSE2 SIMD code is crucial to game, audio, video and graphic developers.  Unlike inline asm which inhibits compiler optimization of surrounding code, intrinsics were designed to allow more effective optimization and still give developers access to low-level control of the machine.  In VS2010, we have added several simple but effective optimizations that focus on SIMD intrinsic quality and performance.  Some of the improvements are listed below:

 

·         Break false dependency:  The scalar convert instructions (CVTSI2SD, CVTSI2SS, CVTSS2SD, or CVTSD2SS) do not modify the upper bits of the destination register. This causes a false dependency which could significantly affect performance. To break the false dependence of memory to register conversions, VS2010 compiler inserts MOVD/MOVSS/MOVSD to zero-out the upper bits and use the corresponding packed conversion.  For instance,

 

cvtsi2ss xmm0, mem-operand   à           movd xmm0, mem-operand
                                                                         cvtdq2ps xmm0, xmm0

For register to register conversions, XORPS is inserted to break the false dependency.

cvtsd2ss xmm1, xmm0                 
à
           xorps xmm1, xmm1
                                                                        cvtsd2ss xmm1, xmm0

Even though this optimization may increase code size we have observed a significant positive performance improvement on several real world code and benchmark programs. 

 

·         Perform vectorization for constant vector initializations: In VS2008, a simple initialization statement, such as __m128 x = { 1, 2, 3, 4 }, would require ~10 instructions. With VS2010, it’s optimized down to a couple of instructions.  This can apply to dimensional initialization as well.  The instructions generated for initialization statements like __m128 x[] = {{1,2,3,4}, {5,6}} or __m128 t2[][2]= {{{1,2},{3,4,5}}, {{6},{7,8,9}}};  are greatly reduced with VS2010. 

 

·         Optimize __mm_set_**(), __mm_setr_**() and __mm_set1_**() intrinsic family.  In VS2008, a series of unpack instructions are used to do the combining of scalar values. When all arguments are constants, this can be achieved with a single vector instruction.  For example, the single statement, return _mm_set_epi16(0, 1, 2, 3, -4, -5, 6, 7), would require ~20 instructions to implement in previous releases while it’s only one instruction is required in  VS2010. 

 

Better register allocation for XMM registers thus removing many redundant loads, stores and moves.

·         Enable Compare & JCC CSE (Common Sub-expression Elimination) for SSE compares.  For example, the code sequence below at left will be optimized to the code sequence at right:

 

ECX, CC1 = PCMPISTRI                                   ECX, CC1 = PCMPISTRI
JCC(EQ) CC1                                                       JCC(EQ) CC1
ECX, CC2 = PCMPISTRI                  
à
           JCC(ULT) CC2
JCC(ULT) CC2                                                     JCC(P) CC3
ECX, CC3 = PCMPISTRI
JCC(P) CC3

 

Support for AVX in Intel and AMD processors:   Intel AVX (Intel Advanced Vector Extensions) is a 256 bit instruction set extension to SSE and is designed for applications that are floating point intensive (See here and here for detailed information from Intel and AMD respectively).  In VS2010 release, all AVX features and instructions are fully supported via intrinsic and /arch:AVX.  Many optimizations have been added to improve the code quality of AVX code generation which will be described with more details in an upcoming blog post. In addition to AVX support in the compiler, the Microsoft Macro Assembler (MASM) in VS2010 also supports the Intel AVX instruction set for x86 and x64.

 

 

More precise Floating Point computation with /fp:fast: To achieve maximum speed, the compiler is allowed to optimize floating point computation aggressively under /fp:fast option.  The consequence is that the floating point computation errors can accumulate and a result could be so inaccurate that it could severely affect the outcome of programs.  For example, we observed that more than half of the programs in the floating points benchmark suite fail with /fp:fast in VS2008 on the X64 targets.  In order to make /fp:fast more useful, we “down-tuned” a couple of optimizations in VS2010. This change could slightly affect the performance of some programs that were previously built with /fp:fast but will improve their accuracy.  And if your programs were failing with /fp:fast in earlier releases, you may see better results with VS2010.

 

Conclusion: The Visual C++ team cares about the performance of applications built with our compiler and we continue to work with customers and CPU vendors to improve code generation. If you see issues or opportunities for improvements, please let us know though Connect or through our blog.

 

 

 

 

Posted by vcblog | 40 Comments

Channel 9 Video: Stephan T. Lavavej - Everything you ever wanted to know about nullptr

Stephan T. Lavavej is back in front of the Channel 9 cameras once again for a discussion on our recently implemented C++0x feature “nullptr”. In a previous channel 9 appearance, Stephan spoke about the C++0x language and library features we were implementing for VS2010  and the various interdependencies between various features (for example, how rvalue references – a language feature – enable move semantics and perfect forwarding in our Standard Template Library implementation – a library feature.) In this video Stephan describes how rvalue references exposed a few loopholes in the C++ type system around the NULL macro (or more specifically around the value of the NULL macro, the integer constant 0, and how this value is treated “differently” by the compiler to other integer constants.) The issue had already been identified by the C++ Language Committee and a solution had been added to the C++0x language specification (the aforementioned “nullptr”). And to add even more good news, customers can see and use this feature (and all our other C++0x Features) in the Visual Studio 2010 Beta 2 which was released this week.  We hope you enjoy Stephan’s latest theatrical release.

 

Thanks

Damien

Posted by vcblog | 9 Comments
Filed under: ,

Visual Studio 2010 Beta 2 Is Now Available For Download

We are very pleased to announce we have released Visual Studio 2010 Beta 2.  You can read the official Beta 2 announcement on Soma’s blog. The Visual C++ team has added a few new features and, of course, many bug fixes. The additional features include some substantial new functionality in the MFC library and the return of the MFC Class Wizard. We  are currently filming some Channel 9 videos and writing a few VC Blog posts. Our first two Channel 9 videos on Beta 2 are already online.  The video on the C++0x Language feature “nullptr”  is here and the video on the MFC features is here. You can download Beta2 from this location.  Please be sure to continue to watch this blog for updates.   As always we welcome your comments/suggestions/criticisms on our blog.

 

Thank you,

Damien Watkins and Kelly Evans

Visual C++ Team

 

Posted by vcblog | 54 Comments

Visual C++ Code Model in Visual Studio 2010

Hello, I’m Vytautas Leonavičius, a developer on the Visual C++ IDE team. Today, I’d like to discuss how the new principles we’re applying to code browsing in Visual Studio 2010 (see here and here) will affect VCCodeModel.  General information about VCCodeModel can be found here.

Improvements

Because of our new incremental update architecture, VCCodeModel implicitly gets certain benefits:

        Availability: Once initial population is done, VCCodeModel is pretty much always available. That is a positive result of design decisions to parse files independently. No change of a single file (even the omnipresent windows.h, if you like) will cause “rebuild of the world”. That means, if issued, calls to VCCodeModel.Synchronize will return almost instantly.

        Reliability: We no longer use the old NCB database. Instead, we now use the same redistributable SQL CE components that we provide to our end-users.  Suggestions to “delete your .ncb file” (due to DB corruption) in order to restore functionality of Browsing/Intellisense are a thing of the past. 

        Independence from a macro state: Prior to Visual Studio 2010, a file snapshot was taken in a certain undefined macro state. Depending on macro state active in a translation unit at the moment file contents were analyzed, certain constructs would or would not make it into the NCB database.  That caused a lot of confusion.  In Visual Studio 2010 macro state is ignored and all source file declarations and definitions make it into database, and therefore into VCCodeModel.

 

// Both A and B will make it into database and VCCodeModel, no matter if X is
// defined or not.

#ifdef X

class A {};

#else

class B {};

#endif


See Dealing with preprocessor conditional directives and Hint files sections of Thierry’s post.

        Improved eventing model: VCCodeModel edit Events are more precise and correct, thanks to improvements to our code difference algorithms.  The new difference algorithms have better understanding of user edits.  This is especially true in the presence of templates. This improves the reliability of the Visual C++ Wizards and clients who listen to VCCodeModel Events.

        Performance: VCCodeModel is faster in scenarios where an object reference is taken and passed through external third party code (your code!) which may perform looped inquiries. This is thanks to some targeted caching we added to VCCodeModel.  All clients should benefit from this since many of VCCodeModel services are implemented using its externally exposed APIs for correctness.

Limitations

The aforementioned benefits are primarily brought about by the design decision to treat every single file independently and through some targeted performance work.  However, a downside that this design decision exposes is a lack of absolute precision in full symbol resolution in Visual Studio 2010 Browsing. It applies the following limitations on VCCodeModel:

        No symbol resolution: Type Strings for functions (return type), typedefs, base classes, macro definitions (except those specified in hint files) and parameters are coming directly from source code. They are no longer resolved by compiler as it used to be.  This problem may surface if a caller wants to figure out precisely what type a function is returning:

 

Example 1:

using namespace X;

using namespace Z;

Type foo() { return Type(); }

 

The Type String for VCCodeFunction object referring to function foo is Type. We don’t really know if it is X::Type, global Type or Z::Type. Note that using namespace in this sample, should not necessarily be in same file, it can be in a header (far away). Without building the full translation unit, there’s no efficient way to resolve the symbol Type deterministically and to maintain the performance improvements that our customers desire and that we have achieved with this new design.

Example 2:

File.h

class Base {};

 

File2.h

namespace A {
       class Base {};

}

 

namespace B {

       class Base {};

}

 

File.cpp

#include “a_header_including_many_headers_and_introducing_using_directives.h”

 

class Derived : Base // Which Base is that? Base, A::Base or B::Base?

{};


Lack of precise fully-qualified symbol resolution affects TypeString properties of VCCodeParameter, VCCodeTypedef and FullName property of VCCodeBase.It also indirectly affects several APIs of the CodeTypeRef object that is used as a link between object in source code and type.

VCCodeModel provides heuristic APIs that attempt to resolve a name to a type, even if name is not fully qualified. For incompletely qualified names (like ”Type”, if Type belongs to namespace X)  VCCodeModel.CodeTypeFromFullName and VCCodeModel.CodeElementFromFullName prefer names defined at global scope and only resolve to a name defined in a namespace if a global one is not found.

We’ve added a helper API called VCCodeModel.CodeElementFromFullName2  in Visual Studio 2010 to help clients detect this ambiguity. See below.

        No code elements from imported assemblies in C++/CLI: Code elements from imported assemblies are not added to the Browsing database. As a consequence, the types coming from these assemblies would not be resolved.  For example, if you have a function with a parameter type specified as Exception in source code, VCCodeModel won’t be able to resolve it to System::Exception, since currently such class is not present in database.

 

        We’ve removed Managed C++ support from VCCodeModel in Dev10: Support for C++/CLI is still there.

 

New APIs in Visual Studio 2010

We have introduced the following APIs to VCCodeModel in Visual Studio 2010. All APIs are additions to the VCCodeModel interface:

        CodeElementFromFullName2: Is identical to CodeElementFromFullName, except that it will disregard namespaces during lookup. Because there is no symbol resolution in Visual Studio 2010 version of VCCodeModel it is sometimes beneficial to know whether a particular symbol is ambiguous. The primary source of ambiguity in VCCodeModel is using namespace directives. CodeElementFromFullName2 API looks up the name disregarding namespace. For the following source code:

class X {};

namespace NS1 {

       class X {};

       namespace NS2 {

              class X {};

       }

}

 

Calls to VCCodeModel.CodeElementFromFullName2(“X”) will yield {X; NS1::X; NS1::NS2::X}.

 

        CodeTypeFromFullName2: Is identical to CodeTypeFromFullName except that it will attempt typedef resolution. That is, for the following code:

class X {};

typedef X TD;

 

call to VCCodeModel.CodeTypeFromFullName2(“TD”) will yield class X.

        IsSynchronized: returns true or false depending on whether VCCodeModel is in sync with solution’s source code. Useful for avoiding blocking the calling thread for an undefined period of time while Browsing database is being populated.

        SynchronizeFiles: If VCCodeModel is not in sync with source code, it is not safe to query for a VCFileCodeModel for a project file (caller will get null reference if project files are not yet registered in Browsing database). Call to SynchronizeFiles makes sure that FileCodeModel property on a project file is guaranteed to be not null.

        SynchronizeCancellable: if caller invokes this API while Browsing database is being populated, and there’s significant delay till population completes (for example: initial population of solution Browsing database), the user will see a dialog box with a progress bar.  The dialog box also allows users to cancel wait and API will exit unblocking thread.

 

We’re looking forward to hearing your feedback.  Any suggestions, comments and feedback about what can we do to make VCCodeModel better are welcome.

Vytas/Visual C++

 

Posted by vcblog | 3 Comments

The ATL/MFC Trace Tool and the Tracing Mechanism

Hi, I am Pat Brenner, a Software Design Engineer in the Visual C++ Libraries group.  Some time back I wrote about Spy++.  Today, I am going to write about another Visual Studio debugging tool, the ATL/MFC Trace Tool, and the tracing mechanism that it interacts with in ATL and MFC.

The tracing mechanism

The tracing mechanism is used to control the type of information, and the amount of that information, that is dumped to the output window during execution of a program.  There are a number of categories of information, and different levels of that information, that can be

displayed.

The tracing macros

An application can output tracing messages to the output window by:

·         using the ATLTRACE macros (for ATL), defined in atltrace.h.

·         using the TRACE macros (for MFC) defined in afx.h.

There are uses of the ATLTRACE and TRACE macros sprinkled throughout the ATL and MFC source code.  For example, in CStringT.h in the atlmfc\include folder, in the CStringT::CheckImplicitLoad method, you can find this line of code:

ATLTRACE( atlTraceString, 2, _T( "Warning: implicit LoadString(%u) failed\n" ), nID );

This will dump the message to the output window if level-2 messages in the string category are turned on.

The ATL/MFC Trace Tool

Below is a screen shot of the ATL/MFC Trace Tool.  An MFC application named Editor.exe is running.  The “atlTraceString” category is selected for the MFC100UD.DLL module in the Editor.exe process.  Since a category is selected in the tree, all three of the groups (Process, Module and Category) are enabled.  If the Editor.exe process was selected in the tree, only the Process group would be enabled, and if the MFC100UD.DLL module was selected in the tree, only the Process and Module groups would be enabled.  With this tool, you can configure exactly what categories you would like to see trace messages for, and what amount of messages in those categories.  Here I have indicated that I would like to see a fairly minimal number of trace messages for the entire process, and that the module should inherit the settings from the process, but I have overridden those values and indicated that I want to see a moderate number of trace messages in the string category.

ATL tracing categories

The categories of trace information that can be dumped by ATL:

·         atlTraceGeneral: general and miscellaneous trace messages

·         atlTraceCOM: COM object and method trace messages

·         atlTraceQI: QueryInterface trace messages (category not used in ATL or MFC)

·         atlTraceRegistrar: registration trace messages

·         atlTraceRefcount: reference count trace messages (category not used in ATL or MFC)

·         atlTraceWindowing: Windows message trace messages

·         atlTraceControls: ActiveX control related trace messages

·         atlTraceHosting: in-place client/site related trace messages

·         atlTraceDBClient: database client related trace messages

·         atlTraceDBProvider: database provider related trace messages

·         atlTraceSnapin: snap-in related trace messages

·         atlTraceNotImpl: “interface not implemented” trace messages

·         atlTraceAllocation: memory allocation trace messages

·         atlTraceException: “exception thrown” trace messages

·         atlTraceTime: COleDateTime related trace messages

·         atlTraceCache: caching related trace messages (category not used in ATL or MFC)

·         atlTraceStencil: stencil related trace messages (category not used in ATL or MFC)

·         atlTraceString: CStringT related trace messages

·         atlTraceMap: CAtlMap related trace messages

·         atlTraceUtil: thread and thread-pool related trace messages

·         atlTraceSecurity: CSecurityDesc/CAccessToken related trace messages

·         atlTraceSync: synchronization object related trace messages

·         atlTraceISAPI: ISAPI related trace messages (category not used in ATL or MFC)

·         atlTraceUser: user-defined trace messages (obsolete category not used in ATL or MFC)

·         atlTraceUser2: user-defined trace messages (obsolete category not used in ATL or MFC)

·         atlTraceUser3: user-defined trace messages (obsolete category not used in ATL or MFC)

·         atlTraceUser4: user-defined trace messages (obsolete category not used in ATL or MFC)

Note: the categories that are not used internally by ATL or MFC will probably be removed in a future version of ATL, in order to clean up the interface in the ATL/MFC Trace Tool.

MFC tracing categories                                                                                    

The categories of trace information that can be dumped by MFC:

·         traceAppMsg: main message pump trace messages, including DDE

·         traceWinMsg: Windows message trace messages

·         traceCmdRouting: Windows command routing trace messages

·         traceOle: special OLE callback trace messages

·         traceDatabase: special database trace messages

·         traceInternet: special internet client trace messages

·         traceDumpContext: trace messages from CDumpContext

·         traceMemory: generic non-kernel memory trace messages

·         traceHtml: HTML trace messages

·         traceSocket: socket trace messages

Tracing levels

The information dumped by the trace mechanism is assigned a level from 0 (zero) to 4, where 0 is the most important level and 4 the least important.  These levels correspond to the five ticks on the sliders in the ATL/MFC Trace Tool.

Tying it together

So, based on the settings I set in the ATL/MFC Trace Tool above, the source line above in CStringT.h will dump out the message to the output window, because although I have indicated that I want only level-0 and level-1 messages from the process, I want level-2 messages in the string category.

How it works

When a module is loaded that is using debug ATL (ATLSD.LIB), part of the initialization process is the initialization of the global CAtlAllocator object g_Allocator (see externs.cpp in the atlmfc\src\ATL\ATLS folder).  This method creates a named shared memory area, and part of the name is the process ID (e.g., for process EB0A, the shared memory area is named “AtlDebugAllocator_FileMappingNameStatic_100_EB0A”).  This shared memory area is used to contain all the settings for the process, modules and categories that can be modified by the ATL/MFC Trace Tool.

When the ATL/MFC Trace Tool is started up, it first enumerates all the process in the system, and for each, checks to see if a named shared memory exists for that process (using the naming scheme mentioned above.  If so, then the tool loads up all the settings for that process and from then on is able to modify the settings for the process in the shared memory area, thus affecting the runtime trace behavior of that process.

An interesting recent discovery

We have implemented support in ATL and MFC for preview, thumbnail and filter (search) handlers.  These are loaded by the Windows Explorer and other Windows components (including the Windows Search service).  Recently we had an issue where the Windows Search component could not load our debug DLL, so we never got search filter results in indexed locations.  As it turns out, this was because the search filter host (which loaded the ATL filter handler DLL in order to do the indexing) was running without any file system permissions.  The ATL tracing mechanism, however, tries to set up the shared memory area (for communication with the ATL/MFC Trace Tool) using the CreateFileMapping API.  The lack of file system permissions caused this to fail, and the DLL initialization was aborted, and thus our filter handler was not called.  Apparently this is an issue that has lurked in ATL since the tracing mechanism was invented.  So, in order to fix this issue, I had to allow DLL initialization to continue if the tracing initialization failed, and then simply bail out of any further calls into the tracing mechanism if the initialization had failed.  This then allowed the DLL to load and the filter handler was called correctly, and the bug was fixed.

 

I hope this has been interesting.  Let me know if you have any questions.

Pat Brenner

Visual C++ Libraries Development

 

Posted by vcblog | 11 Comments

Ribbon Designer

Hello, my name is Samatha Mannem and I am a QA with the IDE team.

The world has become sophisticated and the time has come to make every application geeky as well as fancy. That is where ‘Ribbon’ has evolved. The recent UI designs that people are attracted to are Microsoft Office and Windows 7 ribbons.

While Visual Studio 2008 SP1 included the ability to create an application that has a ribbon UI, it was difficult for you to configure it as desired. Detailed Information on Ribbon Designer VS2008 is available at http://msdn.microsoft.com/en-us/library/bb386089.aspx. The Visual Studio product team received a lot of feedback on this issue. With Visual Studio 2010, designing a ribbon-based UI is made much easier with the “Ribbon Designer”.

 

During project creation, the Application Wizard allows you to select the ribbon style for your application. In addition to Office, Visual Studio and Windows Native which were available for Visual Studio 2008, Windows 7 ribbon style is also available in Visual Studio 2010.

Changing the style of the application can easily be done on the fly. At any time during application development, the style of the ribbon’s UI can be changed easily via the ‘style’ dropdown shown below. Changing the style of the ribbon only affects the ribbon’s appearance – it does not in any way disturb the functionality of your application.

With Dev10 creating a ribbon of your style or adding/deleting few tools from the existing Office/Windows ribbon is just a drag and drop action. Writing and debugging complicated UI code is now a thing of the past. Adding behavior to the ribbon’s tools is easily done by adding an event handler to each (explained later).

The following images show the variety of controls that can be used on the Ribbon.

Each control shown below can be designed using Ribbon Designer’s tool box shown here. This tool box can be viewed either by hovering the mouse over the ‘Toolbox’ in the Designer window or by using the menu View->Toolbox.  You can add the ribbon like any other resource (dialog, icon) with the Add Resource->Ribbon menu in the Resource view.

 

 

A ribbon resource created can be added to existing MFC application. To do so, modify the application to load the ribbon resource.

CMFCRibbonBar m_wndRibbonBar;    //declare it

if (!m_wndRibbonBar.Create (this))  //create and initialize the ribbon control

{

    return -1;

}

if (!m_wndRibbonBar.LoadFromResource(IDR_RIBBON))

{

    return -1;

}

 Adding various properties to the control can make it function the way user wants it to serve the purpose.

The image and Menu Items of the Button can be set using the properties window and can be viewed by Right clicking on Ribbon ->properties. Setting the properties in this window is same as writing the following code in CMainFrame.cpp.

You can double-click any control on the designer to open an Items Editor and add more items in its sub menu.  You can create event handlers for all other control events by using either the Properties window or right click on a control and choose ‘Add Event Handler’.

 

A Resource file in the solution, ribbonname.mfcribbon-ms, contains the property values of each control on the Ribbon.

For example following properties are equivalent to the property window shown. The values modified in the properties window reflect the values in this resource file.

 

With this Ribbon designer, our goal is to make your UI creation easy and flexible to change. Overall we believe that with this designer you will have your Application Ribbon easy to play with. We are excited about this feature and would like to hear back from you.

Posted by vcblog | 26 Comments

User Feedback

Hello! My name is Joshua Baxter, and I am a programming writer on the Microsoft team that produces Help content for Visual C++. I am writing this article to explain how Microsoft collects and handles user feedback.

 

User feedback is an important part of our documentation improvement program. We maintain over 33,000 topics about C++, and we receive an average of 350 comments from Help users every month. Not only does your feedback help us to improve the quality of existing documentation, it also helps us improve the quality of future documentation.

 

Most of the feedback we receive comes from ratings and comments that users submit through the MSDN Web site. At the top right corner of every topic is a link that you can use to send us feedback.

 

 

 

We encourage you to leave feedback that describes specific sections in the topic that are wrong, misleading, or confusing so that we can better understand how to fix the topic. Perhaps it goes without saying, but we cannot address generic comments (“bad topic”, “needs work”, “unclear”).

 

Occasionally, we receive user feedback from other sources. For example, Microsoft MVP Joseph Newcomer maintains MSDN Documentation Errors and Omissions, which at present contains more than 400 issues. Although we also address this kind of feedback regularly, we encourage you to submit your feedback on the MSDN Web site so that it gets to us faster.

 

When we receive feedback that a topic is inaccurate, we verify whether the concern is valid for both Visual Studio 2008 and Visual Studio 2010. Verification might involve testing the reported inaccuracy by using a code sample or by contacting a member of the development team. If a concern is verified, we revise the documentation as appropriate. Sometimes the topic is actually technically accurate, but it requires clarification or additional information.

 

Note: We do not maintain versions of the documentation that are earlier than Visual Studio 2008. If we receive feedback about earlier versions, we determine whether it also applies to Visual Studio 2008 or Visual Studio 2010, and then revise the documentation for those versions as required.

 

Although we address feedback and revise topics regularly, changes do not necessarily appear immediately on the MSDN Web site. All revised topics are reviewed to ensure technical accuracy, and this may take awhile. Also, topics may have to wait in the MSDN publishing queue until the next scheduled update, which occurs every few weeks.

 

Again, we appreciate the feedback that we receive from our users. If you have feedback about a topic on the MSDN Web site, please click the feedback link at the top right corner of the topic and send us your comments.

Posted by vcblog | 4 Comments

Linker throughput

Hello, my name is Chandler Shen, a developer from the Visual C++ Shanghai team.

We have made some changes in the upcoming Visual C++ 2010 release to improve the performance of linker. I would like to first give a brief overview of the linker and how we analyze the bottlenecks of current implementation. Later, I will describe the changes we made and the impact on linker performance.

Our Focus

 

We were targeting the linker throughput of large scale projects full build scenario because this scenario matters most in linker throughput scalability. Incremental linking and smaller projects will not benefit from the work I describe in this blog.

Brief Overview of Linker

 

Traditionally, what’s done by linker can be split into two phases:

1.       Pass1: collecting definitions of symbols (from both object files and libraries)

2.       Pass2: fixing up references to symbols with final address (actually Relative Virtual Address) and writing out the final image.

Link Time Code Generation (LTCG)

If /GL (Whole Program Optimization) is specified when compiling, the compiler will generate a special format of object file containing intermediate language. When linker encounters such object files, Pass1 becomes a 2-phase procedure. From these object files, the linker first calls into compiler to collect definitions of all public symbols to build a complete public symbol table. Then the linker supplies this symbol table to the compiler which generates the final machine instructions (or code generation).

Debug Information

During Pass2, in addition to writing the final image, linker will also write debug information into a PDB (Program Database) file if user specifies /DEBUG (Generate Debug Info). Some of this debug information, such as address of symbols, is not decided until linking.

Bottlenecks

In this section, I will show how we analyze some test cases to figure out bottlenecks of performance.

Test Cases

To get an objective conclusion, four real world projects (whose names are omitted) differ in scale, including proj1, proj2, proj3 and proj4, were chosen as test cases.

Table 1 Measurements of test cases

 

Proj1

Proj2

Proj3

Proj4

Files

Total

55

27

168

1066

.obj

4

6

7

882

.lib

51

21

161

184

Symbols

6026

22436

69570

110262

In Table 1, the number of “symbols” is the number of entries of the symbol table which is internally used by linker to store the information of all external symbols. It is noticeable that “proj4” is much bigger than others.

Test Environment

Following is the configuration of the test machine

·         Hardware

o   CPU       Intel Xeon CPU 3.20GHz, 4 cores

o   RAM      2G

·         Software             Windows Vista 32-bit

Results

To minimize the effect of environment, all cases were run for five times. And the unit of time is in seconds.

In Table 2 and Table 3, it showed that for each test case, there is always one (usually the first, marked in red) run which takes much longer than others.  While one run (marked in Green) may take a much shorter run. This is because following two reasons

l  OS will cache a file’s content in memory for next read (called prefetch on Windows XP, and SuperFetch on Windows Vista)

l  Most of modern hard disks will cache a file’s content for next read

 

Comparing Table 2 with Table 3, we can notice that if /debug is off, the time of Pass2 is much shorter. So it indicates that the majority of Pass2 is writing PDB files

Table 2 Test result of Non-LTCG with /Debug On

Pass1

Pass2

Total

Proj1

1

4.437

2.328

6.765

2

0.266

1.218

1.484

3

0.265

1.188

1.453

4

0.265

1.219

1.484

5

0.235

1.375

1.610

Proj2

1

9.484

15.766

25.250

2

1.531

8.188

9.719

3

1.579

8.078

9.657

4

1.625

7.890

9.515

5

1.610

8.297

9.907

Proj3

1

27.266

43.687

70.953

2

4.250

17.672

21.922

3

4.141

17.265

21.406

4

4.203

18.500

22.703

5

4.688

19.078

23.766

Proj4

1

47.453

70.172

117.625

2

17.250

59.813

77.063

3

17.547

55.672

73.219

4

16.516

47.172

63.688

5

14.937

44.079

59.016

 

Table 3 Test result of Non-LTCG with /Debug Off

Pass1

Pass2

Total

Proj1

1

0.187

0.078

0.265

2

0.218

0.031

0.249

3

0.187

0.047

0.234

4

0.203

0.031

0.234

5

0.187

0.031

0.218

Proj2

1

6.209

0.297

6.506

2

1.310

0.187

1.497

3

1.295

0.187

1.482

4

1.342

0.203

1.545

5

1.310

0.203

1.513

Proj3

1

15.382

0.764

16.146

2

3.541

0.546

4.087

3

3.650

0.562

4.212

4

3.557

0.546

4.150

5

3.588

0.562

4.150

Proj4

1

12.059

1.856

13.915

2

10.811

1.778

12.589

3

10.874

1.809

12.683

4

12.855

1.794

14.649

5

10.796

1.778

12.574

 

It is highly recommended that users use/LTCG (Link-time Code Generation) to optimize applications. The test results with /LTCG are shown in Table 4.

Table 4 Test Result of LTCG with /Debug On

Pass1

Pass2

Total

Proj1

1

178.797

1.734

180.531

2

155.593

0.954

156.547

3

153.750

1.031

154.781

4

152.562

0.891

153.453

5

153.156

0.797

153.953

Proj2

1

120.375

5.546

125.921

2

102.343

5.172

107.515

3

102.203

5.235

107.438

4

102.016

5.343

107.359

5

102.250

5.078

107.328

Proj3

1

222.859

20.719

243.578

2

185.281

22.437

207.718

3

184.984

21.422

206.406

4

185.203

22.656

207.859

5

186.078

22.844

208.922

Proj4

1

522.329

122.984

645.313

2

490.188

54.406

544.594

3

441.125

51.860

492.985

4

430.609

51.813

482.422

5

437.344

49.750

487.094

 

Observations

Based on above results and other investigation, we have the following observations

1.       If /LTCG is used, most of linking time will spend on code-generation (a compiler task) in Pass1.

2.       OS caching of input files will decrease the time spent in both passes quick a lot

3.       The majority of time spent in Pass2 is writing the PDB file

Linker changes and impact in VS2010

Multi-threading during Pass2

After some investigations, we decided to introduce a dedicated thread to writing PDB files because

1.       Most users normally specify /debug when linking, irrespective of whether the application is built under “debug” or “release” configuration.

2.       The data written into final binary does not depend on the result of writing PDB file, and vice versa: i.e., the binary writing task is independent of the PDB writing task

3.       When the project is big, linker has much other work to do during Pass2 in additional to writing PDB file, such as reading data from object files and libraries.

Results

Following is the table that compares the linker performance results between VS2010 and VS2008 SP1. To remove the effect of cache, we rebooted our test machine (with SuperFetch disabled) before each run. For ease comparison, the time cost by old linker (from Table 2 and Table 4) are also listed (no caching).

Table 5 Test Result of new linker, Non-LTCG with /Debug On

New linker (VS2010)

Old linker (VS2008 SP1)

Pass2 Improved

Total Improved

Pass1

Pass2

Total

Pass1

Pass2

Total

Proj1

3.547

1.859

5.406

4.437

2.328

6.765

20.15%

20.08%

Proj2

9.797

10.266

20.063

9.484

15.766

25.250

34.89%

20.54%

Proj3

17.078

22.609

39.687

27.266

43.687

70.953

48.25%

44.07%

Proj4

47.500

54.281

101.781

47.453

70.172

117.625

22.65%

13.47%

 

Table 6 Test Result of new linker, LTCG with /Debug On

New linker(VS2010)

Old linker(VS2008 SP1)

Pass2 Improved

Total Improved

Pass1

Pass2

Total

Pass1

Pass2

Total

Proj1

153.516

0.953

154.469

178.797

1.734

180.531

45.04%

14.44%

Proj2

119.703

5.391

125.094

120.375

5.546

125.921

2.79%

0.66%

Proj3

225.688

16.594

242.282

222.859

20.719

243.578

19.91%

0.53%

Proj4

525.375

80.375

605.750

522.329

122.984

645.313

34.65%

6.13%

 

From Table 5 and Table 6, it can be seen that multi-threading the linker has improved the performance of Pass2, and it is especially effective for bigger projects.

Future

We will continue to look into linker throughput even after 2010 release to find areas to improve. If you have any suggestions and feedbacks, feel free to let us know.

 

Posted by vcblog | 22 Comments

Compiler Warning C4789

When Visual Studio 2010 ships, it will have improvements to warning C4789; allowing it to catch more cases of buffer overrun. This blog post will cover what C4789 warns about, and how to resolve the warning.

What does C4789 mean?

 

When compiling your source file, you may receive the warning: “warning C4789: destination of memory copy is too small.”

This message means that the compiler has detected a possible buffer overrun in your code.

Example 1

Let’s say we have the source file a.cpp that contains the following:

1: #include <memory.h>

2:

3: int p[1];

4:

5: void bar() {

6:     memset(&p[1], 1, sizeof(int));

7: }

 

From the ”Visual Studio 2008 Command Prompt”, if you compile this with the command:

cl /c /O2 a.cpp

You will receive the warning:

a.cpp(6) : warning C4789: destination of memory copy is too small

For the example above, the compiler has detected a buffer overrun for the variable ‘p’. 'p' has been allocated as an array with one element. Arrays are zero-indexed, so the memset on line 6 is taking the address of the second element of an array; this means that we are actually writing to memory outside the array, corrupting memory!

In this case, the user most likely meant to memset the first element, and thus to fix this issue, the memset would be changed to

memset(&p[0], 1, sizeof(int));

 

Typical User Scenarios

 

In practice, a lot of buffer overruns will not be as obvious as Example 1, so I’ll provide some more examples to help you in your investigations.

Example 2

Let’s say we have the source file a.cpp that contains the following:

1:  short G1;

2:

3:  void foo(int * x)

4:  {

5:      *x = 5;

6:  }

7:

8:  void bar() {

9:     foo((int *)&G1);

10: }

From the ”Visual Studio 2010 Command Prompt”, if you compile this with the command:

cl /c /O2 a.cpp

You will receive the warning:

a.cpp(9) : warning C4789: destination of memory copy is too small

In this example, we’ve created a variable 'G1' of size short (which is only two bytes), but we’ve taken the address of it and casted it to 'int *' to pass to 'foo’. 'foo' then writes 4 bytes to the memory location pointed at by 'x’. As 'G1' is only 2 bytes in length, the store “*x = 5” will write past 'G1', resulting in a buffer overrun.

There are a couple of important things to note about this example. This buffer overrun will only be caught with the improvements made in Visual Studio 2010. Also, this warning is caught by inlining 'foo' into 'bar'. This means that this buffer overrun is only caught when optimizations are enabled.

To fix the buffer overrun in Example 2, we declare 'G1' as int. If that isn’t an option, we can create a new variable to pass to ‘foo,’ and assign that variable to ‘G1’ (which truncates the int to a short):

   int y;

   foo(&y);

   G1 = y;

 

Example 3

Let’s say we have the source file a.cpp that contains the following:

1:  int G1;

2:  int G2;

3:

4:  void foo(int ** x)

5:  {

6:      *x = &G2;

7:  }

8:

9:  void bar() {

10:     foo((int *)&G1);

11: }

From the ”Visual Studio 2010 X64 Cross Tools Command Prompt”, if you compile this with the command:

cl /c /O2 a.cpp

You will receive the warning:

a.cpp(10) : warning C4789: destination of memory copy is too small

This example is exactly like Example 2 with a key difference. We’ve casted “int” to “int *”. On x86, this is a harmless cast (int and int * are the same size, 4 bytes). However, on x64, “int” is 4 bytes, and “int *” is 8 bytes, so this code is no longer correct when this code is run on x64.

C4789 False Positives

 

You may hit cases of C4789 where the warning is incorrect. This can happen because the compiler detects a buffer overrun along a code path that will never fire.

Example 4

1:  __int64 G1;

2:  int lengthOfG1 = 8;

3:

4:  void foo(char * x, int len) {

5:      if (len > 8) {

6:          x[8] = 1;

7:      }

8:  }

9:

10: void bar() {

11:     foo((char *)&G1, lengthOfG1);

12: }

From the ”Visual Studio 2010 Command Prompt”, if you compile this with the command:

cl /c /O2 a.cpp

You will receive the warning:

a.cpp(11) : warning C4789: destination of memory copy is too small

In this example, the compiler thinks that ‘G1’ can be buffer overrun because of “x[8] = 1” would assign outside of the size of ‘G1.’ However, as long as ‘lengthOfG1’ is the correct length of ‘G1,’ “x[8] = 1” will never fire for ‘G1,’ and thus a buffer overrun will never occur.

For some of these false positives, the only option will be to disable the warning. In this particular example, however, changing "int lengthOfG1 = 8” to

const int lengthOfG1 = 8;

would solve the problem.

Workarounds

 

If you have proven that the warning is a false positive, there are a couple of different ways to disable the warning.

1.       Disable the warning for one function (recommended)

2.       Disable the warning for all functions

Disable the warning for one function

The compiler allows you to disable a warning for a particular function. This is done by putting

#pragma warning ( disable : 4789 )

before the function, and putting

#pragma warning ( default : 4789 )

after the function. This will disable the warning (in this case warning 4789) for that function (and any functions which inline it).

Example 5

1:  __int64 G1;

2:  int lengthOfG1 = 8;

3:

4:  #pragma warning ( disable : 4789 )

5:  void foo(char * x, int len) {

6:      if (len > 8) {

7:          x[8] = 1;

8:      }

9:  }

10: #pragma warning ( default : 4789 )

11:

12: void bar() {

13:     foo((char *)&G1, lengthOfG1);

14: }

15:

16: void bar1() {

17:     foo((char *)&G1, 9);

18: }

With the #pragma around ‘foo,’ you will receive no warnings; while without it you will receive the warnings:

a.cpp(13) : warning C4789: destination of memory copy is too small

a.cpp(17) : warning C4789: destination of memory copy is too small

You can also choose to disable the warning for one of the functions where the warning occurs. In the example above, we could put the #pragma around ‘bar’ instead of ‘foo’, and then we’d eliminate the warning for line 13, but still receive the warning on line 17.

Disable the warning for all functions

If you need to ignore warning 4789 completely, you can specify /wd4789 on the command line.

cl /c /O2 /wd4789 a.cpp

This option isn’t recommended as it will hide potentional buffer overruns in your code.

 

Posted by vcblog | 11 Comments

Windows SDK V7.0/V7.0A Incompatibility Workaround

Hi,

My name is Nada AboElseoud and I am a QA in VC++ Libraries team. I joined MS in February 2009. I would like to talk here about an incompatibility issue with WinSDK v7.0*.

If you are a developer who has recently migrated to WinSDK v7.0 (standalone SDK) or v7.0A (inbox with VS 2010), you may encounter these kinds of errors “The procedure entry point K32*** could not be located in the dynamic link library KERNEL32.dllwhile running your application.  This implies that you are running your application on an OS other than Windows7 or Windows Server 2008 R2. This blog will explain this blocking issue and provide the workaround.

Let me explain first why this issue happens.  For performance reasons, some APIs have been moved from Psapi.dll to Kernel32.dll in Windows7 and Windows Server 2008 R2. WinSDK v7.0* is reflecting these modifications to be compatible with the new system dlls. This is by design, but wait! If you are trying to link your application to Psapi.lib and then targeting any pre Windows7 or pre Windows Server 2008 R2, you will get this runtime error. Breaking this down, all APIs from Psapi.dll are copied to Kernel32.dll in Windows7 and Windows Server 2008 R2 (Psapi.dll remain unchanged though). Linking to Psapi.lib marks these APIs as Kernel32 APIs to load them from Kernel32.dll instead.   Following is the list of these APIs.

//Snapshot from Psapi.lib – WinSDK V7.0*

#if (PSAPI_VERSION > 1)

#define EnumProcesses               K32EnumProcesses

#define EnumProcessModules          K32EnumProcessModules

#define EnumProcessModulesEx        K32EnumProcessModulesEx

#define GetModuleBaseNameA          K32GetModuleBaseNameA

#define GetModuleBaseNameW          K32GetModuleBaseNameW

#define GetModuleFileNameExA        K32GetModuleFileNameExA

#define GetModuleFileNameExW        K32GetModuleFileNameExW

#define GetModuleInformation        K32GetModuleInformation

#define EmptyWorkingSet             K32EmptyWorkingSet

#define QueryWorkingSet             K32QueryWorkingSet

#define QueryWorkingSetEx           K32QueryWorkingSetEx

#define InitializeProcessForWsWatch K32InitializeProcessForWsWatch

#define GetWsChanges                K32GetWsChanges

#define GetWsChangesEx              K32GetWsChangesEx

#define GetMappedFileNameW          K32GetMappedFileNameW

#define GetMappedFileNameA          K32GetMappedFileNameA

#define EnumDeviceDrivers           K32EnumDeviceDrivers

#define GetDeviceDriverBaseNameA    K32GetDeviceDriverBaseNameA

#define GetDeviceDriverBaseNameW    K32GetDeviceDriverBaseNameW

#define GetDeviceDriverFileNameA    K32GetDeviceDriverFileNameA

#define GetDeviceDriverFileNameW    K32GetDeviceDriverFileNameW

#define GetProcessMemoryInfo        K32GetProcessMemoryInfo

#define GetPerformanceInfo          K32GetPerformanceInfo

#define EnumPageFilesW              K32EnumPageFilesW

#define EnumPageFilesA              K32EnumPageFilesA

#define GetProcessImageFileNameA    K32GetProcessImageFileNameA

#define GetProcessImageFileNameW    K32GetProcessImageFileNameW

#endif

Now, it should be obvious why by calling some API (say EnumProcessModules) you get this runtime error “The procedure entry point K32EnumProcessModules could not be located in the dynamic link library KERNEL32.dll” pointing to a different API name.

However, did you notice the IF condition involved?

#if (PSAPI_VERSION > 1)

This means that these APIs are defined/tagged only if the PSAPI_VERSION > 1. By default this value is set to 2 and _WIN32_WINNT is set to _WIN32_WINNT_MAXVER (which is 0x601 for Win7).

Workaround

After reading about this issue, you may be able now to figure out the solution. Simply,

if you target any OS prior to Windows7 and Windows 2008 R2, what you need to do is to define _WIN32_WINNT to a previous version (before 0x601) or to define Psapi_version to 1.

For example:

cl /MD /EHsc  /D _WIN32_WINNT=0x501 mytest.cpp /link Psapi.lib

Or

cl /MD /EHsc  /D PSAPI_VERSION=1 mytest.cpp /link Psapi.lib

Does this mean that this generated exe will work fine on Windows7 and Windows Server 2008 R2 as well? Definitely. As stated above, Psapi.dll is not modified and the workaround is just loading the APIs from Psapi.dll.

Otherwise, if you are just targeting Windows7 (or Windows 2008 R2), to take advantage of this performance boost, you can simply use the default predefined macros or explicitly define them as below.

cl /MD /EHsc  /D _WIN32_WINNT=0x601 mytest.cpp /link Psapi.lib

Or

cl /MD /EHsc  /D PSAPI_VERSION=2 mytest.cpp /link Psapi.lib

 

Hope this is helpful J

Nada  AboElseoud

 

Posted by vcblog | 3 Comments

Tag Parsing C++

Hello, my name is Thierry Miceli and I am a developer on the Visual C++ Compiler Front End team. Although our team is mostly known for writing and maintaining the part of the C++ compiler that analyzes your source code and builds an internal representation from it, a great deal of our effort in the last few years has been directed into servicing the IDE and improving the intellisense experience (refreshers here, here, and here).

Today, I am going to write about a new parser that has been specifically created to provide a fast and scalable way to extract information from C++ source code. This parser is one of our new additions to Visual Studio 2010 and we call it the “tag parser”.

The tag parser is used in Visual Studio 2010 to populate the SQL database that supersedes the NCB file. All of the browsing features of VC++ rely in some way on results provided by the tag parser. These include Class View, Call Hierarchy, Go To Definition/Declaration, Get All References, Quick Search, the Navigation Bar and VCCodeModel.

A Fuzzy Parser

It is a fuzzy parser, which means that instead of trying to strictly recognize and validate the full C++ syntax (we have an excellent compiler front-end to do that) it lazily matches an input stream of tokens with some patterns. This parser doesn’t populate a symbol table during parsing, it has no notion of types apart from built-in ones, it doesn’t build a full macro context and its unit of translation is a single file (i.e. it doesn’t follow through #include directives). But nevertheless, the parser is able to deal with all of C++, C++/CLI and IDL.

High level of tolerance to incomplete code and errors.

The tag parser doesn’t try to make sense of every symbol or identifier in the source code. It will be satisfied with being able to recognize the different parts of a declaration and their positions in the source file. If a name in the type specification of a declaration couldn’t get resolved by our C++ compiler this would not prevent the tag parser from recognizing the declaration and it will show up in Class View as in the example below.

The tag parser is somewhat analogous to a human reader of the source code that would just be looking at one unique declaration without knowing much about the rest of the project. He may not know what most identifiers actually represent but he can tell with a high level of confidence what the declaration is and locate its subparts.

In addition to the tolerance to ‘semantically’ incorrect code which is a property of fuzzy parsers, the tag parser has heuristic based error recovery for the most common causes of erroneous code during editing. For example, it will try to detect incomplete declarations or unclosed body of functions definitions as shown in the snapshot below.

Dealing with preprocessor conditional directives.

The tag parser’s main role is to extract information from the source code that is then consumed by the IDE browsing features. Because browsing features closely relate to the editing experience it is more useful that the tag parser generates a structured representation of the full source code as it appears in the editor rather than a representation of the code that would get compiled under a specific project configuration.

The tag parser deals with preprocessor conditional directives (#if, #ifdef, #ifndef, #else, #elif, #endif) in a special way. It incorporates the full code in each of the branches of preprocessor conditional directives but still only parses complete declarations. For example in the image below, both the inactive and active branches are parsed and Class View shows both function declarations.

The tag parser is also able to deal with more complex cases where a declaration is interrupted by one or more preprocessor conditional directives. For example in the image below both of the declarations that can be induced by the 2 branches are parsed and reported.

Faster and scalable

Tag parsing scales because it is incremental – it doesn’t need to re-parse hundreds (or thousands) of compilation units after a header file is changed, as is often the case in an actual build.  It is also faster than a full compiler (despite its heuristics) because it is not burdened by macro expansion and full semantic resolution.  Thus it is well suited to capture real-time information for even the largest projects.

No built-in semantic resolution

Since the tag parser operates strictly on a per-file basis, certain semantic resolutions are left to its clients. For example, since function declarations and definitions typically appear in separate files, the tag parser reports a function declaration and its definition separately without any binding information. Therefore Class View has to match a function declaration and its definition so that they appear as a single entry in the Class View tree.

The tag parser is light-weight and this comes with some responsibilities on the side of the consumers of the parser results. The good thing here is that clients only have to incur the cost of building the semantic knowledge that they need and they can dig into the data with SQL now.

Hint Files

We tried to make the tag parser as standalone as possible. It doesn’t need to know about any kind of project configuration (include paths, compiler switches, etc…). In many cases the tag parser could be invoked with a source file name as its only argument and it would do an excellent job at extracting detailed information about the code in this file. The only caveat is preprocessor macros that interfere with the C++ syntax so badly that fuzzy parsing and error recovery heuristics cannot make sense out of the code. One example of such macro is STDMETHOD, when expanded it will generate a member function signature from something like: 

STDMETHOD(OnDocWindowActivate)(BOOL fActivate)

 

You’ll have a hard time guessing what the above line means if you don’t know what STDMETHOD is. Since the tag parser doesn’t follow through #include directives and doesn’t perform SQL lookups into the symbol database*, it cannot discover by itself macro definitions. Nevertheless, its macro state can be preconfigured with what we call a ‘hint file’. A hint file simply contains the definitions of macros that are needed for the tag parser to correctly recognize your source code in the presence of macros that fundamentally interfere with the C++ syntax.

 If you have Beta1 installed, you will find a “cpp.hint” file in your Visual Studio 2010 install directory under vc\vcpackages, this is the hint file for the VC and SDK library headers. Very often the tag parser will do just fine with only this preset hint file. Nevertheless, if your code or some third party library code you are using contains macros that tamper with the C++ syntax, you may need to setup your own hint file. The IDE will look for files named “cpp.hint” in the directory where your source files are located and in all the parent directories up to the root directory or until a file named “cpp.stop” is found. All the hint files that are found will be preprocessed to build the macro context before your files actually get to be parsed. I won’t go into more details about hint files for now but feel free to ask questions and, by the way, they will be thoroughly documented on MSDN.

Don’t worry too much if this machinery seems complex, most of the time you won’t have to define your own hint files or you’ll just need to drop a “cpp.hint” file with a few macro definitions in your project or solution directory.

In the future we are planning to work on tools that will help you decide where hint files are needed and possibly generate them for you. And we will also work on making the tag parser act smarter in the presence of macros so that fewer hints need to be added to a hint file.

*In theory the tag parser could query the database for macros definitions when additional information is needed to recognize or disambiguate a declaration, but a reliable implementation of symbol lookups (even if it was only for macros) would push the tag parser in the opposite direction of being light-weight, standalone, incremental and independent from project configurations.

 

Posted by vcblog | 15 Comments
More Posts Next page »
 
Page view tracker