Hi, Jim Springfield again. This post covers our current work to fundamentally change how we implement Intellisense and code browsing for C++ in Visual Studio 10. I previously covered the history of Intellisense and outlined many of the problems that we face. See here http://blogs.msdn.com/vcblog/archive/2007/12/18/intellisense-history-part-1.aspx for that posting and more detailed information.
As Visual C++ has evolved over the years there has been tension between getting code information quickly and getting it accurately. We have moved from fast and not very accurate to sometimes fast and mostly accurate in Visual Studio 2008. The main reason for the slowness has been the need to reparse .cpp files when a header is changed. For large projects with some common headers, this can mean reparsing just about everything when a header is edited or a configuration or compile option is changed. We are mostly accurate except that we only capture one parse of a header file even though it could be parsed differently depending on the .cpp that includes it (i.e. different #defines, compile options, etc).
For Visual Studio 10, which is the next release after Visual Studio 2008, we are going to do a lot of things differently. For one, the NCB file is being eliminated. The NCB file was very similar to a BSC file and the IDE needed to load the entire thing into memory in order to use it. It was very hard to add new features to it (i.e. template support was bolted on) and some lookups required walking through a lot of information. Instead of this, we will be using SQL Server Compact for our data store. We did a lot of prototyping to make sure that it was the right choice and it exceeded our expectations. Using Compact will allow us to easily make changes to our schema, change indexes if needed, and avoid loading the entire thing into memory. We currently have this implemented and we are seeing increased performance and lower memory usage.
SQL Server Compact is an in-process version of SQL that uses a single file for the storage. It was originally developed for Windows CE and is very small and efficient, while retaining the flexibility of SQL.
Also, there is a new parser for populating this store. This parser will perform a file-based parse of the files in the solution in a way which is independent of any configuration or compile options and does not look inside included headers. Because of this, a change to a header will not cause a reparse of all the files that include it, which avoids one of the fundamental problems today. The parser is also designed to be extremely resilient and will be able to handle ambiguous code, mismatched braces or parentheses, and supports a “hint” file. Due to the nature of C/C++ macros and because we aren’t parsing into headers, there is good bit of code that would be misunderstood. The hint file will contain definitions for certain macros that fundamentally change the parsing and therefore understanding of a file. As shipped, the hint file will contain all known macros of this type from the Windows SDK, MFC, ATL, etc. This can be extended or modified and we are hoping to be able to identify potential macros in the source code. Since we will be looking at every header, we want to be able to propose recommended additions to the hint file.
However, this style of parsing means that we don’t get completely accurate information in all cases, which is especially desirable in the Intellisense scenarios of auto-complete, parameter info, and quick info. To handle this, we will be doing a full parse of a translation unit (i.e. .cpp file) when it is opened in the IDE. This parse will be done in the fullest sense possible and will use all compile options and other configuration settings. All headers will be included and parsed in the exact context that they are being used for. We believe we can do this initial parse for most translation units very quickly with most not taking more than a second or two. It should be comparable to how long it takes to actually compile that translation unit today, although since we won’t be doing code-generation and optimization, it should be faster than that. Additionally, this parse will be done in the background and won’t block use of the IDE. As changes are made to the .cpp or included headers, we will be tracking the edits made and incrementally reparsing only those bits that need to be parsed.
The database created from the file-based parse will be used for ClassView, CodeModel, NavBar, and other browsing based operations. In the case of “Find All References”, the database will be used to identify possible candidates and the full parser will then be used to positively identify the actual references.
We have also been thinking about some longer term ideas that build on this. This includes using a full SQL server to store information about source code, which multiple people can use. It would allow you to lookup code that isn’t even on your machine. For example, you could do a “goto definition” in your source and be taken to a file that isn’t even on your machine. This could be integrated with TFS so that the store is automatically updated as code is checked in, potentially even allowing you to query stuff over time. Another idea would be to populate a SQL database with information from a full build of your product. This would include very detailed information (i.e. like a BSC file) but shared among everyone and including all parts of an application. This would be very useful as you could identify all callers of a method when you are about to make a change to it.
What are your ideas if you had this type of information? Let us know!
Visual C++ Architect
PingBack from http://www.biosensorab.org/2008/02/29/intellisense-part-2-the-future/
I think that you are going to do a *great* work for VC++ IDE! I hope you really do "10 is the new 6" :-)
One thing you could implement with your SQL-based system is the following: make it possible to query the DB resulting from parsing the C++ source code, for things like: "give me all ANSI strings in this source code file", so a macro could convert all the ANSI strings in source code into Unicode version (decorating them with L"..."), or decorate them with _T("...").
Jim Springfield, an Architect on the Visual C++ team has just posted a great example of how SQL Server
cool :) i will be looking forward to this.
Several things occur to me reading this.
One, I'm pretty sure you know this but files can be included more than once in a translation unit and their effects can be different each time. You used the singular in your article.
On a related point, we have some legacy code with oddball names for included code chunks. When we browse into one of those files it isn't treated as C++ even though the compiler and intellisense have both parsed it recently. That is probably not something you'll want to change but it violates the principle of least astonishment, at least for me.
The other issue is that good intellisense is *more* important for broken code than it is for working code. If you can make that work properly I'll finally start believing IDEs may have a future :-) Whatever "properly" means in the context of broken code. What I mean is that it helps me understand the cryptogram the compiler emitted and investigate how it might be fixed. Giving me the wrong expansions for all the DLL import/export and UNICODE and X64 conditional macros does not help at all. Failing to provide intellisense / browse functionality for a source file that doesn't scan is better but still not optimal.
It would be an *enormous* help if I could get a "preprocessed" view of my file without having to check out and modify the project file, recompile the translation unit (including finding it in the tree view if the error navigation took me straight to the header), browse to $(IntDir) and find the .asm file, search for the right spot, and then remember to undo my change to the project file before checking in.
And while I'm here, any chance of fixing the bug that causes the permanent hourglass when you right-click in a source window while intellisense is restarting after a build? The window for that bug may be small on the toys you guys compile in your labs but with 4+ million lines of code in 200+ projects in one solution you could drive a bus through it. The only recovery is to kill the process and start over.
OK, one more thing related to source browsing: When I fix an error reported in the 3-hour-long full build and do my check build on the modified sources, my full build log is gone. I have to nurse it along, trading off the need to test my changes incrementally against the need to fix more than one error per day. There should be a way to push-and-pop the log or perhaps numbered output windows like you have for find.
Related but less fatal: when you double-click on an item in the Errors window it syncs the "Build" output window but not the "Build Order" output window. When you have 40,000 lines of build output it can take a while to locate the error you were working on.
And one more thing :) select-and-copy are a bit funky in the build output windows. It seems to often take more than one try to get a good copy in the buffer to paste from. But that could be a VMware issue.
Yes, I know, Team Build fixes all ills. Except it doesn't work with third-party compilers (Intel Visual Fortran, in our case) and we're still porting and are weeks away from being able to build everything. It doesn't make much sense to start producing 2000 work items per build right now.
Thanks for listening!
Any chance this will help out with general C++ code editing? I'm specifically comparing it to C#, where the IDE is just a dream to work with.
For example, if I type:
MyClass c = new
the compiler will pop up a little window that contains the line:
and all I have to do is hit enter. (It also helps me type "new", "int", and "MyClass".) This sort of user friendliness is <i>all over</i> C#, and I really miss it when I do my C++ work.
mos: You should really try Visual Assist X: it is great for C++ !
Has anyone checked out the Eclipse CDT project?
I've heard it handles this sort of stuff really well
Using a SQL database as a code reference is a great idea (IMHO). It could be used for code refactoring for example, or all these C# available features that are missing in the C++ IDE (as Mos says, above).
Also, you say that "SQL Server Compact is very small and efficient, while retaining the flexibility of SQL". Yep, I agree with that. But using SQL Server Compact in native code is not easy at all... You know that since you've done it.
Using SQL Server Compact could be usefull in many native C++ applications (it is in Visual C++ IDE ! ;)). But there is nothing in the MFC to use it. Not even a sample.
Firstly, I have no idea why you're tooting your horn on getting rid of the ncb file if all you really mean is that you're taking on totally opaque file and replacing it with another totally opaque file.
Secondly, the most useful thing to me is if you honored the 'OutputDirectory' or 'InternediateDirectory' settings in my project. I tend to do development on a flash drive (why? because!, that's why!). It's set up to drop all of the large, constantly changing files onto a temp directory on the real hard drive -- except that I can' redirect the huge&slow ncb file. Which is constantly being 'rebuilt'.
All I can suggest Jim, and I know you guys are having a tough time is to approach the C++ community out there (especially on Windows and it is still big and very dissapointed with .NET bloatware after it was used and sold for the last 7 years) is:
Get, some, funding.
Don't, buy, other, companies.
Put, money, into VC++ team.
Employ, more, people.
Talk, to, Bill and Steve.
Engage, the, community.
Don't, fight, Boost.
Open, up, the, compiler.
With the in process database I don't see much improvement over the current situation (other than it could be more performant), but an out of process sql server session could be incredibly powerful; so long as you detail the specification and have it under a license which allows others to use it. Not only would it be able to do the things you described (after weaving it with TFS), but with some extensions it could do much more powerful things like:
scan your code for possible cases of derefrencing null pointers;
make sure objects instantiated with the new keyword have a cooresponding delete;
point out possible stack/buffer overflows;
discover redundant code;
find similar code
i totally disagree with your statement "The other issue is that good intellisense is *more* important for broken code than it is for working code".
in general coding is putting new code around functions to glue some existing code from an _established_ codebase/toolbox/library that one, a coworker or 3rd party wrote before. because one doesn't remeber the exact parameters and functions one uses intellisense to look them up.
but the new code that you are writing and that get "broken" by editing it at the moment is definitely in your "zone". being in that "zone" means you remember every bit and don't need intellisense for editing it.
that's why i would expect that intellisense should only provide information about the established codebase. but this information should be exact and all macros in the translation unit should be handled as the compiler does.
why don't you just parse the already preprocessed c++ code that is passed to the compiler by redirecting a copy of it???!?
thank you for the great post (as usual). I see massive potential with MS VC++ and i'm glad you are in the thick of it!
I hear you on the lack of info on using sqlce with mfc. I spent far too long figuring this out (and ended up using non-microsoft provided details on how to do it: sad but true). i've complained to deaf MS ears so i'll say it again: Microsoft please provide more *NATIVE* MFC C++ sample code for new technologies.
I agree with you 100% well said!!
I also agree that your proposed (cutting edge) features would be hugely beneficial. I wish MS invested more in quality & useful VC++ features and less on .NET bloatware.