Hello, I’m Mark Hall, an architect in the Visual C++ group. I wanted to follow up on Jim Springfield’s previous blogs about the history of C++ intellisense, and some of the changes we’re making in upcoming Visual Studio 10 release. It’s been almost a year since Jim’s posts, which can be found here:
Many thanks to Jim for his archeological dig through our old products and explanation of where we’re going in “Dev10”. I’d like to add a few more details, track our progress over the last year, and offer my perspective on our investment in C++.
When we first implemented intellisense for C++, it was easy to exceed expectations, mostly because customers weren’t expecting very much. It had to run fast, but if it didn’t work for templates or a complicated overload scenario people didn’t complain. Naturally we wanted to leverage the “front-end” of our C++ command line compiler for intellisense, but it wasn’t designed with intellisense in mind. It was designed to run in 256K of RAM on 16 bit DOS. It would parse and analyze a statement at a time and immediately lower everything to an assembly-level intermediate language. Parsing and semantic analysis were completely intertwined. A good architecture for intellisense separates parsing from semantic analysis. Semantic analysis is expensive, so for intellisense you only want to do it for the one item the user is asking about. We considered writing a new C++ front-end for intellisense. Thankfully we came to our senses – C++ is far too rich and complex. It would take much longer than one product cycle for that, and we needed something more expedient. So we “derived” our intellisense compiler from our command line C++ compiler through the liberal use of ifdefs in the code. We called the intellisense compiler FEACP, for “Front End Auto Complete Parser”. Our command line compiler is called C1XX, for “C++ phase 1” (turn the X’s sideways). The ifdefs removed all the C1XX code that did any “lowering” to the intermediate language. It also removed any semantic analysis we didn’t think was essential for intellisense (initialization, member offset computation, vtable production, etc). The CPU speeds at the time were only 100 MHz, so we had to shortcut a lot of C1XX code to populate dropdown windows within 100 milliseconds. We also removed all non-essential memory usage, storing far less information about each symbol we encountered. This was especially important because FEACP had to run in the IDE process, which already consumed copious amounts of memory.
FEACP was a product success, but a testing and maintenance nightmare. There were so many strange dependencies on code and data that we had ifdef’d out – crashes were common, along with corruption of the NCB file. We paid a hefty toll in maintenance. Since FEACP and C1XX were so different, all our testing of C1XX (millions of lines of code daily) had little effect on the quality of FEACP. It had to be tested more or less independently. The lesson here is that supporting two different compilers in the same source base is only slightly smarter than supporting two completely separate compilers, and neither is a good choice for long-term maintainability (actually it was three compilers, since ANSI-C was another set of ifdefs – and truth be told it was actually four if we include the /analyze compiler).
In the years since we first released FEACP we came to realize that we’d ventured as far into “ifdef hell” as we could go. At the same time, intellisense for simpler languages like Java and C# raised user expectations for intellisense far beyond what we could support in FEACP. With 1000x faster CPU speeds and memory capacity, the (valid) assumptions we made when we created FEACP no longer held. Moreover, the multitude of ifdefs (and resulting compiler models) severely diminished our ability to add language features to C1XX, our bread and butter. Our bug counts were climbing, and the “language gap” was growing. At the same time, the number of people qualified and willing to work on a C++ compiler was shrinking. Something had to give.
With the speed and capacity of modern machines we knew one compiler could service both code generation and intellisense for C++. What we lacked was a high-level internal representation of the source code. If we had that, we could query the representation to service intellisense requests, or lower it to produce machine code. We wouldn’t have thousands of #ifdefs polluting the front-end code, and there would be just one model of compilation. Testing costs would be slashed dramatically. The barrier to entry for new compiler developers would be significantly reduced. But it wouldn’t be free – even with GHz clock speeds, you can’t run the full front-end over all the code for every intellisense request. We would have to create a new intellisense framework that could run the full front-end only on the regions of code that were necessary to produce a desired result. It would still be a lot of work, but we knew we could do it in a single product cycle. I’m happy (and relieved) to say that we did, and Dev10 is the result.
Having read this far you’re probably asking yourself, “OK, so you’ve lowered your cost of ownership. Good for Microsoft. But what’s in it for me?” The most compelling feature this brings to intellisense is accuracy. Visual C++ compiles millions of lines of C++ code daily in build labs and desktops all over the world, and does so with “five nines” of accuracy. Harnessing the command line compiler for intellisense means it will have the same level of accuracy. This is our biggest differentiator in the intellisense market for C++.
Being accurate means more than just getting the right set of members in an auto-complete dropdown – it enables other features that would be impossible or undesirable without it. For example, accuracy means that any errors encountered during the intellisense parse are real errors, and we can expose them to the user as “squiggles” in the editor window as they edit or browse code. They can fix the errors without leaving the editor. This cuts out saving files and kicking off builds to get the command line compiler to provide such diagnostics.
A future benefit of accuracy is that our data stores will be reliable enough to use for code analysis and transformations, such as assisted refactoring of source code. There wasn’t time in Dev10 to provide access to our expression-level data. Users will be able to browse the symbol store to extract symbol-level information about their source bases. In a future release we will provide user-facing APIs that provide access to accurate information about their C++ source bases. This will open up a whole new ecosystem for analysis, understanding, and productivity tools for C++ on Windows.
Dev10 is just the beginning.
Really interesting article. I look forward to the improved IntelliSense. What is the latest on availability of VS2010 betas?
I know about the CTP released last year, but I was wondering when I could get an installable beta release?
>> "Being accurate means more than just getting the right set of members in an auto-complete dropdown – it enables other features that would be impossible or undesirable without it. For example, accuracy means that any errors encountered during the intellisense parse are real errors, and we can expose them to the user as “squiggles” in the editor window as they edit or browse code. They can fix the errors without leaving the editor. This cuts out saving files and kicking off builds to get the command line compiler to provide such diagnostics."
I look forward to using it.
Does the new intellisense produce better information about scoping? One of my most desired features from other code completion systems is the listing order, from closest scope to most distant scope, instead of purely alphabetical.
So where does Phoenix come into this? Is it still slated to be the official VC++ compiler in some future release?
Really nice to hear!
Is the new intellisense available in the CTP? I haven't tryed it yet.
Also, I hope the /analyze will be available in the professional edition too, at this point.
I'm looking forward to VS2010 already. My main annoyance when switching from C# to C++ was intellisense. In C# intellisense treated you like royalty, in c++, not so much ;).
Sounds great but will it feel as snappy on my 8GB quad core as VC6 felt on my old 2GB dual core?
So, were these error “squiggles” implemented in Dev10?
One of the biggest problems in intellisense is how it fails due to small syntactical errors in the code.
That said, I applaud your concentration on accuracy rather than "just getting it to work." Hopefully, though, your highest priority will be stability.
this is very interesting, thanks for the background & update.
Does this relate to Phoenix? Are you actually using the Phoenix framework and intermediate mode reps?
Mark Hall again - thanks for all your comments. I’ll try to cover all the questions thus far.
Regarding dates for beta availability: I can’t divulge schedule dates, of course. I believe this is the site you want to watch for that info: http://www.microsoft.com/visualstudio/en-us/products/2010/default.mspx.
Regarding squiggles in Dev10: We’re evaluating them right now as we dogfood the product. So far, so good, but like many features, this one needs a lot of “bake time” - it will take a lot of use by a lot of different developers over a wide variety of coding styles before we know it’s “ship quality”. Parsing C++ exactly right in all its glory is already insanely hard – doing it incrementally, during live edits is, well, insaner. I’m confident we can fix any showstoppers that arise, but one too many would push squiggles out beyond Dev10. My advice is to keep reading vcblog, download the beta when it comes out, use the new features, and tell us if you run into problems.
Regarding stability in the face of small errors in the code – really this is the whole reason we’re moving away from FEACP. It was so fragile. And it never told you why you weren’t getting intellisense, so you could do precious little to get it working if it was broken. The great thing about squiggles is that now you’ll know why you aren’t getting intellisense, and where you need to fix your code. It’s actionable. So again, download the beta when it comes out, and help us bake Dev10.
Regarding scoping information: I’m sorry to say that this feature got cut. We have the data, but we don’t have the UI. It’s a great feature. I would use it a lot.
Regarding “snappy like VC6”: Visual Studio is a big product, and really the “snappy” feel has to come from everywhere in the product. I won’t be able to answer that until the whole product comes together. I will say this: if you loved C++ intellisense in VC6, you’ll love C++ intellisense in Dev10. But if you hated C++ intellisense in VC6 (header edits pegging the CPU, large projects hanging the system, results consistently wrong, tiny errors resulting in no intellisense at all, templates and namespaces unhandled), then you’ll really love C++ intellisense in Dev10!
Regarding using Phoenix as our intermediate rep: While we have made a lot of progress in the code generation and optimization space, Phoenix has not yet reached the level of quality necessary to ship it as part of dev10. Microsoft continues to invest in Phoenix and we will evaluate it for inclusion in a future version of Visual Studio. Using Phoenix to expose our expression-level data would make a lot of sense.
Thanks again. -markhall
Can you provide some tests showing VS10 Phoenix, VS6.0 and other compilers for C++ showing speed improvements for executing code? Faster code is important to us in the image and numerical processing business.
Can you provide insight on how VS10 has made it easier to port Unix makefile based C++ projects to VS10 projects and get a clean compile? This is an ongoing pain for us as decent code for numerical and image processing almost always originates in a Unix makefile based project on a unix like platform.