Rebuilding Intellisense

Rebuilding Intellisense

Rate This
  • Comments 83

Hi, my name is Boris Jabes. I've been working on the C++ team for over 4 years now (you may have come across my blog, which has gone stale...). Over the past couple of years, the bulk of my time has been spent on re-designing our IDE infrastructure so that we may provide rich functionality for massive code bases. Our goal is to enable developers of large applications that span many millions lines of code to work seamlessly in Visual C++. My colleagues Jim and Mark have already published a number of posts (here, here and here) about this project and with the release of Visual Studio 2010 Beta 1 this week; we're ready to say a lot more. Over the next few weeks, we will highlight some of the coolest features and also delve into some of our design and engineering efforts.

In this post, I want to provide some additional details on how we built some of the core pieces of the C++ language service, which powers features like Intellisense and symbol browsing. I will recap some of the information in the posts I linked to above but I highly recommend reading the posts as they provide a ton of useful detail.

The Problem

Without going into too much detail, the issue we set about to solve in this release was that of providing rich Intellisense and all of the associated features (e.g. Class View) without sacrificing responsiveness at very high scale. Our previous architecture involved two (in)famous components: FEACP and the NCB. While these were a great way to handle our needs 10+ years ago, we weren’t able to scale these up while also improving the quality of results. Multiple forces were pulling us in directions that neither of these components could handle.

1.       Language Improvements. The C++ language grew in complexity and this meant constant changes in many places to make sure each piece was able to grok new concepts (e.g. adding support for templates was a daunting task).

2.       Accuracy & Correctness. We need to improve accuracy in the face of this complexity (e.g. VS2005/2008 often gets confused by what we call the “multi-mod” problem in which a header is included differently by different files in a solution).

3.       Richer Functionality. There has been a ton of innovation in the world of IDEs and it’s essential that we unlock the potential of the IDE for C++ developers.

4.       Scale. The size of ISV source bases has grown to exceed 10+ million lines of code. Arguably the most common (and vocal!) piece of feedback we received about VS2005 was the endless and constant reparsing of the NCB file (this reparsing happened whenever a header was edited or when a configuration changed).

Thus, the first step for us in this project was to come up with a design that would help us achieve these goals.

A New Architecture

Our first design decision involved both accuracy and scalability. We needed to decouple the Intellisense operations that require precise compilation information (e.g. getting parameter help for a function in the open cpp file) from the features that require large-scale indexes (e.g. jumping to a random symbol or listing all classes in a project). The architecture of VS2005 melds these two in the NCB and in the process lost precision and caused constant reparsing, which simply killed any hope of scaling. We thus wanted to transition to a picture like this (simplified):

At this point, we needed to fill in the blanks and decide how these components should be implemented. For the database, we wanted a solution that could scale (obviously) and that would also provide flexibility and consistency. Our existing format, the NCB file, was difficult to modify when new constructs were added (e.g. templates) and the file itself could get corrupted leading our users to delete it periodically if things weren’t working properly in the IDE. We did some research in this area and decided to use SQL Server Compact Edition, which is an in-process, file-oriented database that gives us many of the comforts of working with a SQL database. One of the great things of using something like this is that gave us real indexes and a customizable and constant memory footprint. The NCB on the other hand contained no indexes and was mapped into memory.

Finally, we needed to re-invent our parsers. We quickly realized that the only reasonable solution for scalability was to populate our database incrementally. While this seems obvious at first, it goes against the basic compilation mechanism of C++ in which a small change to a header file can change the meaning of every source file that follows, and indeed every source file in a solution. We wanted to create an IDE where changing a single file did not require reparsing large swaths of a solution, thus causing churn in the database and even possibly locking up the UI (e.g. in the case of loading wizards). We needed a parser that could parse C++ files in isolation, without regard to the context in which they were included. Although C++ is a “context sensitive” language in the strongest sense of the word, we were able to write a “context-free” parser for it that uses heuristics to parse C++ declarations with a high degree of accuracy. We named this our “tag” parser, after a similar parser that was written for good old C code long ago.  We decided to build something fresh in this case as this parser was quite different than a regular C++ parser in its operation, is nearly stand-alone, and involved a lot of innovative ideas. In the future, we’ll talk a bit more about how this parser works and the unique value it provides.

With the core issue of scalability solved, we still needed to build an infrastructure that could provide highly accurate Intellisense information.  To do this, we decided to parse the full “translation unit” (TU) for each open file in the IDE editor* in order to understand the semantics of the code (e.g. getting overload resolution right). Building TUs scales well – in fact, the larger the solution, the smaller the TU is as a percentage of the solution size.  Finally, building TUs allows us to leverage precompiled header technology, thus drastically reducing TU build times.  Using TUs as the basis for Intellisense would yield highly responsive results even in the largest solutions.

Our requirements were clear but the task was significant. We needed rich information about the translation unit in the form of a high-level representation (e.g. AST) and we needed it available while the user was working with the file. We investigated improving on FEACP to achieve this goal but FEACP was a derivation of our compiler, which was not designed for this in mind (see Mark’s post for details). We investigated building a complete compiler front-end designed for this very purpose but this seemed like an ineffective use of our resources. In the 1980s and 1990s, a compiler front-end was cutting-edge technology that every vendor invested in directly but today, innovation lies within providing rich value on top of the compiler. As a result there has been a multiplication of clients for a front-end beyond code generation and we see this trend across all languages: from semantic colorization and Intellisense to refactoring and static analysis. As we wanted to focus on improving the IDE experience, we identified a third and final option: licensing a front-end component for the purposes of the IDE. While this may seem counter-intuitive, it fit well within our design goals for the product. We wanted to spend more resources on the IDE, focusing on scale and richer functionality and we knew of a state-of-the-art component built by the Edison Design Group (commonly referred to as EDG). The EDG front-end fit the bill as it provides a high-level representation chock-full of the information we wanted to build upon to provide insight in the IDE. The bonus is that already handles all of the world’s gnarly C++ code and their team is first in line to keep up with the language standard.

With all these pieces in place, we have been able to build some great new end-to-end functionality in the IDE, which we’ll highlight over the coming weeks. Here’s a sneak peek at one we’ll talk about next week: live error reporting in the editor.

* - We optimize by servicing as many open files as possible with a single translation unit.

  • I noted that intellisense is no longer available for C++ / CLI in the Beta.

    Is this intended to be remedied before RTM?

    I know that your team seems not to really care about C++ / CLI anymore but some people still have projects to maintain.

  • Sunil -

    We completely empathize with the need to work with C++/CLI and we think it's a great way to do interop between native code and managed code. As part of this re-architecture, we had to make the difficult decision to reduce the scope to native C++ only for Intellisense. We still index symbols coming from C++/CLI code and you can browse them with Class View etc...

    While the lack of Intellisense for C++/CLI is unfortunate, we expect that it only represents a small portion of your source code that you don't need to edit nearly as often as the native code. Indeed, the only scenario we don't recommend is to use C++/CLI as a first-class .NET language. Instead, we think it's the ideal solution for interop.

  • There are two scenerios where this is inconvenient:

    1 Trying to use a native api in a file compiled clr (I.e. When you have included a native header and want to remember the parameters.)

    2 trying to use a BCL type in your project (I am do not write a lot of managed code and found the intellisrnse helped with api discovery.)

    All I want to do is wrap a native library so that it can be called from a c# Wpf application. I only want to write around two classes in c++/cli. I don't need intellisrnse for the c++/cli code I am writing but for my native types and the managed dlls I am consuming.

    Surely this should be possible (I.e. Get the Edg parser to ignore managed type declarations parse ordinary c++ code and just use reflection to provide ibtellisense for existing managed dlls)

    Btw generally I am pleased with the IDE improvements including squiggles

  • Well congratulations , I like very much the new Intellisense , so far from what I've tested it's way better and responsive than it was in VS2005/VS2008. Please do make some improvements on the code editor though : richer syntax highlighting , support for different coding styles ( K & R style / GNU style / user defined ). Just take a look at Eclipse CDT / NetBeans CPP pack , they offer so many options for these things. I'm sorry to say it but when it comes to pure text editing , even the venerable Vim surpasses the built in editor of VS.

  • One of the things I really don't like about netbeans is that there is constant movement in the editor while I type. I keeps deciding my line I am typing has an error, then it changes its mind and decides something else is an error, and the red underlines dance all over the context of my typing.

    I hope with VC++, the red underlines are not so distracting. Also, make sure they can be turned off.

  • Phil -

    I definitely hope our "underlines" aren't distracting. We've actually done some tweaks to improve the experience. Actually, one of my colleagues is going to post a more detailed blog entry soon. As for turning them off: we've added a ton of "power user" options under Tools > Options > Text Editor > C/C++ > Advanced, including the one you ask for.

  • If we're talking text editors, there's a feature from TextPad I would love to see in Visual Studio:

    If you go to column X and type Y characters, then push down, you'll be back at column X again on the next line (not at column X+Y as with most editors including Visual Studio).

    Once you use an editor which does this you never want to go back, especially when editing monospaced, often-tabular data such as source-code.

    I'm looking forward to the new intellisense! I think that and the C++0x features will make me overlook my distaste for the new UI. (Do any teams at Microsoft know about "standard look & feel" anymore? :( If I wanted every application on my desktop to look completely different I'd be using Unix. :-)) I guess I'll have to get used to it, though, and the under-the-covers changes sound very compelling, so good work there.

  • Good,use compact SQL is good idea,I wish Intellisense improve speed in very large c++ project over numerous million lines of code.

  • The other thing I noticed was that intellisense does not like R-value references and Lambdas. This is presumably because the EDG front-end does not support them yet.

    I hope this can get fixed before RTM.

  • Sunil -

    Good call on Intellisense with C++0x features. You'll see that coming in Beta2 :)

  • It feels a little scary to me to have one compiler provide typing support and preventative checking and another doing the actual compilation.

    Maybe it's not such a big deal, but I'd hate being stuck with a squiggly error mark for something that EDG doesn't accept, but cl.exe builds without a hitch.

    The other way around (i.e. EDG accepts something cl.exe doesn't) is almost as bad.

    Sounds promising otherwise!

  • Delphi has different front-ends and it commonly marks errors even if the code is correct!

    --

    About C++/CLI : Are you jocking?! No intellisense!?

    You should really take a decision: to support or not to support the language! You can not put "C++/CLI" templates then not have intellisense for it! It doesn't make any sense!

    The statements like "you don't really require intellisense for C++/CLI" are terrific!

    I use C++/CLI to expose the object model of the application I use and having intellisense disable is much worst of using P/Invoke on each function call !

    Given this fact, I simply won't upgrade to VC++ 2010!

    (and I'm really tired of using features abandoned the next VS version!)

  • I like the new IntelliSense, but it still seems lacking. For example, when starting to type something, IntelliSense STILL doesn't offer any suggestions (this works in other languages such as C# or with a 3rd-party IntelliSense tool such as Visual Assist X).

    And the other problem is that IntelliSense is still very, very slow. Indeed, it's so slow that it actually sometimes is hardly of any use! By this I mean the "Live compilation" part. It's still responsive when offering up suggestions.

    The live compilation tends to mark stuff wrong for a long time before changing its mind and thinking it's right. This is not right and must be fixed.

    Also, the go to declaration and/or definition feature is much slower than the VS08 version. In fact, it's much slower. When I tried, it stood there waiting, blocking for something like half a minute before jumping to the declaration. This is inexcusable! I hope you can fix this before the RTM.

    On a side note, I like the new IDE very much, but as with all previous releases, it seems every release is also a step back - they are all slower than previous versions. In fact, the 2010 IDE is now so slow that it's counter-productive to use it. It uses up EVEN MORE memory then previous versions too! It's literally a memory hog. This is really something Microsoft MUST fix!

    On (yet another) side note... I couldn't find the VC++ paths in the options anymore... but I did find something in the project configuration? What is the reasoning behind this? It seems a step back to me.

  • It seems that for templates Intellisense doesn't work quite right, if I have something like this :

    template<typename T>

    struct list_node {

     struct list_node*     flink_;

     struct list_node*     blink_;

     T                     data_;

     list_node() :

       flink_( 0 ) ,

       blink_( 0 ) ,

       data_() {

     }

     list_node( const T& element ) :

       flink_( 0 ) ,

       blink_( 0 ) ,

       data_( element ) {

     }

     ~list_node() {

     }

    };

    template< typename T >

    class linked_list {

    private :

     list_node<T>*   sentinel_;

     size_t          items_;

    public :

     linked_list();

     ~linked_list();

     void  push_front( const T& element ) throw();

     void  push_back( const T& element ) throw();

     T*    pop_front( void ) throw();

     T*    pop_back( void ) throw();

     void  clear( void ) throw();

     inline size_t items( void ) const throw();

     inline bool empty( void ) const throw();

    };

    when inside a member function and typing sentinel_->blink_-> gives the No members available error. Works flawlessly for the non-template version though.

  • Boris,

    You seem like a nice guy, Boris, but not supporting Intellisense for C++/CLI is a huge slap in the face for those of us trying desperately, with precious little support from Microsoft, to move existing MFC applications forward to WPF.  "Managed code is the future!" says Microsoft.  "But if you are one of those many existing applications that has helped make Microsoft Windows successful, and you are using C++/CLI to ease into managed code, No Soup For You!"

    With existing code and existing header file dependencies, even though we are doing everything we can to limit our use of C++/CLI, there are significant limits to the extent to which we can reduce it.  And the reality is that we have 3 developers who will be spending at least half of their time in C++/CLI code for the foreseeable future.  We have Intellisense now in C++/CLI code with Visual Studio 2008, and it actually works 70% of the time.  How is it that I explain to these developers (one of which is me, by the way) that their productivity is to be halved so that native C++ developers and C# developers can have better Intellisense?

    If C++/CLI is not included, VC10 Intellisense is a failure.  Period.  End of discussion.

    Thanks,

    Eric

Page 1 of 6 (83 items) 12345»