Exploring the Visual C++ Browse Database

Exploring the Visual C++ Browse Database

  • Comments 7

Hello, this is Jim Springfield.  This post will cover some low-level details about how we represent information in the browse database in VS 2010.  As I’ve mentioned in a previous post, we are using SQL Server Compact Edition (SSCE) for storing information about all of the C, C++, and IDL files in your solution.  I will show some SQL examples that illustrate how to mine this database for information about your code.

NOTE: The particular database schema we use in VS2010 may very well change in future versions, so the examples I show may not work in future versions of VS.

Opening the SDF file

The database file (SDF) can normally be found in the same directory as your solution file (SLN), although it may get relocated if your SLN is on a network share or a flash drive.  You can open the SDF using several different tools.  SSMS (SQL Server Management Studio) is a good one if you have access to it.  However, you can also open it in Visual Studio itself.  Before opening the SDF to play with, it is best to close the associated solution. 

1.       Within VS, go to the “Server Explorer” window, right click on “Data Connections”, and select “Add Connection…”.

2.       Select “Microsoft SQL Server Compact 3.5” for the Data Source and click “Continue”.

3.       Click the “Browse…” button and then navigate to the SDF file to open.

4.       If the SDF file is large (i.e. > 250MB), you will need to set an option before opening.  To do this, click the “Advanced…” button and set “Max Database Size” to something larger than the SDF you are opening.  4091MB is the largest value allowed and you can just use that if you wish.

5.       Finally, click the “OK” button and VS should open your SDF file.

Learning your way around

If you expand the new node for your SDF in Server Explorer, you will see a “Tables” node.  Expand that and you can see all of the tables that are currently defined.  The “code_items” table contains information on every definition and declaration that occur in your source.  I don’t have space to cover what all of the tables do, but take a look around.  Don’t expect to see data in the refs or symbols tables as those are there for some possible future use. 

Note:  Don’t try to make changes to the schema or indexes and expect that to persist.  When we open the SDF for actual browsing use, we do a consistency check and if anything is not “correct”, we delete the SDF and rebuild it.

Take a look at the “code_item_kinds” table.  You can see the contents by right-clicking and selecting “Show Table Data”.  There should be 59 entries here.  These values are used in the “code_items” table and you can use them in queries to find certain types of code items.

Creating a query

Right click on the “code_items” table and select “New Query”.  You will get a query window with some tools that help you build and run queries.  A window will popup asking you to add a table.  Just click close as you will just be copying queries into the query window for now.

Try this query first to get your feet wet.  To run a query, click the red exclamation icon in the toolbar or press “Ctrl+R”.

select * from code_items where kind=1

This query returns all code items that are C++ classes.  If you look at all of the columns, most should make sense.  Note that the database gives start and end position information for the entire class and just the name portion of the code item.  It is hard to see what file a code item comes from, however, as the code_items table uses a file_id and not a filename.  To see the filename as well, try the following query which does a join with the files table to get the filename.

SELECT     f.name, ci.*

FROM       code_items AS ci

           INNER JOIN files AS f ON ci.file_id = f.id

WHERE     (ci.kind = 1)

One thing to notice is that this query returns information on code items that occur in your own code as well as in the SDK and other headers.  You can return information for files that are explicitly in your solution by using the “config_files” table.  This table has a column that indicates whether a file is implicit or explicit.  A file may have multiple entries in the config_files table as a file can be used in multiple configs/projects.  The “DISTINCT” keyword prevents returning duplicate copies of code items.

SELECT   DISTINCT f.name, ci.*

FROM     code_items AS ci

         INNER JOIN files AS f ON ci.file_id = f.id

         INNER JOIN config_files AS cf ON ci.file_id = cf.file_id

WHERE    (cf.implicit = 0) AND (ci.kind = 1)

The parent_id in the code_items table refers to another item in the code_items table.  Using this information, you can get some parent/child information.  The id of 0 is the global namespace.  So, to get all functions in the global namespace you could do this.

SELECT    *

FROM      code_items

WHERE     (kind = 22) AND (parent_id = 0)

You can also join the code_items table to itself to find a set of code_items whose parent matches some set of criteria.  The following example finds all functions whose parent code_item is named “ATL”.

SELECT     ci1.*

FROM         code_items AS ci1 INNER JOIN

             code_items AS ci2 ON ci2.id = ci1.parent_id

WHERE     (ci1.kind = 22) AND (ci2.name = 'ATL')

I have only scratched the surface of what you can do to mine the SDF for interesting information about your source code.  There are many other ways to leverage the data contained in the SDF to gather information about your source code.  All of our browsing features for C/C++ are implemented on top of the SDF.  We do take advantage of some caching, prepared commands, and the special “table direct” mode that SSCE provides in order to increase performance, but everything comes from the SDF at some point.

The query processor in SSCE is limited in some ways, but you could even replicate the data from the SDF into a full SQL database and perform even more complex queries than are allowed by SSCE.

 

Jim Springfield
Visual C++ Architect

  • Brilliant! We can now sneak into intellisense content itself.

  • Can you go through any VCCodeModel changes?  I'm having trouble where some previously working code now only returns exceptions when accessing VCCodeClass.DeclarationText in VS2010.

  • Thanks for the suggestion JK.  I will talk to the dev that worked on CodeModel and see about getting something written up.  If there is a specific bug you have, please open it on Connect.  connect.microsoft.com/VisualStudio

  • Hi JK,

    Please, have a look at the following blog post: blogs.msdn.com/.../visual-c-code-model-in-visual-studio-2010.aspx .  Here, we tried to enumerate differences in VCCM of VS2010.

    We're interested in source code that causes DeclarationText to throw. Could you please share it with us?

    Vytas/Visual C++ IDE

  • Since you have the old VC++6.0 (MFC and all) source code somewhere there why not just recompile it and update the compiler to produce modern cpu code and include the latest MFC.

    That would make the great 6.0 compiler available again. And kick all managed and .NET stuff to some other package, give them an other name C++NET and ManagedC++ etc. and keep the old VC++ as it was. It is crazy to try to integrate everything since it just greates problems and the compiler front end just gets worse and worse.

    If I had the 6.0 source code available I would just recompile it and swap the compiler to a later version and include fresh libraries.

  • Hi Janne, I always think about the same when I run VS ..I suppose that the reasons are that they designed the IDE to stand at the same level ( or better) amongst Eclipse, Netbeans and Co. where several languajes (Java, Php, JavaFX, Ruby, Python, etc) converge in the same IDE. After all, programming languajes share the same components ( statements, conditional, loops,etc). So there is some solid foundation in the current architecture of the IDE.

    But currently, we use the same IDE to develop programs for "managed code", "native code" and an "C++/CLI" hybrid.

    Managed Code: VB.NET, ASP.NET, C#, F#

    Native Code: C++

    Hybrid Code: C++ /CLI

    It would be better if C++ can have its own special IDE ( with the latests controls such as property grids, ribbons,etc) as in VC++ 6.0 and group the .NET languages in another IDE. After all, most .NET programmers don't like to program in C++.

    About the browse database, it has improved a lot since the bsc and ncb format, the database it seems to be updated in realtime and  is is not necessary to recompile the code to in order to update the database.

  • @ Vytas:  I've filed Connect bug 568823  which demonstrates the issue.

Page 1 of 1 (7 items)