Welcome to MSDN Blogs Sign in | Join | Help

From Phoenix to Media Center...

After nearly six years working on Visual C++ and Phoenix, I will be taking on a new job at Microsoft on the eHome team.  I'm going to be working on the Media Center TV product to help bring the future of TV to you. 

It's been a pleasure talking with developers here, on MSDN forums, at conferences, and via email.  But my departure doesn't mean that you won't hear from me -- some of you also must watch TV, so I hope to continue having conversations with some of you over there.  It may not be in this particular blog, but I don't think it will be hard to find me...

Lastly, I hope you're enjoying Visual Studio 2008.  It's a great release.  And this division will continue to delight you in future releases -- even if I'm not actively working on them  :-)

Posted by kanggatl | 1 Comments
Filed under:

What Do You Want More Information About (with respect to Phoenix)?

As you can probably imagine, we are still hard at work on Phoenix (yes, a new version of the SDK is coming, although I don't have a date yet). One of the things that I'm very interested in is what would YOU like to see in the samples and documentation sections.  We want to make sure that we give you the best bang for the buck.

So please, leave a comment (or email) letting me know what code samples and/or documentation you'd like to see.  We'll do our best to make it happen either for this upcoming SDK or some future version of the SDK -- or maybe even this blog.

Posted by kanggatl | 0 Comments
Filed under:

What do c2 Phases do?

On the Forums, someone asked the good question of "What do the C2 phases actually do?"  Andy got some info from our documentation team about the phases, so I thought I'd also add them here.  Expect this info in future version of the SDK:

 

Phase

Phase Action

CxxIL Reader

Converts CxxIL to Phoenix IR.

Warnings Analysis Detection Phase

Detects and emits back-end warnings (C4700, C4701, and so on).

Add CallGraph Call Site Information

Modifies call graph (if present) to refer to specific call sites within the function.

Inliner

Inlines functions.

Type Checker

Verifies that IR is correctly typed.

Flow Optimization

Streamlines control flow (for example, eliminates jumps to jumps).

MIR Lower

Lowers object model instructions.

Loop recognition and loop transformations

Performs basic loop restructuring.

Global optimization

Performs optimizations such as constant propagation, global value numbering, constant subexpression elimination, and dead code elimination.

Loop optimizations

Recognizes induction variables for strength reduction and loop-invariant code motion.

GS Shadow Copying

Implements the /GS option (stack security).

GS Security Cookie Allocation

Implements the /GS option (stack security).

X86 scalar Sse

Prepares the compiler to use SSE floating point instead of x87 floating point.

Canonicalize

Transforms IR to canonical form in preparation for lowering.

ShiftExpansion Strength Reduction

Transform multiplications and divisions by constants into shifts and additions.

Address Mode Builder

Transforms HIR/MIR to take advantage of target architecture's built-in addressing modes.

Lower

Lowers most instructions to machine level.

SSA-based idioms optimization phase

Performs classic "peephole" optimizations.

Priority Order Register Allocation

Assigns registers to operands.

X87 Stack Allocation

Allocates the x87 stack.

GS Security Cookie Initialization and Check

Implements the /GS option (stack security).

Native EH Lower

Expresses exception handling the way that the run-time convention expects.

Stack Packer

Assigns stack locations to operands.

Frame Generation

Determines shape of stack frame for the function.

Switch Lower

Lowers switch instructions.

Dead Stores

Removes dead stores (that is, stores that are guaranteed not to be read).

Block Layout

Places IR in final order.

Flow Optimization

Removes jump-to-jump and jump-to-next instructions.

Finish EH Lower

Finishes the process begun by Native EH Lower.

Encoding

Produces machine encoding for IR, and debug information.

Emission

Puts encoded IR into the object file.

Emit Referenced Symbols

Verifies the references made from the encoded IR to ensure that the compiler emits a fully closed set of functions and data.

Assembly Listing

Produces the .asm listing file.

Posted by kanggatl | 0 Comments

Finding the Base Class of a Function with Phoenix

This came up on an internal alias today, so I thought I would post the solution.

 The issue is when there is some code like the following:

    class BaseClass

    {

    }

 

    class InheritedClass : BaseClass

    {

        void SomeFunction() { }

    }

And you have a pointer to the SomeFunction function unit, how do you find out the associated base class of that function (in this case it is BaseClass).

The answer is: 

functionSymbol.EnclosingAggregateType.PrimaryBaseAggregateType

Posted by kanggatl | 3 Comments
Filed under:

volatile, acquire/release, memory fences, and VC2005

One of the more common questions I get about VC2005 code generation relates to the code generation of volatile on x86/x64.  If we take a look at MSDN we see that it defines the semantics for volatile in VC2005 as :

 

o    A write to a volatile object (volatile write) has Release semantics; a reference to a global or static object that occurs before a write to a volatile object in the instruction sequence will occur before that volatile write in the compiled binary.

 

o    A read of a volatile object (volatile read) has Acquire semantics; a reference to a global or static object that occurs after a read of volatile memory in the instruction sequence will occur after that volatile read in the compiled binary.

 

So, what does this mean for code that you might write?  Let's look at the Read Acquire semantics in an example.  In this example the volatile variable has the name 'V'.

 

Read Acquire Semantics:

Store A

Load B

Load V

Store C

Load D

 

The Read Acquire semantics say that Store C and Load D must remain below Load V.  Store A and Load B are not constrained by Load V (at least, they have no constraint as a result of the load acquire semantics, but other hardware constraints may constrain their movement).

 

Now let’s look at Store Release semantics.  Again, the volatile variable is 'V':

 

Store Release Semantics:

Store A

Load B

Store V

Store C

Load D

 

The store release semantics state that Store A and Load B must remain about Store V.  In this case Store C and Load D are not constrained by Store V (again, at least not with respect to the store release semantics). 

 

OK, this behavior is exactly what many people want.  So what they often do at this point is they use volatile in code and then they look at the generated assembly code to see what type of synchronization the compiler introduces to ensure that the acquire/release semantics are preserved.  (Note, that the compiler has internal constraints which ensure that the compiler does not violate these semantics when it generates code).   For many people they're surprised to see that there are no synchronization primitives used.  Wait, this can't be right?!  How do we keep the CPU from violating these semantics without some type of lfence or sfence or something?  Well lets talk about what the hardware might do to our instruction sequence. 

 

With respect to a single core all loads and stores are perceived by the programmer to occur in program order (note, that when I say program order, at this point I mean the assembly program).  There is no reordering that occurs.  OK, that makes things easy, but again that's just a single core looking at its own instruction sequence.

 

But things get more interesting when you have more than one processor/core (doesn't it always?).  Across processors, one "might" see different ordering, i.e., processor 1 might observe loads/stores retired in a different order than processor 0 has in its program order (note, this is probably the weakest memory model you will see on x86/x64, but if we work here, we'll work for something stronger).  Hmmm… that may cause problems for our volatile (or will it?).  Lets dig into what this reordering Processor 1 might observe is.

 

The possible reordering that Processor 1 might observe is that Loads can pass Stores (as long as the Store is non-conflicting).  But Loads with respect to other Loads will remain ordered.  And Stores with respect to other Stores will remain ordered.  Lets see an example:

 

Original Program Order on Processor 0:

Load A

Store B

Load C

Load D

Store E

Load F

 

Possible Reordering Processor 1 Might See:

Load A

Load C

Load D

Store B

Load F

Store E

 

or

 

Load A

Load C

Load D

Load F

Store B

Store E

 

If you look at this, you see that Loads can "float" upwards past Stores (again, as long as the Store is non-conflicting), and "can" continue to float upwards  until it hits another Load. 

 

So how does this affect our volatile semantics?  Let’s start with the Read Acquire semantics example (example copied from above):

 

Read Acquire Semantics:

Store A

Load B

Load V

Store C

Load D

 

Another processor observing this instruction sequence may see Load B float above Store A, which is fine (no violation).  But since Store's don't float upward, Store C must remain below Load V.  Load D can float upward, but it can't go past another Load, so it can't pass Load V.  Thus any instruction originally below Load V, will be observed by another processor to execute after Load V.  Good.

 

Now let’s look at the Store Release Semantics (again, copied from above):

 

Store Release Semantics:

Store A

Load B

Store V

Store C

Load D

 

In this case Load D can float past Store C and Store V, but Store Release semantics don't care about instructions that occur below the Store V, so no violation here.  Loads can float upward, but not downward, so Load B can not be observed to execute after Store V.  And Store's are always observed in program order.  Again we're good.  So our volatile model is preserved, even with this reordering semantics.

 

Last thing… these rules don't apply to SSE streaming instructions or fast string operations; so if you are using weakly ordered instructions, then you'll need to use lfence, sfence, mfence, etc...

 

PS - On Itanium, with its weaker memory model, we generate ld.acquire and st.release instructions explicitly. 

Posted by kanggatl | 8 Comments

PLDI Phoenix Tutorial Sold Out!

For those of you who were interested in attending the PLDI tutorial on Phoenix, I hope you have signed up already.  We actually sold out during the pre-registration timeframe!  We think it will be a fun tutorial, with a strong focus on writing code. 

If you didn't sign-up, hopefully we'll have some more events in the future.

Posted by kanggatl | 0 Comments

Native code raise to MIR?

I've heard several questions lately about Phoenix's ability to raise native code to MIR (Phx.FunctionUnit.SymbolicFunctionUnitState).  Today Phoenix does not support raising native code to MIR.  We do plan to support raising beyond LIR in the future for native code, but at this moment in time it is just LIR. 

Note that we can raise managed code to Phx.FunctionUnit.SymbolicFunctionUnitState.

Posted by kanggatl | 0 Comments

Phoenix Tutorial Updates...

OK, first of all I'd like to start out by saying that it sounds like the CGO tutorial was a success.  I wasn't there, but I've heard second hand that people really liked it.  We have some of the material from the tutorial available for download now at: https://connect.microsoft.com/Phoenix/Downloads/DownloadDetails.aspx?DownloadID=5742 (you'll need a Live ID).

The next thing is that the PLDI 2007 Tutorial info is now posted and you can get info on the Phoenix tutorial.  It is located at: http://ties.ucsd.edu/PLDI/tutorials.shtml#phoenix.  Conference registration is not expected until May 12th, so mark your calendars.  Also, the webpage lists the tutorial for 2-hours, but it is a 4-hour tutorial. 

 That's it for now.  More shortly...

Posted by kanggatl | 0 Comments

Phoenix News!

There's been quite a bit going on with Phoenix lately.  Probably the biggest thing is that a new RDK has been released.  Go to the Phoenix Connect site and you can download it;

https://connect.microsoft.com/site/sitehome.aspx?SiteID=214

From the description on the webpage:

"Phoenix RDK March 2007 features improved API naming, more optimizations, volatile supports for acquire/release semantics, improvements to c2 generated debug information, more robust and accurate raising of MSIL to LIR, improved conditional branch semantics, and improved documentation."

Some other cool news is that there is a Phoenix MSDN Forum now!  In the past the forum was restricted to only academics, and it was on more difficult to use message board system.  Now we've joined all of the other Visual Studio technologies in the MSDN Forum, and it is open to everyone.  So if you have any questions, or comments, that's a great place to post them!

And just as a reminder the Phoenix CGO tutorial is coming up in just a about a week.  I got to sit in a dry run session at Microsoft that Andy Ayers gave and it's some cool stuff.  We didn't do the whole tutorial, but we used extension objects, the dataflow/simulation package, and had fun with flow graphs and the bit-vector package.  Online registration I believe is closed, but there are still spaces open if you register onsite.

And lastly, myself, Jim Hogg, and Andy Ayers had our proposal for a Phoenix tutorial at PLDI 2007 accepted.  While it will also be hands-on, the focus will be on building tools.  More info on this as I get it.  Also, if people have feedback on what they'd like to see in a hands-on tutorial, please let me know.  We have plenty of time to custom craft it.

Posted by kanggatl | 5 Comments

A couple of new things in the Phoenix RDK

We are getting ready to have a new Phoenix RDK in time for CGO 2007 and expect to see quite a few new things in it.  Probably the two most visible things is that we are enabling a lot more optimizations in c2 and the API will look very different. 

For the optimizations, the Phoenix C++ code generator will generate much better code with this new RDK, although the code quality still won't be quite as good as the shipping Visual C++ 2005 compiler.

What will probably be the most visible difference to users of the RDK is that we have done some name auditing, and now the names for classes, methods, properties, etc are a lot clearer, and we removed needless (and often confusing) abbreviations.  Additionally, we needed to be more inline with our own company's guidelines on building .NET class libraries.  We will also ship a tool in the RDK that will help you convert your existing Phoenix code to use these new names, so we're making this transition as easy as possible

All in all, we think that this will go a decent ways towards making the API usable.  And I know this is something that many people have inquired about.

Posted by kanggatl | 0 Comments

Run VS2005 as Administrator on Vista when debugging

I recently was doing some ASP.NET 2.0 work (sometimes it's good to mix up what you work on, ya know) and I had a problem where I could not debug my ASP.NET application on Vista.  It was weird because the debugger would start, but then would exit immediately.  The webpage would come up just fine, javascript and server side components worked perfect, but it would just not be in debug mode. 

I couldn't find anyone having a similar problem (or I'm just not good at finding it), but I eventually did find that some people were having other problems with debugging ASP.NET on Vista.  I decided to try a solution that others were using, which actually was something I had used to solve a few other problems I'd had on Vista, which was to right-click VS2005 and run it as Administrator.  Unsurprisingly, this fixed the problem. 

 I write this blog to simply help someone who is having similar problems.

Keywords: f5 debug debugging vista vs2005 vs 2005 visual studio asp asp.net asp.net 2.0 exit

Posted by kanggatl | 8 Comments

Phoenix at CGO 2007

The Phoenix team will be doing a hands-on tutorial at CGO 2007, led by Andy Ayers, one of the architects on the team.  I think the tutorial should be quite interesting.  Not your typical tutorial.  For more info, go to this link:  http://www.cgo.org/cgo2007/html/tutorials.html#Practical:Phoenix

 BTW, if you have specific topics you'd like covered in a tutorial on Phoenix, let me know.

 

 

Posted by kanggatl | 3 Comments

New Year's Resolutions

Today marks the first day of 2007 and so it is time for the obligatory New Year's resolutions.  Here's my list of resolutions:

1) Post to my blog more regularly.  I had not posted to my blog since September.  That's just not great regularity.  There's a lot that's been going on, so it is worth putting it out there.

2) Spend more time listening to potential customers.  We're really making great progress on Phoenix and we want to continue to use the great feedback from customers in the product.  So if you do have some ideas, please let me know.  And if you'd like to have some time doing some one on one chatting with you and your company, let me know.

Also if there are things you'd like for me to write about in my blog, please let me know about those too.

3) Become an ASP.NET expert.  It's just fun stuff.

4) Get more Phoenix tutorials and short snippets out there.  I personally believe that everyone learns by discovery (whether they know it or not), and through reading examples, people go through the process of discovery. 

BTW, OOPSLA 2006 went really well.  We had good turn out at the BoF, with a lot of positive feedback.  We will be at CGO and PLDI this year.  More details on what we'll be doing at those conferences later...

Posted by kanggatl | 2 Comments

Phoenix at OOPSLA

This year's OOPSLA conference (OOPSLA 2006) is being held in Portland, OR and we will be there with a Birds-of-a-Feather session (BOF).  This is a great chance to chat with someone from the Phoenix team, and other like-minded tools developers who are either using or interested in using Phoenix.

You can get more information about the BOF (and OOPSLA) at this link: http://www.oopsla.org/2006/program/program/birds_of_a_feather_bof_sessions.html#2

And if you do attend, make sure to let me know that you read about it on my blog.

Posted by kanggatl | 0 Comments

Walk Through: Adding a Function Call to a Program

Here is the scenario: you have compiled and linked a big program – you may have even shipped it out to customers.  After it was built you realize that in order to find a bug or determine some necessary information, you need to instrument a certain function in the program.  With Phoenix you don’t need to rebuild the program, but can simply use this Phoenix tool to instrument the binary directly.  We do this by inserting a function call into the original program, from a DLL that you have written (It is worth noting that this function call that is inserted takes no arguments in this sample.  We will handle passing arguments later, as that is a more complex task).

 

As I promised, I will switch between C# and C++/CLI.  This program is taken from the Phoenix RDK and is written in C++/CLI.  If you aren’t familiar with the syntax, it is actually quite similar to C#, but if you want more details then the language specification is located here. 

 

Things Covered in this Article

·         Reading/Writing a PE file.

·         Creating imports.

·         Creating a memory operand (MemOpnd).

·         Loading functions.

·         Adding function calls to an instruction stream.

 

There is one part of the program that this article will NOT cover, which are controls, this is Phoenix lingo for the command line arguments.  We will cover controls in depth in a later article.  They are somewhat apparent from reading the code, so I don’t think you should be confused by their presence, as you’ve probably written code to parse the command line a million times yourself.   

 

Also, I haven’t talked about the details of the various Units in Phoenix.  This is something I’ll have to do in a future posting.  If any of this blog is unclear due to this omission, let me know, and I’ll make sure to make this clarification.

 

The Main Function

Like StaticGlobalDump, we start with main(), which is given below (it’s just after Code Point 9).  The code that is bolded are function calls that have more user-defined functionality behind it, whereas the non-bold code calls directly into supplied framework code (either the CRT, STL, CLR, or Phoenix). 

 

Code Point 1: Looking at the code, we see that the first thing we do is to initialize the Phoenix targets.  In the StaticGlobalDump walkthrough I explained this code, so I’ll skip discussion of it here.  The code is identical (except in StaticGlobalDump it was a in a separate function). 

 

Code point 2: This is where we begin initialization of the infrastructure.  This is the second time we have seen the BeginInit method, as we also saw it in the StaticGlobalDump program. 

 

What happens when you call BeginInit is that a LOT of things get initialized under the covers.  Everything from the initialization of threading and memory management infrastructure of Phoenix, to the symbol and type table, to the controls infrastructure.  BeginInit is just something you need to do to get Phoenix started.

 

Code point 3: Phoenix has a very rich set of command-line parsing capability.  The various command line arguments one can pass to a Phoenix client are called controls, and are accessed in the Phx::Ctrls namespace.  InitCmdLineParser is a user-defined function that parses the command-line argument for the program using the routines in the Phx::Ctrls namespace.  We will cover this capability in a future article.

 

Code point 4: One reasonable question is “Why is InitCmdLineParser in-between BeginInit and EndInit, whereas in StaticGlobalDump there was nothing in-between those two calls?” 

 

The reason is that at EndInit Phoenix parses the command-line for the controls, thus you need to have the controls setup before EndInit is called, and naturally you can’t set up the Phoenix controls before you start initialization of Phoenix.  Therefore the InitCmdLineParser must reside in-between BeginInit and EndInit.

 

EndInit actually does more than just parse the command-line, but for this particular piece of code that’s the only thing that is relevant.  We will dive into some of the other things that need to happen in-between BeginInit/EndInit during another article where it is relevant.

 

Code point 5: This is a call to CheckCmdLine.  This is a simple user-defined function that checks to make sure that each of the required command-line arguments is supplied.  If not, it exits the program with an error.  Again, we will cover this capability in a future article.

 

Code point 6: This is where we open a PE file, in the same way we did with StaticGlobalDump.  the main difference is that we get the string name from a global variable called “GlobalPlaceHolder”.  GlobalPlaceHolder is a class that has a set of controls in it, each one mapping to one of the command line arguments:

 

public ref class GlobalPlaceHolder {

public:

   static Phx::Ctrls::StringCtrl ^ in;

   static Phx::Ctrls::StringCtrl ^ out;

   static Phx::Ctrls::StringCtrl ^ pdbout;

   static Phx::Ctrls::StringCtrl ^ importdll;

   static Phx::Ctrls::StringCtrl ^ importmethod;

   static Phx::Ctrls::StringCtrl ^ localmethod;

};

 

GlobalPlaceHolder::in->GetValue(nullptr) gets the string out of the “in” field, which corresponds to the name of the input PE file for this program.  For this walkthrough, ignore the nullptr argument.  We will cover that in the future when I discuss controls.

 

Code point 7: These two lines simply copy the command-line arguments out of the controls into two fields in the PEModuleUnit that correspond to the command line arguments.  The first being the path for the resulting PE image, and the second line being for the output PDB filename.

 

Code point 8:  LoadGlobalSymbols takes a PE ModuleUnit and loads all of the global symbol data out of the associated PDB file.  So after doing this call you will have a symbol table populated with all of the global/static variables and the symbols for the types, and the methods associated with that type. 

 

We will go into more depth about LoadGlobalSymobols in a future posting, but if you’re curious, you can dump the Symbol Table for the PEModuleUnit after you load it (PEModuleUnit->SymTable->Dump(dumpOptions)).

 

Code point 9:  DoAddInstrumentation is where the user logic to add the new calls to the existing function takes place.  See later in this article for the section on DoAddInstrumentation.

 

After that is the call to moduleUnit->Close().  It does more than simply closes the PEModuleUnit.  It also checks if the OutputImagePath is non-null.  If it is non-null then it writes out the PEModuleUnit to disk, using the OutputImagePath.  Note that we did set the OutputImagePath in code point 7, thus when this program ends it generates a new binary.  It also generates a new PDB file, placing it at OutputPdbPath, which we also set in code point 7.

 

int main(array<String ^> ^ args) {

   // Initialize the target architectures.

   // 1

   Phx::Targets::Archs::Arch ^ arch =

      Phx::Targets::Archs::X86::Arch::New();

   Phx::Targets::Runtimes::Runtime ^ runtime =

      Phx::Targets::Runtimes::VCCRT::Win32::X86::Runtime::New(arch);

   Phx::GlobalData::RegisterTargetArch(arch);

   Phx::GlobalData::RegisterTargetRuntime(runtime);

 

   // Initialize the infrastructure.

   // 2

   Phx::Init::BeginInit();

 

   // Init the cmd line stuff.

   // 3

   ::InitCmdLineParser();

 

   // Check for Phoenix wide options like "-assertbreak".

  

   // 4

   Phx::Init::EndInit(L"PHX|*|_PHX_", args);

 

   // Check the command line.

   // 5

   ::CheckCmdLine();

 

   // Open the module and read it in.

   Phx::PEModuleUnit ^ moduleUnit;

   // 6

   moduleUnit =

      Phx::PEModuleUnit::Open(GlobalPlaceHolder::in->GetValue(nullptr));

 

   // Setup output file name and PDB.

   // 7

   moduleUnit->OutputImagePath = GlobalPlaceHolder::out->GetValue(nullptr);

   moduleUnit->OutputPdbPath = GlobalPlaceHolder::pdbout->GetValue(nullptr);

 

   // Iterator will load symbols implicitly.

   // However Load Global Symbols upfront and print total.

   // 8

   moduleUnit->LoadGlobalSyms();

 

   Phx::Output::WriteLine(L"Total Global Symbols  Count - {0} ",

      moduleUnit->SymTable->SymCount.ToString());

 

   // Do some useful work on the tool front here:

   // 9

   ::DoAddInstrumentation(moduleUnit);

 

   // Close the ModuleUnit.

 

   moduleUnit->Close();

 

   // If this was not the end of the application it would be best to

   // delete the ModuleUnit.

 

   // moduleUnit->Delete();

 

   return 0;

}

 

The DoAddInstrumentation Function

This function is where we do the meat of the work to instrument the PE image.  We instrument the PE image with a call to an imported function at entry to the function which is specified on the command-line.

 

Let’s step back and think about the steps that are required to add a call to a function, even outside of a framework such as Phoenix. 

 

1.    Import the DLL that contains the function, F, which we wish to inject into the specified function.

2.    Get the import symbol of F from within the import module that we wish to inject into the specified function, S, in the PE file.

3.    Get S and find its first instruction.

4.    Inject a call to F from the imported DLL before this first instruction in S.

 

Those are the basic steps that we need to do.  DoAddInstrumentation will do these four steps using Phoenix.