A thread in internal discussion group reveals that dynamic_cast is very slow in x64 system. And one of the developers explains the reason:
From: Sent: Tuesday, October 17, 2006 11:52 AMTo: Subject: RE: dynamic_cast code runs faster in WOW mode than native x64
I haven't looked at profiles or tried your testcase, but I think I know why you see this difference.
dynamic_cast works by looking at the RTTI (run-time type info) associated with a type. That RTTI has a bunch of pointers in it. On x86, these pointers are just that, raw 32-bit pointers. But on 64-bit platforms, the pointers are actually 32-bit offsets which need to be added to the base address of the DLL or EXE in which the RTTI resides to compute a true 64-bit pointer. That addition shouldn't be causing any major perf problems, since it's cheap and not too common. But determining the module base address, which happens once per dynamic_cast, could be expensive. It's done via the API call RtlPcToFileHeader, which (in the fast case) takes the loader lock and walks the list of loaded modules to find where the RTTI data resides.