One intern in my team has been working on a utility, which makes use of binary instrumentation. So I think it's time to recap on that.
Understand the Fundamentals
As we mentioned in Microsoft Binary Technologies and Debugging, there are many binary technologies. Most of these technologies can be used either statically (patch and write back to the disk) or dynamically (hotpatching in memory at runtime). Which one to choose really depends on the requirement.
In most cases, API hooking would be sufficient since it captures the skeleton of execution flow, as well as the inputs and outputs.
API hooking normally happens on the callee side using trampoline technology like Detours. PE header (esp. delay loading, address fix up) and calling convention are must known, and being able to read a bit assembly language is a bonus.
Sometimes you would like to hook API from caller side, normally this would be hijacking the PE import directory. This would eliminate the internal invocations since they are not routed via IAT (e.g. by adding trampoline to a function which invokes itself recursively, each recursion would go through the trampoline thunk).
When it goes deeper such like implementing a profiler or a code coverage analysis tool, Extended Basic Block (EBB) Analysis is unavoidable. This requires solid knowledge over disassembler, compiler backend and linker (code generation, optimization, symbol file, etc.).
If the instrumentation happens in kernel mode, special things need to be considered, such as page-in and page-out, IRQL and spinlock. For SSDT level hooking these normally wouldn't become a problem.
Understand the Runtime and Environment
The art of instrumentation is to live well within the target process.
The instrumentation code would consume additional resources and might introduce side effects: