ICorDebug has a nicely abstracted "Stepper" object, via ICorDebugStepper (I talked more about that here). You setup a stepper object (via CreateStepper), resume the process, and then get an aysnchronous "StepComplete" debug event when the stepper is finished.

For example, you could setup a StepOut, Continue, and then get a StepComplete when the thread returns from the current function.

The Stepper is a very high-level object. Stepping is just a high level abstraction built on breakpoints and single-step ("step one machine instruction") flags. ICorDebug (ICD) handles the plumbing underneath for setting any intermediate breakpoints or single-step flags, staying aware of recursion and stack-depth, etc. ICD has several types of steppers: step-in, step-over, and step-out.

Here's some trivia about ICorDebug steppers:
1. Stepping is built on intermediate debug events, so it's very important that those events are fast in order to get a fast overall stepping experience. Source-level step through a basic block will likely result in a a native single-step event (which is an native exception on x86) and each native instruction in the source-line. So I repeat: it's very important to have those intermediate single-step events go fast because that's a case where your app will be firing a lot of exceptions. We improved this a lot in whidbey (400x) for interop-debugging.
2. Steppers have thread-affinity, so if you want fancy cross-thread stepping, your debugger needs to build that on top of the stepper primitives ICD provides. GreggM explains how VS does stepping from a web client to a web service in a single seamless session.
3. ICD allows multiple steppers and abstracts them from each other, in the exact same way it does with breakpoints or func-evals. Thus multiple steppers can run concurrently, and finish concurrently.
4. In light of multiple steppers, you can "deactivate" outstanding steppers.
5. Once a stepper fires a step-complete, then it's "dead". You can recycle it by issuing another "step" command (which I think was a very unfortunate API design because it makes the ICorDebugSteppers lifetime undefined)
6. Steppers can have different flags guiding policy about what to step-in to, etc.
7. Steppers don't care about 0xFeeFee sequence points at the ICD level. This is handled by the debugger: when the debugger hits a 0xFeeFee, it has a policy to reissue another step. V1.0 Cordbg didn't have this check, but V2 MDbg does.
8. ICorDebug has no notion of an "active" stepper or an "active" thread. Both of those concepts are entirely maintained by the debugger by deactivating any outstanding steppers once the shell stops.
9. ICD tracks all outstanding steppers and lets you enumerate them via ICorDebugProcess::EnumerateSteppers

Let me drill in on the consequences of multiple steppers:
1. You can have multiple steppers on different threads. For example, you could enumerate all threads and create a step-out stepper on each of them; and then wait for all the StepCompletes to trickle in on their different threads. Or once you got the first step-complete, you could deactivate all the others.

2. You can even have multiple steppers on the same thread! For example, you could issue both a step-in and a step-out, and then see which one hits first.

3. You can have identical steppers in parallel on the same thread. You could issue two step-out steppers, and they'd each have their own unique ICorDebugStepper instance! One advantage here is for plug-in models. Two isolated debugger extensions could issue a series of step-commands and they wouldn't interfere with each other. There are other advantages too, such as letting a debugging create a "compound stepper" with more complex semantics.