Larry Osterman's WebLog

Confessions of an Old Fogey
Blog - Title

Faster Syscall Trap redux

Faster Syscall Trap redux

  • Comments 12
Raymond has a funny historical article about how Windows made system calls on 386 processors.

What he left out was the 286 version of this story.  Microsoft and Intel had a similar meeting to the one that Raymond described with the 386, but in that case, one of the Microsoft requests was for the ability to switch from protected mode back into real mode.

You see, the 286 had the ability to switch from real mode (no virtual memory) to protected mode (virtual memory), but not back.  The theory was that you'd never want to go back to real mode, that would be "silly".

But of course, that doesn't deal with the issue of compatibility.  OS/2 supported one real mode application running in the system, in the "DOS Box".  The DOS box was essentially just another task, it got time sliced like other processes (ok, it really didn't, but conceptually that's what happened), so the system did a LOT of switching between real mode and protected mode.

It was critical that we be able to switch from protected mode back to real mode (when switching to the DOS box).  The problem is that the only documented way of doing this was to write to the keyboard controller device (which controls WAY more just the keyboard on a PC).  Unfortunately, the keyboard controller was REALLY slow, and this mechanism took literally milliseconds.

So Microsoft went crazy trying to find a fast way of switching back to real mode.

Eventually they found it.

Their solution:

    LIDT -1
    INT 1

What did this do?  Well, LIDT -1 sets the interrupt descriptor table to an invalid physical address.  The system tried to execute the INT 1 instruction, which caused it to fault the IDT into memory (a fault).  Well, the IDT couldn't be found, so that raised a not present fault (a double fault).  The not present fault tried to fault in the not present fault handler, which failed (a triple fault).

The 286 processor couldn't handle faults more than 3 deep, so it gave up the ghost and reset itself.  Which caused the system ROM to start executing, and we simply set the real mode start address (which was kept in real memory) and poof! we had transferred from protected mode to real mode in microseconds (not milliseconds).

I actually found a write-up of this technique on the web here.  Interestingly enough, the article on the web credits Intel for this technique, they may be right, I remember it being developed in-house, but I may be wrong.  After I unpack my office, I'll check my 286 reference manual.

Much later: I just checked my 286 reference manual (a rather well thumbed first edition), and I found the reference that was discussed in that article.  The comment accompanying the comment is "Setup null IDT to force shutdown on any protection error or interrupt".  That's it, that's the only hint that Intel came up with the triple faulting technique to force a restart in real mode.  Personally, I'm not surprised that the IBM engineers didn't pick up on this.

 

  • > The theory was that you'd never want to go
    > back to real mode, that would be "silly".

    Actually that much of it is true. The part they forgot is this: no matter how much you don't want to do silly things, sometimes you _have to_ do silly things, so you still ought to plan for them.
  • Wouldn't INT 3 have saved a byte?
  • Mike: You're right, it would have - that may have been what they did actually :)...
  • I recall the "official" way to switch back to real mode to be by programming the keyboard controller so it would end-up resetting the processor ... or somrthing like this. I read this in the only book on 286 programming I could get my hands on at the time. Uff ...
  • Hugh ... I guess I should have read more carefully the article. You mention it right there. Sorry about that.
  • I never really had much to do with PCs in the 286 era, and I've always wondered why the keyboard controller did so much more than control the keyboard. Is this related in some way to the "A20 gate enabled" option that used to be in the BIOS?
  • "I recall the "official" way to switch back to real mode to be by programming the keyboard controller so it would end-up resetting the processor ... or somrthing like this. I read this in the only book on 286 programming I could get my hands on at the time. Uff ..."

    If I remember correctly, the keyboard controller aspect of it had to do with enabling/disabling the A20 gate to get full use of memory when in protected mode...
  • Mike,
    They shoved the kitchen sink into the keyboard controller. Enabling the A20 line WAS a part of it, it also controlled the reset line to the CPU and other things.
  • In some ways this is the type of stuff I would never let my programmers do "try stuff till you find something that works". I guess it is a different world with CPUs, but I can't help but think of all those Windows programmers who rely on undocumented behavior (not features) in Windows that result in Microsoft having to add shim code to not break the applications.
  • Tim,
    It wasn't really "try stuff 'til you find something that works" - it was more of a "Ok, we've got do figure out how to reset the CPU - pour over the Intel reference manuals and see what you can find"

    The triple fault behavior was known, it just took some really clever people to put the triple fault behavior together with a desire to switch to real mode and make it work.

    Btw, I forgot to add to the article that for the 386, Intel added the ability to quickly switch back to real mode :)
  • In the original PC, the keyboard was controled by a few bits in a paralel port (a 8255 chip if memory does not fail me). In the AT, que 8255 was replaced by a microcontroler, and it had to emulate the other features of the paralel port.

    Now, IBM decided to include a way to reset the processor by hardware (to get from protect mode to real mode). It makes sense to use one bit of the paralel port that is emulated by the keyboard controler.

    Btw, it is not talking to the keyboard controler that was slow, the fact is that a full reset was done, so a flag had to be saves somewhere, and a lot of initialization code had to be executed before resuming the OS.

    It should also be remembered that all this had to be done even in DOS, if you want to access memory above 1M+64K (for the first 64K above 1M there was a very good hack).
  • Daniel, using the triple fault trick brought the context switch time from 15+ milliseconds to 800ish microseconds. Since the same real-mode code to restart was executed in both cases, that implies that the 14 millisecond savings came from the time to reset the controller, and not from the time to get back to your real mode code.

    I actually wrote about the A20line trick a while ago - himem.sys was the driver to enable that piece of magic.
Page 1 of 1 (12 items)