I was browsing through the Bonus Chapters for Raymond's book and I remembered an old CPU bug we encountered with the early 286 processors.
Back in those days, it was common to manipulate the processor directly, especially from inside the operating system.
Whenever we needed to have a period of time when we didn't want the OS interrupted by the hardware, we used the handy CLI instruction to turn them off.
The CLI instruction (and it's partner STI) clears (and sets) the "interrupt" flag that's part of the processor state. There are two other instructions of interest - pushf and popf which push all the processor flags onto the stack and pop them off.
As a result, periodically inside the system ROMs and OS, you'd find the following sequence:
pushf ; Push flags on the stackcli ; Disable interrupts, we're doing something : : popf ; Restore interrupts to their previous state.
pushf ; Push flags on the stackcli ; Disable interrupts, we're doing something
popf ; Restore interrupts to their previous state.
The problem was that this version of the 286 had a bug in the popf instruction. If you executed a popf instruction when interrupts were off and the flags value on the stack also had interrupts disabled (in other words you were transitioning from "interrupt disabled" to "interrupt disabled"), the processor would enable interrupts (and then turn them off again).
Unfortunately that had the side effect of potentially allowing an interrupt to occur when the system didn't expect it.
I'd forgotten how we fixed the problem, but amazingly enough, the first hit I searched for was from a Mike Abrash article in Byte magazine which (in part) discusses the issue and describes a workaround.
The great thing about these kind of bugs is that software gets to be a hero, rather than a goat. Imagine if Intel had to recall every 286 chip because of a bug like this. Nowadays, chips have microcode patch areas so they can work around these problems. Essentially, Intel and AMD turned hardware problems into software problems because they're easier to fix that way.
That's the problems with nowadays rookies: None of them is able to write an ISR in assembly. Programmers? Pfeeew! Pardon me while I laugh out loud!
Larry, you and I are true men! They don't do ones like us anymore. Ha!
ok,ok... back to work. I have this CSS problem to fix before lunch!
Tuesday, February 06, 2007 5:08 PM by Dave
> Imagine if Intel had to recall every 286 chip because of a bug
> like this.
Wouldn't that be around 100.003% like Intel recalling every Pentium some-variety chip because of a bug in floating point division? Some persuasion was necessary but they eventually did it.
An original Intel 8088 had a bug which allowed interrupts after the POP SS instruction. This was fixed within a year or two of the release of the IBM PC. If you pushed SS:SP onto the stack and then tried to pop SS and SP off the stack, there was a chance that an interrupt would come in BETWEEN the two instructions when the stack pointer (SS:SP) was in an inconsistent state. Nothing good could come of this.
There were also some interesting security related design bugs in the 80286 and I think the 80386 which were documented in a paper and later fixed by Intel. I don't have the URL but I'm sure someone will contribute it to this blog.
I remember that particular bug. It made life interesting for DOS 4.
And then there was the PC Jr, which is worth an entire blog post in its own right.