Two guys walked into a bar, but the bar was broken

Sorting it all Out
Michael Kaplan's random stuff of dubious value
Be sure to read the disclaimer here first!

Two guys walked into a bar, but the bar was broken

  • Comments 22

It was over a year ago that I pointed out in the post Keyboards: hardware vs. software how disconnected our team (which owns most of the keyboard layouts) and the hardware team (which owns most of the actual keyboard hardware) were.

And how impressive it was that we managed to be in sync so often, given that disconnect.

But it is possible I may live the rest of my life without being able to understand why almost every keyboard layout has a key which, when typed, will produce | (U+007c, a.k.a. VERTICAL LINE) yet printed upon the face of the key is ¦ (U+00a6, a.k.a. BROKEN BAR).

What's up with that?

It turns out that every single byte code page other 874 of the Windows code pages supports U+00a6, and every single Windows code page bar none (pardon the pun) supports U+007c.

And just about every font that has one has the other.

Even though in most cases (to get back to keyboards) almost every keyboard prints one on theface of a key but the matching layout has the other input.

So why this disconnect?

And more importantly, why does it persist?

And most important of all, why don't people complain? In either direction?

I suspect it is because no one really cares.

Or maybe is just that two guys can walk into a bar. Even if it looks like it is broken. Since it turns out they may still be serving drinks....

 

This post brought to you by "|" (U+007c, a.k.a. VERTICAL LINE)

Comment on the blather
Leave a Comment
  • Please add 6 and 5 and type the answer here:
  • Post
Blog - Comment List
  • I'm using the Microsoft Natural Multi-media keyboard and that doesn't have the broken bar http://www.microsoft.com/hardware/mouseandkeyboard/productdetails.aspx?pid=019
  • On the console's raster fonts (and Terminal) U+7c is still a broken bar, so I think that this is historical.

    It seems the bar was first broken in 1967 and later repaired in 1977, but not all devices changed along with it. I would assume that IBM decided the bar should be broken, so it made the original PC with a broken bar on the keyboard and in the graphics card fonts. Later on when Microsoft started implementing fonts of course they implemented the ANSI standard solid bar, but nobody ever thought to change the keyboard.

    At some point I guess ISO decided that the broken bar was still useful (backcompat maybe?), so they assigned it a code point in the upper half of the charset.
  • I suspect that people don't even understand that theres a difference - they just see it as two different ways of printing the same character.  Like 0 with a dot, a dash or empty.

    FWIW the random, cheap Dell keyboard I've got in front of me has both ¦ and | and they both print the right characters.
  • It's a solid bar on my old, clunky, UK keyboard!

    I suspect that one reason why no-one complains is that only geeks ever type that symbol, and geeks don't generally look at the keyboard while typing! I certainly didn't know what was printed on that key until you prompted me to look.

    Of course, on a UK keyboard the | symbol is on an extra key between the Z and the left shift key. A key which doesn't exist at all on a US keyboard! We also have a second extra key between the " (which is really the @ key!) and the enter key, which has on it # and ~. I've often wondered why the UK keyboard is so different from the US one, since the only extra symbol you get is the British pound sign in the place of # on the 3 key.

    My laptop has a US keyboard, and I initially tried to use the UK keymap on it despite the lying key captions, but I soon found out that I had four symbols I could no longer type due to the missing keys. Quite frustrating.
  • On my keyboard when I press '|' I get '¦' and vice verca.  And the only way that I can get '¦' involves use of the 'Alt Gr' key. Luckily I don't think I've had any cause to use in in the last 5 years!
  • I always thought of them as being the same character, like the many forms of writing 0. I didn't realize that they were both in character sets. When would one use one versus the other? I'm not aware of their uses in typography, just their use for piping in shells and their use in ASCII art.
  • When I was learning MS-DOS, I was really confused when the manual told me to type the "pipe" key, which looked like a "vertical line" and not being able to find it.
  • I would guess the reason the keyboard shows a broken version is to avoid confusing the "|" key with the "I" key.  In fact, on my Microsoft Natural Keyboard, the "I" on the "I" key and the "|" on the "|" key look absolutely identical.

    Which brings up another question.  Why are the letter marks on the keyboard capital, but when you press them, the letter that shows up on the screen is lowercase?  (Unless you have Caps Lock on, of course.)
  • Oh boy...
    http://www.fileformat.info/info/unicode/char/007c/index.htm 
    http://www.fileformat.info/info/unicode/char/00a6/index.htm 
    http://www.fileformat.info/info/unicode/char/01c0/index.htm 
    http://www.fileformat.info/info/unicode/char/05c0/index.htm 
    http://www.fileformat.info/info/unicode/char/2223/index.htm 
    http://www.fileformat.info/info/unicode/char/2758/index.htm 
  • Not to mention
    http://www.fileformat.info/info/unicode/char/0049/index.htm
    http://www.fileformat.info/info/unicode/char/006c/index.htm
  • Uppercase letters one the key faces even though the default keystrokes will be lowercase is something inherited from typewriters....
  • Whee...

    Il|¦ǀ׀∣❘

    Watch out for that RTL character in there.
  • my mac keyboard has an unbroken bar.

    Vorn
  • Gabe was the closest on this one.   And in a way, it was IBM's fault, but not the way you might think.

    As I recall, the original ASCII specification, ANS X3.4-1968 had the broken bar, and so the keyboard committee (a different group) used it when the X4.22 and X4.23 keyboard standards were created.  (I'd go look for it but I don't want to move off this page.)

    The problem had to do with reserving the stylization of "!" being as "|" (for logical-or symbol) and of "^" as the logical not symbol, "¬".  To allow for that, there could be no vertical bar already in the code and so position 7/12 got the broken bar.

    I have this vague recollection that an objection from IBM was involved.  There could have also been some concern that the 7/12 position was available for customization in international usage.

    In the X3.4-1977 revision, it was observed that those stylizations never caught on and the idea was removed.  Also, IBM hadn't implemented ASCII very much anyhow.  (IBM's move to ASCII didn't start in earnest until introduction of the PC and Microsoft may deserve some credit here.)  The vertical bar was restored to the 7/12 position in line with the International Reference version of ISO 646-1973.  It appears that the keyboard standard "got stuck"

  • It's all coming back to me (and thank heavens for digital libraries that go way back).

    The first ASCII (ANS X3.4-1963) was defined in a 7-bit code, but there were 29 undefined code points.  http://doi.acm.org/10.1145/366707.367524

    The lower-case letters were not defined yet, nor were any of the special characters in the same sticks as the lower case letters.  There were also 4 control-code positions in the top (7/12-7/15) positions.  Keep in mind that having a 4-bit subset and a good 6-bit (64-character) subset was important at that time.  Also think EBCDIC 64-character subsets with "|" and "¬" already tucked in there.  Think PL/I programming.

    When the filled-out code was being proposed and brought out for public comment, the situation was rather different.  The vertical bar was proposed for 7/14, there was a "¬" in 7/12 (but called overbar with a hook for readability), the tilde was in 6/12 and carat was in 6/14.  There were some pretty amazing intermediate stages, documented in http://doi.acm.org/10.1145/363831.363839

    In the rearrangement before X3.4-1968 was completed and agreed to, the back-slash arrived and the organization became what it is now.  The vertical bar became broken so that a 64-point subset could have vertical bar as a substitute for ! as lobbied for by IBM.  In ISO 646-1973 the tilde disappeared and the overbar (without the hook) ended up at code point 7/14.  I don't think I ever saw that used, but I can't verify that X3.4-1968 went directly to tilde at 7/14.

    It helps to remember that while all of this was being figured out, most computer memory organizations and printer/display capabilities made 6-bit character codes the norm.  There were few peripheral units that provided for more characters than that and I never saw a punched card with lower-case codes on it, though I'm sure there were some.  

    Although EBCDIC had been introduced (along with System/360's 8-bit bytes), it was a sparse 8-bit code and the telecommunication folk were having none of it.  It took minicomputers and teletype terminals to bring ASCII into serious use for computing.    
Page 1 of 2 (22 items) 12