Larry Osterman's WebLog

Confessions of an Old Fogey
Blog - Title

How do I compare two different NetBIOS names?

How do I compare two different NetBIOS names?

  • Comments 10

On one of our internal aliases, someone asked the following question:

[i]s there any API that I can use to do case insensitive comparison of two OEM strings? (NetBIOS names are encoded in OEMCP.)

Wow, that's question was a blast from the past.  Windows Networking before NT 3.1 (which includes NetBIOS) had this undeclared and undefined construct called the "network codepage". Essentially an administrator was required to decide what the single codepage was for every computer on the network, and ensure that all computers were running in the same codepage.

History lesson: A NetBIOS "name" is actually a series of 16 octets, and as such can only be compared by memcmp. In DOS 3.1 (1984, which was before DNS was designed), Microsoft layered the concept of a "computer name" on top of a NetBIOS name. It did that by uppercasing (using the internal DOS case mapping table) the computername being contacted and setting that as the NetBIOS name on the PC Lan Adapter.

When DOS 3.3 came out, it's major innovation was to borrow the concept of a "code page" from IBM's mainframe systems. Essentially it meant that instead of the case mapping table being hard coded into the OS, it was loaded by an application (chcp). Note that there was still only one codepage per system, and that codepage case maping was still per-machine. As such, if you have machines with more than one codepage on the network, you're likely to have issues if those computernames contain internationalized characters. We received complaints from customers at the time about this Microsoft's answer was essentially "don't use international characters in your computer names".

Windows 1.0 added a second codepage to DOS, called the "ANSI Codepage" (or Windows Codepage). Windows applications used this codepage, while MS-DOS continued to use the codepage loaded by chcp. This MS-DOS codepage became known as the OEM codepage.

Fast forward 8 years when NT 3.1 came out. NT 3.1 still had a single OEM codepage, but added support for Unicode. Millions rejoiced, especially those customers who got pissed off by our answer from 8 years earlier. NT 3.1's rules were slightly better than MS-DOS's rules. NT 3.1 took the Unicode computername and uppercased it using the current active codepage. It then converted that uppercase Unicode string to the single OEM codepage and used that series of octets as the computer name.

The customers who had been pissed off by our answer 8 years earlier were somewhat happier, but not very much. If you have more than one codepage on the network, you STILL can have issues because the upper casing rules are still per-machine, and characters uppercase differently depending on the character set on the machine. Essentially NT 3.1 helped things for some computer names, but we STILL had this undefined, undeclared concept of a "network codepage".

 

As far as I know, this is still the state of the world w.r.t. NetBIOS names.

In general, you're better off matching two computernames (i.e. before the Unicode to OEM conversion) before you try matching two NetBIOS names).

 

If you were to root cause the problem, the issue is that most networking protocols were not designed with internationalization in mind - as a result, most of them seem to have an assumption that the both sides of the network transaction are running with the same internationalization rules.  It's not surprising, honestly - I was involved in some of the efforts to define internationalization extensions to the IMAP4 protocol and it turned into a swamp pit (the problem is that at the time (late 1990s) there weren't many international standards for case folding and thus the group was stuck with essentially punting the problem to the host OS, which wasn't considered a good solution because many OS's had limited support for supporting multiple case folding tables).  As a result, networking protocols that specify case insensitivity tend to describe their command verbs as being in the 7bit ASCII set (which has relatively straightforward case folding rules) and punt the problem of case folding to the server (which essentially means that you either support case sensitivity or you assume some kind of network codepage). 

  • The DNS specification was first published in November 1983 by the IETF (RFC 881 through RFC 883).   I would argue that DNS had been "designed" by 1984, just not used (at least on a PC).

  • Richard, fair 'nuf, I do know that at the time I left CMU (August 1984) it had not yet been deployed on any of the hosts at CMU - they were still using host tables.

  • > Microsoft's answer was essentially "don't use international

    > characters in your computer names".

    That needs some adjusting.  In most countries where Microsoft operates, the answer is "don't use your own national characters in your computer names, ONLY use international characters from one particular country".  Now, some Windows systems still sometimes warn when this advice is violated, and that warning is pretty much accurate.  However, some don't obey their own warning!  For example when Vista is being installed, it might recommend a computer name which is based on what might be the owner's name, which isn't very likely to be in ASCII.

    > we STILL had this undefined, undeclared concept of a

    > "network codepage".

    We still DON'T have a "network codepage", which is why some problems persist to this day.  I think it doesn't explain why odd occurences can be observed within a single computer, and it doesn't explain why filenames might still have problems, but it very well does explain why NetBIOS names still have problems.

    For practical purposes it remains better to use a subset of ASCII for computer names.  For the same practical purposes, it would be better if Windows would make that recommendation consistently.  Sure it requires holding noses while doing so, but that's easier than tracking down bugs and incompatibilities afterwards.

  • (no, this post is not about a rap or hip hop song, or its lyrics, though I admit the title may have been

  • (no, this post is not about a rap or hip hop song, or its lyrics, though I admit the title may have been

  • How to restore XP activation status information after a reformatWeb Vulnerabilities in the Age of the iPhoneNo ROI? No ProblemEvent Logs in Unallocated SpaceIvan Voras FreeBSD 7 Live CDWindows Vista Integrity Mechanism Technical ReferenceSguil vs. BASEMi

  • Had netbios names been limited to what was reasonable at the time, such as a subset of ASCII (7-bit) (pulling something out of my behind and saying [a-z][A-Z][0-9] + some extra chars like ",.;_+-*#$%&", where I'd be very reluctant to include ",.;$%&" and especially "*" :-), leaving "_+-"), this would have been a non-problem then, and a non-problem now. Yes, even when shifting case.

    But let me ask; have anyone here encountered a netbios name having chars outside this subset? If yes; have they created more joy or more trouble?

    I know, hindsight is 20-20, but this should really have been a no-brainer - even at the time.

  • NetBIOS "names" were really a sequence of 16 octets.  Nothing more, nothing less.

    Any data could be put in those 16 byte fields.

    And yes, I've encountered netbios names outside this subset.  Many, many, many times.

  • Thursday, July 19, 2007 1:15 PM by Mike

    > ,.;_+-*#$%&

    I notice you omit @.  Various Windows systems with various settings have various problems with @ in various places.  You should have omitted $ for the same reason, instead of adding it to your "reluctant" list.  Actually I recall reading that $ had a meaning in NetBIOS even before Microsoft copied it.

    > this would have been a non-problem then, and a non-problem

    > now. Yes, even when shifting case.

    It would be a non-problem if shifting case would be done the way Turkey does it.  Well maybe not completely, because it might be a problem outside Turkey, but that doesn't count.  Turks know why Turkey is the only country that counts.  Americans might disagree with one of the details there but Americans understand that reasoning very well.

    Thursday, July 19, 2007 1:28 PM by LarryOsterman

    > Any data could be put in those 16 byte fields.

    Yes that is true.  And this is why Windows sometimes gives warnings (not errors) where, for practical purposes, some selections might not be advisable.  I think it would help if Windows would more accurately diagnose these conditions, for practical purposes, even though we have to hold our noses while doing so.

    One common reason for encountering NetBIOS names outside that subset is the installer for Windows in the first place.  After asking what the owner's name is going to be, it recommends a machine name that includes the owner's name.  If I recall correctly, during installation it doesn't even warn at all that the recommended machine name will have a high probability of NetBIOS problems.

  • As being discussed on MichKap's site, http://blogs.msdn.com/michkap/archive/2007/08/16/4421520.aspx, non-Windows NetBios names and comments cause a different problem altogether - the default is to use the MacRoman single quote, which isn't handled by any of these options. Any advice?

Page 1 of 1 (10 items)