Postings are provided as is with no warranties, and confer no rights. Opinions expressed here are my own delusions; my employers at best shake their heads and sigh, at worst repudiate the content with extreme prejudice, whenever it manages to appear on their radar.
This blog is unsuitable for overly sensitive persons with low self-esteem and/or no sense of humour. Proceed at your own risk. Use as directed. Do not spray directly into eyes. Caution: filling may be hot. Do not give to children under 60 years of age. Not labeled for individual sale. Do not read 'natas teews ym' backwards. Objects in mirror are closer than they appear. Chew before swallowing. Do not bend, fold, spindle or mutilate. Do not take orally unless directed by a physician. Remove baby before folding stroller. Not for use on unexplained calf pain.
A nice FLAIR (FLuid Attenuated Inversion Recovery) view from the not-too-distant past. Every abnormality you can see on this scan (and there is more than one!) is asymptomatic at present. Alongside is a picture of me walking the walls at Fremont Studios, a sign of a damaged brain.
(This page was originally posted at http://i18nWithVB.com/win2k.htm but I thought it could use a wider audience)
A lot of work was done to this dialog since Windows 2000, including massive shifts in terminology. Here is the handy-dandy conversion chart of the most important items:
Windows 2000 term
Windows XP/Server 2003 term
Regional Options
Regional and Language Options
Default User Locale
Standards and Formats
Default System Locale
Language for Non-Unicode Programs
Language Settings for System
Supplemental Language Support
There are many other changes, as well. While I do welcome change when there are confusing issues, I am not sure how much I welcome change that others will find to be just as confusing. I'll let you decide how you feel about this particular issue yourself....
Anyway, here are some screenshots for the three important tabs for the dialog:
The first change is obvious -- the settings that used to show up on the first tab are now spread across three of them.
Here is each part, explained:
Language for Standards and Formats - Located in the first tab, these are the preference that you, the user, has for items like date formats, calendar, preferences for text sorting, etc. Now most of these settings can be handled individually by clicking on the Customize button. You can think of this dropdown combobox as a useful way to be lazy and have settings made automatically based on the locale you choose. There are really no standards per se involved (such as sorting), but not everything there is a format so there had to be something else there.
I will talk about the Location stuff some other time.
Supplemental Language Support -- In Windows 2000 this was a list containing various familes of locales corresponding to lanaguage groups, but now most of the support is already installed and turned on. In fact, there are only two groups that are not:
This information is in the second tab and is handled by two checkboxes. These two checkboxes control the installation of all the code pages, fonts, keyboards, etc. so that applications can support the particular language. You will probably be prompted for your Windows CD to install the files that you are in essence requesting.
The top of the second tab handles input methods. I will talk about that more another time.
User Interface Language - You may not have this control on your regional options at all; it is only there if you have MUI (the Multilingual User Interface) installed. This allows you to change the actual language of Windows itself. It has no effect, I repeat no effect, on your installation of Windows otherwise. At all. Period. If you think it will, then cure yourself of this delusion and realize that you do not need MUI to have a multilingual experience on Windows XP and Server 2003!
Language for Non-Unicode Programs (aka Default System Locale) - Located on the third tab, this setting is the one that controls, at the machine level, the locale that will be used for all conversions to and from Unicode for applications without Unicode support (like VB 6.0, for example). If you change the Default System Locale, you will be prompted to reboot afterwards (you may be prompted for your Windows CD first if you need to install some files). But I cannot stress it strongly enough: this is the top control on the third tab. You would not believe how many people mess this up and try to change the language at the top of the first tab under "Standards and Formats"! So think carefully and allow yourself to be one of the people laughing about the confusion, rather than one of the people being laughed at.
Incidentally, it also controls the font "language" that is used for the case of [primarily] East Asian fonts that have more than one name, based on language.
Under this are the various code pages you can install. I recommend you use Unicode and avoiding needing these things. :-)
Default Settings - Although the title is misleading, this checkbox located on the bottom of the third tab is incredibly useful in many situations. What is does is apply any changes you make on any of the three tabs to .DEFAULT, the default user profile (copied for all new user accounts), and several system accounts. In the case of keyboards, it copies all keyboards that have been selected by the given user whether they were selected at this time or not and applies them to the .DEFAULT account. The latter is very useful if you want the ability to switch keyboards in the logon dialog.
This setting does not exist in prior versions, which is a damn shame since people try all the time to e.g. set the default user locale on a web server and expect that change to be applied to their IIS. It does not immediately occur to most people that the setting only applies to the currently logged in user; unfortunately understanding is likely piss off any reasonable person since Windows 2000 does not provide any user interface to resolve the issue. Thankfully, much of the problem is taken care of with this one confusing setting.
That's all for now. Let me know if you have any questions or comments about this page!
The MSDN documentation generally recommends that you use the static versions of libraries like the C Runtime (CRT) or the Microsoft Foundation Classes (MFC). The reason for this is that the DLL versions have not been built for MSLU and thus have no knowledge of the need to use the Microsoft Layer for Unicode for Unicode APIs. However, many complex applications really need to use the DLL versions of these libraries. If you are the developer for one of these applications, you will need to rebuild them so they link with Unicows.lib. The following is a small guide on how to perform this task. This document is divided into 3 parts How to build the C Runtime Library 7.1 with MSLU How to build MFC 7.1 with MSLU Switching between non MSLU builds and MSLU builds The fine print For more info on rebuilding MFC extension DLLs, see TN033: DLL Version of MFC, specifically the section entitled "Building the MFC DLL" towards the bottom of the article. Our steps here seem a lot nicer. :-) All of these steps were used to build DLLs that were subsequently tested on Win98 SE. They are expected to work on all platforms. Special thanks are owed to Ted W. for taking the time to do what we all knew was theoretically possible and making it technically possible for everyone. This document is mostly due to his efforts. Thanks, Ted! In all instructions below, the assumption is a default install path and an en-US copy of Windows; if either is not the case, make sure you replace paths such as C:\Program Files\Microsoft Visual Studio 2003 with the appropriate install location. Also, special thanks to Tim Dowty of Music Match for the great text of step #4 under the MFC build! Before you start: Install Visual Studio .NET 2003 including all necessary files. The first thing you need to do is make sure that when you install Visual Studio .NET 2003 that you make sure both the Unicode MFC version and the CRT source code are installed. Identify the folders and files you will be modifying. If you installed to the default locations, all of the files we need to change are contained in the tree \Program Files\Microsoft Visual Studio .NET 2003\VC7. Find the ATLMFC\SRC folder and the CRT\SRC folder. Install the Platform SDK and copy the latest unicows.lib file to your VC7\PlatformSDK\LIB folder. Since VC7 comes with a unicows.lib, this step is optional, although it is good to be on the latest unicows.lib from the most recent Platform SDK. How to build the CRT 7.1 with MSLU The first thing we want to do is make a backup of our VC7\Lib folder. We will be replacing files in it, so if we need to go back (or switch between MSLU and non-MSLU version of the CRT) we can always do that. Secondly, let's copy the VC7\CRT\SRC folder to a comfortable place so we can change it and build from it. For example, we'll copy it to the root of C: so we have a folder called C:\SRC, available for quick access from the command line. When building the CRT we are actually building two DLLs: MSVCR71.DLL and MSVCP71.DLL. Since we are building both debug and release builds it makes a total of four DLLs we need to build. In the SRC folder, there is a provided batch file bldwin9x.bat that will build the all CRT DLLs and an associated makefile. Now we will open up the makefile in notepad. At the top of makefile there is a section that controls the naming of the two DLLs to build. For this purpose we will use the name MSLU as a prefix to all of the DLLs instead of the standard name MSVC. So the four names of the DLLs we will create are:MSLUR71.DLL MSLUR71D.DLL MSLUP71.DLL MSLUP71D.DLL Warning: Since most people who follow these steps will probably use the exact names given here, please be sure to keep these versions of the DLLs in your own private directory when you use them. The default names provided in the makefile are _SAMPLE_ and SAMPLE_P. There are associated RC and DEF files for each of these names, so we need to copy them to the new names, i.e. copy _SAMPLE_.RC MSLUR71.RC copy SAMPLE_P.RC MSLUP71.RC copy SAMPLE_P.DEF MSLUP71.DEF copy SAMPLD_P.DEF MSLUP71D.DEF copy Intel\_SAMPLE_.DEF Intel\MSLUR71.DEF copy Intel\_SAMPLD_.DEF Intel\MSLUR71D.DEF Next we need to change the LIBRARY name in each of the above DEF files to match the name of the DEF file. Open up each file in notepad to make the change. The provided makefile needs some minor changes to get it to work properly and link with Unicows.lib. Change the top block of defines to the following: RETAIL_DLL_NAME=MSLUR71 RETAIL_LIB_NAME=MSLUR71 RETAIL_DLLCPP_NAME=MSLUP71 RETAIL_LIBCPP_NAME=MSLUP71 DEBUG_DLL_NAME=MSLUR71D DEBUG_LIB_NAME=MSLUR71D DEBUG_DLLCPP_NAME=MSLUP71D DEBUG_LIBCPP_NAME=MSLUP71D The VCTOOLS path should be changed to point to the path where you installed Visual Studio .NET 2003, e.g. VCTOOLS=C:\Program Files\Microsoft Visual Studio .NET 2003\VC7 We want to link to unicows.lib before any other lib files. line 1216, 1268, 1315, 1364 change kernel32.lib to: unicows.lib kernel32.lib advapi32.lib user32.lib gdi32.lib shell32.lib comdlg32.lib version.lib mpr.lib rasapi32.lib winmm.lib winspool.lib vfw32.lib oleacc.lib oledlg.lib Once we make these changes, we are ready to build the DLLs. It's simple – launch a Visual Studio .NET 2003 command prompt (start menu-programs-Visual Studio .NET 2003 – Visual Studio .NET 2003 tools – Visual Studio command prompt) and then go to the C:\SRC folder and type: set VCTOOLS=C:\Program Files\Microsoft Visual Studio .NET 2003\VC7 BLDWIN9X Once the DLLs finish building they will be in a subfolder called BUILD\INTEL. The Libs, PDBs, and Maps are also in that folder. Now we've got 4 libs (2 debug, 2 release) we can link to. Let's copy those new libs back to the original names of the libs, e.g. copy MSLUR71.LIB MSVCRT.LIB copy MSLUR71D.LIB MSVCRTD.LIB copy MSLUP71.LIB MSVCPRT.LIB copy MSLUP71D.LIB MSVCPRTD.LIB The reason we do this is so we can link our existing apps (and build MFC) without having to change the libraries that they link to. The Libs still point to the newly named DLLs, even though they don't share the same names as the new ones anymore. Now copy the 4 MSVC libs to the VC7\Lib folder (overwriting the existing ones) Repeat step 4. This step is necessary to rebuild MSLUP71(D).DLL again so it links to our newly created MSVCRT(D).LIB (which points to our new MSLUR71(D).DLL). The CRT build is now done. Before proceeding any further we need to close the command prompt that we used to build the CRT because it created certain environment variables that will cause compile errors in the next step, building the Unicode version of MFC. Building MFC 7.1 Unicode version with MSLU First we will make a backup of the following folders (and all subfolders of): VC7\ATLMFC\LIB, and VC7\ATLMFC\SRC so we can restore them later if necessary. Building the Unicode version of MFC is slightly easier than building the CRT. The Unicode version of MFC is 2 different DLLs (unlike the 5 different DLLs that we had to worry about when building MFC 6.0): MFC71U.DLL (Unicode Release) MFC71UD.DLL (Unicode Debug) There is also a static component to even a DLL build of MFC, named as follows: MFCS71U.LIB (Unicode Release – static library – deprecated classes) MFCS71UD.LIB (Unicode Debug – static library – deprecated classes) To build MFC, there is one master Makefile in the VC7\ATLMFC\SRC folder named: ATLMFC.MAK And there is one Makefile in the VC7\ATLMFC\SRC\MFC folder named MFCDLL.MAK First, we will change the MFCDLL.MAK file to link to Unicows.lib. In each file, after the line that states: link @<< insert the following lines:/nod:kernel32.lib /nod:advapi32.lib /nod:user32.lib /nod:gdi32.lib /nod:shell32.lib /nod:comdlg32.lib /nod:version.lib /nod:mpr.lib /nod:rasapi32.lib /nod:winmm.lib /nod:winspool.lib /nod:vfw32.lib /nod:secur32.lib /nod:oleacc.lib /nod:oledlg.lib /nod:sensapi.lib unicows.lib kernel32.lib advapi32.lib user32.lib gdi32.lib shell32.lib comdlg32.lib version.lib mpr.lib rasapi32.lib winmm.lib winspool.lib vfw32.lib oleacc.lib oledlg.lib They must go in that position, if we don't do this then a library reference will be included causing unicows.lib to be linked after kernel32.lib (which will then cause the unicows.dll load to fail). Other DLLs in the wrong order will simply cause APIs in those specific DLLs to not be called. The line number to insert the above two lines after is line 273. Now, we will decide what to name our new DLL. We do not want to use the standard name(s) for the same reasons we did not use the standard names for the CRT. So we will come up with a simple naming convention: we'll add an "L" to the name. So the new names will be: MFC71LU.DLL MFC71LUD.DLL Now we're ready to build the versions of MFC: From a Visual Studio .NET 2003 Command Prompt, create a new batch file called buildmfc.bat in the ATLMFC\SRC folder with the following content:nmake -f atlmfc.mak MFC libname=MFC71L This will build all MFC libraries, not just the Unicode DLLs, but it will save us the effort of figuring out how to use the MFCDLL.MAK makefile. Run the batch file. If you need to rebuild any time in the future you now have a convenient batch file to do so. The DLL and PDB files will be created in the VC7\ATLMFC\SRC\MFC\INTEL folder. The LIB files will be created in the ATLMFC\LIB\INTEL folder. There is one crucial step missing from the supplied MFC makefiles. If you take a look at line 425 of …vc7\mfc\makefile, you’ll see that one of the options passed to the compiler is /Zc:wchar_t, which causes wchar_t to become an implicit type. This may be what you want, but if the application you’re linking the lib to wasn’t compiled with this same option (and -- therefore -- has wchar_t #defined to unsigned short), you will get unresolved externals when you link. Your program is looking for function signatures with unsigned shorts in them, but the lib only exports wchar_t in the function signatures. You could remove the /Zc:wchar_t from the makefile, but this solution isn’t universal; it would still prevent linking with programs compiled with the /Zc:wchar_t switch. A better solution is to do what Microsoft did in the original mfc71 libraries: include alias records in the library so that you can link both implicit wchar_t and unsigned short programs. Alias records allow a library to export multiple function signatures that resolve down to the same object code. So how do you add alias records to your newly-built MFC libraries? For both the debug and release MFC library libraries you need to do the following: a) Extract all of the alias records from the corresponding retail MFC library b) Create a new library comprising only these alias records c) Merge your new Unicows-compliant MFC library with the associated alias-record library Step A) This one requires a small detour because lib.exe only allows you to extract one object at a time. We want to automate this step by creating a batch file to do all of the extractions. First, get a command prompt and make …\Vc7\atlmfc\lib your current directory. Next, create a list of all of the alias records in both debug and release MFC libs using the following two command lines:lib /LIST mfc71ud.lib > mfc71ud.lib.lstlib /LIST mfc71u.lib > mfc71u.lib.lst You should now have the two .lib.lst files that each contain a list of library objects, one per line. Now, we will create a perl script to build a pair of batch files from the .lib.lst files (if you don’t already have perl, it’s freely available from several sources. You can find Perl here). Start up a text editor and enter the following text:#!/usr/bin/perl # builds a batch file to extract all alias records # in the input file (input file created with lib.exe /LIST) $targetLib = "mfc71ud.lib"; $outDir = "_aliasRecordsD"; print "md .\\$outDir\n"; while (<>) { # find alias record name if (/_alias[0-9]+\.obj/) { chop; print "LIB /EXTRACT:$_ /OUT:.\\$outDir\\$_ $targetLib\n"; } } Save the text as BuildAliasExtractBatchD.pl. Now edit the text so that the $targetLib variable is changed as follows:$targetLib = "mfc71u.lib"; also change $outDir as shown:$outDir = “_aliasRecords"; Save the edited text as BuildAliasExtractBatch.pl. Now run the two perl scripts as follows from the command prompt:perl BuildAliasExtractBatchD.pl mfc71ud.lib.lst > BuildAliasExtractD.batperl BuildAliasExtractBatch.pl mfc71u.lib.lst > BuildAliasExtract.bat At this point you have two batch files, one of which will extract the alias records from the debug library and one that will extract from the release library. To complete step a) all that’s left is to run the batch files. Note that there are about 2000 alias records in each MFC library, and extracting them one by one is a slow process; each library extraction took about 4 hours on a fast PC. At the completion of this step, you will have two new directories under …\vc7\atlmfc\lib each of which contains extracted alias records. Each extracted alias record is a file with a name of the form _alias*.obj where * is one to four decimal digits. Step B) For Step b), we want to create a new library from the extracted records. Fortunately, this can be done in two simple steps; in contrast to Step a) we can use a response file with lib.exe to simplify our operation. First, we create a pair of perl scripts that will build the response files. Use your favorite text editor to enter the following text:#!/usr/bin/perl # builds a response file for lib.exe to build a library of # alias records. (input file created with lib.exe /LIST) $outLib = "mfc71udAlias.lib"; $aliasDir = "_aliasRecordsD"; print "/OUT:$outLib"; while (<>) { # find alias record name if (/_alias[0-9]+\.obj/) { chop; print " .\\$aliasDir\\$_"; } } Save the file as CreateAliasLibD.pl. Now edit the variable declarations so they read:$outLib = "mfc71uAlias.lib";$aliasDir = "_aliasRecords"; and save the file as CreateAliasLib.pl Run the perl scripts from the command line. Note that we reuse the .lib.lst files we created in Step A) as input here:perl CreateAliasLibD.pl mfc71ud.lib.lst > mfc71udAlias.rspperl CreateAliasLib.pl mfc71u.lib.lst > mfc71uAlias.rsp With the response files made, we now use them with lib.exe to create the alias libraries:lib @mfc71udAlias.rsplib @mfc71uAlias.rsp At the completion of this step you will have two new libraries: mfc71udAlias.lib and mfc71uAlias.lib. They will each contain their respective alias records. Step C) In this step, we simply merge our custom-built MFC libraries with the alias libraries we just made. While we’re at it, we’ll also rename the libraries so they’ll replace the original libraries. Note that we get our custom-built libraries directly from their output locations. lib /OUT:mfc71ud.lib .\Intel\MFC71LUD.lib mfc71udAlias.liblib /OUT:mfc71u.lib .\Intel\MFC71LU.lib mfc71uAlias.lib After the building is done, we need to copy the rest of the created LIBs in the VC7\ATLMFC\LIB\INTEL folder back to their original names in VC7\ATLMFC\LIB (overwriting what's there) so that any of our apps that we link will use the new DLLs. i.e. copy MFC71LSU.LIB MFCS71U.LIB copy MFC71LSUD.LIB MFCS71UD.LIB The reason we do this is the same as for the CRT: we don't have to worry about changing any linker options in our projects to link to the new version of MFC. We should also copy the PDB files back from the LIB\INTEL folder back to the LIB folder. Now we're ready to do a test build of an application. Create a new SDI MFC application using the AppWizard, choose dynamic MFC, create a Unicode Debug and Release build, change the settings to link to unicows.lib, copy the newly created CRT and MFC DLLs to your DEBUG or RELEASE build folder(s) and then run the application. It should all work. Use dependency walker to make sure that everything is getting linked properly and the proper DLLs are being loaded (run a profile in dependency walker). No references to the old names of the DLLs for both the CRT or MFC should be there. Switching between non MSLU builds and MSLU builds Because we have done all of the above, any Unicode build on your machine will now link to MSLU. We may not want this necessarily, or we may want to link back to the original CRT and MFC DLLs. This is what we made the backups for. To restore the system, simply restore your VC7\LIB and VC7\ATLMFC\LIB folders. You could even make a simple batch file that copies older or newer versions of the LIBs back to the LIB folders depending on what you want to build.
The MSDN documentation generally recommends that you use the static versions of libraries like the C Runtime (CRT) or the Microsoft Foundation Classes (MFC). The reason for this is that the DLL versions have not been built for MSLU and thus have no knowledge of the need to use the Microsoft Layer for Unicode for Unicode APIs. However, many complex applications really need to use the DLL versions of these libraries. If you are the developer for one of these applications, you will need to rebuild them so they link with Unicows.lib. The following is a small guide on how to perform this task.
This document is divided into 3 parts
For more info on rebuilding MFC extension DLLs, see TN033: DLL Version of MFC, specifically the section entitled "Building the MFC DLL" towards the bottom of the article. Our steps here seem a lot nicer. :-)
All of these steps were used to build DLLs that were subsequently tested on Win98 SE. They are expected to work on all platforms.
Special thanks are owed to Ted W. for taking the time to do what we all knew was theoretically possible and making it technically possible for everyone. This document is mostly due to his efforts. Thanks, Ted!
In all instructions below, the assumption is a default install path and an en-US copy of Windows; if either is not the case, make sure you replace paths such as C:\Program Files\Microsoft Visual Studio 2003 with the appropriate install location.
Also, special thanks to Tim Dowty of Music Match for the great text of step #4 under the MFC build!
Before you start:
The first thing you need to do is make sure that when you install Visual Studio .NET 2003 that you make sure both the Unicode MFC version and the CRT source code are installed.
If you installed to the default locations, all of the files we need to change are contained in the tree \Program Files\Microsoft Visual Studio .NET 2003\VC7. Find the ATLMFC\SRC folder and the CRT\SRC folder.
The first thing we want to do is make a backup of our VC7\Lib folder. We will be replacing files in it, so if we need to go back (or switch between MSLU and non-MSLU version of the CRT) we can always do that.
Secondly, let's copy the VC7\CRT\SRC folder to a comfortable place so we can change it and build from it. For example, we'll copy it to the root of C: so we have a folder called C:\SRC, available for quick access from the command line.
When building the CRT we are actually building two DLLs: MSVCR71.DLL and MSVCP71.DLL. Since we are building both debug and release builds it makes a total of four DLLs we need to build.
In the SRC folder, there is a provided batch file bldwin9x.bat that will build the all CRT DLLs and an associated makefile.
Now we will open up the makefile in notepad. At the top of makefile there is a section that controls the naming of the two DLLs to build. For this purpose we will use the name MSLU as a prefix to all of the DLLs instead of the standard name MSVC. So the four names of the DLLs we will create are:
MSLUR71.DLL MSLUR71D.DLL MSLUP71.DLL MSLUP71D.DLL
Warning: Since most people who follow these steps will probably use the exact names given here, please be sure to keep these versions of the DLLs in your own private directory when you use them.
The default names provided in the makefile are _SAMPLE_ and SAMPLE_P. There are associated RC and DEF files for each of these names, so we need to copy them to the new names, i.e.
copy _SAMPLE_.RC MSLUR71.RC copy SAMPLE_P.RC MSLUP71.RC copy SAMPLE_P.DEF MSLUP71.DEF copy SAMPLD_P.DEF MSLUP71D.DEF copy Intel\_SAMPLE_.DEF Intel\MSLUR71.DEF copy Intel\_SAMPLD_.DEF Intel\MSLUR71D.DEF
Next we need to change the LIBRARY name in each of the above DEF files to match the name of the DEF file. Open up each file in notepad to make the change.
The provided makefile needs some minor changes to get it to work properly and link with Unicows.lib.
RETAIL_DLL_NAME=MSLUR71 RETAIL_LIB_NAME=MSLUR71 RETAIL_DLLCPP_NAME=MSLUP71 RETAIL_LIBCPP_NAME=MSLUP71 DEBUG_DLL_NAME=MSLUR71D DEBUG_LIB_NAME=MSLUR71D DEBUG_DLLCPP_NAME=MSLUP71D DEBUG_LIBCPP_NAME=MSLUP71D
VCTOOLS=C:\Program Files\Microsoft Visual Studio .NET 2003\VC7
line 1216, 1268, 1315, 1364 change kernel32.lib to:
unicows.lib kernel32.lib advapi32.lib user32.lib gdi32.lib shell32.lib comdlg32.lib version.lib mpr.lib rasapi32.lib winmm.lib winspool.lib vfw32.lib oleacc.lib oledlg.lib
Once we make these changes, we are ready to build the DLLs. It's simple – launch a Visual Studio .NET 2003 command prompt (start menu-programs-Visual Studio .NET 2003 – Visual Studio .NET 2003 tools – Visual Studio command prompt) and then go to the C:\SRC folder and type:
set VCTOOLS=C:\Program Files\Microsoft Visual Studio .NET 2003\VC7 BLDWIN9X
Once the DLLs finish building they will be in a subfolder called BUILD\INTEL. The Libs, PDBs, and Maps are also in that folder.
copy MSLUR71.LIB MSVCRT.LIB copy MSLUR71D.LIB MSVCRTD.LIB copy MSLUP71.LIB MSVCPRT.LIB copy MSLUP71D.LIB MSVCPRTD.LIB
The reason we do this is so we can link our existing apps (and build MFC) without having to change the libraries that they link to. The Libs still point to the newly named DLLs, even though they don't share the same names as the new ones anymore.
The CRT build is now done.
Before proceeding any further we need to close the command prompt that we used to build the CRT because it created certain environment variables that will cause compile errors in the next step, building the Unicode version of MFC.
First we will make a backup of the following folders (and all subfolders of): VC7\ATLMFC\LIB, and VC7\ATLMFC\SRC so we can restore them later if necessary.
Building the Unicode version of MFC is slightly easier than building the CRT. The Unicode version of MFC is 2 different DLLs (unlike the 5 different DLLs that we had to worry about when building MFC 6.0):
There is also a static component to even a DLL build of MFC, named as follows:
To build MFC, there is one master Makefile in the VC7\ATLMFC\SRC folder named:
And there is one Makefile in the VC7\ATLMFC\SRC\MFC folder named
link @<<
insert the following lines:
/nod:kernel32.lib /nod:advapi32.lib /nod:user32.lib /nod:gdi32.lib /nod:shell32.lib /nod:comdlg32.lib /nod:version.lib /nod:mpr.lib /nod:rasapi32.lib /nod:winmm.lib /nod:winspool.lib /nod:vfw32.lib /nod:secur32.lib /nod:oleacc.lib /nod:oledlg.lib /nod:sensapi.lib unicows.lib kernel32.lib advapi32.lib user32.lib gdi32.lib shell32.lib comdlg32.lib version.lib mpr.lib rasapi32.lib winmm.lib winspool.lib vfw32.lib oleacc.lib oledlg.lib
They must go in that position, if we don't do this then a library reference will be included causing unicows.lib to be linked after kernel32.lib (which will then cause the unicows.dll load to fail). Other DLLs in the wrong order will simply cause APIs in those specific DLLs to not be called.
The line number to insert the above two lines after is line 273.
From a Visual Studio .NET 2003 Command Prompt, create a new batch file called buildmfc.bat in the ATLMFC\SRC folder with the following content:
nmake -f atlmfc.mak MFC libname=MFC71L
This will build all MFC libraries, not just the Unicode DLLs, but it will save us the effort of figuring out how to use the MFCDLL.MAK makefile.
Run the batch file. If you need to rebuild any time in the future you now have a convenient batch file to do so. The DLL and PDB files will be created in the VC7\ATLMFC\SRC\MFC\INTEL folder. The LIB files will be created in the ATLMFC\LIB\INTEL folder.
There is one crucial step missing from the supplied MFC makefiles. If you take a look at line 425 of …vc7\mfc\makefile, you’ll see that one of the options passed to the compiler is /Zc:wchar_t, which causes wchar_t to become an implicit type. This may be what you want, but if the application you’re linking the lib to wasn’t compiled with this same option (and -- therefore -- has wchar_t #defined to unsigned short), you will get unresolved externals when you link. Your program is looking for function signatures with unsigned shorts in them, but the lib only exports wchar_t in the function signatures.
You could remove the /Zc:wchar_t from the makefile, but this solution isn’t universal; it would still prevent linking with programs compiled with the /Zc:wchar_t switch.
A better solution is to do what Microsoft did in the original mfc71 libraries: include alias records in the library so that you can link both implicit wchar_t and unsigned short programs. Alias records allow a library to export multiple function signatures that resolve down to the same object code.
So how do you add alias records to your newly-built MFC libraries?
For both the debug and release MFC library libraries you need to do the following:
a) Extract all of the alias records from the corresponding retail MFC library
b) Create a new library comprising only these alias records
c) Merge your new Unicows-compliant MFC library with the associated alias-record library
Step A)
This one requires a small detour because lib.exe only allows you to extract one object at a time. We want to automate this step by creating a batch file to do all of the extractions.
First, get a command prompt and make …\Vc7\atlmfc\lib your current directory. Next, create a list of all of the alias records in both debug and release MFC libs using the following two command lines:
lib /LIST mfc71ud.lib > mfc71ud.lib.lstlib /LIST mfc71u.lib > mfc71u.lib.lst
You should now have the two .lib.lst files that each contain a list of library objects, one per line.
Now, we will create a perl script to build a pair of batch files from the .lib.lst files (if you don’t already have perl, it’s freely available from several sources. You can find Perl here).
Start up a text editor and enter the following text:
#!/usr/bin/perl # builds a batch file to extract all alias records # in the input file (input file created with lib.exe /LIST) $targetLib = "mfc71ud.lib"; $outDir = "_aliasRecordsD"; print "md .\\$outDir\n"; while (<>) { # find alias record name if (/_alias[0-9]+\.obj/) { chop; print "LIB /EXTRACT:$_ /OUT:.\\$outDir\\$_ $targetLib\n"; } }
Save the text as BuildAliasExtractBatchD.pl.
Now edit the text so that the $targetLib variable is changed as follows:
$targetLib = "mfc71u.lib";
also change $outDir as shown:
$outDir = “_aliasRecords";
Save the edited text as BuildAliasExtractBatch.pl.
Now run the two perl scripts as follows from the command prompt:
perl BuildAliasExtractBatchD.pl mfc71ud.lib.lst > BuildAliasExtractD.batperl BuildAliasExtractBatch.pl mfc71u.lib.lst > BuildAliasExtract.bat
At this point you have two batch files, one of which will extract the alias records from the debug library and one that will extract from the release library.
To complete step a) all that’s left is to run the batch files. Note that there are about 2000 alias records in each MFC library, and extracting them one by one is a slow process; each library extraction took about 4 hours on a fast PC.
At the completion of this step, you will have two new directories under …\vc7\atlmfc\lib each of which contains extracted alias records. Each extracted alias record is a file with a name of the form _alias*.obj where * is one to four decimal digits.
Step B)
For Step b), we want to create a new library from the extracted records. Fortunately, this can be done in two simple steps; in contrast to Step a) we can use a response file with lib.exe to simplify our operation.
First, we create a pair of perl scripts that will build the response files.
Use your favorite text editor to enter the following text:
#!/usr/bin/perl # builds a response file for lib.exe to build a library of # alias records. (input file created with lib.exe /LIST) $outLib = "mfc71udAlias.lib"; $aliasDir = "_aliasRecordsD"; print "/OUT:$outLib"; while (<>) { # find alias record name if (/_alias[0-9]+\.obj/) { chop; print " .\\$aliasDir\\$_"; } }
Save the file as CreateAliasLibD.pl.
Now edit the variable declarations so they read:
$outLib = "mfc71uAlias.lib";$aliasDir = "_aliasRecords";
and save the file as CreateAliasLib.pl
Run the perl scripts from the command line. Note that we reuse the .lib.lst files we created in Step A) as input here:
perl CreateAliasLibD.pl mfc71ud.lib.lst > mfc71udAlias.rspperl CreateAliasLib.pl mfc71u.lib.lst > mfc71uAlias.rsp
With the response files made, we now use them with lib.exe to create the alias libraries:
lib @mfc71udAlias.rsplib @mfc71uAlias.rsp
At the completion of this step you will have two new libraries: mfc71udAlias.lib and mfc71uAlias.lib. They will each contain their respective alias records.
Step C)
In this step, we simply merge our custom-built MFC libraries with the alias libraries we just made. While we’re at it, we’ll also rename the libraries so they’ll replace the original libraries. Note that we get our custom-built libraries directly from their output locations.
lib /OUT:mfc71ud.lib .\Intel\MFC71LUD.lib mfc71udAlias.liblib /OUT:mfc71u.lib .\Intel\MFC71LU.lib mfc71uAlias.lib
copy MFC71LSU.LIB MFCS71U.LIB copy MFC71LSUD.LIB MFCS71UD.LIB
The reason we do this is the same as for the CRT: we don't have to worry about changing any linker options in our projects to link to the new version of MFC. We should also copy the PDB files back from the LIB\INTEL folder back to the LIB folder.
Now we're ready to do a test build of an application. Create a new SDI MFC application using the AppWizard, choose dynamic MFC, create a Unicode Debug and Release build, change the settings to link to unicows.lib, copy the newly created CRT and MFC DLLs to your DEBUG or RELEASE build folder(s) and then run the application. It should all work.
Use dependency walker to make sure that everything is getting linked properly and the proper DLLs are being loaded (run a profile in dependency walker). No references to the old names of the DLLs for both the CRT or MFC should be there.
Because we have done all of the above, any Unicode build on your machine will now link to MSLU. We may not want this necessarily, or we may want to link back to the original CRT and MFC DLLs. This is what we made the backups for. To restore the system, simply restore your VC7\LIB and VC7\ATLMFC\LIB folders. You could even make a simple batch file that copies older or newer versions of the LIBs back to the LIB folders depending on what you want to build.
My policy and the policy of this blog is to not make substantive changes to a post. If I make a substantive addition to a post, I will mark it as such. Though that will probably be pretty rare. I will not retroactively try to make myself appear more impressive than I was. Not that I find myself to be particularly impressive (nor do I know of anyone who does), but just in case....
With that said, I have (and will continue to) make small changes to fix typos and broken links. I will likely not give a sign of having done such minor editorial changes, it just seems silly to bother.
The contents of this article are going to make me unpopular with some blog "true-believers" who look at blogs as a static "slice of life" that should never be modified. But that's their problem -- and their issue. I have to be me, even if I am using a different means of communication. I guess I have to misquote one of Milos Forman's movies -- if you don't like Sorting It All Out then don't read it. But if you do, or if you have constructive suggestions about topics or thoughts about topics posted then I would love to hear from you. :-)
I am going to take these two questions out of order because (a) locales existed before cultures did, (b) neutral locales set the stage for neutral cultures, and (c) I think it may help us look less lame. Though that third reason is probably just naive optimism on my part....
To see what Windows does with neutral locales, you can look at the documented behavior of ConvertDefaultLocale. Basically, if you pass a neutral like LANG_ENGLISH then it will return the equivalent of MAKELANGID(LANG_ENGLISH, SUBLANG_DEFAULT) thus 0x009 becomes 0x0409, 0x01a becomes 0x041a, and so forth. Easy, huh?
This ConvertDefaultLocale function calls an internal routine to do its work; the same routine is called by every NLS function, too. Which is a long way around to say that neutral locales do not exist to the Win32 NLS APIs.
Now there is one use for them in Win32 -- resource loading. You can use neutral LCIDs either to more accurately tag resources or to provide an easy fallback mechanism. Of course, if you want to put names on them then you cannot use GetLocaleInfo since asking for the LOCALE_SLANGUAGE of LANG_ENGLISH will give you "English (United States)" which is probably not what you wanted.
(In fact, I wonder what Visual Studio's resource editor does for its strings for these neutrals -- it must have its own strings somewhere, hard coded? Ick!)
In retrospect, it might have been a better idea to not do things that way, but it has shipped this way for at least the last ten versions of Windows. So we are kinda stuck with it.
Anyway, thats neutral locales -- at best, they are tolerated. But you can't really do much with them. Using that ConvertDefaultLocale-ish behavior can actually get you unexpected results sometimes, too. More on this another day.
This brings us to neutral cultures....
In the .NET Framework, a neutral CultureInfo mostly does not do this weird implied LCID fallback thing. There is actual data behind these culures that you can query and use -- and you can get back the actual names and everything. It also does a great job on the resource loading fallback -- using the CultureInfo object's Parent property. The parent property is not based on LCID tricks, either -- it's actual planned data for each culture. Obviously much cooler and a bit more thought out.There is even a CreateSpecificCulture method on a CultureInfo that does the same sort of thing as ConvertDefaultLocale, creating a specific culture from a neutral one.
A lot of you probably noticed where I said "mostly" in that last paragraph (an occupational hazard of having readers who can be as cynical as I am!). Unfortunately, that weird LCID-esque fallback behavior still basically happens for collation and encoding via the culture's CompareInfo and TextInfo objects. Which is not such a big deal, and it is really necessary since both of those objects need the context of a specific culture.
In retrospect, it might have been a better idea to not do things that way -- CompareInfo and TextInfo should not have been made available (as happens for other objects like the associated DateTimeFormatInfo and NumberFormatInfo), but it has shipped this way for two versions so we are kinda stuck with it.
One important difference that distinguishes them from neutral locales is that one could create a class that is derived from CultureInfo that contains language-specific information which would make more sense in neutral cultures, which is really a fancy way of saying "language-only cultures" (which is itself kind of a fancy way of say something).
There are also some odd situations with the LCID property of the CultureInfo, but thats a separate issue. More on this another day, too.
So, thats neutral cultures. Mostly not useful, except for resources -- except in the same way that they are useful in Win32 (by pretending its a specific culture). Or for potential extensibility either by Microsoft or by developers in the future.
Three steps forward, one step back? :-)
This post brought to you by "♎" (U+264e, a.k.a. LIBRA)
The MSDN documentation generally recommends that you use the static versions of libraries like the C Runtime (CRT) or the Microsoft Foundation Classes (MFC). The reason for this is that the DLL versions have not been built for MSLU and thus have no knowledge of the need to use the Microsoft Layer for Unicode for Unicode APIs. However, many complex applications really need to use the DLL versions of these libraries. If you are the developer for one of these applications, you will need to rebuild them so they link with Unicows.lib. The following is a small guide on how to perform this task. This document is divided into 4 parts: FAQ How to build the C Runtime Library with MSLU How to build MFC with MSLU Switching between non MSLU builds and MSLU builds The fine print For more info on rebuilding MFC extension DLLs, see TN033: DLL Version of MFC, specifically the section entitled "Building the MFC DLL" towards the bottom of the article. Our steps here seem a lot nicer. :-) All of these steps were used to build DLLs that were subsequently tested on Win98 SE. They are expected to work on all platforms. In all instructions below, the assumption is a default install path and an en-US copy of Windows; if either is not the case, make sure you replace paths such as C:\Program Files\Microsoft Visual Studio with the appropriate install location. Special thanks are owed to Ted W. for taking the time to do what we all knew was theoretically possible and making it technically possible for everyone. This document is mostly due to his efforts. Thanks, Ted! FAQ I dynamically link to MFC in my Unicode app and want to use MSLU. Why do I need to worry about this? Answer: The MFC Unicode version was originally meant for NT based operating systems only. With the advent of MSLU, it is now possible to write a single binary that runs on all platforms. If you have an existing Unicode MFC application or simply want to take advantage of the MFC in a Unicode app you will need to rebuild both the CRT and MFC. You need to build the CRT because MFC MUST be linked dynamically to the CRT. Doesn't the CRT already work on all platforms? Why do I need to rebuild the runtime library? I thought that the CRT was already Unicode ready - I've been using the string functions, e.g. wcscpy extensively already. Answer: No, not all CRT functions work on all platforms. The CRT does run on Windows 9x platforms but there are certain wide versions of functions, e.g. wfullpath, that call wide Windows API functions, e.g. GetFullPathNameW, which are only available on NT based systems. So therefore if calling one of these functions, as MFC does, the function will fail. This affects MFC in a major way, the MFC startup code passes in an invalid hInstance, causing the DLL to fail. How can I build MFC and the runtime library? Isn't it complicated? Answer: It's not very complicated. Microsoft in their wisdom included the source code to all of MFC and the runtime library, so it's possible to create new DLLs with minor modifications to the provided makefiles. Why doesn't Microsoft provide these rebuilt DLLs? Isn't it a pretty common scenario? Answer: For one thing, MSLU came out long after MFC did, and the team that created MSLU is separate from the one that maintains MFC. I'm sure if you encourage them they might consider maintaining a special build of MFC and the CRT. It's all a matter of priorites, the MFC and CRT teams at Microsoft are really busy. Does MFC 7.0 Unicode version or CRT 7.0 (Visual Studio .NET) already link to MSLU? It just came out. Answer: No it doesn't. MFC 7.0 was frozen/locked down in the summer of 2001, around the same time that MSLU was being released. See also the answer to question 4. Before you start: Install Visual Studio 6 with SP5 including all necessary files. The first thing you need to do is make sure that when you install Visual Studio 6 that you check on both the Unicode MFC versions, and the CRT source code. Both are turned off by default. After this install Visual Studio 6 Service Pack 5. Identify the folders and files you will be modifying. If you installed to the default locations, all of the files we need to change are contained in the tree \Program Files\Microsoft Visual Studio\VC98. Find the MFC\SRC folder and the CRT\SRC folder. Install the Platform SDK and copy the unicows.lib file to your VC98\LIB folder. How to build the CRT with MSLU The first thing we want to do is make a backup of our VC98\Lib folder. We will be replacing files in it, so if we need to go back (or switch between MSLU and non-MSLU version of the CRT) we can always do that. Secondly, let's copy the VC98\CRT\SRC folder to a comfortable place so we can change it and build from it. For example, we'll copy it to the root of C: so we have a folder called C:\SRC, available for quick access from the command line. When building the CRT we are actually building three DLLs: MSVCRT.DLL, MSVCP60.DLL, and MSVCIRT.DLL. Since we are building both debug and release builds it makes a total of six DLLs we need to build. There is a provided batch file bldwin95.bat that will build the all CRT DLLs but the makefile does not seem to be there. The Makefile IS provided, but it a hidden manner. The three files are ext_mkf, ext_mkf.inc, and ext_mkf.sub. If you copy these to makefile, makefile.inc, and makefile.sub then the batch file will work, i.e. copy ext_mkf Makefile copy ext_mkf.inc Makefile.inc copy ext_mkf.sub Makefile.sub At the top of makefile there is a section that controls the naming of the three DLLs to build. For this purpose we will use the name MSLU as a prefix to all of the DLLs instead of the standard name MSVC. So the six names of the DLLs we will create are:MSLURT.DLL MSLURTD.DLL MSLUP60.DLL MSLUP60D.DLL MSLUIRT.DLL MSLUIRTD.DLL Warning: Since most people who follow these steps will probably use the exact names given here, please be sure to keep these versions of the DLLs in your own private directory when you use them. The default names provided in the makefile are _SAMPLE_, SAMPLE_I, and SAMPLE_P. There are associated RC and DEF files for each of these names, so we need to copy them to the new names, i.e. copy _SAMPLE_.RC MSLURT.RC copy SAMPLE_I.RC MSLUIRT.RC copy SAMPLE_P.RC MSLUP60.RC copy SAMPLE_I.DEF MSLUIRT.DEF copy SAMPLD_I.DEF MSLUIRTD.DEF copy SAMPLE_P.DEF MSLUP60.DEF copy SAMPLD_P.DEF MSLUP60D.DEF copy Intel\_SAMPLE_.DEF Intel\MSLURT.DEF copy Intel\_SAMPLD_.DEF Intel\MSLURTD.DEF Next we need to change the LIBRARY name in each of the above DEF files to match the name of the DEF file. Open up each file in notepad to make the change. The provided makefile needs some minor changes to get it to work properly and link with Unicows.lib. Change the top block of defines to the following: RETAIL_DLL_NAME=MSLURT RETAIL_LIB_NAME=MSLURT RETAIL_DLLCPP_NAME=MSLUP60 RETAIL_LIBCPP_NAME=MSLUP60 RETAIL_DLLIOS_NAME=MSLUIRT RETAIL_LIBIOS_NAME=MSLUIRT DEBUG_DLL_NAME=MSLURTD DEBUG_LIB_NAME=MSLURTD DEBUG_DLLCPP_NAME=MSLUP60D DEBUG_LIBCPP_NAME=MSLUP60D DEBUG_DLLIOS_NAME=MSLUIRTD DEBUG_LIBIOS_NAME=MSLUIRTD The path to the Visual Studio folder needs to be changed (the default is MSDEV) V6TOOLS=C:\Program Files\Microsoft Visual Studio\VC98 Some of the commands need to be surrounded in quotes because of the location of the VC98 folder has spaces in the path. line 331 change to: RC_INCS="-I$(V6TOOLS)\include" line 1728, 1770, 1810, 1853, 1898, 1941 change to: "$(V6TOOLS)\include\winver.h" \ We want to fix the PDB file creation line 381-383 change to: RELEASE_DLL_DBG_PDB = $(PDBDIR_CPU)\$(DEBUG_DLL_NAME).pdb RELEASE_DLLCPP_DBG_PDB = $(PDBDIR_CPU)\$(DEBUG_DLLCPP_NAME).pdb RELEASE_DLLIOS_DBG_PDB = $(PDBDIR_CPU)\$(DEBUG_DLLIOS_NAME).pdb line 474 change to: $(CRT_RELDIR) $(RELDIR_CPU) : line 987 change to: xdll : $(OBJROOT) $(OBJCPUDIR) $(OBJDIR_DLL_DBG) $(RELDIR_CPU) xothers \ after line 1746 add: -pdb:$(RELEASE_DLL_PDB) after line 1789 add: -pdb:$(RELEASE_DLLCPP_PDB) after line 1829 add: -pdb:$(RELEASE_DLLIOS_PDB) We want to link to unicows.lib before any other lib files. line 1750, 1794, 1835, 1878, 1925, 1969 change to: unicows.lib kernel32.lib advapi32.lib user32.lib gdi32.lib shell32.lib comdlg32.lib version.lib mpr.lib rasapi32.lib winmm.lib winspool.lib vfw32.lib oleacc.lib oledlg.lib Once we make these changes, we are ready to build the DLLs. It's simple - just do the following from a command prompt in the C:\SRC folder: set V6TOOLS=C:\Program Files\Microsoft Visual Studio\VC98 "C:\Program Files\Microsoft Visual Studio\VC98\BIN\VCVARS32" BLDWIN95 Once the DLLs finish building they will be in a subfolder called BUILD\INTEL. The Libs, PDBs, and Maps are also in that folder. Now we've got 6 libs (3 debug, 3 release) we can link to. Let's copy those new libs back to the original names of the libs, e.g. copy MSLURT.LIB MSVCRT.LIB copy MSLURTD.LIB MSVCRTD.LIB copy MSLUP60.LIB MSVCPRT.LIB copy MSLUP60D.LIB MSVCPRTD.LIB copy MSLUIRT.LIB MSVCIRT.LIB copy MSLUIRTD.LIB MSVCIRTD.LIB The reason we do this is so we can link our existing apps (and build MFC) without having to muck around with the libraries that they link to. The Libs still point to the newly named DLLs, even though they don't share the same names as the new ones anymore. Now copy the 6 MSVC libs to the VC98\Lib folder (overwriting the existing ones) The CRT build is now done. Before proceeding any further we need to close the command prompt that we used to build the CRT because it created certain environment variables that will cause compile errors in the next step, building the Unicode version of MFC. Building MFC Unicode version with MSLU First we will make a backup of the following folders (and all subfolders of): VC98\MFC\LIB, and VC98\MFC\SRC so we can restore them later if necessary. Building the Unicode version of MFC is slightly easier than building the CRT. The Unicode version of MFC is actually 5 different DLLs: MFC42U.DLL (Unicode Release) MFC42UD.DLL (Unicode Debug) MFCN42UD.DLL (Unicode Debug - Network classes) MFCO42UD.DLL (Unicode Debug - OLE classes) MFCD42UD.DLL (Unicode Debug - Database classes) Notice that the release version is a single DLL whereas the debug version is split up into 4 different DLLs. Presumably, this was done to reduce load times when debugging. This makes it a little less straightforward that it should be. To build MFC, there are 4 provided Makefiles in the VC98\MFC\SRC folder named: MFCDLL.MAK MFCNET.MAK MFCOLE.MAK MFCDB.MAK The MFCDLL.MAK builds the first two DLLs, and each of the others builds the rest. First, we will change each of the 4 .MAK files to link to Unicows.lib. In each file, after the line that states: link @<< insert the following two lines:/nod:kernel32.lib /nod:advapi32.lib /nod:user32.lib /nod:gdi32.lib /nod:shell32.lib /nod:comdlg32.lib /nod:version.lib /nod:mpr.lib /nod:rasapi32.lib /nod:winmm.lib /nod:winspool.lib /nod:vfw32.lib /nod:secur32.lib /nod:oleacc.lib /nod:oledlg.lib /nod:sensapi.lib unicows.lib kernel32.lib advapi32.lib user32.lib gdi32.lib shell32.lib comdlg32.lib version.lib mpr.lib rasapi32.lib winmm.lib winspool.lib vfw32.lib oleacc.lib oledlg.lib They must go in that position, if we don't do this then a library reference will be included causing unicows.lib to be linked after kernel32.lib (which will then cause the unicows.dll load to fail). Other DLLs in the wrong order will simply cause APIs in those specific DLLs to not be called. The line numbers to insert the above two lines after are: MFCDLL.MAK - line 206 MFCNET.MAK - line 134 MFCOLE.MAK - line 134 MFCDB.MAK - line 140 Now, we will decide what to name our new DLL. We do not want to use the standard name(s) for the same reasons we did not use the standard names for the CRT. So we will come up with a simple naming convention: we'll add an "L" to the name. So the new names will be: MFC42LU.DLL MFC42LUD.DLL MFCN42LUD.DLL MFCO42LUD.DLL MFCD42LUD.DLL We are going over the 8.3 naming convention here, but only for some of the debug builds which we will not be shipping, so we should be fine. Now we need to change the following three files to match our naming. MFC does a LoadLibrary and has hard-coded each of the above DLL names to our new names in the following files: DLLDB.CPP, DLLNET.CPP, and DLLOLE.CPP. In DLLDB.CPP, change lines 38,39, 46, and 47, i.e. #define MFC42_DLL "MFC42LUD.DLL" #define MFCO42_DLL "MFCO42LUD.DLL" #define MFC42_DLL "MFC42LU.DLL" #define MFCO42_DLL "MFCO42LU.DLL" In DLLNET.CPP, change lines 37 and 43, i.e. #define MFC42_DLL "MFC42LUD.DLL" #define MFC42_DLL "MFC42LU.DLL" In DLLOLE.CPP, change lines 38 and 44, i.e. #define MFC42_DLL "MFC42LUD.DLL" #define MFC42_DLL "MFC42LU.DLL" Now we need to change the hard-coded reference to the name of the CRT. Since we changed it when rebuilt the CRT, we need to change the following file: DLLINIT.CPP. In DLLINIT.CPP, change lines 371 and 373, i.e. #define MSVCRT_DLL "MSLURTD.DLL" #define MSVCRT_DLL "MSLURT.DLL" While we're in DLLINIT.CPP we need to get rid of the block that prevents loading of MFC (if we haven't already done so). Change the line 391 from #ifdef _UNICODE to #if 0 (this will prevent that piece of code from being included). For more info on this issue and other MFC/MSLU issues, click here. Since we renamed the DLLs, we need to rename the DEF files that are linked to the DLLs. They are located in the VC98\MFC\SRC\INTEL folder. i.e. copy MFC42U.DEF MFC42LU.DEF copy MFC42UD.DEF MFC42LUD.DEF copy MFCN42UD.DEF MFCN42LUD.DEF copy MFCO42UD.DEF MFCO42LUD.DEF copy MFCD42UD.DEF MFCD42LUD.DEF We need to open each of the above new DEF files up in notepad and change the LIBRARY line to match the name of the DEF file. Now we're ready to build the versions of MFC: Create a new batch file called buildmfc.bat in the MFC\SRC folder with the following content: nmake -f mfcdll.mak libname=MFC42L DEBUG=0 UNICODE=1 /a nmake -f mfcdll.mak libname=MFC42L DEBUG=1 UNICODE=1 /a nmake -f mfcdb.mak libname=MFCD42L DEBUG=1 UNICODE=1 /a nmake -f mfcnet.mak libname=MFCN42L DEBUG=1 UNICODE=1 /a nmake -f mfcole.mak libname=MFCO42L DEBUG=1 UNICODE=1 /a Above we are building a release and debug versions of the main DLL (MFC42LU(D).DLL), and debug versions of the rest. Run the batch file, and then you will be done. If you need to rebuild any time in the future you now have a convenient batch file to do so. The DLL files will be created in the VC98\MFC\SRC folder. The LIB and PDB files will be created in the MFC\LIB folder. After the building is done, we need to copy the created LIBs in the VC98\MFC\LIB folder back to their original names (overwriting what's there) so that any of our apps that we link will use the new DLLs. i.e. copy MFC42LU.LIB MFC42U.LIB copy MFC42LUD.LIB MFC42UD.LIB copy MFCN42LUD.LIB MFCN42UD.LIB copy MFCO42LUD.LIB MFCO42UD.LIB copy MFCD42LUD.LIB MFCD42UD.LIB The reason we do this is the same as for the CRT: we don't have to worry about changing any linker options in our projects to link to the new version of MFC. Now we're ready to do a test build of an application. Create a new SDI MFC application using the AppWizard, choose dynamic MFC, create a Unicode Debug and Release build, change the settings to link to unicows.lib, copy the newly created CRT and MFC DLLs to your DEBUG or RELEASE build folder(s) and then run the application. It should all work. Use dependency walker to make sure that everything is getting linked properly and the proper DLLs are being loaded (run a profile in dependency walker). No references to the old names of the DLLs for both the CRT or MFC should be there. Switching between non MSLU builds and MSLU builds Because we have done all of the above, any Unicode build on your machine will now link to MSLU. We may not want this necessarily, or we may want to link back to the original CRT and MFC DLLs. This is what we made the backups for. To restore the system, simply restore your VC98\LIB and VC98\MFC\LIB folders. You could even make a simple batch file that copies older or newer versions of the LIBs back to the LIB folders depending on what you want to build.
This document is divided into 4 parts:
In all instructions below, the assumption is a default install path and an en-US copy of Windows; if either is not the case, make sure you replace paths such as C:\Program Files\Microsoft Visual Studio with the appropriate install location.
Answer: The MFC Unicode version was originally meant for NT based operating systems only. With the advent of MSLU, it is now possible to write a single binary that runs on all platforms. If you have an existing Unicode MFC application or simply want to take advantage of the MFC in a Unicode app you will need to rebuild both the CRT and MFC. You need to build the CRT because MFC MUST be linked dynamically to the CRT.
Answer: No, not all CRT functions work on all platforms. The CRT does run on Windows 9x platforms but there are certain wide versions of functions, e.g. wfullpath, that call wide Windows API functions, e.g. GetFullPathNameW, which are only available on NT based systems. So therefore if calling one of these functions, as MFC does, the function will fail. This affects MFC in a major way, the MFC startup code passes in an invalid hInstance, causing the DLL to fail.
Answer: It's not very complicated. Microsoft in their wisdom included the source code to all of MFC and the runtime library, so it's possible to create new DLLs with minor modifications to the provided makefiles.
Answer: For one thing, MSLU came out long after MFC did, and the team that created MSLU is separate from the one that maintains MFC. I'm sure if you encourage them they might consider maintaining a special build of MFC and the CRT. It's all a matter of priorites, the MFC and CRT teams at Microsoft are really busy.
Answer: No it doesn't. MFC 7.0 was frozen/locked down in the summer of 2001, around the same time that MSLU was being released. See also the answer to question 4.
The first thing you need to do is make sure that when you install Visual Studio 6 that you check on both the Unicode MFC versions, and the CRT source code. Both are turned off by default. After this install Visual Studio 6 Service Pack 5.
If you installed to the default locations, all of the files we need to change are contained in the tree \Program Files\Microsoft Visual Studio\VC98. Find the MFC\SRC folder and the CRT\SRC folder.
The first thing we want to do is make a backup of our VC98\Lib folder. We will be replacing files in it, so if we need to go back (or switch between MSLU and non-MSLU version of the CRT) we can always do that.
Secondly, let's copy the VC98\CRT\SRC folder to a comfortable place so we can change it and build from it. For example, we'll copy it to the root of C: so we have a folder called C:\SRC, available for quick access from the command line.
When building the CRT we are actually building three DLLs: MSVCRT.DLL, MSVCP60.DLL, and MSVCIRT.DLL. Since we are building both debug and release builds it makes a total of six DLLs we need to build.
There is a provided batch file bldwin95.bat that will build the all CRT DLLs but the makefile does not seem to be there. The Makefile IS provided, but it a hidden manner. The three files are ext_mkf, ext_mkf.inc, and ext_mkf.sub. If you copy these to makefile, makefile.inc, and makefile.sub then the batch file will work, i.e.
copy ext_mkf Makefile copy ext_mkf.inc Makefile.inc copy ext_mkf.sub Makefile.sub
At the top of makefile there is a section that controls the naming of the three DLLs to build. For this purpose we will use the name MSLU as a prefix to all of the DLLs instead of the standard name MSVC. So the six names of the DLLs we will create are:
MSLURT.DLL MSLURTD.DLL MSLUP60.DLL MSLUP60D.DLL MSLUIRT.DLL MSLUIRTD.DLL
The default names provided in the makefile are _SAMPLE_, SAMPLE_I, and SAMPLE_P. There are associated RC and DEF files for each of these names, so we need to copy them to the new names, i.e.
copy _SAMPLE_.RC MSLURT.RC copy SAMPLE_I.RC MSLUIRT.RC copy SAMPLE_P.RC MSLUP60.RC copy SAMPLE_I.DEF MSLUIRT.DEF copy SAMPLD_I.DEF MSLUIRTD.DEF copy SAMPLE_P.DEF MSLUP60.DEF copy SAMPLD_P.DEF MSLUP60D.DEF copy Intel\_SAMPLE_.DEF Intel\MSLURT.DEF copy Intel\_SAMPLD_.DEF Intel\MSLURTD.DEF
RETAIL_DLL_NAME=MSLURT RETAIL_LIB_NAME=MSLURT RETAIL_DLLCPP_NAME=MSLUP60 RETAIL_LIBCPP_NAME=MSLUP60 RETAIL_DLLIOS_NAME=MSLUIRT RETAIL_LIBIOS_NAME=MSLUIRT DEBUG_DLL_NAME=MSLURTD DEBUG_LIB_NAME=MSLURTD DEBUG_DLLCPP_NAME=MSLUP60D DEBUG_LIBCPP_NAME=MSLUP60D DEBUG_DLLIOS_NAME=MSLUIRTD DEBUG_LIBIOS_NAME=MSLUIRTD
V6TOOLS=C:\Program Files\Microsoft Visual Studio\VC98
line 331 change to: RC_INCS="-I$(V6TOOLS)\include" line 1728, 1770, 1810, 1853, 1898, 1941 change to: "$(V6TOOLS)\include\winver.h" \
line 331 change to:
RC_INCS="-I$(V6TOOLS)\include"
line 1728, 1770, 1810, 1853, 1898, 1941 change to:
"$(V6TOOLS)\include\winver.h" \
line 381-383 change to: RELEASE_DLL_DBG_PDB = $(PDBDIR_CPU)\$(DEBUG_DLL_NAME).pdb RELEASE_DLLCPP_DBG_PDB = $(PDBDIR_CPU)\$(DEBUG_DLLCPP_NAME).pdb RELEASE_DLLIOS_DBG_PDB = $(PDBDIR_CPU)\$(DEBUG_DLLIOS_NAME).pdb line 474 change to: $(CRT_RELDIR) $(RELDIR_CPU) : line 987 change to: xdll : $(OBJROOT) $(OBJCPUDIR) $(OBJDIR_DLL_DBG) $(RELDIR_CPU) xothers \ after line 1746 add: -pdb:$(RELEASE_DLL_PDB) after line 1789 add: -pdb:$(RELEASE_DLLCPP_PDB) after line 1829 add: -pdb:$(RELEASE_DLLIOS_PDB)
line 381-383 change to:
RELEASE_DLL_DBG_PDB = $(PDBDIR_CPU)\$(DEBUG_DLL_NAME).pdb RELEASE_DLLCPP_DBG_PDB = $(PDBDIR_CPU)\$(DEBUG_DLLCPP_NAME).pdb RELEASE_DLLIOS_DBG_PDB = $(PDBDIR_CPU)\$(DEBUG_DLLIOS_NAME).pdb line 474 change to: $(CRT_RELDIR) $(RELDIR_CPU) : line 987 change to: xdll : $(OBJROOT) $(OBJCPUDIR) $(OBJDIR_DLL_DBG) $(RELDIR_CPU) xothers \ after line 1746 add: -pdb:$(RELEASE_DLL_PDB) after line 1789 add: -pdb:$(RELEASE_DLLCPP_PDB) after line 1829 add: -pdb:$(RELEASE_DLLIOS_PDB)
line 474 change to:
$(CRT_RELDIR) $(RELDIR_CPU) :
line 987 change to:
xdll : $(OBJROOT) $(OBJCPUDIR) $(OBJDIR_DLL_DBG) $(RELDIR_CPU) xothers \
after line 1746 add:
-pdb:$(RELEASE_DLL_PDB)
after line 1789 add:
-pdb:$(RELEASE_DLLCPP_PDB)
after line 1829 add:
-pdb:$(RELEASE_DLLIOS_PDB)
line 1750, 1794, 1835, 1878, 1925, 1969 change to: unicows.lib kernel32.lib advapi32.lib user32.lib gdi32.lib shell32.lib comdlg32.lib version.lib mpr.lib rasapi32.lib winmm.lib winspool.lib vfw32.lib oleacc.lib oledlg.lib Once we make these changes, we are ready to build the DLLs. It's simple - just do the following from a command prompt in the C:\SRC folder: set V6TOOLS=C:\Program Files\Microsoft Visual Studio\VC98 "C:\Program Files\Microsoft Visual Studio\VC98\BIN\VCVARS32" BLDWIN95 Once the DLLs finish building they will be in a subfolder called BUILD\INTEL. The Libs, PDBs, and Maps are also in that folder.
line 1750, 1794, 1835, 1878, 1925, 1969 change to:
Once we make these changes, we are ready to build the DLLs. It's simple - just do the following from a command prompt in the C:\SRC folder:
set V6TOOLS=C:\Program Files\Microsoft Visual Studio\VC98 "C:\Program Files\Microsoft Visual Studio\VC98\BIN\VCVARS32" BLDWIN95
copy MSLURT.LIB MSVCRT.LIB copy MSLURTD.LIB MSVCRTD.LIB copy MSLUP60.LIB MSVCPRT.LIB copy MSLUP60D.LIB MSVCPRTD.LIB copy MSLUIRT.LIB MSVCIRT.LIB copy MSLUIRTD.LIB MSVCIRTD.LIB
The reason we do this is so we can link our existing apps (and build MFC) without having to muck around with the libraries that they link to. The Libs still point to the newly named DLLs, even though they don't share the same names as the new ones anymore.
First we will make a backup of the following folders (and all subfolders of): VC98\MFC\LIB, and VC98\MFC\SRC so we can restore them later if necessary.
Building the Unicode version of MFC is slightly easier than building the CRT. The Unicode version of MFC is actually 5 different DLLs:
Notice that the release version is a single DLL whereas the debug version is split up into 4 different DLLs. Presumably, this was done to reduce load times when debugging. This makes it a little less straightforward that it should be.
To build MFC, there are 4 provided Makefiles in the VC98\MFC\SRC folder named:
The MFCDLL.MAK builds the first two DLLs, and each of the others builds the rest.
First, we will change each of the 4 .MAK files to link to Unicows.lib. In each file, after the line that states:
insert the following two lines:
The line numbers to insert the above two lines after are:
Now, we will decide what to name our new DLL. We do not want to use the standard name(s) for the same reasons we did not use the standard names for the CRT. So we will come up with a simple naming convention: we'll add an "L" to the name. So the new names will be:
We are going over the 8.3 naming convention here, but only for some of the debug builds which we will not be shipping, so we should be fine.
Now we need to change the following three files to match our naming. MFC does a LoadLibrary and has hard-coded each of the above DLL names to our new names in the following files: DLLDB.CPP, DLLNET.CPP, and DLLOLE.CPP.
In DLLDB.CPP, change lines 38,39, 46, and 47, i.e.
#define MFC42_DLL "MFC42LUD.DLL" #define MFCO42_DLL "MFCO42LUD.DLL" #define MFC42_DLL "MFC42LU.DLL" #define MFCO42_DLL "MFCO42LU.DLL"
In DLLNET.CPP, change lines 37 and 43, i.e.
#define MFC42_DLL "MFC42LUD.DLL" #define MFC42_DLL "MFC42LU.DLL"
In DLLOLE.CPP, change lines 38 and 44, i.e.
Now we need to change the hard-coded reference to the name of the CRT. Since we changed it when rebuilt the CRT, we need to change the following file: DLLINIT.CPP.
In DLLINIT.CPP, change lines 371 and 373, i.e.
#define MSVCRT_DLL "MSLURTD.DLL" #define MSVCRT_DLL "MSLURT.DLL"
While we're in DLLINIT.CPP we need to get rid of the block that prevents loading of MFC (if we haven't already done so). Change the line 391 from #ifdef _UNICODE to #if 0 (this will prevent that piece of code from being included). For more info on this issue and other MFC/MSLU issues, click here.
Since we renamed the DLLs, we need to rename the DEF files that are linked to the DLLs. They are located in the VC98\MFC\SRC\INTEL folder. i.e.
copy MFC42U.DEF MFC42LU.DEF copy MFC42UD.DEF MFC42LUD.DEF copy MFCN42UD.DEF MFCN42LUD.DEF copy MFCO42UD.DEF MFCO42LUD.DEF copy MFCD42UD.DEF MFCD42LUD.DEF
We need to open each of the above new DEF files up in notepad and change the LIBRARY line to match the name of the DEF file.
Now we're ready to build the versions of MFC:
Create a new batch file called buildmfc.bat in the MFC\SRC folder with the following content:
nmake -f mfcdll.mak libname=MFC42L DEBUG=0 UNICODE=1 /a nmake -f mfcdll.mak libname=MFC42L DEBUG=1 UNICODE=1 /a nmake -f mfcdb.mak libname=MFCD42L DEBUG=1 UNICODE=1 /a nmake -f mfcnet.mak libname=MFCN42L DEBUG=1 UNICODE=1 /a nmake -f mfcole.mak libname=MFCO42L DEBUG=1 UNICODE=1 /a
Above we are building a release and debug versions of the main DLL (MFC42LU(D).DLL), and debug versions of the rest.
Run the batch file, and then you will be done. If you need to rebuild any time in the future you now have a convenient batch file to do so. The DLL files will be created in the VC98\MFC\SRC folder. The LIB and PDB files will be created in the MFC\LIB folder.
After the building is done, we need to copy the created LIBs in the VC98\MFC\LIB folder back to their original names (overwriting what's there) so that any of our apps that we link will use the new DLLs. i.e.
copy MFC42LU.LIB MFC42U.LIB copy MFC42LUD.LIB MFC42UD.LIB copy MFCN42LUD.LIB MFCN42UD.LIB copy MFCO42LUD.LIB MFCO42UD.LIB copy MFCD42LUD.LIB MFCD42UD.LIB
The reason we do this is the same as for the CRT: we don't have to worry about changing any linker options in our projects to link to the new version of MFC.
Because we have done all of the above, any Unicode build on your machine will now link to MSLU. We may not want this necessarily, or we may want to link back to the original CRT and MFC DLLs. This is what we made the backups for. To restore the system, simply restore your VC98\LIB and VC98\MFC\LIB folders. You could even make a simple batch file that copies older or newer versions of the LIBs back to the LIB folders depending on what you want to build.
There is a great deal of confusion surrounding the meaning of these two different things in the .NET Framework, and when to use each. If you have suffered, are suffering, or think may suffer in the future from such a confusion, then read on!
(Otherwise, I guess you can go away and come back another time)
The invariant culture's direct ancestor is the invariant locale. Officially added to the Windows source tree at 10:23am on May 12, 2001, its intention was not to be used as an actual locale (which would explain why no locale data was added until a month later; until then no one was using it in GetLocaleInfo!).
Originally, LOCALE_INVARIANT had just one noble purpose -- to allow one to use CompareString (and LCMapString with the LCMAP_SORTKEY flag) in a way that would only use the "Default" Windows sorting table as mentioned a little bit here and especially here. The results, as that second article mentioned, would not vary when the user or system locale settings did; they would be invariant within that installation of Windows.
The data was added for this locale a month later, as I said, for obvious reasons -- if you have an LCID that one function considers to be valid, you must have a very good reason if another will not. And it cannot duplicate any other locale, either. Much weird data was added so that no one would be tempted to try to act like they spoke a language called "Invariant" and then all was good.
Note that these string comparisons still had much linguistic value -- half of the locales in Windows use that default table, so an invariant sort would not only avoid varying, it would also look right to a lot of the world.
The .NET framework had similar requirements (with the additional need for invariant parsing/formatting support) and thus CultureInfo.InvariantCulture was created. As with the locale, any string comparions made with InvariantCulture's CompareInfo object would have linguistic validity in a lot of places, and would not vary within that installation of the .NET Framework.
So everyone had what they needed, right?
Well, no.
A bunch of people wanted a method of doing a more binary type of comparison, instead of one that would be based on the "linguistically appropriate" approach gven a particular culture1.
The difference between what we had and what they wanted was akin to the difference between the C Runtime's strcoll/wcscoll versus strcmp/wcscmp (in the CRT documentation they refer to the difference as being locale based versus lexicographic).
The other advantage to such a "lexicographic" comparison is that it would be faster since a simple binary comparison of the code point values was being used.
To meet this need, the notion of an Ordinal sort was added and an Ordinal member was added to the CompareOptions enumeration. Selecting it would ignore all of those cultural collation features and give you a binary sort that would also, incidentally, not vary.
The only remaining problem at this point is that there were now two useful ways to do these different "niche" type of comparisons but neither name really jumps out at the developers who were looking for such solutions.
That problem remains to this day, though every single time I speak at a conference or answer a question in a newsgroup or get someone to look at posts like this one, then there is at least one less developer who has this problem. Maybe this time it is you? :-)
Now the story does not end here; many people have wanted to do things in a case-insensitive way. Of course if you wanted a case-insensitive invariant comparison then you could have done that all along -- just use the InvariantCulture's CompareInfo methods with the CompareOptions.IgnoreCase flag passed in. Easy!
But some people wanted a case-insensitive ordinal comparison?!?
Now the closet linguist in me shudders at this concept since a casing operation is essentially a linguistic one while an ordinal one is specifically not -- it's lexicographic.
So people are asking for a linguistic non-linguistic support, a request that for me brings to mind the comedian Steven Wright's dog2.
However, the technical half of me understands the need and so I got over my linguistic fetish as one of my colleagues on the BCL team worked in Whidbey to add a new OrdinalIgnoreCase member to the CompareOptions enumeration.
The behavior is basically to do the casing operation using the default casing tables prior to doing the binary comparison. This feature has been in the "Whidbey" version of the .NET Framework for some time (first checked into the source code tree on February 7, 2003), so you can try it out today if you have just about any build of Whidbey underfoot.
Hopefully this post will help clear up some of the confusion about these two interesting comparison types.
1 - What can I say? Some people are Некультурные (uncultured) though not in the culturally offensive sense.2 - Steven Wright claimed to have named his dog Stay so that he could call out "Come here, Stay! Come here, Stay!" and watch the dog walk toward him in a stuttery fashion.
This post brought to you by "Ω" (U+03a9, GREEK CAPITAL LETTER OMEGA)I talked to Omega just before this post went live. She said that as the last letter in the Greek alphabet (who was pretty much always therefore last in the queue), she understood the cost of keeping letters in order. Any performance benefit is good one, to her mind. Especially since a binary sort would let her come before her little sister (U+03c9, GREEK SMALL LETTER OMEGA) for once.
I had this conversation a little over two years ago in the Netherlands on the end of the last day at a conference. It may not be word for word, though I actually think it comes pretty close (its not like I had a tape recorder). The cookies were Pepperidge Farm Mint Milanos, but I do not like mint (I love the non-mint varieties, I am not sure how I ended up with the ones I did - it might have been a mistake to mention I did not like them).
Oh, also the name of woman I talked to is not really Andrea; I just like the name and do not mind the nod to Jubal Harshaw....
Me: Andrea, would you like a cookie?
Andrea: Actually, I would like to know what the "Korean Unicode sort" is.
Me: I'd actually rather give you one of these cookies. They are really good. Plus its less embarrassing than the answer to your question.
Andrea: I know you hate mint, you said so yesterday at the luncheon. C'mon Michael!
(Short pause)
Andrea: Or is it Mike? Or maybe michka like your mails?
Me: Michael's best.
Andrea: Ok, no Russian bears. So tell me, why is the Korean Unicode sort embarrassing? I could not find it defined anywhere, except maybe I found a vague hint to the 'Unicode collation' setting that was used in SQL Server 7.0, which could be Korean. Is that it?
Me: No, that's not what it is. Though SQL Server does have a "Korean Unicode collation" of its own that matches the one that used to be on Windows.
Andrea: Grrr. You are infuriating, Michael. What is the Korean Unicode sort? The one that is in SQL Server, the one that used to be in Windows, the one that is still in the header files. What is it?
Me: Well, its almost the same sort as the one we use for English.
Andrea: Almost? How close is almost? Sounds like almost hitting a home run, but what kind? Was it an almost home run that was a strike out, or an almost home run that was a triple?
Me: Ouch! Well, if you put it that way, I guess you could say it's a strike out.
(I have an embarrassed smile at this point)
Me: We move one character.
Andrea: One character?
Me: One character.
Andrea: What character is it? Something insulting to a government? Did Microsoft upset the Korean premier or something?
Me: No, nothing like that. Its U+005c, the "REVERSE SOLIDUS". Also known as the backslash. Not insulting at all.
Andrea: One of us has to be missing something, Michael. Maybe you had better give me a cookie.
(She eats a cookie, and tries to hand the package back. I shake my head)
Andrea: So please, explain to me why the backslash has to be moved for Korea.
Me: Well, because for Korean, it is also the Won sign (₩).
Andrea: You said in your talk today that there is room for over a million characters in Unicode. There is no room for a dedicated Won?
Me: Oh, there is a dedicated Won Sign at U+20a9. Its just that in most Korean fonts a character that looks like a Won is put in the slot for U+005c, and since the characters look the same we try to make sure that they are treated as if they were the same.
Andrea: Ok, I see that. But why is it called the Korean Unicode sort. If its legacy then that would make it the Korean ANSI sort, right?
Me: Well, ANSI does not have Korean in it, and there is no Won.
Andrea: You know what I mean, Michael. Are you this exasperating when you talk with your girlfriend?
Me: Oh, I... I'm between girlfriends at the moment.
Andrea: I WONder why....
Me: Hey now!
(Andrea is wearing quite an impish grin at this point)
Andrea: Just kidding. But I was up too late last night and you already gave me the cookies. So I have no real need to flirt when I am teasing at this point.
Me: Hmmmm, no one ever used to have a need. Anyway, I know what you mean. It probably would have made more sense to tie it to the Korean standard, except thats encoding and not sorting. And they basically do put the won at 0x5c in their encoding standard, so MS is just trying to be consistent. It would have been really weird trying to tie to KSC-5601.
Andrea: I can definitely see that. So, what about the rest of the Hangul and Hanja and Jamo and whatnot that is used in by Koreans?
Me: Well, now you understand why it was probably removed from Windows -- because it does not really do much for Korean.
Andrea: But its still in SQL Server. They didn't get the memo?
Me: I know you think that I am a bigwig at Microsoft, but I'm not. I was offered a job there but I haven't even started yet. And I am definitely not "in the know" about what they do in SQL Server.
Andrea: No need to be shirty, dear. I understand. I apologize for thinking you were important.
(I grimace at this point)
Andrea: Ok, and I apologize for teasing you now. But back to the Korean thing.... do you have a guess?
Me: Oh, definitely. I just don't know if I am right.
Andrea: So what is the theory?
Me: My guess is that since there is a serious worry about backward compatibiliy and sort orders in SQL Server, and they can't really get rid of something as easily, even if it is useless. I guess they could have hacked it since its only different by one character, but they are a team that is astoundingly against hacks. Thats something I can respect.
Andrea: So can I. Probably worth a KB article, at least.
Me: Maybe. If PSS gets customers wondering where good old 0x00010412 went, I'll suggest it.
(She eats another cookie)
Andrea: Ok. I'm sorry to monopolize your time like this.
Me: No worries, the group is gone, the conference is mostly over. Hell, I'd probably be flying out tonight if there were a flight. You can come out with us tonight if you want. Well, that is if we are going anywhere.
Andrea: Actually, you can come out with us. My friends are more socially adept than yours.
Me: Probably true. And more than me, too.
Andrea: One more question and we can head back to what's left of the group.
Me: Ok. What's the question?
Andrea: Whats up with the Japanese (Unicode) sort?
Needless to say, the conversation devolved at that point. But Andrea did finish the cookies. I did go out with four of Andrea's friends that night and drank more than I should have. The flight home was harder with a hangover, and to be perfectly honest it was not until I sat down to try and remember the whole conversation earlier tonight that I remembered I was supposed to follow up with PSS.
Maybe the blog entry is good enough at this point? :-)
Now, I would have thought that this dialog would have been easy for people to understand, but there has been a lot of confusion related to it. Therefore, this rude little Q&A is intended to cover just what this dialog does for you. As I describe each part, refer to this picture so you know exactly what I am talking about.
I will let you in on a secret: the people who understand this dialog will sometimes laugh their you-know-whats off every time they think about the people who just do not get it. Not the people who ask questions and then learn, but the people who ask questions, get answers, and then start confusing simple terms like BUTTON, DROPDOWN, and LISTBOX. So please, we want YOU to be one of us, the laughers, rather than the laughees.
You should assume that every word of this page is there to allow you to laugh at all the others who simply refuse to get it. If there happens to be a sentence that used to apply to you, just don't tell anyone; we will never know the difference. Now you can laugh with the rest of us -- welcome to the club!
First, the control types:
You press it, and then things happen. If it has three dots at the end of it, usually that means another dialog is doing to open when you press it.
Ok, now to the dialog!
Default User Locale (DROPDOWN) - These are the preference that you, the user, has for items like date formats, calendar, preferences for text sorting, etc. Now most of these settings can be handled individually by clicking on many of the tabs at the top of the dialog (Numbers, Currency, Time, Date). You can think of this dropdown combobox as a useful way to be lazy and have settings made automatically based on the locale you choose. And don't forget -- its the DROPDOWN at the top of the dialog!
Default System Locale (BUTTON) - This setting is the one that controls, at the machine level, the locale that will be used for all conversions to and from Unicode for applications without Unicode support (like VB, for example). After you hit the BUTTON another dialog will appear. If you change the Default System Locale, you will be prompted to reboot afterwards (you may be prompted for your Windows 2000 CD first if you need to install some files). But I cannot stress it strongly enough: this is the BUTTON at the lower left hand corner of the dialog: the one that says "Set default..." You would not believe how many people mess this up! So think carefully and allow yourself to be one of the people laughing about the confusion, rather than one of the people being laughed at.
User Interface Language (DROPDOWN) - You may not have this control on your regional options at all; it is only there if you have MUI (the Multilingual User Interface) installed. This allows you to change the actual language of Windows itself. It has no effect, I repeat no effect, on your installation of Windows otherwise. At all. Period. If you think it will, then cure yourself of this delusion and realize that you do not need MUI to have a multilingual experience on Windows 2000!
Languages Your System Supports (CHECKED LISTBOX) -- This LISTBOX of languages, each of which has a CHECKBOX to the left, is what controls the installation of all the code pages, fonts, keyboards, etc. so that applications can support the particular language. You will notice that the list has several items that cover many languages; this is intentional! After all, many languages share all of the same information, and thus you can choose the one item and be done with it. You will probably be prompted for your Windows 2000 CD to install the files that you are in essence requesting.
Now, there is obviously more: there is that daunting "Advanced..." BUTTON in the lower right hand corner of the dialog. Lets just ignore that one for now, it is for advanced installation of various alternate legacy code pages (many of which are installed automatically when you add languages that your system supports).
There is also the last TAB of the dialog, which supports input locales. But that is one that will probably requires its own page to explain, so that will be a job for another day.
The dead key mechanism in keyboard layouts is rooted in European typewriters. One would type the accent character and the typewriter's head would not advance, then one would type the base character and it would. The term dead key refers to the fact that the position is not advanced after typing the diacritic mark.
So since people are used to this from typewriters, adding a similar mechanism to Windows should be easy, right?
After all, if you are used to Unicode then you know that one first types the base character and then the diacritic. If you are used to typewriters, you expect to see the diacritic. And in both cases you expect that what is finally produced is always made up of the constituent parts that were typed.
In keyboard layouts on Windows, none of these assumptions are true. Nothing is visible after typing the dead key. The layout defines what is to appear when the combination of the dead key and the base character is typed and there is no rule guiding what must appear.
This is a mechanism that is very easy and intuitive if you know about it, otherwise it is as confusing as hell.
Also, you can only have each pairing of keystrokes produce a single UTF-16 code point. I would not have believed it, but this limitation is one of the most common questions I am asked about (not #1 but definitly in the top five -- most often to do with user-created Polytonic Greek keyboards since there is no precomposed form in Unicode). This is not an MSKLC limitation, it is a core limitation in the keyboard layout architecture, as one can find in the kbd.h header file from the Windows DDK (available to all for a mere $199 plus s/h):
/***************************************************************************\** Dead Key (diaresis) tables** LATER #####: supplant by an NLS API that composes Diacritic+Base -> WCHAR*\***************************************************************************/typedef struct { DWORD dwBoth; // diacritic & char WCHAR wchComposed; USHORT uFlags;} DEADKEY, *KBD_LONG_POINTER PDEADKEY;
There is only room for a single WCHAR for the precomposed character. Sorry!
There is also a comment above the struct by someone who no longer works on keyboards (their name removed from above for obvious reasons!), and no NLS API was ever added for the sake of keyboards that would combine the two characters to create the dead key. Technically one already exists -- the FoldString API with the MAP_PRECOMPOSED flag. However, such an API could never be used at this point, since there has been many years of potential keyboard layouts shipped that allow one to attach to two unrelated characters a third unrelated character.
(I'll see about talking to someone about removing the comment!)
Now one thing that is possible in the Windows architecture is chaining dead keys together, so that a dead key plus a base character will then wait treat the combination as another dead key waiting for yet a third base character. One could then chain that as well, and so on -- adding more and more keystrokes to produce in the end a single code point. This feature is not currently supported by MSKLC since the demand for combinations of three or more keystrokes always involve multiple characters being produced -- one is simply not enough here for anyone who has ever asked....
On the whole, my earlier words about dead keys sum up the situation best:
Luckily there are many ways to produce input with keyboards that is much more intuitive to potential users. Lets leave dead keys for the people who are used to them.
The quote in the title is an allusion to a quote from the Simpsons. Now the ANY key quote is not my favorite quote from Homer J. Simpson (actually, this is), but its in the top five.
And it popped into my head as I prepared to talk about the AltGR key, which is a key that does not exist on the keyboard upon which I am typing right now.
Anyway....
In a comment from a couple of days ago, Norman Diamond suggested that AltGr stood for Alternate Graphics. Thats fine as far as it goes, but only the first two words of his post suggest what AltGR stands for, and the words do not really explain what it means, though he talks a bit about what it does. So the question remains, where does ALTGR come from as a term?
Well, it does indeed stand for Alternate Graphics (or Alternative Grafiken). The original intended purpose of it was to have an easy way to get at the table-based graphical characters that were so handy to use in a console application, located on the right side of the spacebar.
It was, however, used quite actively for keyboards that needed extra keys (and there is no layout I know of today in Windows that supports the graphical characters except by accident in consoles, and they do not use ALTGR to get there). This does not apply to the US english keyboard hardware, so they just put a RIGHT ALT key there which will actually act as if it were the ALTGR key any time you switch to a layout that makes use of this extra shift state. Note that this extra shift state is also available by hitting <Ctrl>+<Alt>, but thats more work to type. So having a single key to type instead is much cooler.
Of course, this can cause problems since sometimes people make shortcuts using <Ctrl>+<Alt>, which screws with what people might want to actually do with a keyboard layout. In fact, Raymond Chen talked about Why Ctrl+Alt shouldn't be used as a shortcut modifier last March, explaining this fact.
I would extend Raymond's very good advice to anybody who uses MSKLC to create custom keyboards (note that MSKLC warns about assigning <Ctrl>+<Shift> to keyboards since many shortcuts are assigned there). Or people who uses Word to create shortcuts. Or people on the Microsoft Word team who created tons of "useful" shortcuts that do not mind stomping on what a keyboard layout may have assigned to a keystroke combination1. The key is to think about the keyboards and/or the shortcuts you create in the larger context of where you may either step on others or be stepped upon by them.
And if you create a custom keyboard with MSKLC, consider putting one of the graphical BOX DRAWING characters in the ALTGR state somewhere, so that you can be one of the cool people that makes the AltGR key meaningful again. Its like having an easter egg in software, but with an important recreational purpose!
1 - Every few months I start looking at the Word object model and its KeyBindings collection and related trivia to create a Word Add-in that will listen for keyboard changes and any time a WM_INPUTLANGCHANGE notification is received it would remove the Word shortcuts that conflicted with actual keyboard assignments. I find the undone project I was working on, get into it for a few hours, and then realize that this is something that the Word team or the Offce team ought to put together and build into the product. So I send off some mails and they agree with me and then it seems to go nowhere. A few months later it starts over again. Maybe one day one of us will have a finished solution for this problem. :-)
This post brought to you by "╦" (U+2566, a.k.a. BOX DRAWINGS DOUBLE DOWN AND HORIZONTAL).The competition in the BOX DRAWING block of Unicode to do a topical sponsorship of the post was fierce; it was finally chosen by the drawing of lots, in order to avoid violence.In the future, an effort will be made to woo "appropriate sponsorship" from Unicode characters based on actual relevance to the specific post. Otherwise, its like a celebrity endorsement for a product that the famous person does not use -- and I hate that.
The MSDN documentation generally recommends that you use the static versions of libraries like the C Runtime (CRT) or the Microsoft Foundation Classes (MFC). The reason for this is that the DLL versions have not been built for MSLU and thus have no knowledge of the need to use the Microsoft Layer for Unicode for Unicode APIs. However, many complex applications really need to use the DLL versions of these libraries. If you are the developer for one of these applications, you will need to rebuild them so they link with Unicows.lib. The following is a small guide on how to perform this task. This document is divided into 3 parts How to build the C Runtime Library 7.0 with MSLU How to build MFC 7.0 with MSLU Switching between non MSLU builds and MSLU builds The fine print For more info on rebuilding MFC extension DLLs, see TN033: DLL Version of MFC, specifically the section entitled "Building the MFC DLL" towards the bottom of the article. Our steps here seem a lot nicer. :-) All of these steps were used to build DLLs that were subsequently tested on Win98 SE. They are expected to work on all platforms. Special thanks are owed to Ted W. for taking the time to do what we all knew was theoretically possible and making it technically possible for everyone. This document is mostly due to his efforts. Thanks, Ted! In all instructions below, the assumption is a default install path and an en-US copy of Windows; if either is not the case, make sure you replace paths such as C:\Program Files\Microsoft Visual Studio .NET with the appropriate install location. Also, special thanks to Tim Dowty of Music Match for the great text of step #5 under the MFC build! Before you start: Install Visual Studio .NET including all necessary files. The first thing you need to do is make sure that when you install Visual Studio .NET that you make sure both the Unicode MFC version and the CRT source code are installed. Identify the folders and files you will be modifying. If you installed to the default locations, all of the files we need to change are contained in the tree \Program Files\Microsoft Visual Studio .NET\VC7. Find the ATLMFC\SRC folder and the CRT\SRC folder. Install the Platform SDK and copy the latest unicows.lib file to your VC7\PlatformSDK\LIB folder. Since VC7 comes with a unicows.lib, this step is optional, although it is good to be on the latest unicows.lib from the most recent Platform SDK. How to build the CRT 7.0 with MSLU The first thing we want to do is make a backup of our VC7\Lib folder. We will be replacing files in it, so if we need to go back (or switch between MSLU and non-MSLU version of the CRT) we can always do that. Secondly, let's copy the VC7\CRT\SRC folder to a comfortable place so we can change it and build from it. For example, we'll copy it to the root of C: so we have a folder called C:\SRC, available for quick access from the command line. When building the CRT we are actually building three DLLs: MSVCR70.DLL, MSVCP70.DLL, and MSVCI70.DLL. Since we are building both debug and release builds it makes a total of six DLLs we need to build. In the SRC folder, there is a provided batch file bldwin9x.bat that will build the all CRT DLLs and an associated makefile. Now we will open up the makefile in notepad. At the top of makefile there is a section that controls the naming of the three DLLs to build. For this purpose we will use the name MSLU as a prefix to all of the DLLs instead of the standard name MSVC. So the six names of the DLLs we will create are:MSLUR70.DLL MSLUR70D.DLL MSLUP70.DLL MSLUP70D.DLL MSLUI70.DLL MSLUI70D.DLL Warning: Since most people who follow these steps will probably use the exact names given here, please be sure to keep these versions of the DLLs in your own private directory when you use them. The default names provided in the makefile are _SAMPLE_, SAMPLE_I, and SAMPLE_P. There are associated RC and DEF files for each of these names, so we need to copy them to the new names, i.e. copy _SAMPLE_.RC MSLUR70.RC copy SAMPLE_I.RC MSLUI70.RC copy SAMPLE_P.RC MSLUP70.RC copy SAMPLE_I.DEF MSLUI70.DEF copy SAMPLD_I.DEF MSLUI70D.DEF copy SAMPLE_P.DEF MSLUP70.DEF copy SAMPLD_P.DEF MSLUP70D.DEF copy Intel\_SAMPLE_.DEF Intel\MSLUR70.DEF copy Intel\_SAMPLD_.DEF Intel\MSLUR70D.DEF Next we need to change the LIBRARY name in each of the above DEF files to match the name of the DEF file. Open up each file in notepad to make the change. The provided makefile needs some minor changes to get it to work properly and link with Unicows.lib. Change the top block of defines to the following: RETAIL_DLL_NAME=MSLUR70 RETAIL_LIB_NAME=MSLUR70 RETAIL_DLLCPP_NAME=MSLUP70 RETAIL_LIBCPP_NAME=MSLUP70 RETAIL_DLLIOS_NAME=MSLUI70 RETAIL_LIBIOS_NAME=MSLUI70 DEBUG_DLL_NAME=MSLUR70D DEBUG_LIB_NAME=MSLUR70D DEBUG_DLLCPP_NAME=MSLUP70D DEBUG_LIBCPP_NAME=MSLUP70D DEBUG_DLLIOS_NAME=MSLUI70D DEBUG_LIBIOS_NAME=MSLUI70D The VCTOOLS path should be changed to point to the path where you installed Visual Studio .NET, e.g. VCTOOLS=C:\Program Files\Microsoft Visual Studio .NET\VC7 We want to link to unicows.lib before any other lib files. line 1527, 1580, 1630, 1678, 1728, 1775 change kernel32.lib to: unicows.lib kernel32.lib advapi32.lib user32.lib gdi32.lib shell32.lib comdlg32.lib version.lib mpr.lib rasapi32.lib winmm.lib winspool.lib vfw32.lib oleacc.lib oledlg.lib Once we make these changes, we are ready to build the DLLs. It's simple – launch a Visual Studio .NET command prompt (start menu-programs-Visual Studio .NET – Visual Studio .net tools – Visual Studio command prompt) and then go to the C:\SRC folder and type: set VCTOOLS=C:\Program Files\Microsoft Visual Studio .NET\VC7 BLDWIN9X Once the DLLs finish building they will be in a subfolder called BUILD\INTEL. The Libs, PDBs, and Maps are also in that folder. Now we've got 6 libs (3 debug, 3 release) we can link to. Let's copy those new libs back to the original names of the libs, e.g. copy MSLUR70.LIB MSVCRT.LIB copy MSLUR70D.LIB MSVCRTD.LIB copy MSLUP70.LIB MSVCPRT.LIB copy MSLUP70D.LIB MSVCPRTD.LIB copy MSLUI70.LIB MSVCIRT.LIB copy MSLUI70D.LIB MSVCIRTD.LIB The reason we do this is so we can link our existing apps (and build MFC) without having to change the libraries that they link to. The Libs still point to the newly named DLLs, even though they don't share the same names as the new ones anymore. Now copy the 6 MSVC libs to the VC7\Lib folder (overwriting the existing ones) The CRT build is now done. Before proceeding any further we need to close the command prompt that we used to build the CRT because it created certain environment variables that will cause compile errors in the next step, building the Unicode version of MFC. Building MFC 7.0 Unicode version with MSLU First we will make a backup of the following folders (and all subfolders of): VC7\ATLMFC\LIB, and VC7\ATLMFC\SRC so we can restore them later if necessary. Building the Unicode version of MFC is slightly easier than building the CRT. The Unicode version of MFC is 2 different DLLs (unlike the 5 different DLLs that we had to worry about when building MFC 6.0): MFC70U.DLL (Unicode Release) MFC70UD.DLL (Unicode Debug) There is also a static component to even a DLL build of MFC, named as follows: MFCS70U.LIB (Unicode Release – static library – deprecated classes) MFCS70UD.LIB (Unicode Debug – static library – deprecated classes) To build MFC, there is one master Makefile in the VC7\ATLMFC folder named: ATLMFC.MAK And there is one Makefile in the VC7\ATLMFC\SRC\MFC folder named MFCDLL.MAK First, we will change the MFCDLL.MAK file to link to Unicows.lib. In each file, after the line that states: link @<< insert the following lines:/nod:kernel32.lib /nod:advapi32.lib /nod:user32.lib /nod:gdi32.lib /nod:shell32.lib /nod:comdlg32.lib /nod:version.lib /nod:mpr.lib /nod:rasapi32.lib /nod:winmm.lib /nod:winspool.lib /nod:vfw32.lib /nod:secur32.lib /nod:oleacc.lib /nod:oledlg.lib /nod:sensapi.lib unicows.lib kernel32.lib advapi32.lib user32.lib gdi32.lib shell32.lib comdlg32.lib version.lib mpr.lib rasapi32.lib winmm.lib winspool.lib vfw32.lib oleacc.lib oledlg.lib They must go in that position, if we don't do this then a library reference will be included causing unicows.lib to be linked after kernel32.lib (which will then cause the unicows.dll load to fail). Other DLLs in the wrong order will simply cause APIs in those specific DLLs to not be called. The line number to insert the above two lines after is line 287. Now, we will decide what to name our new DLL. We do not want to use the standard name(s) for the same reasons we did not use the standard names for the CRT. So we will come up with a simple naming convention: we'll add an "L" to the name. So the new names will be: MFC70LU.DLL MFC70LUD.DLL Now we need to change the following three files to match our naming. MFC does a LoadLibrary and has hard-coded each of the above DLL names to our new names in the following files: DLLDB.CPP, DLLNET.CPP, and DLLOLE.CPP. In DLLDB.CPP, change lines 34, and 42, i.e.#define MFC70_DLL "MFC70LUD.DLL" #define MFC70_DLL "MFC70LU.DLL" In DLLNET.CPP, change lines 33 and 39, i.e.#define MFC70_DLL "MFC70LUD.DLL" #define MFC70_DLL "MFC70LU.DLL" In DLLOLE.CPP, change lines 34 and 40, i.e.#define MFC70_DLL "MFC70LUD.DLL" #define MFC70_DLL "MFC70LU.DLL" Now we're ready to build the versions of MFC: From a Visual Studio .NET Command Prompt, create a new batch file called buildmfc.bat in the ATLMFC\SRC folder with the following content:nmake -f atlmfc.mak MFC libname=MFC70L This will build all MFC libraries, not just the Unicode DLLs, but it will save us the effort of figuring out how to use the MFCDLL.MAK makefile. Run the batch file. If you need to rebuild any time in the future you now have a convenient batch file to do so. The DLL and PDB files will be created in the VC7\ATLMFC\SRC\MFC\INTEL folder. The LIB files will be created in the ATLMFC\LIB\INTEL folder. There is one crucial step missing from the supplied MFC makefiles. If you take a look at line 425 of …vc7\mfc\makefile, you’ll see that one of the options passed to the compiler is /Zc:wchar_t, which causes wchar_t to become an implicit type. This may be what you want, but if the application you’re linking the lib to wasn’t compiled with this same option (and -- therefore -- has wchar_t #defined to unsigned short), you will get unresolved externals when you link. Your program is looking for function signatures with unsigned shorts in them, but the lib only exports wchar_t in the function signatures. You could remove the /Zc:wchar_t from the makefile, but this solution isn’t universal; it would still prevent linking with programs compiled with the /Zc:wchar_t switch. A better solution is to do what Microsoft did in the original mfc70 libraries: include alias records in the library so that you can link both implicit wchar_t and unsigned short programs. Alias records allow a library to export multiple function signatures that resolve down to the same object code. So how do you add alias records to your newly-built MFC libraries? For both the debug and release MFC library libraries you need to do the following: a) Extract all of the alias records from the corresponding retail MFC library b) Create a new library comprising only these alias records c) Merge your new Unicows-compliant MFC library with the associated alias-record library Step A) This one requires a small detour because lib.exe only allows you to extract one object at a time. We want to automate this step by creating a batch file to do all of the extractions. First, get a command prompt and make …\Vc7\atlmfc\lib your current directory. Next, create a list of all of the alias records in both debug and release MFC libs using the following two command lines:lib /LIST mfc70ud.lib > mfc70ud.lib.lstlib /LIST mfc70u.lib > mfc70u.lib.lst You should now have the two .lib.lst files that each contain a list of library objects, one per line. Now, we will create a perl script to build a pair of batch files from the .lib.lst files (if you don’t already have perl, it’s freely available from several sources. You can find Perl here). Start up a text editor and enter the following text:#!/usr/bin/perl # builds a batch file to extract all alias records # in the input file (input file created with lib.exe /LIST) $targetLib = "mfc70ud.lib"; $outDir = "_aliasRecordsD"; print "md .\\$outDir\n"; while (<>) { # find alias record name if (/_alias[0-9]+\.obj/) { chop; print "LIB /EXTRACT:$_ /OUT:.\\$outDir\\$_ $targetLib\n"; } } Save the text as BuildAliasExtractBatchD.pl. Now edit the text so that the $targetLib variable is changed as follows:$targetLib = "mfc70u.lib"; also change $outDir as shown:$outDir = “_aliasRecords"; Save the edited text as BuildAliasExtractBatch.pl. Now run the two perl scripts as follows from the command prompt:perl BuildAliasExtractBatchD.pl mfc70ud.lib.lst > BuildAliasExtractD.batperl BuildAliasExtractBatch.pl mfc70u.lib.lst > BuildAliasExtract.bat At this point you have two batch files, one of which will extract the alias records from the debug library and one that will extract from the release library. To complete step a) all that’s left is to run the batch files. Note that there are about 2000 alias records in each MFC library, and extracting them one by one is a slow process; each library extraction took about 4 hours on a fast PC. At the completion of this step, you will have two new directories under …\vc7\atlmfc\lib each of which contains extracted alias records. Each extracted alias record is a file with a name of the form _alias*.obj where * is one to four decimal digits. Step B) For Step b), we want to create a new library from the extracted records. Fortunately, this can be done in two simple steps; in contrast to Step a) we can use a response file with lib.exe to simplify our operation. First, we create a pair of perl scripts that will build the response files. Use your favorite text editor to enter the following text:#!/usr/bin/perl # builds a response file for lib.exe to build a library of # alias records. (input file created with lib.exe /LIST) $outLib = "mfc70udAlias.lib"; $aliasDir = "_aliasRecordsD"; print "/OUT:$outLib"; while (<>) { # find alias record name if (/_alias[0-9]+\.obj/) { chop; print " .\\$aliasDir\\$_"; } } Save the file as CreateAliasLibD.pl. Now edit the variable declarations so they read:$outLib = "mfc70uAlias.lib";$aliasDir = "_aliasRecords"; and save the file as CreateAliasLib.pl Run the perl scripts from the command line. Note that we reuse the .lib.lst files we created in Step A) as input here:perl CreateAliasLibD.pl mfc70ud.lib.lst > mfc70udAlias.rspperl CreateAliasLib.pl mfc70u.lib.lst > mfc70uAlias.rsp With the response files made, we now use them with lib.exe to create the alias libraries:lib @mfc70udAlias.rsplib @mfc70uAlias.rsp At the completion of this step you will have two new libraries: mfc70udAlias.lib and mfc70uAlias.lib. They will each contain their respective alias records. Step C) In this step, we simply merge our custom-built MFC libraries with the alias libraries we just made. While we’re at it, we’ll also rename the libraries so they’ll replace the original libraries. Note that we get our custom-built libraries directly from their output locations. lib /OUT:mfc70ud.lib .\Intel\MFC70LUD.lib mfc70udAlias.liblib /OUT:mfc70u.lib .\Intel\MFC70LU.lib mfc70uAlias.lib After the building is done, we need to copy the rest of the created LIBs in the VC7\ATLMFC\LIB\INTEL folder back to their original names in VC7\ATLMFC\LIB (overwriting what's there) so that any of our apps that we link will use the new DLLs. i.e. copy MFC70LSU.LIB MFCS70U.LIB copy MFC70LSUD.LIB MFCS70UD.LIB The reason we do this is the same as for the CRT: we don't have to worry about changing any linker options in our projects to link to the new version of MFC. We should also copy the PDB files back from the LIB\INTEL folder back to the LIB folder. Now we're ready to do a test build of an application. Create a new SDI MFC application using the AppWizard, choose dynamic MFC, create a Unicode Debug and Release build, change the settings to link to unicows.lib, copy the newly created CRT and MFC DLLs to your DEBUG or RELEASE build folder(s) and then run the application. It should all work. Use dependency walker to make sure that everything is getting linked properly and the proper DLLs are being loaded (run a profile in dependency walker). No references to the old names of the DLLs for both the CRT or MFC should be there. Switching between non MSLU builds and MSLU builds Because we have done all of the above, any Unicode build on your machine will now link to MSLU. We may not want this necessarily, or we may want to link back to the original CRT and MFC DLLs. This is what we made the backups for. To restore the system, simply restore your VC7\LIB and VC7\ATLMFC\LIB folders. You could even make a simple batch file that copies older or newer versions of the LIBs back to the LIB folders depending on what you want to build.
In all instructions below, the assumption is a default install path and an en-US copy of Windows; if either is not the case, make sure you replace paths such as C:\Program Files\Microsoft Visual Studio .NET with the appropriate install location.
Also, special thanks to Tim Dowty of Music Match for the great text of step #5 under the MFC build!
The first thing you need to do is make sure that when you install Visual Studio .NET that you make sure both the Unicode MFC version and the CRT source code are installed.
If you installed to the default locations, all of the files we need to change are contained in the tree \Program Files\Microsoft Visual Studio .NET\VC7. Find the ATLMFC\SRC folder and the CRT\SRC folder.
When building the CRT we are actually building three DLLs: MSVCR70.DLL, MSVCP70.DLL, and MSVCI70.DLL. Since we are building both debug and release builds it makes a total of six DLLs we need to build.
Now we will open up the makefile in notepad. At the top of makefile there is a section that controls the naming of the three DLLs to build. For this purpose we will use the name MSLU as a prefix to all of the DLLs instead of the standard name MSVC. So the six names of the DLLs we will create are:
MSLUR70.DLL MSLUR70D.DLL MSLUP70.DLL MSLUP70D.DLL MSLUI70.DLL MSLUI70D.DLL
copy _SAMPLE_.RC MSLUR70.RC copy SAMPLE_I.RC MSLUI70.RC copy SAMPLE_P.RC MSLUP70.RC copy SAMPLE_I.DEF MSLUI70.DEF copy SAMPLD_I.DEF MSLUI70D.DEF copy SAMPLE_P.DEF MSLUP70.DEF copy SAMPLD_P.DEF MSLUP70D.DEF copy Intel\_SAMPLE_.DEF Intel\MSLUR70.DEF copy Intel\_SAMPLD_.DEF Intel\MSLUR70D.DEF
RETAIL_DLL_NAME=MSLUR70 RETAIL_LIB_NAME=MSLUR70 RETAIL_DLLCPP_NAME=MSLUP70 RETAIL_LIBCPP_NAME=MSLUP70 RETAIL_DLLIOS_NAME=MSLUI70 RETAIL_LIBIOS_NAME=MSLUI70 DEBUG_DLL_NAME=MSLUR70D DEBUG_LIB_NAME=MSLUR70D DEBUG_DLLCPP_NAME=MSLUP70D DEBUG_LIBCPP_NAME=MSLUP70D DEBUG_DLLIOS_NAME=MSLUI70D DEBUG_LIBIOS_NAME=MSLUI70D
VCTOOLS=C:\Program Files\Microsoft Visual Studio .NET\VC7
line 1527, 1580, 1630, 1678, 1728, 1775 change kernel32.lib to:
Once we make these changes, we are ready to build the DLLs. It's simple – launch a Visual Studio .NET command prompt (start menu-programs-Visual Studio .NET – Visual Studio .net tools – Visual Studio command prompt) and then go to the C:\SRC folder and type:
set VCTOOLS=C:\Program Files\Microsoft Visual Studio .NET\VC7 BLDWIN9X
copy MSLUR70.LIB MSVCRT.LIB copy MSLUR70D.LIB MSVCRTD.LIB copy MSLUP70.LIB MSVCPRT.LIB copy MSLUP70D.LIB MSVCPRTD.LIB copy MSLUI70.LIB MSVCIRT.LIB copy MSLUI70D.LIB MSVCIRTD.LIB
To build MFC, there is one master Makefile in the VC7\ATLMFC folder named:
The line number to insert the above two lines after is line 287.
In DLLDB.CPP, change lines 34, and 42, i.e.
#define MFC70_DLL "MFC70LUD.DLL" #define MFC70_DLL "MFC70LU.DLL"
In DLLNET.CPP, change lines 33 and 39, i.e.
In DLLOLE.CPP, change lines 34 and 40, i.e.
From a Visual Studio .NET Command Prompt, create a new batch file called buildmfc.bat in the ATLMFC\SRC folder with the following content:
nmake -f atlmfc.mak MFC libname=MFC70L
A better solution is to do what Microsoft did in the original mfc70 libraries: include alias records in the library so that you can link both implicit wchar_t and unsigned short programs. Alias records allow a library to export multiple function signatures that resolve down to the same object code.
lib /LIST mfc70ud.lib > mfc70ud.lib.lstlib /LIST mfc70u.lib > mfc70u.lib.lst
#!/usr/bin/perl # builds a batch file to extract all alias records # in the input file (input file created with lib.exe /LIST) $targetLib = "mfc70ud.lib"; $outDir = "_aliasRecordsD"; print "md .\\$outDir\n"; while (<>) { # find alias record name if (/_alias[0-9]+\.obj/) { chop; print "LIB /EXTRACT:$_ /OUT:.\\$outDir\\$_ $targetLib\n"; } }
$targetLib = "mfc70u.lib";
perl BuildAliasExtractBatchD.pl mfc70ud.lib.lst > BuildAliasExtractD.batperl BuildAliasExtractBatch.pl mfc70u.lib.lst > BuildAliasExtract.bat
#!/usr/bin/perl # builds a response file for lib.exe to build a library of # alias records. (input file created with lib.exe /LIST) $outLib = "mfc70udAlias.lib"; $aliasDir = "_aliasRecordsD"; print "/OUT:$outLib"; while (<>) { # find alias record name if (/_alias[0-9]+\.obj/) { chop; print " .\\$aliasDir\\$_"; } }
$outLib = "mfc70uAlias.lib";$aliasDir = "_aliasRecords";
perl CreateAliasLibD.pl mfc70ud.lib.lst > mfc70udAlias.rspperl CreateAliasLib.pl mfc70u.lib.lst > mfc70uAlias.rsp
lib @mfc70udAlias.rsplib @mfc70uAlias.rsp
At the completion of this step you will have two new libraries: mfc70udAlias.lib and mfc70uAlias.lib. They will each contain their respective alias records.
lib /OUT:mfc70ud.lib .\Intel\MFC70LUD.lib mfc70udAlias.liblib /OUT:mfc70u.lib .\Intel\MFC70LU.lib mfc70uAlias.lib
copy MFC70LSU.LIB MFCS70U.LIB copy MFC70LSUD.LIB MFCS70UD.LIB
Now as those who know me personally will attest I am not a linguist. But I often cannot help but act as if I know something about it. Yet you can tell I am not a linguist (or someone who ever got better than a B+ in grammar!) as I am about to go out on a limb and describe something based on what I think is meant. The reader is therefore warned! :-)
So, the question is more politely "what are genitive dates?". Well, to answer that question, we'll first start with the dictionary definition of genitive. We'll go with the very first definition since it described the intended usage:
Adj - Of, relating to, or being the grammatical case expressing possession, measurement, or source
In this particular situation, its to do with the that 'possessive' usage.
When in English (in the US) you say 'December 25' aloud you usually say "December twenty-fifth" and this all really a shorter form of "the twenty-fifth of December". Its not the traditional way one thinks of a possessive (after all December does not "own" the twenty-fifth day in the same sense that one would talk about 'my dignity' or 'the dignity of me' (before this posting of course!). But there is a possessive usage going on here, and in some sense December does indeed own 31 days, the twenty-fifth being one of them (while poor February, the 90-pound weakling of the calendar, owns a mere 28.25!).
Anyway, this form of "December" is the genitive form.
In English, 'December' on its own and the genitive forms such as those above are the same words, which is why you may never have learned about most of this in grade school (well, I did not in my grade school -- we could just always blame Beachwood elementary schools if everyone else learned this stuff). But this is not true of all languages. In Czech, for example, the twelfth month on its own is 'prosinec' and the genitive form is 'prosince'. In Greek the difference is 'Δεκέμβριος' versus 'Δεκεμβρίου', in Polish it's 'grudzień' versus 'grudnia'. In Russian it's 'Декабрь' versus 'декабря'. And so on for Belarusian (aka Byelorussian), Ukranian, Slovak, Latvin, Lithuanian, and others.
Lest any of my English speaking colleagues find this too confusing, they should probably consider trying to explain the differences between the words 'I' and 'me' and their genitive forms 'my' and 'mine' and when each is used, plus the capitalization of 'I'. They will busy for a long time trying to summarize that. Japanese has numerous ways to do counting which vary with the thing being counted yet many other things that are simpler like gender neutral honorific. Honestly, I suspect that every language to a non-native speaker has some things that are easier and some other things that are harder, yet a native speaker just handles them without even thinking. So perhaps we can forgive those with a different word used in genitive dates since the languages have done nothing wrong? :-)
At this point, if one is looking at the LOCALE_SMONTHNAME* flags in the Locale Information used by GetLocaleInfo or the MonthName array in the .NET Framework's DateTimeFormatInfo and wondering why I am going on about genitive dates when it looks like Windows does not support them. But if any of my readers have used Czech or any of those other languages then they can attest that GetDateFormat and the various formatting and parsing functions in the .NET Framework support them quite well, there is simply no method to obtain the raw data. Its one of those cool stealth features which speakers in other languages do not have and thus do not understand and do not expect, while speakers in those languages do not really think about since everything seems to be working. This has been working properly in Microsoft products since NT 4.0 and Windows 95 and probably earlier and has been in every version of the .NET Framework that has ever shipped.
In fact, the upcoming version of the .NET Framework (code name Whidbey) includes new properties to set and retrieve the genitive form of the month names, so it is no longer really a "hidden" feature anyway. And it was never really hidden to be difficult; it was more that it is very hard to describe to anyone who does not use different forms for months. And things that are hard to document make stuff more confusing for everyone.
If nothing else, it is yet another reason to use the built-in functions and methods for formatting and parsing rather than trying to write one's own!
This post brought to you by "ᠲ" (U+1832, a.k.a MONGOLIAN LETTER TA)
People often when looking at wingdi.h notice the following constant definitions, somewhere around line 1292:
#define HANGEUL_CHARSET 129#define HANGUL_CHARSET 129
and they wonder -- which is the right one? People usually assume that one if the older and the other is preferred.
Well, its a funny question. The short answer is that it does not matter. They are just simple #defines and they end up being the same value anyway.
The longer answer may be of interest to some, so I'll give that too. :-)
Back in the late 1930s, George McCune and Edwin Reischauer put together a system to represent 한글 (Han'gŭl) in a romanized form using the Latin script. This system (after many years of being used around the world) became the official romanized form used by South Korea from 1984 until 2000 (a very good summary of the system can be found here). In that form the first syllable 한 (Han') is combined with 글 (gŭl) to produce Han'gŭl. People would often skip all accents/diacritics and thus Hangul is the most common way people saw it (especially in identifiers like constants which cannot contain diacritics, but also in general usage). The problem with the information that is lost in the names is a real one, however, and for many years people struggled with an imperfect system.
Then starting in the mid 1990s work was started to try to produce a romanization system that would not have all of those diacritics, and although much of this new standard was communicated earlier, in the year 2000 it was offically published as the official system by South Korea. In that standard the 'ŭ' is actually represented by 'eu', and thus the official romanization of 한글 becomes Hangeul. Given that change, there was really no good reason to not add a CHARSET_HANGEUL constant.
Now there have been some criticisms of the "Revised Romanization of Korean" both inside and outside of Korea (summarized on the government site here with all of the changes) and its ability to properly represent Korean in a completely reversible form, and thus there are people outside of Korea who continue to work with the original McCune-Reischauer romanization. Of course, the existing constant (CHARSET_HANGUL) could not be removed anyway without breaking existing code. Also the constant is not really mentioned explicily in documentation much since the CHARSET_* constants are not used much in the world of modern font linking and fallback. In the end, it was just easier to leave it in as is, but add the new constant so that people could use the "new" name if they wanted it.
Koean as a language is best represented by using actual Hangeul syllables rather the romanized form anyway, so neither form really should affect much other than trying to use the term in situations where you need to describe the language in English anyway. Given the poor reversibility, the best way to store Korean text is to not try to romanize it at all if one can avoid it. Other solutions to this problem have been proposed such as the Korean Romanization for Data Applications (KORDA), but there has not been a high demand for this solution in Windows or the .NET Framework since there are no API that would make good use of transliterated forms and collation is not really set up to support it either.
A final piece of the puzzle is what happens in North Korea. Essentially, the original McCune-Reischauer form is used for romanization, but the name 조선글 (Chosŏn'gŭl) is preferred. However, the preferred ordering for Jamos in North Korea (and thus by extension for the full syllables that are made up of Jamos) is different than that of South Korea. Therefore, the expected sort for North Korea is not directly available since there is no North Korean locale support in Windows or the .NET Framework, although proper rendering will be achievable if one has appropriate fonts.
This post brought to you by "ᅅ" (U+1145, a.k.a. HANGUL CHOSEONG IEUNG-SIOS)
Old joke, updated for Windows XP:
Q -- Where is the locale? A -- Its invariant. Q -- Where is variant? A -- Its ten miles south of communicado, and five miles east of Cognito.
The invariant locale is pretty weird. Lets take a look at its interesting chracteristics.
So why is it there?
Well, it all comes back to collation. Like everything else that is worthwhile in life. :-)
In collation for Windows, there is a default table that gives the ordering of every code point in Unicode. As I noted in in my article about how Microsoft does not use the UCA, the default table has been around for a long time, adding code points from version to version as more languages have become supported by Windows. And the thing about the default table is that it supports every language that can co-exist without conflicts -- like English and Greek and Arabic and German (not a complete list!). It is not that they have the same sort -- they don't. It is that they do not have anything in their sorts that conflict with the others on the list (because they either do not share characters at all or they do not sort any of the characters that are shared differently). They all have the non-conflicting rules for the following characters:
A
a
Å
o
Ö
Z
α
β
γ
ب
ح
د
Now you can contrast the way that Swedish would look at the same characters (differences are marked in red):
So obviously, Swedish cannot be handled in the default table due to the fact that in Swedish "Å" and "Ö" are both considered to be separate letters that sort after "Z", rather than being treated as an "A" and an "O" with accessories (diacritics), like they are in English or like something that sorts after but near to "A" and "O" like in German. So it is not handled by the default table, like those other languages are.
(For fuller and cooler examples of this sort of thing, see Appendix D from the first edition of Developing International Software for Windows 95 and Windows NT. Though not attributed, Cathy Wissink did those tables and it was how I got to be "impressed in advanced" about her work at Microsoft. Even though I had no idea who she was, and would not find out for another half decade.)
Now often people would get into trouble trying to LOCALE_USER_DEFAULT or LOCALE_SYSTEM_DEFAULT for sorts that were not supposed to change. Either of those, however, would change any time a setting was changed by the user. And that would cause bugs in people's code. On the NLS team, we would recommend that people use MAKELCID(MAKELANGID(LANG_ENGLISH, SUBLANG_ENGLISH_US)), not because we were trying to be provinical (after all, German or Arabic or Russian or Greek or many other languages would have been fine), but to force an unchanging result on sorts that were not supposed to change on the whim of the user's settings.
Of course, people would often look at using 0x0409 (also known as US English) as Microsoft just being a provincial US corporation. So rather than fight that perception, since the only real goal anyway was to "use the default table" for sorting, a new locale was added. One that would not change, would not vary. It would be.... INVARIANT. And thus, LOCALE_INVARIANT was born in Windows XP, the 136th locale added to Windows.
Not really such a bad thing to do, since thats all that the folks on the NLS team were trying to do anyway, right?
The same thing exists in the .NET Framework, with its static member CultureInfo.InvariantCulture. It is just as weird for everything in its locale fields for dates and numbers and such. But it has consistent results for sorting that use the default table.
A Sort key is basically an array of bytes. The intention of the sort key is to make for faster comparisons of strings, so that if you compare the sort key values for two strings you will get the same results as comparing the strings themselves. They abstract out all of the irrelevant data (for example if you use NORM_IGNORECASE or CompareOptions.IgnoreCase) then the binary sort key for "AAAA", "AaAa", and "aaaa" will all be identical. As such, sort keys make a great basis for an index of string values, like you would have in a database engine.
But how are they structured to allow this to happen?
They have the same architecture in both managed code (via the SortKey class) and unmanaged code (via LCMapString with the LCMAP_SORTKEY flag). The structure is described in the LCMapString topic in the Platform SDK:
[all Unicode sort weights] 0x01 [all Diacritic weights] 0x01 [all Case weights] 0x01 [all Special weights] 0x00 Note that the sort key is null-terminated. This is true regardless of the value of cchSrc. Also note that, even if some of the sort weights are absent from the sort key, due to the presence of one or more ignore flags in dwMapFlags, the 0x01 separators and the 0x00 terminator are still present.
[all Unicode sort weights] 0x01 [all Diacritic weights] 0x01 [all Case weights] 0x01 [all Special weights] 0x00
Note that the sort key is null-terminated. This is true regardless of the value of cchSrc. Also note that, even if some of the sort weights are absent from the sort key, due to the presence of one or more ignore flags in dwMapFlags, the 0x01 separators and the 0x00 terminator are still present.
The reason for this structure is that the primary weights (called the Unicode weights, above) need to take priority over secondary weights (called Diacritic weights, above), which themselves have to take priority over the tertiary weights (called Case weights, above), and so forth. In this way, all of the following examples are true when using the invariant locale/culture, as described in the last post:
AAAA < AAAB (primary difference)AAAA < AÃAA (secondary difference)aaaa < AAAA (tertiary difference)AÃAA < AAAB (primary difference, secondary difference ignored)AAAA < aaaB (primary difference, tertiary difference ignored)aaaaab < aaab (primary difference, length and tertiary difference ignored)AAA < aaà (secondary difference, special width and tertiary difference ignored)
And so forth. For that to work, the four different categories need to be kept separate and each one needs to be put in the sort key in its entirety, and if any type of weight is ignored then that whole section will be empty.
You can take the sort key, this structured array of bytes, and use it as an index for the string. Comparisons of two byte arrays will always be faster than comparing the string themselves.
This of course assumes that the sort keys are pre-calculated, like in an index. If they are not and you are looking at the difference between caculating then comparing the sort key values for two strings versus comparing the strings themselves, the string comparison will almost certainly be faster. The reason for that is that the sort key calculation involves analyzing the information of the entire string (and still does not include the actual comparison) whereas string comparisons will exit as soon as they can come up with an answer to the question of which one comes first.
I was doing a presentation a few years ago and it occurred to me that looking at direct string comparison versus sort key calculation/comparison was like looking at the "retail" version of collation vs. the "Wholesale" one. Only some people in the crowd felt it was an illuminating analogy, and I once again learned that I should not blurt out "good ideas thst suddenly occur to me" when I am in the middle of a presentation. :-)
One last thought -- no, there is not an Ordinal type of sort key. Because Ordinal comparisons are already done in a binary manner!
This post brought to you by "р" (U+0440, a.k.a. CYRILLIC SMALL LETTER ER)