Welcome to MSDN Blogs Sign in | Join | Help

The format of string resources

Unlike the other resource formats, where the resource identifier is the same as the value listed in the *.rc file, string resources are packaged in "bundles". There is a rather terse description of this in Knowledge Base article Q196774. Today we're going to expand that terse description into actual code.

The strings listed in the *.rc file are grouped together in bundles of sixteen. So the first bundle contains strings 0 through 15, the second bundle contains strings 16 through 31, and so on. In general, bundle N contains strings (N-1)*16 through (N-1)*16+15.

The strings in each bundle are stored as counted UNICODE strings, not null-terminated strings. If there are gaps in the numbering, null strings are used. So for example if your string table had only strings 16 and 31, there would be one bundle (number 2), which consists of string 16, fourteen null strings, then string 31.

(Note that this means there is no way to tell the difference between "string 20 is a string that has length zero" and "string 20 doesn't exist".)

The LoadString function is rather limiting in a few ways:

  • You can't pass a language ID. If your resources are multilingual, you can't load strings from a nondefault language.
  • You can't query the length of a resource string.

Let's write some functions that remove these limitations.

LPCWSTR FindStringResourceEx(HINSTANCE hinst,
 UINT uId, UINT langId)
{
 // Convert the string ID into a bundle number
 LPCWSTR pwsz = NULL;
 HRSRC hrsrc = FindResourceEx(hinst, RT_STRING,
                     MAKEINTRESOURCE(uId / 16 + 1),
                     langId);
 if (hrsrc) {
  HGLOBAL hglob = LoadResource(hinst, hrsrc);
  if (hglob) {
   pwsz = reinterpret_cast<LPCWSTR>
              (LockResource(hglob));
   if (pwsz) {
    // okay now walk the string table
    for (int i = 0; i < uId & 15; i++) {
     pwsz += 1 + (UINT)*pwsz;
    }
    UnlockResource(pwsz);
   }
   FreeResource(hglob);
  }
 }
 return pwsz;
}

After converting the string ID into a bundle number, we find the bundle, load it, and lock it. (That's an awful lot of paperwork just to access a resource. It's a throwback to the Windows 3.1 way of managing resources; more on that in a future entry.)

We then walk through the table skipping over the desired number of strings until we find the one we want. The first WCHAR in each string entry is the length of the string, so adding 1 skips over the count and adding the count skips over the string.

When we finish walking, pwsz is left pointing to the counted string.

With this basic function we can create fancier functions.

The function FindStringResource is a simple wrapper that searches for the string in the default thread language.

LPCWSTR FindStringResource(HINSTANCE hinst, UINT uId)
{
 return FindStringResourceEx(hinst, uId,
     MAKELANGID(LANG_NEUTRAL, SUBLANG_NEUTRAL));
}

The function GetResourceStringLengthEx returns the length of the corresponding string, including the null terminator.

UINT GetStringResourceLengthEx(HINSTANCE hinst,
 UINT uId, UINT langId)
{
 LPCWSTR pwsz = FindStringResourceEx
                       (hinst, uId, langId);
 return 1 + (pwsz ? *pwsz : 0);
}

And the function AllocStringFromResourceEx loads the entire string resource into a heap-allocated memory block.

LPWSTR AllocStringFromResourceEx(HINSTANCE hinst,
 UINT uId, UINT langId)
{
 LPCWSTR pwszRes = FindStringResourceEx
                       (hinst, uId, langId);
 if (!pwszRes) pwszRes = L"";
 LPWSTR pwsz = new WCHAR[(UINT)*pwszRes+1];
 if (pwsz) {
   pwsz[(UINT)*pwszRes] = L'\0';
   CopyMemory(pwsz, pwszRes+1,
              *pwszRes * sizeof(WCHAR));
 }
 return pwsz;
}

(Writing the non-Ex functions GetStringResourceLength and AllocStringFromResource is left as an exercise.)

Note that we must explicitly null-terminate the string since the string in the resource is not null-terminated. Note also that the string returned by AllocStringFromResourceEx must be freed with delete[]. For example:

LPWSTR pwsz = AllocStringFromResource(hinst, uId);
if (pwsz) {
  ... use pwsz ...
  delete[] pwsz;
}

Mismatching vector "new[]" and scalar "delete" is an error I'll talk about in a future entry.

Exercise: Discuss how the /n flag to rc.exe affects these functions.
Published Friday, January 30, 2004 7:00 AM by oldnewthing
Filed under:

Comments

# The format of string resources

Friday, January 30, 2004 11:41 AM by HMK's Spurious Thoughts
The Old New Thing talks about format of string resources (Windows)....

# re: The format of string resources

Friday, January 30, 2004 9:06 AM by Frederik Slijkerman
It would be nicer to return a std::wstring from AllocStringFromResource(), so you don't have to use delete[] at all.

# re: The format of string resources

Friday, January 30, 2004 9:38 AM by Reuben Harris
The VC6 resource compiler doesn't like strings with more than 256 characters, and will truncate those that are and give you a warning. This can be annoying if you're doing verbose UI labels, such as a wizard page.

Is this a restriction of the way Win32 expects resource strings to be stored, or is rc.exe just dim?

(Great blog by the way... I've just read through your archive and have fallen into at least half the traps you warn about!)

# re: The format of string resources

Friday, January 30, 2004 9:49 AM by Dan Crevier
Doesn't AllocStringFromResourceEx suffer from the integer overflow problems you've been talking about recently?

# re: The format of string resources

Friday, January 30, 2004 10:23 AM by runtime
Windows CE's LoadString() resource API has a nice "underdocumented" feature to use resource strings without allocating extra memory or copying strings. According to Douglas Boling's book "Programming Microsoft Windows CE", if you pass a NULL lpBuffer parameter to LoadString(), the API will return a read-only pointer to the string. Since the resource strings are not null-terminated, the string length is stored in the word preceeding the start of the resource string.

The book also says you can request that resource strings be stored as null-terminated strings if you invoke the resource compiler with the -r command line switch. I don't know if that feature is Windows CE specific.

# re: The format of string resources

Friday, January 30, 2004 11:03 AM by Mike Dunn
One of my favorite tricks for loading strings is to use the little-known CString (MFC or WTL) constructor trick:

CString str ((LPCTSTR) IDS_SOME_STRING);

Then you can make a macro to do that on the fly:

#define _S(id) (CString(LPCTSTR(id)))

and use it inline, such as:

MessageBox ( _S(IDS_BAD_ERROR), _S(IDS_MSGBOX_TITLE), MB_ICONERROR );

Of course, the _S definition above is only for release mode. In debug mode, it's a real function that does a LoadString and asserts if the string can't be loaded.

# re: The format of string resources

Friday, January 30, 2004 12:22 PM by B.Y.
Since the EXE is always mapped into memory, if they designed the format so that strings are always terminated with a zero, can't you can get read-only pointers to the strings without the need to allocate memory ?

# re: The format of string resources

Friday, January 30, 2004 3:00 PM by Joe
Dan: No, it doesn't. Because the size is coming from a short (0 - 65535), that gets cast to a UINT (0 - 2^32-1), and then has 1 added. There's no way to overflow the allocator because 2 * 65536 < 2^32 - 1.

# re: The format of string resources

Friday, January 30, 2004 3:02 PM by runtime
BY, you would think so, but maybe Microsoft wanted to save the "wasted" space of the null-terminator character? I bet 90% of the time, resource strings are used without modification. Microsoft should have optimized this common case with an API that just returned an easy to use, zero-copy, read-only pointer to the null-terminated resource string.

# re: The format of string resources

Friday, January 30, 2004 6:05 PM by Mike Dunn
runtime - remember that when these APIs were designed and written, they had to work on machines with 4 MB of RAM (the lowest Win95 would run in). That's four MEGAbytes. Lots of null bytes hanging around can add up if there are a lot of strings in the string table.
Sure, now we don't give it a second thought when a .NET app requires a 20MB download and uses 40MB of memory (that's what SharpReader is at right now on my system). In 1993/94/95, things were a *lot* different.

# re: The format of string resources

Friday, January 30, 2004 9:01 PM by Raymond Chen
Frederick: There was a discussion of std::wstring in previous blog comments: http://weblogs.asp.net/oldnewthing/archive/2004/01/21/61101.aspx

I wrote these functions as if they were part of the Platform SDK. This means no language-specific constructs, and certainly no compiler-specific constructs. (std::wstring is not guaranteed to be compatible from one compiler to the next or even from one compiler VERSION to the next. The contract for std::wstring is at the source code level, not the ABI.)

B.Y.: Try solving the exercise.

runtime/Mike Dunn: I actually have a discussion of the historical basis for resource formats scheduled for a future entry. It's even weirder than you think.

# re: The format of string resources

Saturday, January 31, 2004 11:17 AM by asdf
new is a language specific construct. And a non-throwing new is a compiler-specific construct (or a standards compliant compiler with exceptions disabled via a flag).

Using /n it looks like the string length gets reported 1 WCHAR longer than it actually is.

# re: The format of string resources

Saturday, January 31, 2004 11:47 AM by Mike Dunn
Note: there's a bug in the for loop in FindStringResourceEx(), the condition should be "i < (uId & 15)"

Raymond, I noticed that FindStringResourceEx() returns a pointer to the block of memory occupied by the resource, but you call UnlockResource() and FreeResource(), which presumably might free that memory. Is this safe?
OTOH, the docs on LockResource() say: "The pointer returned by LockResource is valid until the module containing the resource is unloaded." That implies that UnlockResource/FreeResource can't free the memory because it would break LockResource(). So who's right?

As for the exercise, adding /n doesn't break your functions, it just makes them allocate one extra WCHAR. When the strings are 0-terminated, the lengths are increased as well, so the code to walk the strings and find a particular one still works.
The code assumes string table entries are not 0-terminated. They become 0-terminated with /n, so the string returned by AllocStringFromResourceEx() has two 0 chars at the end. Mostly harmless.

# re: The format of string resources

Saturday, January 31, 2004 2:58 PM by Raymond Chen
Yeah, I broke my own rule with new[]; I should have used LocalAlloc.

Good catch on the precedence bug.

UnlockResource and FreeResource are NOPs on Win32. More information to come in that promised future blog entry.

# re: The format of string resources

Saturday, January 31, 2004 11:06 PM by dru

How is the landId used?

I couldn't find a good reference to that
on my MSDN CD via FindResourceEx?

# re: The format of string resources

Sunday, February 01, 2004 5:18 AM by Raymond Chen
Like the documentation says, it specifies the language of the resource you want to access. You can use the LANGUAGE directive in the *.rc file to provide resources in multiple languages.

# re: The format of string resources

Sunday, February 01, 2004 5:36 AM by Frederik Slijkerman
The best solution might be to introduce a new function ReleaseStringFromResource that would take the pointer from AllocStringFromResourceEx and free it properly, with delete[] or whatever.

That way, you also reserve the right to change the allocation mechanism without breaking backwards compatibility.

# re: The format of string resources

Monday, February 02, 2004 2:39 AM by Mike Dimmick
UnlockResource is a total no-op in the current SDK headers - it's a macro which evaluates the argument, then discards the result.

LockResource is a slightly more substantial no-op, because it's implemented as a function. However, the implementation is basically:

PVOID LockResource(HGLOBAL hGlob)
{
return (PVOID) hGlob;
}

(dumpbin /disasm is your friend...)

# re: The format of string resources

Monday, February 02, 2004 7:51 PM by Lonnie McCullough
I have a question about the MAKELANGID macro (actually about langids in general). What is the difference between (LANG_NEUTRAL, SUBLANG_NEUTRAL) and (LANG_NEUTRAL, SUBLANG_DEFAULT)? Will the first map to the second if there are resources present in the user's default language? Is there an algorithm for falling back from the user's language to other languges in the resource file? I guess I'm just not sure how all this stuff is really handled and would like to know more (trying to do internationalization the right way if at all possible). Even a pointer to a resource would help greatly in clearing up the confusion in my head over how NEUTRAL,NEUTRAL contrasts with NEUTRAL,DEFAULT. Thanks for the great blog.

Go Pats!!!

# re: The format of string resources

Monday, February 02, 2004 9:35 PM by Raymond Chen
Lonnie: I'm going to have to defer on your question. I am not an internationalization expert and I wouldn't want to give the wrong answer.

# The management of memory for resources in 16-bit Windows

Sunday, February 08, 2004 12:04 PM by The Old New Thing

# Mismatching scalar and vector new and delete

Sunday, February 08, 2004 12:06 PM by The Old New Thing

# re: The format of string resources

Wednesday, February 18, 2004 2:11 AM by David Kemp
Lonnie:
(LANG_NEUTRAL, SUBLANG_NEUTRAL) = Language Neutral
(LANG_NEUTRAL, SUBLANG_DEFAULT) = User's Default Language
A language neutral string is different from one in a user's default language.
You can mark a resource as Language Neutral by using "LANGUAGE LANG_NEUTRAL, SUBLANG_NEUTRAL" in the resource file.
I'd imagine you'd want to make a distinction between Language Neutral and the Default Language of your application tho. For example, you might want to default to American English if you don't have resources for the user's prefered language, but you might want to check the system's default language first. You'd want to use Language for resources that are truly Language Neutral (so, probably never, as I'd doubt such a thing exists)
[The MSDN reference for MAKELANGID is:
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/intl/nls_97vo.asp]

# re: The format of string resources

Wednesday, February 18, 2004 7:09 AM by Raymond Chen
Examples of language-neutral resources would be most icons and bitmaps. These are things that don't change regardless of the language. (Of course you have to make sure your icon/bitmap doesn't contain locale-sensitive imagery.)

# re: The format of string resources

Saturday, February 21, 2004 6:57 AM by floyd
Raymond Chen: "Examples of language-neutral resources would be most icons and bitmaps. These are things that don't change regardless of the language."

I'd have to disagree here. Although you cannot translate images in the same way as text resources, there are still potential language specific facets. Just think about bitmaps with text on them. With this being an exception that is easily perceived there are also less obvious nuances: a green coloured UI element would signal a successful operation to those living in western civilizations -- if your product ships to asian countries you would rather change this colour to red.

.f

p.s.: Thanks a lot for sharing your experience. I stumbled across your blog today, almost by accident, and I like it already :)

# re: The format of string resources

Saturday, February 21, 2004 8:02 AM by Raymond Chen
True, bitmaps with text and culturally-dependnet images would need to be localized. But I ruled that out in my parenthetical.

It's a good idea to avoid locale-sensitive bitmaps because professional translators tend not also to be accomplished graphic artists.

# re: The format of string resources

Saturday, February 21, 2004 8:48 AM by floyd
Don't get me wrong, I wasn't going to challange you in any way. I merely meant to illustrate that it isn't always as easy as it may appear to decide whether a resource is language-dependent. I would agree with you, that locale-dependent images are generally a bad idea, unless you have tool-support to track those as well as good reasons to go for that approach in the first place.

I just didn't want anyone reading this thread take it as a fact that images are generally locale-independent. With that said, I also have a not so obvious string resource that is in fact language-independent: let's say you are writing an image processing application and need to support CMY color space -- if you translate Cyan-Magenta-Yellow into local names, you will run into major trouble when it comes to printing your work.

Anyway, I'm not an expert in this field either. But with all those bits and pieces I picked up along the way the only thing I can say is this: localization is a beast to master.

.f

# re: The format of string resources

Saturday, February 21, 2004 8:57 AM by Raymond Chen
Agreed. Designing your code to be localizable is a lot of work and contains many pitfalls.

# re: The format of string resources

Monday, June 28, 2004 3:37 PM by Raymond Chen
Commenting on this entry has been closed.

# LoadString() C++ Klasse

Sunday, August 28, 2005 6:14 AM by hardwarefetish.com
Um eine WIN32-Applikation in mehreren Sprachversionen zu lokalisieren,
gibt es neben den lokalisierten Forms auch die String tables, die sich in den Programmresourcen befinden.
Wenn man sein Programm also mehrsprachig gestalten will, sind alle hardcoi

# Which language strings to load? How to load them?

Saturday, May 13, 2006 1:17 PM by Sorting It All Out
The SZ (a.k.a. Steffen) asked in the suggestion box:

What is the prefered way to select the &quot;most...

# Don't trust the return address, no really

Thursday, August 17, 2006 10:00 AM by The Old New Thing
No really, you can't.

# GetLocaleInfo for other languages?

Friday, July 13, 2007 5:46 PM by Sorting It All Out

Serdar asked: Hi, Is it possible to call GetLocaleInfo in a different language? What I’m trying to do

# Nektra Advanced Computing Blog &raquo; Blog Archive &raquo; Windows Live Messenger Internals

New Comments to this post are disabled
 
Page view tracker