|
|
-
Today's post will provide a basic introduction to the Windows Embedded CE Platform Builder Catalog. If you don't know what that is, consider yourself lucky. If you do have to deal with the catalog data, today's post might be helpful. We'll write a quick C# utility that demonstrates the catalog object model and also provides some useful diagnostic information that might be helpful to catalog file authors. The tool will display any errors or warnings that PB encounters while loading the catalog. It will also list all of the catalog files that it was able to load.
I don't really know how it happened, but somehow about 5 years ago, through no fault of my own, I ended up owning the PB "catalog". For those who don't know, the catalog is a repository for information about the CE operating system, and PB uses this information to help developers customize their CE builds. For example, if you have the environment variable "SYSGEN_AUDIO" set when you build a CE operating system, your operating system will include sound system support. The catalog has a list of these "features", along with a friendly title, description, and some metadata that helps PB ensure that the environment variable is properly exposed to the developer.
There were some significant issues with the catalog's design, and maintaining the catalog data was not a fun experience. CEC files were not Maintaining the code that worked with the catalog was also problematic. For PB 6.0, I finally had the opportunity to start from scratch with a new catalog format. While a certain amount of backwards compatibility was required (the new catalog needed to essentially provide the same data as the old one, and there would need to be an upgrade path). So I designed a new catalog system, written in C# and based on XML files. While it isn't perfect, I think it is an improvement.
From a development perspective, one thing that is nice about the new catalog is that it has a pretty clean object model (an interface for developers who want to use the catalog). While we've never published the interface for the object model, we don't support it for external use, and everything I say here comes without warrantee or guarrantee of future compatibility, it isn't too hard to get started with basic tasks using the catalog object model. The reflection and metadata included in every managed assembly make this possible.
Before I go too far, I want to be clear that the design for the catalog object model emphasized the most common use of the catalog as a read-only repository. Loading and reading data from the catalog is (hopefully) reasonably simple and efficient. Using the object model to create or edit catalog files is a bit more of a challenge and not quite as efficient.
Loading a catalog means parsing about 2 megabytes of XML and loading the data into memory. While this happens in less than 1/10 of a second, we don't want to do this too often. In addition, we don't usually need (or want) multiple copies of the catalog in memory. To help with this, the catalog object model provides a cache of read-only loaded catalogs. When loading a catalog via the object model, you can choose to get your own copy of the catalog or you can ask for a read-only copy, which may come from the cache. (The cache is based on the .NET garbage collector via weak references. As long as anybody is using a catalog, it won't be evicted from the cache. As soon as nobody is using it, it becomes eligible for garbage collection. If it re-enters use before it is collected, it is no longer eligible for collection.)
The first step in using the catalog from your own code is to reference the catalog library. This code is in the Microsoft.PlatformBuilder.Catalog assembly, which (assuming you've installed Platform Builder 6 into C:\Program Files) can be found in C:\Program Files\Microsoft Platform Builder\6.00\cepb\idevs\Microsoft.PlatformBuilder.Catalog.dll. In Visual Studio, add a reference to this file. For convenience, you can add a "using" declaration to the top of your code to import the Catalog's namespace:
using Microsoft.PlatformBuilder.Catalog;
Next, to load a catalog, you use the CatalogManager class. The CatalogManager is a collection of catalog files. It handles searching through a winceroot for catalog files to load, loading all of the catalog files into memory, and maintaining the organization of the data in the catalog. It "flattens" the contents of the files, merging the data from all of the catalog files into a single collection of catalog elements. It also groups the data by type (so I can get a list of BSPs without going through all elements in the whole catalog), and generates indexes (so I can quickly look up a BSP by name).
When creating a CatalogManager, you can either create your own (writable) copy or you can load a read-only cached copy that might be shared by other components in the same process. To create a writable copy, just use the CatalogManager's constructor:
CatalogManager catalog = new CatalogManager();
The default constructor for the CatalogManager will load only the "Global" catalog files. These are the catalog files in C:\Program Files\Microsoft Platform Builder\6.00\Catalog, and need to be available to PB whether or not it is working with a CE OS tree. To load the catalog files from an OS tree as well as the global catalog files, just add the winceroot path to the constructor call:
CatalogManager catalog = new CatalogManager(@"C:\wince600");
If you don't need to make changes to the catalog manager's data and don't need precise control over the load sequence, you can use the CatalogManager factory method. This will return a cached catalog if one is available, so it is usually more efficient for repeated use than creating and loading a new catalog each time.
CatalogManager catalog = CatalogManager.GetCatalogManager(@"C:\wince600");
Next, we want to know if the catalog manager had any trouble while loading the catalog. For normal operation, developers generally don't want a defective catalog file to prevent PB from working correctly, so the catalog manager will silently catch catalog load errors. However, when working on catalog files, it is helpful to know what PB really thinks about your catalog. The catalog manager keeps track of warnings and errors generated during catalog load. You can access the errors via the catalog manager's Messages property, which is a read-only collection of CatalogMessage objects.
The catalog manager also provides a utility method to format the error or warning messages. It will then write the messages to any TextWriter (i.e. a StreamWriter or StringWriter). For example, to print all error or warning messages to stdout, you could do this:
CatalogManager.WriteCatalogMessages(catalog.Messages, Console.Out);
Finally, we want to access some of the data in the catalog. Let's print out a list of all of the catalog files that were successfully loaded:
foreach (CatalogFile file in catalog.Files) { Console.WriteLine("CatalogFile: {0} - \"{0}\"", file.FileInformation.Title, file.FullName); }
Similar collections exist for the BSPs, CPUs, and "Items" in the catalog.
Here is a simple command-line program that pulls this all together. You can build this via a Visual Studio C# command-line app project, or you can compile this at the command line with the csc.exe C# compiler. The only trick is to add a reference to the Microsoft.PlatformBuilder.Catalog.dll assembly (from C:\Program Files\Microsoft Platform Builder\6.00\cepb\idevs).
Have fun!
using System; using System.IO; using Microsoft.Win32; using Microsoft.PlatformBuilder.Catalog; namespace CatalogErrors { public class Program { public static void Main(string[] args) { try { string winceroot = null; if (args.Length > 0) { string arg0 = args[0].ToLowerInvariant(); if (arg0 != "/?" && arg0 != "-?" && arg0 != "/h" && arg0 != "-h" && arg0 != "--help") { winceroot = args[0]; } } else { winceroot = GetDefaultWinceroot(); }
if (winceroot != null && !Directory.Exists(winceroot)) { Console.Error.WriteLine( "ERROR: Winceroot \"{0}\" not found.", winceroot); winceroot = null; }
if (winceroot == null) { Console.Error.WriteLine( "Usage: CatalogErrors [winceroot]"); Console.Error.WriteLine( "Displays any CE catalog load warnings and errors."); Console.Error.WriteLine( "If winceroot is not specified, uses default from registry."); } else { Run(winceroot); } } catch (Exception ex) { Console.Error.WriteLine( "ERROR: {0}: {1}{2}", ex.GetType().FullName, ex.Message, ex.StackTrace); ex = ex.InnerException; while (ex != null) { Console.Error.WriteLine( " : {0}: {1}{2}", ex.GetType().FullName, ex.Message, ex.StackTrace); } } }
private static string GetDefaultWinceroot() { string winceroot = null; using (RegistryKey localDir = Registry.CurrentUser.OpenSubKey( @"Software\Microsoft\Platform Builder\6.00\Directories")) { if (localDir != null) { winceroot = localDir.GetValue("OS Install Dir") as string; } }
if (winceroot == null) { using (RegistryKey globalDir = Registry.CurrentUser.OpenSubKey( @"Software\Microsoft\Platform Builder\6.00\Directories")) { if (globalDir != null) { winceroot = globalDir.GetValue("OS Install Dir") as string; } } }
return winceroot; }
private static void Run(string winceroot) { CatalogManager catalog = CatalogManager.GetCatalogManager(@"C:\wince600"); CatalogManager.WriteCatalogMessages(catalog.Messages, Console.Out);
foreach (CatalogFile file in catalog.Files) { Console.WriteLine("CatalogFile: {0} - \"{0}\"", file.FileInformation.Title, file.FullName); } } } }
|
-
Steve Ball posted an article about some "glitching" issues in Vista. I can't resist adding my two cents.
For me, Vista definitely glitches a LOT more than previous versions of Windows. As a fairly experienced developer, I think I understand the reasons pretty well, so I can explain it away. But as a user, when my laptop audio is glitchy, I want to find the developer responsible and (censored for mild descriptions of hypothetical violence).
I've read a lot of comments raising various theories, some that call into question the sanity of the Windows developers. I can't say I blame them. There is definitely some room for improvement in the way things work. However, in the interest of fairness and progress, the attention should be focused where it will do the most good. That means we shouldn't simply blame the Windows developers unless there is really something they can do about the problem. And that means that before we start placing blame, we probably ought to figure out where the problem really lies.
The first complaints always mention something about the "lame" and "brain-dead" Windows NT scheduler. However, I'm pretty convinced that this is not the problem. In fact, my audio sounds BETTER when my system is under load (more on this later). I agree with the statement that audio glitches under CPU load are usually the fault of the OS and the scheduler, but I've seen very little correlation between CPU load and audio glitching. I haven't seen any evidence that the CPU scheduler at fault.
Hard disk load is occasionally an issue, but that is generally fairly obvious and easy to fix at the application level. The application simply didn't buffer enough sound samples and ran out of music to play while waiting for the hard drive to load the next bit of music. Either tweak the buffering algorithm of the application or get a faster hard drive (or network).
Memory can also be an issue. Some buffer or code needed to play the music might be paged out because you're running low on free memory, and it didn't get paged in quickly enough once the application tried to access it. This could possibly be blamed on the OS if the OS is too aggressive in trimming the working set. If an app pre-loads 60 seconds of audio, that means it doesn't won't touch the last page of the buffer for 60 seconds, which might be long enough for the OS to page out the buffer. Here, you would probably get better results by buffering only 5 seconds worth of music. In any case, I haven't had any significant trouble with this on Vista (except once in a while when I let Firefox run too long, it eats up 1.5 GB of RAM, and my system goes into memory panic mode).
Drivers are a much more significant issue. Traditional OSes (XP, Vista, Linux) can't really schedule a driver's activity. Once the driver starts doing something it thinks is important, it can only be interrupted by a higher-priority driver. On a single-CPU system, if a driver takes over, no new audio can be buffered until the driver returns. Many drivers written for XP or earlier systems (where the audio buffer was somewhat more forgiving -- more on this later) cause trouble by doing too much processing at once. In testing (under XP), the driver's latency wasn't a problem, but on Vista, the latency requirements are much less lenient. I've seen significant Vista audio glitch issues go away after upgrading from an XP-era driver to a newly released Vista-compiant driver.
Even if the driver only does 1 millisecond of work at a time (or whatever the Vista latency recommendations are), if it has to do this 1000 times a second, it will still use up all available CPU time. Drivers have priority over all applications, so on a single-CPU system, this leaves no CPU for the audio application and mixer. On a multi-CPU system, this can still be a problem if the driver holds certain locks that are needed for audio processing. This is why Vista throttles network activity when the audio channel is open -- network packet bursts can easily use enough CPU to cause audio glitches. Probably a good idea overall, though it seems that the throttling algorithm is a bit too aggressive and has some room for improvement.
Another issue is power management. This turned out to be the major problem on my laptop. My laptop's motherboard (CPU and chipset) goes into sleep mode whenever it detects that it is "idle". That's actually a pretty good thing because it means I can get 2 or 3 hours of use out of the battery instead of 20 minutes. This happens hundreds of times per second -- it sleeps for 2 milliseconds, wakes up to handle a keystroke, sleeps for another 2 milliseconds, wakes up to handle the calculations for an animation, sleeps a bit more, wakes to fill an audio buffer, etc. But it sometimes doesn't wake up quickly enough to buffer the next bit of audio. If it ever takes longer than 9 ms to wake up, there will be a glitch. This was a real problem when my laptop was new. Recent drivers have improved this a lot, but there's still a bit of scritch-scratch during some games or media.
As an experiment, I wrote a very simple application to prevent the motherboard from going to sleep. It starts a low-priority thread that does a simple busy wait in a low priority loop. A Sleep(1) loop didn't help -- it gave the motherboard a chance to go to sleep. While the busy wait makes my laptop get very hot, it also completely stops the glitching.
#include <windows.h>
#include <stdio.h>
// This probably doesn't really do anything. At such a low priority, the
// process usually terminates before the thread exits. But this is an easy
// way to avoid certain compiler warnings. Without it, some compilers warn
// that the "return 0" below is unreachable. If I remove the "return 0",
// other compilers warn that I don't return a value from DoNothingQuickly.
volatile BOOL g_stopNow = FALSE;
DWORD WINAPI DoNothingQuickly(LPVOID /* unused */)
{
while (!g_stopNow)
{
// Sleep(1) didn't fix the glitching. Sleep(0) just spends all
// the time context switching, which is probably as bad or
// worse than a busy wait in terms of impact on the rest of
// the system. So I'll just do a busy wait.
}
return 0; // Usually never reached.
}
int main()
{
int returnCode;
DWORD dwThreadId;
HANDLE hThread;
hThread = CreateThread(
NULL,
0,
DoNothingQuickly,
NULL,
CREATE_SUSPENDED,
&dwThreadId);
if (hThread != NULL)
{
// We want to keep one CPU wide-awake and leave any other
// CPU(s) idle.
SetThreadAffinityMask(hThread, 1);
// We don't want to get in the way of any useful work.
SetThreadPriority(hThread, THREAD_BASE_PRIORITY_IDLE);
ResumeThread(hThread);
CloseHandle(hThread);
printf("Caffeine: Now running. Press <Enter> to quit...");
getchar();
g_stopNow = TRUE; // Probably useless, but might as well...
returnCode = 0;
}
else
{
printf("Caffeine: Unable to create thread. Exiting.\n");
returnCode = GetLastError();
}
return returnCode;
}
Drivers seem to be a big part of the problem here -- they either spend too much time working, or they take too long to wake up after going into a sleep state. Hopefully this means that audio problems will go away as the drivers improve. Computer retailers like Dell and HP will probably ensure that their new hardware meets the Vista latency requirements before putting it on the market. Unfortunately, owners of older hardware might be out of luck.
Hindsight is 20/20. I've seen how these kinds of issues come up, and I've been involved in some mistakes myself, so I don't want to sound like I'm smarter than anybody on the Windows audio team. However, there is certainly some room for improvement in they way this issue has played out. While the changes in the audio stack are technically admirable and the problems can generally be blamed on drivers, that's little comfort to those enduring static on their speakers. Things work in XP and don't work in Vista. That sounds like a regression, not an improvement. It's getting better, but there really shouldn't have been a problem in the first place.
What's wrong? Well, Vista aimed for a technically superior audio experience. Latency has been significantly reduced in Vista -- when you fire your machine gun in your favorite game, you'll hear the sound effect a bit more quickly. For gamers and audio professionals, this reduction in latency can make a big difference. For people trying to listen to their MP3s, this probably doesn't matter much. The downside to reducing latency is that it reduces the margin for error. Vista cannot tolerate any delays longer than 5 or 10 milliseconds without glitching, while XP could usually tolerate a much longer delay with no problem. Assuming all of the drivers do their part, modern hardware actually has no problem meeting the deadlines. But if anything goes wrong on Vista, you can hear it.
What was the mistake? The core of the issue is that a change was made that was detrimental to some customers. In the long term, the change is probably a step (or two) in the right direction, but for many people, the change causes trouble and offers no immediate benefit. Obviously the severity of the problem was underestimated (contrary to popular belief, Windows developers do care about their customers and would never have done this had they known the outcome). I've never been a big fan of taking away without giving something in return.
What would have prevented the problem? I don't know how hard this would have been to implement, but I would love to have some kind of adjustment knob to control the amount of latency I want on my system. That would have sidestepped the whole issue by allowing the customer to pick their priorities, i.e. lower latency on my desktop where there aren't power issues and where I want my games to sound great, higher latency on my laptop where I want additional power savings, the drivers aren't as good (and are also optimized for power savings), and where I'm never using the audio stack for anything other than music or videos anyway. This would also have been a great mitigation for driver issues during the transition period.
This is probably a good lesson for developers in general -- be sure to consider the transition period between the old system and the new system when designing the new system.
In the meantime, I've finally gotten my audio problems worked out. After the latest motherboard chipset upgrade, I no longer have to run caffeine.exe anymore, and I only run into minor static when running a few specific programs. Hopefully things are improving for everybody else, too.
|
-
A common complaint about PB 5.0 is that the OS Design View tab will sometimes mysteriously disappear. The best answer I have is that people should be using PB 6.0. Unfortunately, that answer tends to make people want to punch me. In the interest of my own safety, here is what I know about alternative methods of resolving the problem.
The OS Design View tab is one of the places where the old PB shell code (VS 6.0 vintage 1998) meets up with the new PB object model (totally re-written and designed for integration with the VS 2005 shell). So this is more evidence that you can't put new wine in old bottles -- you'll almost always end up with leaks.
In the case of the OS Design View, most of the issues I've resolved were due to an exception leaking. Exception handling is a great thing, but it has the good/bad quality that it is often easy to forget that you're calling a function that can throw an exception (good when you don't need to handle it, bad when you do...).
Ideally, the exception will be caught by PB and PB will properly recover and/or display a useful error message, but this doesn't always happen, in which case PB just silently fails or crashes. Luckily, the exception type we use does a little bit of logging. The exception constructor will send a quick message to the system debug stream. Sometimes this message is useless, but every once in a while, the exception has enough information to help you track down the problem.
There are several utilities that can monitor the debug stream. DBMon.exe is a simple console tool that is part of the Microsoft Platform SDK. DebugView can be downloaded from the SysInternals website. If you run cepb.exe under a debugger such as WinDbg or Visual Studio, you'll also see the messages from the debug stream.
In the case of Missing OS Design View tabs, I have occasionally been able to track down the problem by monitoring the system debug stream while opening an OS Design (pbxml). For example, I saw an "Access Denied" exception that occurred while trying to open a file in an OS Design. The exception message included the file name. Removing the "Read Only" attribute from the file allowed me to successfully open the OS Design.
Obviously, this is a problem with PB. PB should probably be able to do its job with read-only access to the file. And if it fails, it should at least pop up a useful error message indicating what the problem was. Hopefully, we'll do better in the future. In the meantime, you can try to identify the file that is causing trouble for PB and remove the read-only attribute. While this might cause trouble for your source control system, you'll hopefully at least be able to get back to work.
- Start a debug stream monitor utility such as DbMon.exe or DebugView, then start cepb.exe. Alternatively, launch cepb.exe under a debugger such as Visual Studio or WinDbg.
- Recreate the scenario that is causing trouble. For example, if the OS Design View tab is missing when you open a particular OS Design, open that OS Design.
- Examine the messages logged to the debug output stream.
- If you find any messages that look interesting, try to resolve the problem they indicate. For example, if you see that one of the messages refers to an access denied failure when opening a particular file, verify that the file exists and that you can access the file. If the file is read-only, try removing the read-only attribute from the file using Explorer (right-click on the file and select "Properties") or the DOS Attrib command.
- Leave a comment below or contact me to let me know if this does or does not work for you!
|
-
Home-grown hash functions considered harmful...
About 7 years ago, when I was a high-minded computer science senior, a lowly computer science freshman buddy mentioned the trouble he was having with his homework. He had to design a hash table, and he was getting too many collisions for his design to be accepted for the assignment. I suggested that he always stick with a prime number of buckets. He tried that and -- like magic -- his collision rate went down dramatically. He asked me why it worked, and (being a member of the CS elite) I just scoffed. Isn't it obvious? Prime numbers are just better!
Now that I've been taken down a notch or two (six years of real industry experience will do that), I've learned that contrary to common practice, it really isn't obvious. I've also learned that magic solutions to hard problems should be examined with suspicion.
The rest of this post is probably pretty boring, so here's the quick summary:
- If you have to write a hash function for something important (i.e. a .NET GetHashCode method or a C++ hash traits class), don't design your own. Steal one that has been well-tested. One of my favorites for speed, simplicity, and quality is the "Jenkins One-At-A-Time hash" (aka joaat hash - "one-at-a-time" refers to one byte hashed per iteration, though the algorithm is probably still mostly ok if you use a 16-bit char per iteration instead). You can find a good description of it and many others here. For more information, see the Wikipedia entry on hash tables.
- If you see any locally-developed hash table implementation that requires a prime number of buckets, you might want to track down the author and suggest a better hash function that doesn't place a prime number constraint on the number of buckets. Prime numbers are so 20th century! (Keep in mind that the hash table author might be aware of better ways but is remaining backwards-compatible with previous releases or is protecting the clients of the hash table from their own bad hash functions.)
The Goal: Fast Lookup
(This is review for those who aren't familiar with hash tables.)
A common task (in both real life and computer programming) is looking things up. Given input "A" (the "key"), I want to find a corresponding value "B" (the "value"). Think finding a word in a dictionary (given a word, I want to find the definition) or finding the page of a book given a page number.
If the key is easily enumerable (for the purpose of this discussion, enumerable means I am able to easily map each key onto a unique non-negative integer) and all valid keys are densely packed into a small range in the enumeration, a numbered list works quite well. For example, if the key is an integer, and valid keys are all in the range 200 to 300, and all possible keys from 200 to 300 are valid (have associated valies), then the input is easily enumerable and the enumeration is dense. For computer programs, the logical data structure for this situation might be an array or a vector, mapping each key (200..300) to an array index (0..100) by subtracting 200 and rejecting any key outside of the range 200 to 300.
If the key is not easy to enumerate or if the enumeration is not dense, looking up the value might be more difficult. If the key is enumerable but not dense (i.e. a string with up to 256 Unicode characters), you could theoretically create an array with indexes corresponding to all possible inputs, but unfortunately, most of the slots in the array will be empty and wasted. In the case of 256 Unicode characters, at one bit per value, your array of values will require more bits of storage than there are atoms in the known universe.
One good solution is to make a list of items sorted by the associated key. The list can then be searched using a binary search. This works quite well for a large class of problems, and assuming that the list allows for linear-time lookup of item a (where a is a number refering to the a'th item), finding an arbitrary item is O(log N) (where N is the number of items in the list). (To allow quick insertions and deletions, the list is typically implemented as a tree, but that's just an implementation detail.) The STL std::map class uses this method.
Another good solution is to split the set of values into many small groups or "buckets" based on the key. Each possible key always maps to the same bucket. Now if I want to find an item, I don't have to search the whole list -- I just have to search through the items in the correct bucket. Assuming a good mapping that evenly distributes items among buckets and a number of buckets proportional to the number of items, this method can allow us to find an item (or know that it is not present) in O(1) time. This method is often called the hash table.
The Challenge: Hash Functions
The mapping of input keys to bucket numbers is called the hash mapping. A good hash mapping will quickly map any input key A to an integer m from 0..M-1. It will do this in a way that evenly distributes the items in the hash table so that most of the buckets have about the same number of items. An ideal hash mapping will result in each bucket having the same number of items. "Perfect" hashing is where there are N items, N = M buckets, and each bucket contains exactly one item. In general, perfection is only possible when the items are known before the hash mapping is chosen.
A hash function is (for this discussion) a mapping of input keys to non-negative integers. This is an important step of a hash mapping. Given a hash function, the hash mapping can be completed by performing any mapping of non-negative integers onto the range 0..M-1. This is often done by dividing the result of the hash function by M and taking the remainder.
Assuming that we have to pick a hash function ahead of time, desirable qualities include speed (minimize the time spent turning a key its hash), simplicity (don't waste space in the memory of the computer, be easy to understand and debug for the benefit of the developer), quality (evenly distribute typical inputs into available buckets), and flexibility (work with any kind of input and any number of buckets).
Developing good hash functions is hard. In fact, if you have to pick a hash function before the items are known, it is provably impossible to avoid the worst-case scenario where all items end up in the same bucket and your search time becomes O(N). However, assuming that your inputs are not generated by a malicious attacker with knowledge of your hash function, it is possible to come up with some good hash functions that are fast, simple, flexible, and of high quality. It just isn't easy, especially because literature and common practice are full of examples of poorly designed hash functions.
Bad hash functions have persisted partly because it really doesn't matter much for many tasks. A fairly bad hash function might make your algorithm perform somewhat more slowly, but as long as your hash table correctly handles the collision cases, you might not even notice. In many cases where N is small or bounded, even the worst case of O(N) might not be a problem.
In addition, while good hash functions are hard and require careful analysis, in most cases, it is relatively easy to randomly pick some unsigned integer operations (set unsigned int x = something, then for each byte b of input set x = x (OP) b) and come up with a hash function that looks "pretty random". The problems don't show up right away, so the code gets checked in. In the rare case that the hash table ends up being a performance bottleneck, a liberal application of prime-number sauce makes the worst issues go away.
The Magic: Prime Numbers
A hash function designed "at random" without careful analysis will almost certainly have a very uneven distribution for the likely set of keys. Often there will be periodic patterns in the mapping of input to output. Inputs following a pattern will result in hash codes that follow some other pattern. The ideal hash function would evenly distribute all keys among all buckets (no matter how the keys might be related), so any common pattern leading to an uneven distribution is a flaw. Sometimes the flaw will go unnoticed, but sometimes the flaw will cause uneven bucket distribution leading to possible performance issues.
For example (strawman, I know, but not entirely far-fetched), assume I invent a hash function with the flaw that the result of the hash function is always a multiple of 48 if all inputs are ASCII digits (48..57). This might be ok for some item sets, but it suddenly becomes a problem when the items are "all zip codes in the US" with 96,000 buckets. Instead of O(1), the lookup becomes O(48) with 48 items in every 48th bucket and 0 items in all others.
So lets say I get a bug assigned to me to fix the performance issue. I determine that the problem could not possibly be with my hash algorithm -- I designed it at random, therefore the output must be perfectly random! (Not true, by the way.)
I play around with some data sets, and notice that the problem is really bad when the number of buckets is a nice round number, but seems to go away when I use a more "random" number of buckets. This is because given an input that is divisible by 48, the remainder when dividing by a nice round number is likely to preserve my hash function's distribution flaw:
M (the number of buckets) = 96000 = 48 x 2000
m (the output of my hash function) = 59259216
m mod M = (48 x 1234567) mod (48 x 2000) = 48 x (1234567 mod 2000) = 48 x 567 = 27216. (The output of my hash mapping - still divisible by 48!)
On the other hand, dividing by a number that doesn't have any prime factors in common with the period of my distribution flaw (48 = 2x2x2x2x3) will do a nice job of hiding the flaw. Prime numbers don't have any prime factors other than themselves, so they magically hide a number of periodic hash function flaws.
Not really understanding the logic, I do a few searches through the available literature (i.e. Google) and see references to prime numbers in relation to hash functions. I try a few prime numbers for the bucket sizes, and it looks like all of the distribution issues magically go away. So I resolve the bug as "somebody else's fault", indicating that the client needs to use a prime number for the number of buckets.
This is inconvenient for the client, but it solves the problem, so it gets written into the documentation for my hash table. The prime-number-of-buckets idiom continues to propagate. (Oh my!)
The Right Way: Steal
While prime numbers smooth over a number of flaws, they really don't solve the problem. The problem is that poorly developed hash functions aren't well distributed. Using a prime number of buckets eliminates one class of issues (which seems to be the most common class), making the performance acceptable in more cases, but other issues remain. In addition, requiring a prime number of buckets causes some inconvenience -- the client now needs a table of prime numbers, hash computations require a division operation, hash table resizing becomes more complex, etc.
It turns out that with some careful design, it is possible to create hash functions that do not contain these periodic flaws. These high-quality hash functions are very good at producing even distributions over nearly any number of buckets given nearly any input. The result is that there need be no hash-function-induced constraint on the number of buckets.
Since careful design of hash functions is not something most developers (myself included) know how to do, the best advice I have is this: steal. The best code is the code you don't have to write yourself. There are many good hash algorithms out there that have undergone extensive analysis and have excellent distributions for nearly all kinds of input. Some are even extremely simple. All of them will almost certainly knock the socks off of anything you can develop on your own in terms of even distributions. Most of them will run more quickly than your home-grown solution. And they're free.
Here is a very simple hash function (Jenkins One-At-A-Time) that is better in almost every way than anything I saw in college (assume ub4 is a typedef for a 4-byte unsigned integer): ub4 one_at_a_time(char *key, ub4 len, ub4 hash = 0)
{
ub4 i;
for (i=0; i<len; ++i)
{
hash += key[i];
hash += (hash << 10);
hash ^= (hash >> 6);
}
hash += (hash << 3);
hash ^= (hash >> 11);
hash += (hash << 15);
return hash;
}
You can easily use this for any size of hash table by dividing and taking the remainder. Even better, use it with power-of-two hash tables by masking off the top bits of the hash. You can hash multiple parts of a structure by calling the function repeatedly and chaining the result each time.
|
-
When Platform Builder interacts with a CE OS build tree ("winceroot", i.e. C:\WINCE600), it needs to know a little bit about the contents of that build tree. For example, it needs to know what SYSGEN variables are available and what they mean. It needs to know what BSPs are available and what options they provide. It needs to know what CPUs have been installed. We refer to this information as the winceroot metadata.
This data is either hardcoded into PB, or encoded into the "catalog". The catalog was historically used when data changed so often that code changes for the data got to be annoying. In any case, the data was still essentially hardcoded into PB since the catalog lived in PB's Program Files directory.
As PB has matured, we've tried to move in the direction of decoupling PB from the winceroot. One consequence of this is that we try to move as much metadata out of PB and into the winceroot itself. The first step was to move the catalog from PB's Program Files directory into the winceroot. Then we moved some of the hardcoded information into various metadata files in the winceroot or figured out some way to dynamically infer that metadata from existing files in the winceroot. Finally, in PB 6.0, we rewrote the "catalog" system using XML instead of "CEC" files. One of the goals was to make it usable with all kinds of new metadata. We consolidated a few of the other metadata files into the catalog as well as including all of the metadata from the old catalog. The next version of PB will have many additional metadata types in the catalog.
One immediate disadvantage of this is that while we want PB to be decoupled from the winceroot, we have immediately created a hard break between PB 5.0's metadata format and PB 6.0's metadata format. This is why PB 6.0 doesn't work with the CE 5.0 build tree as shipped. Since PB n had never really worked well with the CE n-1 build tree before, we didn't consider this to be a particularly significant problem. (PB 6.0 includes a tool called "CecImport.exe" that can be used to extract the bare-bones 5.0-format metadata from a CE 5.0 build tree and write it in the 6.0 format. This is unsupported, but it may work if you really want to use PB 6.0 tools with a CE 5.0 tree. Additional tweaking beyond the automatic conversion may be required for best results.)
Another disadvantage is that PB n will choke on the metadata from PB n+1. This probably could have been avoided had we used a somewhat more relaxed mechanism for parsing the metadata files. We decided that we wanted the schema to be as tight as possible so that catalog authoring errors would show up quickly. Instead of ignoring unrecognized metadata, we raise an error.
The new format brings several advantages. As XML, it is a lot easier to manipulate with general development tools (Visual Studio's XML editor, XmlSpy, or even Notepad) than the old catalog database. The schema (PbcXml600.xsd, which may or may not be renamed for future versions) contains documentation for each element and attribute type (aka "what in the world was Doug thinking when he invented this element?"). Adding new data to the catalog is as easy as dropping a new XML file into the appropriate Catalog directory in the winceroot.
The manipulation of the catalog is managed by a C# assembly named "Microsoft.PlatformBuilder.Catalog.dll". This includes the object model for the catalog (classes corresponding to each XML element), helper functions for loading and querying the data, a system for determining where a catalog element should show up if the catalog "tree" is shown, a "catalog control" displaying the catalog as a tree view, and classes that know how to parse some of the CE 5.0 metadata and turn it into PB 6.0 catalog data. (While we don't publish this object model, we might do so in the future if there is sufficient demand. In the meantime, you are welcome to examine it with ILdasm or Reflector.)
We also included a rudimentary catalog editor with PB. This editor does not allow editing of the full range of metadata supported by the catalog, but it hopefully handles the most common tasks for component and BSP developers. To be honest, I never use it -- I just open the catalog files in Visual Studio's XML editor. Visual Studio finds the PbcXml600.xsd schema and that provides all the editing help I need.
When loading the catalog, the catalog's client generally requests "please give me a catalog for this winceroot". The catalog loader then enumerates all *.PbcXml files from the following locations and all subdirectories:
- PB's Program Files "catalog" folder. This folder contains metadata about the current PB installation that is specific to PB and not winceroot-dependent.
- winceroot\PUBLIC\*\Catalog
- winceroot\PLATFORM\*\Catalog
I think there are actually one or two more places that the system checks for catalog files, but the exact list escapes me at the moment, and the locations listed above are the ones most commonly containing catalog files.
All subdirectories of the listed location are checked. If there is a "Catalog" directory, then all *.PbcXml files from that directory and all subdirectories are loaded. For example, if there is a C:\WINCE600\public\ie\catalog folder with subdirectories 1033 and 1041, then the loader will load catalog\*.pbcxml, catalog\1033\*.pbcxml, and catalog\1041\*.pbcxml.
The CE 6.0 catalog includes metadata for the following (as far as I remember -- I don't have access to the schema right now):
- WinceRoot. Basic metadata about the current winceroot, such as the version of CE it contains.
- Core OSes. There is one CoreOS element for each Core OS (_TGTPLAT) supported by the winceroot.
- BSPs. There is one Bsp element for each BSP installed in the winceroot. The PB catalog editor contains support for editing BSPs.
- CPUs. There is once Cpu element for each CPU supported by PB. There is one CpuInstalled element for each CPU for which support has been installed.
- OS Design Templates. There is one element for each OS Design Template. The OS Design Template is what drives the "New OS Design Wizard". It contains information about what features should be enabled by default for the new OS Design and what features should be shown in the series of option screens in the wizard.
- Catalog Items. There is one Item element for each independent "feature" or "driver" in the catalog. The PB catalog editor contains support for editing catalog items.
Let me know if you have specific questions about the catalog or if there are areas you would like me to blog about.
|
-
Hey, PB expert guy, I have a question.
Shoot.
Can I run PB 5.0 side-by-side with PB 6.0?
Short answer or long answer?
Um, short answer, please.
Kind-of.
What kind of answer is that? This is a yes-or-no question!
Ok. Then the answer is no.
That’s not what I wanted to hear.
Oh, sorry. The customer is always right, I forgot. Yes, yes, of course you can.
Are you lying to me?
Technically, no.
I want the truth.
You can’t handle the truth.
Ha ha. Very funny.
Well, you can run them side by side, but you may or may not be happy with the results.
Hmm. You’re weaseling out of the question, aren’t you?
Yup. Weaseling is what separates us from the animals. (Except the weasel.)
Ok, then the long answer please?
Sure. But first, recognize that the following information is offered without any warrantee or guarantee. You follow these instructions at your own risk. Neither Microsoft nor dcook accepts any responsibility for the use or misuse of these instructions. Use at your own risk. Back up your data first. I’m sharing this information because it has helped others, but it may or may not help you.
Yes, I accept. (click.)
Platform Builder 5.0 and 6.0 tools do not always work perfectly when both are installed on the same computer. (Note that this refers to the PB IDE and tools, not to the OS build tree. The build tree is fairly self-contained.) Most things work, but there are a few challenges. There are also a few steps you can take to minimize the challenges.
The first thing to know is that you must install PB 5.0 first. If you’ve already installed PB 6.0, I’m sorry, but you'll have to uninstall it. PB 5.0 broke some installation rules and will clobber some PB 6.0 settings. And there are a few things that PB 6.0 had to do that aren’t pretty, so a repair of PB 6.0 might not work either. If you’ve already installed PB 6.0 and want to install PB 5.0, your best bet is to uninstall PB 6.0 first. Here is the recommended order of installation:
- Install PB 5.0.
- Make a backup copy of all files in C:\Program Files\Common Files\Microsoft Shared\Windows CE Tools\Platman\bin. (Only the files in the bin folder need to be backed up. The subfolders do not need to be saved.)
- Install Visual Studio 2005 and all associated service packs (as of this writing, SP1, and if you’re running Vista, the SP1-Vista-GDR).
- Install PB 6.0 and all associated service packs (as of this writing, SP1).
- Compare the current set of files in the Platman\bin folder with the set of files you backed up. Restore any missing files. (Do NOT overwrite any existing files.)
- If you have any Connection Manager device settings that were created with PB 5.0, you’ll need to delete them and recreate them.
At this point, PB 5.0 and PB 6.0 should both be mostly working – or at least as well as can be expected.
What’s that supposed to mean?
Well, the above steps worked for me. But there are all kinds of things that could go wrong, and I may not have tested some aspect of PB that is important to you, so this may not work perfectly for everybody.
The largest trouble spots are with the x86 Emulator, the Remote Tools (especially PerfMon and CeCap), and with CoreCon (download and KITL transports).
Why is this such a problem in the first place? Why don’t you just fix this mess?
I would love nothing better than to fix this mess. Unfortunately, as long as we support cemgr.exe, we’re going to have some side-by-side troubles with the Platman\bin folder. Back in PB 4.0, when the best practices for MSI development were not nearly as well-known as they are today (and they still aren't very well-known), a few mistakes were made in the development of the PB 4.0 Platman install. The result is that if a file was part of the PB 4.0 Platman install in the Platman\bin folder, and we want that file to still exist after PB 6.0 is installed, we have to include that file in PB 6.0. Due to various security and legal issues, there are a few files that were part of PB 4.0 that we can no longer ship in PB 6.0. As a result, installing PB 6.0 causes these files to mysteriously disappear.
Any other issues?
Yeah. If you uninstall either PB 5.0 or PB 6.0, you’ll probably have to repair the remaining version of PB before it runs correctly (Add/Remove Programs, select PB, then select Repair). Certain registry settings for COM interfaces can't really be properly shared between versions of PB.
Also, the CoreCon datastore is a bit more fragile than I would have liked. Sometimes things go wrong with the datastore resulting in problems when you launch Connection Settings from PB or when you try to create a Visual Studio for Devices project. (You may also run into problems while installing PB 6.0 if your datastore is corrupt.) If you have trouble with the CoreCon datastore, try the following:
- 1. You can safely delete your local (per-user) CoreCon datastore settings. (See http://msdn2.microsoft.com/en-us/library/ms184403(VS.80).aspx for details.) This sometimes fixes issues with launching Connection Settings or with creating VSD projects. Go to your user’s application data directory, navigate to the Microsoft folder, and delete the CoreCon directory. For example, 'rd /s "C:\Documents and Settings\YourUserName\Local Settings\Application Data\Microsoft\CoreCon" '. (Warning: DO NOT delete the CoreCon settings under All Users!) The next time you activate CoreCon, it will rebuild your per-user datastore. You will lose any devices that you may have configured, but it shouldn't be too hard to recreate them.
- 2. If the above does not help, you may have a problem with your global (all users) CoreCon datastore settings. This is a hard problem to solve. These files are in the All Users application data area (i.e. C:\Documents and Settings\All Users\Application Data\Microsoft\CoreCon\1.0, or C:\ProgramData\Microsoft\CoreCon\1.0 on Vista). You can’t just delete these files and hope they come back (because they won’t). This might work:
- Make a backup copy of the data in the CoreCon\1.0 folder.
- No, seriously. Make a backup copy. If you lose the data in this folder, it may be quite difficult to get it back.
- Rename the 1.0 folder to something else, like “1.0_original”.
- If a friend or co-worker has a working installation with a similar set of programs (i.e. they successfully installed PB 5.0, VS 2005, and PB 6.0), try copying the data from the other computer to your computer. Ensure that you’ve deleted the per-user CoreCon data (step 1 above), then launch PB 5.0 or VS 2005 and try to open Connection Settings.
- If this doesn’t work, rename this folder out of the way (i.e. rename it to 1.0_from_other_computer), then try a repair install of PB 5.0, then VS 2005, then PB 6.0.
Yikes!
Yeah. Tell me about it.
Some people have had better luck just installing PB 5.0 and 6.0 on separate machines, dual-booting, or even installing PB 5.0 into a Virtual PC image.
There are also ways to make PB 6.0 work with OS trees from CE 5.0, but that is a more complicated subject (and while it generally works, it is definitely unsupported by Microsoft).
|
-
I've learned some interesting facts about MSI-based setups in the past few days. Working together, they can cause some really tricky and nasty issues if you are unaware of them. (In other words, this is why I'm still at work at 2:00 in the morning.)
This is more proof that MSI setup is not something you can pick up overnight -- you really need to be aware of a lot of "gotchas".
- If a feature's install level is 0 during InstallExecuteSequence, it cannot be installed or uninstalled. Even if the rest of the product is uninstalled, if a feature's level is 0 during the uninstall, the feature will be left behind.
- AppSearch is never run more than once in an install. If AppSearch runs during the UI sequence, it will be skipped in the Execute sequence.
- In a non-elevated setup (a setup launched by a non-admin user but permitted by group policy or approved via UAC), if the setup is run in maintenance mode, properties from the UI sequence are ignored by the Execute sequence unless they are "Secure" properties. Properties that would normally propagate from the UI sequence to the Execute sequence are simply left at their default values (usually NULL).
Fact #1 is scary all by itself. If the feature's level is 0, it will not be touched by the installer. This may or may not be what you want.
Setting the level to 0 is a commonly used method to hide a feature under certain conditions. The logic is: IF (condition) THEN set Level=0. However, you must be absolutely certain that the condition is FALSE if the product is being uninstalled. If the condition evaluates to TRUE, and the feature has been installed somehow, the feature will be left behind (orphaned) when the rest of the product is uninstalled. There will be no way to uninstall the feature. The feature's components are forever stuck on the user's computer. You've just screwed up your customer's machine. Bad programmer, no donut.
As a quick example, consider the following condition (if true, the feature's level will be set to 0):
NOT DOTNET20INSTALLED
The intent is probably to hide the feature if .NET 2.0 is not installed (.NET 2.0 is a prerequisite for the feature). However, if the user uninstalls .NET 2.0 before uninstalling the product, the condition becomes true during uninstall, the feature's level is set to 0, and the feature is orphaned. Bang, you're dead.
Now let's assume that your users are smart enough to not uninstall .NET 2.0 before uninstalling your product. Being the devious users that they are, they'll probably find some other way to break your install. This is where facts #2 and #3 come in.
Assume your user is using Vista with UAC enabled. The user installs your product as usual. Later on, the user wants to uninstall. He goes to "Programs and Features" (formerly Add/Remove Programs), right-clicks on your product, and selects "Change", then from the installer's UI selects "Remove". Here's what happens:
- The install started without Administrator rights, so the UI sequence runs as the normal user.
- The DOTNET20INSTALLED property is initialized via AppSearch during the UI sequence.
- At the transition to the Execute sequence, the user receives a UAC prompt and approves the install. This enables the Execute sequence to run as a privileged user.
- The Execute sequence notes that the UI sequence ran as a limited user, and therefore disallows any non-Secure properties. The DOTNET20INSTALLED property is disallowed, so it remains at the default value of NULL in the Execute sequence.
- AppSearch already ran in the UI sequence, so it doesn't run a second time in the Execute sequence.
- The condition "NOT DOTNET20INSTALLED" evaluates to true, so your feature's level is set to 0. Bang, you're dead. No donut.
Solutions to these issues will be considered in my next blog. Stay tuned!
|
-
In Visual C++ 7.1 and earlier, "catch(...)" would catch all exceptions, both C++ and SEH. The behavior has changed with Visual C++ 8.0. This has caused some confusion.
The details get a bit tricky, but the generally accepted wisdom among the C++ gurus that have advised me is:
- try/catch is for C++ exceptions.
- Corollary: Don't catch C++ exceptions with __except.
- __try/__except is for Windows Structured Exception Handling (SEH) exceptions.
- Corollary: Don't catch SEH exceptions with catch.
- Corollary: Don't use _set_se_translator. Use __except, then throw your C++ exception from the __except block.
- Corollary: Don't use /EHa. The only reason to use /EHa is to catch SEH exceptions with catch or via _set_se_translator, which you shouldn't be doing in the first place.
- Don't use both C++ exception handling and SEH exception handling in the same function.
- Creating an instance of an object that has a destructor implicitly generates a try/catch block, so don't create destructable objects in a function that uses __try/__except or might directly raise an SEH exception.
- Don't catch anything you don't know how to handle.
- You don't know how to handle Access Violations.
- Corollary: Don't catch Access Violations. Ever.
- It was always kind of weird (many would say horrible and bad) that the C++ catch would sometimes catch SEH exceptions. This was never dependable. Don't rely on this behavior.
- It is a compiler implementation detail that __except can catch C++ exceptions. Don't rely on this behavior.
- Visual C++ 8.0 changed the semantics of "catch(...)".
- catch(...) no longer catches all SEH exceptions. This is a good thing (see rules 4 and 5).
- If you were following the above rules, you didn't see any changes in your program's behavior.
|
-
The consumer versions of 32-bit Windows XP and Vista have a stated limit of 4 GB RAM, but a practical limit of about 3.1 GB. A lot of partial explanations have been floating around, so I thought I would try my hand at clearing up the issue. (Wish me luck!)
The design of the Intel 386 architecture supported access to up to 4 GB of physical memory (32-bit physical addresses) and unlimited virtual memory (4 GB at a time via 32-bit virtual addresses). 4 GB of physical memory seemed quite unthinkable at the time the chip was released, so the actual CPU did not have enough address pins to actually do this. Back then a 32-bit address space seemed extravagant for anything less than a supercomputer or mainframe. Nowadays, you can get 4 GB for under $400, and what was unthinkable in 1986 is within reach of anybody thinking about a new computer.
So at least I can access 4 GB, right? Nope.
The original IBM PC’s processor could access 1024 KB of physical address space, but you could only use 640 KB for RAM. The remaining 384 KB of address space was reserved for memory-mapped hardware and ROM. A similar situation exists with current systems: hardware reserves large chunks of the upper 1 GB of physical address space. Because of these reserved areas, a system with a 32-bit physical address space will be limited to somewhere around 3.1-3.5 GB of RAM.
To overcome the 32-bit limitation, recent x86 CPUs (Pentium Pro and later) have 36 address pins and can address 64 GB of RAM. The original design of the x86 32-bit protected mode only provided access to 32-bit addresses, so PAE (Physical Address Extensions) mode was created to allow access to 36-bit addresses.
PAE mode changes the layout of the page tables. Page tables map virtual addresses to physical addresses. Without PAE, the 32-bit virtual addresses map through 2 levels of page tables (1 level for huge pages) and are translated to 32-bit physical addresses. With PAE, the 32-bit virtual addresses map through 3 levels of page tables (2 levels for huge pages) and are translated to 64-bit physical addresses.
PAE doesn’t do anything to the virtual memory limit. Pointers are still 32 bits, so a process can only access 4 GB of address space at a time. However, using PAE, two or more processes could each access a different 4 GB of physical memory. With proper operating system support (i.e. AWE on Windows operating systems) PAE also allows a process to allocate additional memory outside its normal address space, then swap portions of that additional memory into its address space as needed.
So PAE is the answer, right? Well, maybe…
One thing that can prevent access to more than 4 GB of RAM is motherboard design. PAE can only access 64 GB of memory if all 36 address pins are properly wired up on the motherboard. This is not always the case, since those extra 4 wires make the motherboard just a little bit more expensive to design and manufacture (and use just a little bit more power). Many motherboards (especially on laptops) only have 32 address pins connected. If that is the case, no OS will be able to access more than 4 GB of address space.
Another hardware limitation is the ability of the chipset to remap RAM. If you have 4 GB of RAM, and 600 MB of address space is used up by PCI/AGP reserved areas, the only way to access the top 600 MB of RAM is to remap it into the addresses above the 4 GB boundary. Not all chipsets are able to do this, so some systems will just waste any RAM that happens to be shadowed by a PCI/AGP reserved region.
My BIOS reports 4 (or more) GB of RAM, I’ve enabled PAE, and I still only see 3.1 GB. What gives?
Unless you’re running one of the advanced server varieties of Windows, you won’t see more than 4 GB of physical memory. This is a limitation of Windows designed (I assume) to encourage people building expensive servers to pay more for Windows than those who are using it for normal day-to-day activities.
As for that last 0.9 GB, it all comes down to drivers and system stability. Not all drivers behave well in the presence of 64 bit physical addresses. Many driver authors assume that only the bottom 32 bits of the physical address are valid. Others don't properly handle the creation of bounce buffers when necessary (they’re needed when transferring data from a hardware device to/from a buffer that is above the 4 GB mark in physical memory).
Windows XP originally supported a full 4 GB of RAM. You would be limited to 3.1-3.5 GB without PAE, but if you enabled PAE on a 4 GB system with proper chipset and motherboard support, you would have access to the full 4 GB. As more people began to take advantage of this feature using commodity (read: cheapest product with the features I want) hardware, Microsoft noticed a new source of crashes and blue screens. These were traced to drivers failing to correctly handle 64-bit physical addresses. A decision was made to improve system stability at a cost of possibly wasting memory. XP SP2 introduced a change such that only the bottom 32 bits of physical memory will ever be used, even if that means some memory will not be used. (This is also the case with 32-bit editions of Vista.) While this is annoying to those who want that little bit of extra oomph, and while I would have liked a way to re-enable the memory “at my own risk”, this is probably the right decision for 99.9% of the general population of Windows users (and probably saves Dell millions in support costs). See the relevant KB article and a TechNet article for details.
Some of the server Operating Systems still allow the use of larger amounts of memory. I’m guessing that this is done with the assumption that higher quality parts will be used and drivers will be more likely to have been tested in PAE mode with large addresses.
Side-note: PAE is also related to page execution protection, called "hardware DEP" (Microsoft term), "NX" (AMD term), and "XD" (Intel term). In 32-bit x86 processors, this can only be used in PAE mode. This is why you might see PAE mode used even on systems with less than 4 GB of memory.
Performance note: 3-level page table lookups are inherently slower than 2-level page table lookups. However, the processor has substantial dedicated circuitry that usually eliminates most of the performance impact.
|
-
In my previous post, I described two issues encountered after updating our build system to Nmake 8.0 (the version from Visual Studio 2005) from earlier versions. Both issues turned out to have essentially the same root cause.
Nmake's job is to execute a sequence of commands to bring targets up-to-date. Some of those commands do real work, such as "cl -c MyProgram.cpp" to run the compiler. Other commands just set things up, such as "set TARGETDIR=c:\obj". Some commands just output status information, such as "echo Starting compile".
In order to improve build speed (as well as to supply certain semantics expected by the developer), Nmake emulates some commands instead of actually executing them. For example, if Nmake were to run "cmd /c set TARGETDIR=c:\obj", nothing would happen (the environment of the new cmd.exe process would be modified, but then the process would exit and the change to the environment would be lost).
It seems that between Nmake 7.1 and Nmake 8.0, the mechanism for executing some of cmd.exe's internal commands was changed. Previously, "echo", "type" and other built-in commands were always executed with predictable semantics. (I'm guessing this is because they were emulated by Nmake instead of being passed to cmd.exe.) In Nmake 8.0, the "echo" and "type" commands are apparently executed exactly as if you typed them into the cmd.exe command line.
On the surface, this isn't such a big deal. Most of the time, this change is completely transparent. Once in a while...
In the first case described in the previous blog post, it turned out that "echo.exe" was a program on the user's path. This is a tool written to make the "echo" command in cmd.exe behave more like the "echo" command in Unix. Nmake was simply calling the user's echo.exe instead of using cmd.exe's built-in echo command.
In the second case, "type.exe" was a program on the user's path.
|
-
I've run into this issue twice now (in different forms) after upgrading build systems from old versions of Nmake to Nmake 8.0 (the version from Visual Studio 2005), so I think that means it's time to blog about it.
Scenario 1:
Your stuff builds ok, but the output is totally wrong. Characters are missing from the output, and during the build, your computer keeps beeping at you.
Compiling .._koho.c ..\gaimem.c ..\k_area.c ..\k_clearn.c ..\k_grbg.c ..\k_hnkn.c ..\k_moji.c ..\k_regist.c . evconv.c ..♂lbmgr.c ..nvdbg.c ..\k_frame.c ..\k_kenti.c ..\k_disp.c ..\k_atwid.c ..\k_crsr.c ..\k_kkti.c .. ngmgr.c ..\k_koho.c ..\k_bun mk.c ..comment.c ..utotune.c ..
Investigating, you find that all of the missing characters are C-style escape sequences - "\a", "\r", "\n", etc., and the escape sequences have been replaced by the corresponding control code - beep, CR, LF, etc. So why did nmake suddenly decide to start processing escape codes?
Scenario 2:
The following is something of an idiom for makefiles. It shows the command line that is about to be executed, then executes it. (This is a bit of a simplification. As presented, this serves no purpose, but the pattern becomes useful when inline response files are involved.) .cpp.obj: @type << $(CL_COMMAND_LINE) <<NOKEEP $(CL_COMMAND_LINE)
After upgrading to Nmake 8.0, this starts causing an error. Type complains that it can't find the temporary inline file. Replacing "type" with "cmd /c type" fixes the problem.
Both issues have essentially the same root cause. Tune in next time for the solution.
|
-
Around Microsoft, people talk about making things performant. They mean "fast" or "performs well".
It is a real word, but according to OED, it means a person who performs something.
In any case, I think it really isn't all that important (that is a "real word"). I know what they mean, they know what I mean, and isn't that the point?
|
-
I work in the Windows Embedded CE group on the "PB IDE tools" team. That might take a bit of explaining. (Sometimes, I'm not entirely clear myself.)
CE is the "miniature" version of Windows. (The big version, referred to as "NT" on this blog, is familiar to most people -- nearly everybody knows what I mean when I talk about Windows 2000, XP, or Vista.) When talking to family members and friends, I get the most success by explaining it as "the operating system that runs Pocket PCs and SmartPhones". That's accurate (assuming they don't confuse it with PalmOS or their Nokia cell phone), but not a complete description. The CE operating system is actually used in a lot of other things such as industrial automation, GPS navigation systems, and routers. While skiing with my sister's husband, I was trying to explain this and I noticed that the little barcode scanners they were using to scan our lift tickets were running CE, so that made it a bit easier to explain. In any case, CE tends to be used in computerized components that are used as "devices" or "appliances" and not "computers".
I am a member of the "IDE Tools" team. Officially, this means that my team is responsible for the Platform Builder shell and project system used to develop for CE, but unofficially it means "if it runs on the desktop and it isn't owned by any other team, it belongs to the IDE Tools team". As a result, I get to stick my fingers in nearly every aspect of development for the desktop - compilers, debuggers, build systems, setup, C#, C++, scripting, you name it. If it runs on a Windows NT platform (Windows 2000, XP, Vista, etc.), I'm supposed to know all about it. It's my job.
Platform Builder ("PB") is the IDE for developing the CE operating system. This also takes a bit of explaining.
CE is different from NT in a lot of ways. You don't go out, buy a copy of CE, and install it on your phone/PDA/router/refrigerator/toaster. You buy a phone or a PDA with CE already built into it. That copy of CE was customized and tweaked to make it a (hopefully) perfect fit for the phone or PDA onto which it was installed. Typically HP, Samsung, HTC, or some other hardware vendor will design and develop some hardware to meet a need, then use the PB tools to tweak the CE operating system to work well with their hardware. The hardware and the CE operating system are then sold as a single item. So you won't see a copy of "Windows CE" on the shelf at your local software retailer.
PB has a catalog that allows the vendor to choose the parts of the operating system they need. A minimal CE operating system with just a kernel and basic device drivers is about 300K. A fully featured CE OS with a web browser, web server, solitaire, .NET Compact Framework, media player, etc. can be 64 MB or more. Somewhere in the middle is an OS that (hopefully) has all of the features you need for your device. This is important because each additional megabyte of flash or RAM adds to the cost of the hardware, reduces battery life, and increases device size and weight. These costs become especially significant if plan to ship 1 million CE-based talking toasters. (Are you sure you don't want a nice English muffin right now?)
Most devices running CE are proprietary hardware designs with a unique combination of microchips designed to fill a specific niche in the market. This usually means that the CE operating system needs to be customized somewhat to run on the specific hardware. Often, device manufacturers have to write their own device drivers since they're using proprietary hardware. PB provides many tools to help with this process. PB provides an integrated development environment (IDE) to allow developers to edit and compile their code (driver, application, etc.), copy it to the device ("download"), run their code and control its execution (remote shell), and debug their code (debugger).
Four different teams work on PB - debugger/connectivity, remote tools, build, and shell/project system (that's my team!). The responsibilities of the other teams are fairly self-explanatory and well-defined. The debugger/connectivity team works on the PB kernel debugger, the KITL connectivity layer, and the platform manager/connection manager transport layers. The remote tools team works on tools that run on the desktop to provide data about what is going on within a device (remote registry editor, remote file viewer, kernel tracker, etc.) The build team (currently one person, recently split off from the shell/project system team) is responsible for improving the CE build system. The responsibilities of my team (shell/project system) are a bit more nebulous, and we tend to wind up doing anything not covered by any other team (this drives my manager nuts).
The exact boundaries are sometimes hazy, but since the goal is "get the product working well", not "make my team look good", the boundaries don't have to be rock-solid.
| |
|