Larry Osterman's WebLog

Confessions of an Old Fogey
Blog - Title

The Windows command line is just a string...

The Windows command line is just a string...

  • Comments 30

Yesterday, Richard Gemmell left the following comment on my blog (I've trimmed to the critical part):

I was referring to the way that IE can be tricked into calling the Firefox command line with multiple parameters instead of the single parameter registered with the URL handler.

I saw this comment and was really confused for a second, until I realized the disconnect.  The problem is that *nix and Windows handle command line arguments totally differently.  On *nix, you launch a program using the execve API (or  it's cousins execvp, execl, execlp, execle, and execvp).  The interesting thing about these APIs is that they allow the caller to specify each of the command line arguments - the signature for execve is:

int execve(const char *filename, char *const argv [], char *const envp[]);

In *nix, the shell is responsible for turning the string provided by the user into the argv parameter to the program[1].

 

On Windows, the command line doesn't work that way.  Instead, you launch a new program using the CreateProcess API, which takes the command line as a string (the lpComandLine parameter to CreateProcess).  It's considered the responsibility of the newly started application to call the GetCommandLine API to retrieve that command line and parse it (possibly using the CommandLineToArgvW helper function).

So when Richard talked about IE "tricking" Firefox by calling it with multiple parameters, he was apparently thinking about the *nix model where an application launches a new application with multiple command line arguments.  But that model isn't the Windows model - instead, in the Windows model, the application is responsible for parsing it's own command line arguments, and thus IE can't "trick" anything - it's just asking the shell to pass a string to the application, and it's the application's job to figure out how handle that string.

We can discuss the relative merits of that decision, but it was a decision made over 25 years ago (in MS-DOS 2.0).

 

[1] Yes, I know that the execl() API allows you to specify a command line string, but the execl() API parses that command line string into argv and argc before calling execve.

  • > it was a decision made over 25 years ago (in MS-DOS 2.0).

    I am pretty sure that was actually inherited from CP/M; the (MS-,PC-,DR-,Q-)DOS COM file format used the same memory layout so that the "thousands" of existing CP/M programs could be ported over more easily. It also explains the 127-character limitation for DOS command lines that still exists today; the command tail (not including the command name) started at 0x81 and the program loaded at 0x100.

  • Dave, that's entirely possible.  OTOH, for OS versions before 2.0, launching a new program was actually a function of command.com - there was no OS API for launching a new process.

  • It is of course worth noting that if you link your C program with mainCRTStartup or wmainCRTStartup, the C runtime decodes into argc/argv and calls your main or wmain function respectively.

    It's unusual, but not forbidden, for a Windows application (i.e. an application that registers and uses its own window classes, rather than a console) to do this. The bit governing whether or not a console is created for the application is an independent setting, set in the PE header by the linker (/SUBSYSTEM:CONSOLE vs /SUBSYSTEM:WINDOWS). Visual Studio sets its defaults so console applications use (w)main, and Windows applications use (w)WinMain, but it's not required. I don't know what Firefox does but I'd take a guess that they might be using (w)main for portability.

  • Mike: Absolutely.  I actually had a paragraph in the post describing that but edited it out (because I thought it rendered the narrative flow awkwards).

  • And even if you use WinMain, you can still make use of the C runtime's argument decoding by accessing __argc and __argv.

    In other words, the following are all completely orthogonal to each other:

    * Whether you are /SUBSYSTEM:CONSOLE or /SUBSYSTEM:WINDOWS

    * Whether your entry point is mainCRTStartup (calls main) or WinMainCRTStartup (calls WinMain)

    * Whether you access arguments via __argc/__argv or as a raw string from GetCommandLine

    * Whether your program creates a GUI or calls console APIs (or both)

  • Steve, at the Win32 level, all of that is irrelevant.

    Win32 applications get their command line from the GetCommandLine() API, what they do with it after that is their business.  The entrypoint to the process may do preprocessing (mainCRTStartup or WinMainCRTStartup) or it might not.  But the key takeway is that in the Win32 model, command line processing is handled by the child process, while in *nix, command line processing is handled by the parent process.

  • This manual provides info on how programs were loaded in early versions of DOS. Be warned that most of the numbers are in decimal, NOT hex:

    http://www.patersontech.com/Dos/Docs/86_dos_prog.pdf

  • Alun Jones expanded on this on his blog back when the fires were still raging:

    http://msmvps.com/blogs/alunj/archive/2007/07/23/firefoxurl-part-ii.aspx

  • You have a bit of an odd phrasing here which threw me for a loop.  ("In *nix, the shell is responsible for turning the string provided by the user into the argv parameter to the program.")

    I'd say the caller is responsible, rather than the shell.  A shell is only involved if you're in a shell, or if your code calls system(), or popen(), or some other hugely dangerous system call, like pwnme().

  • When you use an obsolete command-line, you get obsolete command-line parsing. PowerShell is fast becoming the new command-line or Windows (it is designed to be). With it the arguments are parsed by the shell.

  • Adam: You're right, my bad.

    Mitch: What powershell hands to it's applets is irrelevant.  I'm describing the Win32 command line handling semantics.  Powershell doesn't use Win32 command line semantics when interacting with applets, that's fine - I did say that this was an implementation decision.

    If powershell launches a Win32 process, it passes the arguments as a single string, because that's the way that Win32 works.  Powershell can't change it, because it's just a shell.

  • Theorem :

    A subset of a true phrase is not necessarily a true phrase.

    Proof :

    "PowerShell is fast becoming the new command-line".

    -> Probably true.

    "PowerShell is fast"

    -> AHAHAHAHAH! :(

    Really.. I don't know how anyone could use it given its speed... :(

  • The "problem" is that POSIX functions (except for deprecated, highly-insecure functions like system()) take arguments explicitly as arguments. It will never take a series of characters to mean something other than it means.

    Windows, on the other hand, tries to find meaning in a string. Meanings which may be very unwanted and/or can be horribly insecure. There's a reason why system() is so hated... it has the same problems as the GetCommandLine API and things have. system() gives special meaning to a string.

    In this case, Windows should put all of these functions in the banned functions file (like strcpy) and make new, explicit APIs that treat process paths and arguments as very, very different things. Security should trump backcompat if the old methods are clearly of a very borked design.

  • At the ISO C89 level, the main() function has well-defined arguments, and there is a de facto method for escaping parts of the command line in order to present those arguments to a program using the system-supplied C runtime. The firefoxurl vulnerability came about because there doesn't appear to be any way for an URL handler to take advantage of that encoding -- which, given that it takes its argument from an URL and uses it to form a command-line, is quite inexcusable. Ultimately, as Rosyna says, this is a threat in CreateProcess itself.

  • Rosyna, I'm not sure that I understand the difference between the two paradigms, or why one is better than the other.

    In one paradigm, an application (the shell) parses a string and converts it to arguments.  In the other paradigm, an application (the application being called) parses a string and converts it to arguments.

    The only significant difference is that in the *nix paradigm, the caller doesn't have to interpret the intent of the parent - but there's also an opportunity for mischief there, because the parent can produce strings that are impossible for the shell to create (and thus may not have been tested by the application).

Page 1 of 2 (30 items) 12