Issues with Windows PowerShell syntax

Issues with Windows PowerShell syntax

  • Comments 21

REI recently posted some comments/requests about Windows PowerShell syntax at:

http://blogs.msdn.com/powershell/archive/2006/04/25/583273.aspx#675133

Let's go through a few of the points.

...the syntax was just way too cryptic and unintuitive. Often it's even dangerous. Like this:

#PowerShell's syntax causes dangerous problems generating incorrect results and no error for seemingly innocent expressions:
function Pow($var, $exp) { [Math]::Pow($var, $exp) }
Pow(2, 8) + 3 #invalid (you'd expect it to work)
Pow 2 8 + 3 #valid, BUT INCORRECT RESULT (256)
(Pow 2 8) + 3 #vald (259)
...........
#PowerShell requires you to know whether a function is a cmdlet/script or a .NET method. Even though this method does the exact same thing as the method above, the syntax differs:
[Math]::Pow 2 8 + 3 #invalid
[Math]::Pow(2, 8) + 3 #valid


There are really 2 issues here: parsing modes and function calling syntax.  In order to make Windows PowerShell serve 2 masters: Rich Scripting Engine & Interactive Shell, we made a number of design decisions that you need to be aware of.  We realize that whenever we have one of these, it is going to require our users to invest their time and energy to learn so we treat these issues with incredible respect.  We travail over each of them, demanding to understand whether it is truly necessary or whether we just haven't thought about the problem hard enough.  I'm really proud of the teams dedication to work through these issues and there are a bunch of them that we were able to design away after enough creative thought and hard work.  In the end, we are trying to perform a slightly unnatural act so a couple remain.

With all that as a preamble, let me be quick to note that this issue exists because we are providing a uniform surface to an incredibly wide range of functions.  In the typical shell experience, the user has dozens of these issues as the shell does a limited set of functions and then you have to rely upon separate and sundry utilities and commands to accomplish a task.

The first of those is Parsing Modes:

PSMDTAG:PARSER: The parser has both a COMMAND and an EXPRESSION parsing mode. 

EXPRESSION Mode:
PS> 8 + 3
11

COMMAND Mode:
PS> Write-Output 8 + 3
8
+
3

Parentheses allow you to use expressions in a statement that looks like it would be evaluated in COMMAND Mode:

PS> Write-Output (8 + 3)
11

Bruce Payette, the Language dev-lead, is writing a great book Windows PowerShell in Action (http://manning.com/payette/ ) which goes into this topic in great detail regarding both how it works and why we do this.  Sadly, you'll have to wait this this fall to purchase it.

 

The second issue is function call syntax.  A number of people have been tripped up by this because a number of environments call functions putting the arguments within parentheses.  e.g. FOO(2,8)

PSMDTAG:PHILOSOPHY: Command abstractions are the greatest good.

PSMDTAG:PHILOSOPHY: The world already has lots of great tools for PROGRAMMERS, Windows PowerShell is focused on USERs.

While we provide access to a wide range of functions, our core belief is that commands represent the greatest good.  The essense of a COMMAND is the fact that someone took time to think about the best way to surface a function to a USER (vs a programmer).  The industry already has lots of great ways to surface functions to programmers, our mission is to address the great unmet need of users.

As such, we view functions as just another implementation choice for writing COMMANDS (not methods).  This is why we surface them using COMMAND syntax.  That said, we've brainstormed the idea of allowing either syntax because a number of people have stumbled on this point.   I believe that one of the reasons why this comes up is that we have stopped short on making functions be true equivalents to Commands (something we all dislike but "to ship is to choose").  I'm inclined to address this ASAP and use better documentation to get people over this point.

PSMDTAG:PHILOSOPHY:  No matter what, there will always be some number of key concepts that you have to learn to be productive in a new environment.  You want to keep that number as small as possible but if it is zero, there is a good chance that you've produced something worthwhile.

 

 

REI continues:

#Certain universally recognized aliases for common tasks don't exist:
ls #aliased
dir #aliased
new #not aliased (new-object)

#Calling a default constructor requires its own special syntax:
new-object collections.arraylist(5) #valid
new-object collections.arraylist() #invalid
new-object collections.arraylist #valid

Great point.  Here is what is going on.  We decided against provide an alias for NEW because we decided to make NEW a language keyword in a future version.  Thus in the future, you'll be able to do either:

New-Object Collections.ArrayList 5

or

new Collections.ArrayList(5)

 

REI continues:

Also, the no plurals thing: it doesn't work. The convention in .NET is to signify collections as plurals, and PowerShell's no-plurals policy is a real lump in the pudding. get-member would make a lot more sense if it were called get-members.

The problem is that plurals are an inconsistent syntax in English.  Sure, we can all agree that Get-Members makes more sense than Get-Member but then how is someone from Korea  suppose to guess that it should be Get-ChildREN vs get-ChildS ?  Predictability is critical for a command line environment (there is a theory of operations involved here that I should take time to document but this is a core pillar of Windows PowerShell).  The other reason why we stick with the singular tense is that it is often unknown whether a command provides a value or a set of values.

Get-Process Notepad

Returns a singleton if there is only one notepad running and a set of values if there are more than one notepads running. 

 

Yeah, I know, this isn't the right place to be rambling about this....

Nope - there you are absolutely wrong.   We love people telling us what they don't like.  10,000 thanks for taking the time to articulate the things that you didn't like/found confusing.  These really help us understand where our issues are and it forces us to be crisp about our thinking.  e.g. is this something we are hard-core about and need to do a better job of documenting or is it something we are open to changing and if so, what can we do to improve things.

I encourage other people to chime in with the things that they've found confusing. 

Jeffrey Snover [MSFT]
Windows PowerShell/Aspen Architect
Visit the Windows PowerShell Team blog at:    http://blogs.msdn.com/PowerShell
Visit the Windows PowerShell ScriptCenter at:  http://www.microsoft.com/technet/scriptcenter/hubs/msh.mspx

Leave a Comment
  • Please add 8 and 4 and type the answer here:
  • Post
  • Hello

    I am not sure if there is a bug register for PowerShell, and I hope you do not mind me posting it here.

    $a = "6"
    $a = 0 + $a + 3

    Now, $a is eq 9, as expected

    $a = "6"
    $a = 0 + $a * 3

    Now, $a is eq 666, not 18 as expected

    Is this a bug?

    Thanks
  • Thanks, Jeffery. This is precisely the sort of detailed, insider explanation that administrators are unaccustomed to getting from Microsoft. Even if we don't necessarily agree with a design decision, it's so incredibly useful to know about the scenario and backstory that resulted in that decision. This sort of transparency is deeply refreshing.
  • David, one issue with your example is that you are not using integers, but strings.

    $a = "6"
    is a string as denoted by the quotes.
    $a = 6
    is an integer.

    I believe the other issue involves parsing modes and operator precedence.

    When PowerShell sees $a = 0 + $a + 3
    PowerShell processes from left to right, first evaluating 0+$a, which puts PowerShell in Expression mode and gives the integer value of 6. Then it adds 3 to 6 and assigns the result to $a.

    However, when PowerShell sees $a = 0 + $a * 3
    PowerShell first processes $a * 3 because the multiplication operator has higher precedence. This puts PowerShell into Command (or Argument) mode and the result is a string operation since $a is the first token and it is a string. This gives you the string "666". Then 0 + $a is evaluated. 0 is first so PowerShell evaluates in Expression mode, turning the string "666" into the integer value 666. Likewise, if the 0 had been a 1, you'd have gotten the intger value 667.
  • Forgot to add that you can change the order of evaluation by using parentheses. So if you had $a = (0 + $a) * 3
    then 0 + $a would be evaluated first rather than $a * 3, so PowerShell would have evaluated the expression in Expression mode as in your first example because the integer 0 is the first token. Thus $a would be converted to an integer and you would get the integer value 18 rather than the string value "666".
  • http://blogs.msdn.com/powershell/archive/2006/07/23/Issues_with_Windows_PowerShell_syntax.aspx
  • Hello again

    The example

    $a = "6"
    $a = 0 + $a + 3

    is in the documentation for Powershell. I would never use this sort of code. I was just trying to better understand the consequences of this sort of code. I thought it was interesting that  each of these:

    $a = 0 + $a + 3
    $a = 0 + $a - 3
    $a = 0 + $a / 3

    results in the expression mode, and the calculation being performed. But this :

    $a = 0 + $a * 3

    results in command mode, and the strings being concatenated, and the zero being dropped off as well. Shouldn't the "0 +" cause an exception when it is performed after the string manipulation, in command mode? It sounds like it just forgets the "0 +"

    So I can see why the multiply-equals results in the string multiplication. I can see why the divide results in the expression mode (there is no /= function for strings. But I would say that the documentation is a little bit wrong. It states:

    "In an assignment, the first element of the right hand side determines whether strings or integers are the result.  If an explicit requirement for an integer exists, preface the right hand side with "0 +" or cast the right hand side to a number, followed by the rest of the expression."

    According to that statement, "$a = 0 + $a * 3" should result in  expression mode.

    Anyway, I think this code is dangerous, and should not be done, firstly because it is not good practice, and secondly because it is not explicit, and thirdly because this proves that the outcome is unpredictable.

    Thanks
  • OK, I get what you're saying now.

    In the statement $a = 0 + $a * 3, PowerShell won't evaluate the 0 first. Due to higher precedence for multiplication it will evaluate $a * 3 first, get "666", then evaluate 0 + $a (or 0 + "666"). The zero is the first element in the evaluation so PowerShell uses Expression mode and the result is an integer which is why the result is 666 and not "6660". If you instead use the expression ($a * 3) + 0, then the evaluation after $a * 3 would be "666" + 0. Here, the string "666" is the first element, so PowerShell evaluates the expression in Command mode and the result is "6660".

    You're right. The docs could be clearer about order of evaluation due to precedence, and how the existance (or lack thereof) of certain operators can affect parsing and evaluation. PowerShell will always take these factors into account and the first element in an expression will not always be the first element to be evaluated.

    I don't think it's a bug. It's similar to what would occur in C++, C#, etc., if you evaluated 2 + 6 * 3 vs (2 + 6) * 3. In the first example, 6*3 is first evaluated, whereas 2+6 is evaluated first in the second example, each yielding different results due to operator precedence. This type of pitfall affects more scenarios in PowerShell, however, due to its greater flexibility in type conversion and different parsing modes. In most programming languages, you'd have to perform the type conversion explicitly.
  • Thanks n4cer

    Yes, I think it is a issue with the documentation, not a bug. But it points out the inconsistencies within a language that does not have strict type checking. Good luck debugging stuff like this. I prefer a strict type checking language so that you know what you are getting.

    I totally understand the need to have a more flexible option in a command line interface. It just means the coder has to know what they are doing.

    Like the documentation says, the confusion can be avoided by applying a strict type to the variable when it is created:

    [int]$a = "6"
    $a = $a * 3

    results in the calculation being performed, even with the implicit type conversion.

    Thanks
  • Thanks for the reply, Jeff. I'm just as glad as you that you've taken the time to give a real explanation as to how things are done.

    I don't like to beat a dead horse, but let me explain my thoughts further...


    "As such, we view functions as just another implementation choice for writing COMMANDS (not methods).  This is why we surface them using COMMAND syntax.  That said, we've brainstormed the idea of allowing either syntax because a number of people have stumbled on this point."

    So will there be a command or registry value or something that I can change to switch modes? What happens when I'm reading someone's code and trying to understand it, or when I run someone else's script when I'm using the other setting?


    That's great that you're making new a keyword. In the future, will I be able to say arraylist(), or are the brackets going to throw it off the way it does right now?

    The problem with the current situation is that we get an error saying that an expression was expected after the '('... which, to most of us OOP programmers, is an all too familiar error, but makes no sense in this context. I stumbled around for quite a while until I realized that constructors can't have the () when it doesn't take any arguments.


    "Sure, we can all agree that Get-Members makes more sense than Get-Member but then how is someone from Korea  suppose to guess that it should be Get-ChildREN vs get-ChildS ?"

    As a Japanese-English translator, I guarantee that although you will often hear Japanese and Koreans mistaking singulars and plurals, this is rarely a problem for a computer user's purposes. Two reasons for this:
    1. Even though most Japanese people have a hard time knowing _how_ to use plurals, most of them know one when they see one.
    2. Because most of the time the plural form of a word is made by adding a character or two to the end of the word, the tab key solves this problem very well. Hungarian notation was thrown out in .NET thanks to Intellisense and is now frowned upon, so problems such as these that can be solved by UI advances are bound to suffer the same fate in the near future.


    "Predictability is critical for a command line environment."

    Unfortunately, the no-plurals policy doesn't resolve lack of predictability enough to be meaningful at all. Synonyms are so abundant in English that we'll end up looking up the documentation (or hitting tab) anyway.

    I don't mean to alienate non-English users, but I think being consistent with standard English as well as with the rest of the .NET platform would make it more predictable.


    "The other reason why we stick with the singular tense is that it is often unknown whether a command provides a value or a set of values.

    True, but in the get-process example, if there are two instances of notepad, a method expecting a Diagnostics.Process object would break. It would be much more informative to know when a method _might_ return multiple objects, in which case we will know to either 1. make sure the call would yield only a single result -- perhaps by specifying a handle rather than a name, or 2. make sure the receiving function/script/etc can handle collections.

    Without knowing that the command can return multiple values we, are more likely to accidentally use it in a way that might in an off-chance return multiple values. I think both users and developers would be set back by this.


    Thanks again for the detailed response. I'd love to hear from you more sometime, by email if another blog entry on the same topic would be inconvenient. My address is ragingrei hotmail.
  • > So will there be a command or registry value or something that I can change to switch modes? What happens when I'm reading someone's code and trying to understand it, or when I run someone else's script when I'm using the other setting?

    No.  IF we supported this, we would just allow either syntax.

    > In the future, will I be able to say arraylist()

    yes. You'll be able to say:
     $a = new ArrayList()

    > Even though most Japanese people have a hard time knowing _how_ to use plurals, most of them know one when they see one.

    Therein lies the problem.  With envrionments like Visual Studio, you will SEE things (via intellisense) and be able to RECOGNIZE them.  With a command line environment, you have to RECALL what to type.  RECOGNITION VS RECALL is the central theoretical split between GUI and CLI environments.

    I think being consistent with standard English as well as with the rest of the .NET platform would make it more predictable.

    > I think being consistent with standard English as well as with the rest of the .NET platform would make it more predictable.

    If our primary customers were developers, I'd agree with you however our primary customers are Admins/IT Pros who we assume will not have much/any knowledge of .NET.  Now that said, one of our explicit goals was to met the needs of those users in a way that provided a smooth glide path to .NET but that is a secondary design consideration.


    Cheers!
    Jeffrey Snover [MSFT]
    Windows PowerShell/Aspen Architect
    Visit the Windows PowerShell Team blog at:    http://blogs.msdn.com/PowerShell
    Visit the Windows PowerShell ScriptCenter at:  http://www.microsoft.com/technet/scriptcenter/hubs/msh.mspx
  • I am having a problem trying to figure out how to pass a varible to "where."

    Senario: I have an XML file that contains all my Microsoft Product ID's. I want a function that will parse the XML file for a given product and return the PID.

    Example:
    Function GetPIDS
    {
     $pids = [xml] (Get-Content \\NDISDEV\Data$\AdminTools\Pids.xml)
     $pids.Product_Keys.PID | where {$_.Name -like $args} | Format-Table Name,Key
    }

    Problem: where doesnt seem to like $args. I have tried $args, "$args", and '$args'. It failes with all of them.  
  • This is a bug. If you assign $args to another variable like

    $myargs = $args

    and then use that variable in the scriptblock  instead of $args it should work ok. What's happening is that where-object invokes the scriptblock with $sb.Invoke() so $args is reset in the scriptblock.

    -bruce

    Bruce Payette [MSFT]
    PowerShell Technical Lead
  • Assign $args to a local variable and use it:

    Function GetPIDS
    {
    $x = $args
    $pids = [xml] (Get-Content \\NDISDEV\Data$\AdminTools\Pids.xml)
    $pids.Product_Keys.PID | where {$_.Name -like $x[0]} | Format-Table Name,Key
    }

    Jeffrey Snover [MSFT]
    Windows PowerShell/Aspen Architect
    Visit the Windows PowerShell Team blog at:    http://blogs.msdn.com/PowerShell
    Visit the Windows PowerShell ScriptCenter at:  http://www.microsoft.com/technet/scriptcenter/hubs/msh.mspx

  • I encountered a problem with the globbing syntax that I couldn't solve. Filenames containing [] characters are almost impossible to manipulate because the names are treated as regular expressions. I'm not trying to match a character class: the filename really contains square brackets.

    The worst thing is that often such mistakes are completely silent: some files just don't get processed. It's really quite dangerous.

    After wasting two hours by constructing various alternatives involving pipelines to try to prevent interpretation of strings as filespecs, of trying to escape the filenames before they're interpreted, and of reading documentation to try to find a way of avoiding wildcard expansion, I gave up with PowerShell and haven't used it since. I'd be really interested in knowing the solution, though.
  • I thought I'd provide an example of the problem. In an empty directory, try this:

      new-item [test] -type file
      get-childitem | remove-item

    The file [test] is not deleted. Of course, in this case, remove-item * is an acceptable substitute, but in my case I had 'where-item' in the pipeline (and I was actually trying to archvie the items somewhere safe).

    It was a surprise to find that a pipe of objects wasn't equivalent to specifying those same objects with wildcards. A lot of programs use suffixes like [1] on filenames, and this is causing me problems.

    Another observation. If you try:
      move-item -path * -destination newname

    You get the error:
      Move-Item : Cannot move item because item at 'Some\Path\[test]' does not exist.

    Hm. I'm stuck.
Page 1 of 2 (21 items) 12