-
Notifications
You must be signed in to change notification settings - Fork 513
Powershell Parser can't tell the difference between En dash and dash #1308
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
When the file created above is opened in PowerShell ISE line 2 is highlighted and the En dash is shown as: Vscode displays the above as: |
Thanks for this! Just curious, what do you expect to happen? That PowerShell would be able to handle the En dash? Or that VSCode was more outward on the fact that it is an en dash? |
For me if I use an en dash or em dash it does break highlighting which is an indicator of the wrong dash being used although not super in your face. Seems like something that should perhaps be handled by PSScriptAnalyzer though as an en/em dash could be valid in a string and with positional parameters "<en-dash>something" might be a string not a parameter which would be valid. PSScriptAnalyzer could have some sort of "potential illegal dash" warning that could be suppressed when it's intentional. |
That's a great idea @omniomi 👍 We should open an issue on: to create such a rule. |
To clarify, is the problem that an em-dash is working like a dash, or that it's not working like a dash? Or is the problem that the encoding is bad and we're displaying it not as a dash but as a nasty character sequence? PowerShell accepts all dash variants equally by design. So I guess there's a problem if we're not doing that. |
If I type: Get-Item —Path ~ And execute it with F8, it works. But we don't get any completions on it, so we should definitely fix the regex/completions mechanism there. @gbuktenica Can you explain what the hold-Alt functionality does and what provides it? |
@rjmholt The ALT function is just to show you how to recreate the problem. A developer would never normally do this. Normally this issue arises when a segment of code is copied from an external source, say github, stackoverflow or more likely a blog and during the copy and paste a valid dash is converted to an en dash. Potentially the original hosting platform had already converted a dash. |
@omniomi and @tylerl0706 I have raised a request on PSScriptAnalyzer. |
So I've done some research and think I finally understand the Alt-code thing as a Windows CP1252 code-point entry feature, except that I can't find any specification of how it's supposed to function these days (i.e. how much code-point translation it does). In CP1252, en-dash is code-point 150, but it looks like VSCode converts that to UTF-8, where it is encoded as Because I haven't been able to reproduce the Alt-code input, I don't know if VSCode is displaying But I did copy and paste an em-dash (—) into the buffer and it worked properly — recognised as a dash by the PowerShell parser/tokenizer, so that In terms of other applications converting characters and then copying them back into VSCode, I don't really see how we can filter characters copied into the document. Even if VSCode allows the extension to do it, PowerShell supports unicode strings, so backquotes and other things generated by MSWord are legitimate string contents. Also, PowerShell already supports using left/right double quotes and en/em-dashes where ASCII quotes and dashes are conventionally used respectively. That is, they're not wrong or invalid — you're allowed to use them. |
Also @gbuktenica, hello from a fellow sandgroper. |
@rjmholt Yes, I included the Alt-code thing so that a Windows developer recreate the Unicode character I had the issue with (em-dash). |
Ah, I see! Well, for completeness, VSCode lets you change your file encoding. And it also looks like there's a way to make the ISE use UTF-8. |
So now that we have an issue on PSSA, do we have any action to take in the extension? I think no - and we can probably close this. |
@rjmholt and I are discussing this. I think we generally agree that this is already being handled correctly in PowerShell, and that we shouldn't automatically transform what people paste into VS Code. That being said, it does make sense to me to have a PSSA rule that can help detect where these characters are showing up in your file (and said PSSA rule can have a "quick fix" in VS Code that allows you to snap to one specific dash universally). However, I wouldn't place it at a particularly high priority. |
Issue Type: Bug
This issue will happen when code is copied from an external source such as the internet.
Error returned:
Extension version: 1.7.0
VS Code version: Code 1.22.2 (3aeede733d9a3098f7b4bdc1f66b63b0f48c1ef9, 2018-04-12T16:38:45.278Z)
OS version: Windows_NT x64 10.0.15063
System Info
The text was updated successfully, but these errors were encountered: