Skip to content

Incorrect handling of Umlaut when debugging with PowerShell Extension #1744

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
MarcusLerch opened this issue Feb 5, 2019 · 7 comments
Closed

Comments

@MarcusLerch
Copy link

Issue Type: Bug

debugging this code

$MyHashTable = [ordered]@{}
$MyHashTable.Add("Key1","Value1")
$MyHashTable.Add("KeyÄ2","ValueÄ2")
$MyHashTable

in VSCode with PowerShell Extension, incorrectly throws the following exceptions:

At C:\Untitled-1.ps1:3 char:24
+ $MyHashTable.Add("KeyÄ2","ValueÄ2")
+                        ~
Missing ')' in method call.

At C:\Untitled-1.ps1:3 char:24
+ $MyHashTable.Add("KeyÄ2","ValueÄ2")
+                        ~~~~~~~~~~~~~
Unexpected token '2","ValueÄ2"' in expression or statement.

At C:\Untitled-1.ps1:3 char:37
+ $MyHashTable.Add("KeyÄ2","ValueÄ2")
+                                     ~
Unexpected token ')' in expression or statement.

However the code runs fine in PowerShell and debugs correct in ISE.

Extension version: 1.11.0
VS Code version: Code 1.30.2 (61122f88f0bf01e2ac16bdb9e1bc4571755f5bd8, 2019-01-07T22:54:13.295Z)
OS version: Windows_NT x64 10.0.17763

System Info
Item Value
CPUs Intel(R) Core(TM) i7-6820HQ CPU @ 2.70GHz (8 x 2712)
GPU Status 2d_canvas: enabled
checker_imaging: disabled_off
flash_3d: enabled
flash_stage3d: enabled
flash_stage3d_baseline: enabled
gpu_compositing: enabled
multiple_raster_threads: enabled_on
native_gpu_memory_buffers: disabled_software
rasterization: enabled
video_decode: enabled
video_encode: enabled
webgl: enabled
webgl2: enabled
Memory (System) 31.92GB (23.73GB free)
Process Argv
Screen Reader no
VM 0%
@SydneyhSmith
Copy link
Collaborator

Thanks @MarcusLerch this seems like the same issue as #1680
We need to figure out how to properly configure the encoding settings and then document that.

@SydneyhSmith
Copy link
Collaborator

This occurs because your PowerShell encoding and VSCode encoding are not configured the same way; your script file is encoded in UTF8 and PowerShell is trying to read it in Latin-1 which is a setting that the PowerShell extension cant see or change. Can you try this #1680 (comment) and let us know if it works for you?

@MarcusLerch
Copy link
Author

@SydneyhSmith that is actually fixing the problem. If I manually set the file to UTF8 Encoding debugging and running gives the same correct output. VSCode opened the file as UTF8 with BOM which then produces the error.
So I'm closing this thread.
Thanks for your support!

@rjmholt
Copy link
Contributor

rjmholt commented Feb 8, 2019

@MarcusLerch would you be able to share the output of $PSVersionTable? I'm interested in the fact that the BOM seemed to cause rather than resolve the problem.

My reasoning is:

  • UTF-8 is generally a good choice for non-ASCII characters, since the extended ASCII encoding for each character set is often different (i.e. latin-1 might solve the problem for 'ü' but not for Cyrillic characters, whereas UTF-8 solves it once and for all).
  • UTF-8 without BOM has become the dominant standard encoding scheme, but looks just like ASCII or latin-1 to things like Windows PowerShell. You can try to change the settings of that, but I see mixed reports about that working.
  • A lot of .NET BCL reader types (like System.Text.TextReader) look for a BOM, and WIndows PowerShell automatically "just works" when it sees a BOM like this (possibly why the configuration settings are so hard to make work)

@MarcusLerch
Copy link
Author

MarcusLerch commented Feb 11, 2019

@rjmholt sure, here you go

PS C:\> $PSVersionTable

Name                           Value
----                           -----
PSVersion                      5.1.17763.134
PSEdition                      Desktop
PSCompatibleVersions           {1.0, 2.0, 3.0, 4.0...}
BuildVersion                   10.0.17763.134
CLRVersion                     4.0.30319.42000
WSManStackVersion              3.0
PSRemotingProtocolVersion      2.3
SerializationVersion           1.1.0.1

Taken from the VSCode PowerShell Terminal.
IF I can check anything else to help find the cause just let me know.

@MarcusLerch
Copy link
Author

MarcusLerch commented Feb 11, 2019

Just tried to reproduce the behavior and @rjmholt you are correct.
UTF8 with BOM solves the problem not UTF8.
I saved the file as UTF8 with BOM and now it runs and debugs correct!
I then saved it as UTF8 and debugging now longer works.
So here are the steps to reproduce the error:

  1. Create a new file in VSCode
  2. paste the follwing PowerShell code:
$MyHashTable = [ordered]@{}
$MyHashTable.Add("Key1","Value1")
$MyHashTable.Add("KeyÄ2","ValueÄ2")
$MyHashTable
  1. Save the file as PS1 file and VSCode automatically saves with UTF8 Encoding
  2. Set a breakpoint and start debugging -> you get the Error
  3. Save the file with encoding set to UTF8BOM -> everything works fine

@rjmholt
Copy link
Contributor

rjmholt commented Feb 14, 2019

I then saved it as UTF8 and debugging now longer works.

Yeah I thought this might be the case -- basically the PowerShell tokenizer will detect the BOM and reconfigure itself. The .NET StreamReader classes all do something like this, and it often causes headaches.

When there's no BOM, PowerShell assumes by default that the encoding is latin-1/CP-1252 and that's when things break.

You can try to configure PowerShell's encoding (@rkeithhill being a StackOverflow legend as always!), but in versions prior to 6, trying to get PowerShell to keep and honour your encoding settings is quite a dance.

So a BOM is the easiest way to be sure.

PowerShell 6+ defaults to UTF-8 without a BOM (but will happily accept a BOM as well), so you'll likely find this isn't a problem in PowerShell 6+.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants