Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unicode support #44

Open
fmang opened this issue Oct 28, 2020 · 2 comments
Open

Unicode support #44

fmang opened this issue Oct 28, 2020 · 2 comments

Comments

@fmang
Copy link
Contributor

fmang commented Oct 28, 2020

The task elevated_shell.ps1 creates runs cmd.exe, which calls PowerShell with its outputs redirected. By having cmd.exe make the redirections, the files are encoded as OEM.

If we dropped the cmd step and ran PowerShell directly, the whole data pipeline would have full Unicode support.

Performing redirections in PowerShell with Write-Host involved is a nightmare, especially if we want to support older PowerShell versions like 4. I haven’t had any luck, but if someone manages to have the task spawn PowerShell directly without breaking the test suite, glory to them!

Related to #43 which fixes encoding corruptions, but limits the supported character set to OEM/ANSI.

@mwrock
Copy link
Member

mwrock commented Nov 4, 2020

Its been so long that I really can't remember why we did not go directly through powershell and went to cmd first. I'm pretty sure we tried and ran into issues similar to yourself.

@fmang
Copy link
Contributor Author

fmang commented Nov 5, 2020

I see. The trick with the cmd wrapper is that it merges the multitude of PowerShell streams into stdout/stderr. Since PowerShell wouldn’t collaborate, I thought I could use System.Diagnostics.Process directly to emulate cmd’s behavior, and I managed to get an echo with Unicode to work.

Here’s the task code:

$OutputEncoding = [console]::InputEncoding = [console]::OutputEncoding = [System.Text.Encoding]::UTF8
$pinfo = New-Object System.Diagnostics.ProcessStartInfo
$pinfo.FileName = "PowerShell"
$pinfo.RedirectStandardOutput = $true
$pinfo.RedirectStandardError = $true
$pinfo.UseShellExecute = $false
$pinfo.Arguments = "{arguments}"
$p = New-Object System.Diagnostics.Process
$p.StartInfo = $pinfo
$p.Start()
$p.WaitForExit()
$p.StandardOutput.ReadToEnd() | Out-File {out_file} -NoNewline
$p.StandardError.ReadToEnd() | Out-File {err_file} -NoNewline
exit $p.ExitCode

Here it is in action: fmang@c86fd08

One issue remains: ipconfig, and presumably most cmd-ish programs won’t output UTF-8 even when the console is configured that way, but PowerShell will still try to decode them as UTF-8, causing corruptions. Given special characters didn’t work in the first place, we could specify that for old commands to work, one must explicity convert the output like ipconfig | Convert-From-OEM-To-Unicode, though I don’t know what PowerShell command can do the conversion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants