Everybody likes pipes. Starting from 1972 on when Douglas McIlroy introduced the concept, pipes have been an unreplacable tool for hooking programs together. Pipes just work. Well... at least in UNIX. But you know, this is a blog about me being stuck in Windows.
Real Pipes in UNIX
The main idea of a pipe is that you take the output of one program and connect it to the input of another program. The pipe itself does nothing more than just forwards data. When the output of one program does not fit into the input of another program, then you can pipe it first through a filter program, that transforms the data as needed. But the pipes themselves remain only as a transport layer, carefully carrying data from one program to another and not changing a bit on the way.
The same holds true for real-world pipes. A good pipe is one that doesn't change the aroma on its way from bowl to your mouth. It's the tobacco you want to smoke, not the pipe.
I had a PHP script that generated test data for MySQL database. In UNIX I would have used it as follows:
php create-test-data.php | mysql dbname
I thought that this should also work with PowerShell.
Water Pipes in PowerShell
PowerShell pipes work more like water pipes. The smoke that comes in is sucked through water, changing it's aroma, softening the bitter taste. Water pipes are great, but you shouldn't try to sell them labeled as normal pipes.
When I ran the above code in PowerShell, the data generated by the script wasn't exactly the same that MySQL database received. If it even received it, because the thing crashed along its way.
Actually I didn't even had to pipe it to another program, just redirecting the output to a file changed it considerably:
php create-test-data.php > test-data.sql
Let Me Encode This for You
The first problem was encoding. The output of the script was in UTF-8 encoding. PowerShell wanted to convert the text into his internal UTF-16 representation and then convert it to another encoding when saving to file, because for PowerShell the
> operator is equivalent of piping your output to
php create-test-data.php | Out-File test-data.sql
-Encoding parameter, which can have the following values:
oem. I tried all of them, and the only one that preserved my encoding was
oem – which designates single-byte encoding.
Let Me Correct Those Lines
The philosophy of UNIX has been: it's all text. The input of every program is text and the output of every program is text. Except when it's not, and when it's not, then you can't use all the common UNIX text-processing tools on it. Instead you have to use separate tools specific to your binary format. For example you can use ImageMagick to apply all kinds of transformations for images.
This philosophy was recognized by the PowerShell team as one of the great weaknesses of UNIX pipeline. And therefore the mantra of PowerShell has been: it's not all text. It has been the great promise of PowerShell, that it will enable you to work more easily with all kinds of data, not just text.
And this all works out fine, when your program outputs .NET objects. But when it doesn't, the output is treated as text. But it's a water pipe as you remember, so the text isn't just left alone, it's transformed into .NET array, array of lines.
Doesn't look that bad, does it. But the trick is that when the text is split to lines, the line separators are discarded. And when it is put back together, all lines are joined with
\r\n. For example this input:
my name is Rene.\r\n
I'm the author of this blog.\n
...will be converted to this output:
my name is Rene.\r\n
I'm the author of this blog.\r\n
First of all it's dosification - all your nice UNIX file endings will be converted to ugly DOS line endings. Really annoying. Not just annoying – terrible. I think you already know where this is going: binary data.
The SQL generated by my script also included binary data. And you can imagine what this kind of conversion can do to binary data.
There is no built-in mechanism in PowerShell to overcome this problem, although the problem is well known and there exist some ugly workarounds.
At the end of the day PowerShell had failed me. It was the first real job I wanted to do with PowerShell and it failed completely.
If PowerShell really wants to succeed, this behavior has to be corrected. Please, PowerShell, no water inside my pipe.
Thanks to paws22 for sharing the above photo in Flickr under Creative Commons Attribution Noncommercial Share-Alike license.