Replies: 25 comments 14 replies
-
|
Basing it on what the current script users isn't the greatest of ideas. What happens when one script calls another and one is unix-style LF and the other windows style CR-LF .
|
Beta Was this translation helpful? Give feedback.
-
|
I would like a global "don't corrupt my text data without my express permission" preference for PowerShell. It still confuses me that the current behavior isn't considered a bug. All output should be treated as binary unless I explicitly pipe it into a PowerShell command that expects its input to be objects rather than data. I.e., the behavior should be based on what is consuming the data, not based on having been invoked by PowerShell. If the data is being consumed by the filesystem (i.e., |
Beta Was this translation helpful? Give feedback.
-
|
There has been some discussion on this issue particularly when piping from one native app to another - PS should step out of the way. However, the scenario that @btjwork brings up is a bit different. The |
Beta Was this translation helpful? Give feedback.
-
|
Yes, I was aware. I was hinting that I wish I could override (Note that the title of the original issue includes "or file".) |
Beta Was this translation helpful? Give feedback.
-
But this is exactly my point. I only use PowerShell because I have to in order to use prepackaged functionality that requires it. If library scripts I am calling expect the piping behavior to be generating CR LF (let's say because an internal tool or function would croak if passed data with only LF) then I don't want those to break. But I want my scripts to not be generating CR LF files, because I don't want them anywhere on the shared drives. The proposal has the possibility of converging to a place where if there are no CR LF scripts on a system, there will be no CR LF piping. I'm not saying it's a strategy that would please everyone. But in my world it allows for a way that people collaborating a project can make the decisions for themselves based on how a file is represented on their drive while using the same script code. If you choose to use this option then you can collaborate with people who insist on CR LF while being able to maintain LF-only in your own world. |
Beta Was this translation helpful? Give feedback.
-
Yes, though as mentioned some prepackaged functionality may depend on the corruption to operate. :-/ It's in some sense a separate point from whether CR LF are generated from thin air. but PowerShell itself isn't a neutral party if it generates any text with newlines at all which can be redirected. (e.g. some kind of multi-line echo of material from within a script). If it injects CR LF in that without any way to stop it...even if the source script was LF only...then that is undesirable. So I'm proposing "Letting People Vote With Their Feet" ...e.g. how they represent line endings in their script on disk. This even permits individuals working with the same code to make different choices, and I think the Git autocrlf setting dovetails nicely with this. (My true wish is that along with standardizing on UTF-8 with no byte order mark (as it should be!), the cross-platform focus had taken that extra step to standardizing on LF only so the files could be exchanged between platforms without transformation. That would have been ideal, and solving the binary corruption would be a nice bonus of that. But I would guess that is a bridge too far, so this is a proposal that could let people inch stepwise toward that world.) |
Beta Was this translation helpful? Give feedback.
-
|
If $PSDefaultParameterValues['Out-File:LineTerminator'] = 'Lf' # THIS PARAMETER DOES NOT EXIST ATMThis would work with both Currently, Out-File is calling through to .NET's Caveat - any other cmdlets that write to files e.g. |
Beta Was this translation helpful? Give feedback.
-
|
I like the direction @rkeithhill is going for this particular scenario - I think this should be an explicit setting that the user and/or script writer can control. And @AE1020, this would be consistent with the git behavior you use as an example: it doesn't "guess" what line ending to use based on other files on the same filesystem, it instead makes an explicit consistent decision based on user preference settings. This would also allow the top-level script to include the logic to "look around" and automatically initialize the setting based on whatever rules the developer chooses. For example, could literally look at the git settings, rather than trying to infer them. |
Beta Was this translation helpful? Give feedback.
-
|
BTW, sorry to all if my first responses were confusing - I got here from @AE1020's comment on #1908, where it seemed that this was being suggested as an alternative solution to that scenario. I personally think this is an unrelated (though valid) scenario. I've been waiting years for #1908 to be resolved and didn't want to see it get side-tracked by this proposal. |
Beta Was this translation helpful? Give feedback.
-
If you have a preference variable which is And since you will need to set that preference in a script why ask PowerShell to examine the code and try to work out if what it is and why not just have $LineTerminatorPreference = Unix / $LineTerminatorPreference = Windows.
That isn't the behaviour. Windows files (like DOS files before them and CP/M files before that) use CR LF as a line separator. PowerShell does have < to lash a file to std-in and ">" is (as already said) syntactic sugar for When you have a bunch of text strings and you want to send them to a non-Powershell program, or to a file, PowerShell inserts the local OS's new line (LF or CR LF).
So you have a quest to prevent Windows programs writing output in the way Windows programs naturally write. If you work with a sufficiently small set you might be able to achieve that. PowerShell on Linux (or in WSL) would do what you want, but PowerShell on Windows thinks it is being helpful, and isn't doing what you want.
Like I said it works both ways. If you come against someone who demands all LF and you're running on a Windows machine you should be able to please them, and if you're on a linux machine and the demand is for all CRLF that should be possible as well. Right now if the default for the OS you're on isn't the desired one you're stuck. There are plenty of harder things to change. |
Beta Was this translation helpful? Give feedback.
-
I'm proposing a global setting for powershell. (It doesn't make sense to put this in the scripts themselves.)
As the problem statement says that's what I'd like to see. And I proposed an implementation path for this that doesn't require putting anything in the scripts themselves. (Based on git crlf settings, the crlf setting isn't something considered "in the script" but an out-of-band characteristic that version control can manage invisibly.) If there's another way to do it, great. But I think my suggestion would work. |
Beta Was this translation helpful? Give feedback.
-
I was thinking it works two ways. I can put a setting in my profile to ensure I always use one form, or I can set it at the command line if I am outputting for a server which requires one way. But if I'm writing scripts I'm going to live by the mantra "assumption is the mother of error" and ensure option is set at the start of the script. And when the option is on it wants to be on for everything, compiled cmdlets run from the prompt, 3rd party modules which use the other line ending etc. etc. |
Beta Was this translation helpful? Give feedback.
-
|
Line endings behavior comes from OS fundamentals - that is how OS abstracts devices. Line endings is one from many such abstractions. If somebody thinks about wrapping over fundamental OS abstractions the one should understand it is very, very expensive. From PowerShell design point of view, this would mean adding a thick layer of abstraction of OS fundamentals. While this looks tempting, I think it is unreasonably expensive to do this kind of thing in PowerShell. For that issue, this would mean that you should request improvements in NFS client from Windows team to support line endings translation. |
Beta Was this translation helpful? Give feedback.
-
I wouldn't ask PowerShell to be changing anything that would be costly. I'm just asking it not to introduce any new CR LF sequences into the mix. CMD.EXE seems able to do this...if Program A emits LF and then you pipe it into Program B it will receive LF, not CR LF. So nothing expensive is being asked for...if anything, I'm asking it to be cheaper. |
Beta Was this translation helpful? Give feedback.
-
|
While there have been negative responses to this, no one has proposed a solution to my issue: How am I to ask PowerShell to not produce CR LF in my scripts on file shares that run on many different directories...while not disrupting existing packaged functionality (other peoples scripts) as part of the PowerShell distribution or otherwise (e.g. not on the shares, but relating to the Windows OS install)? I do not see why there is such a negative reaction to the idea that I would encode my desire by virtue of the CR LF disposition of the script that is running. It is only an option, after all. But I would accept another approach so long as it solved the problem. FWIW, approaches that require any editing of script code to say what option you want to use explicitly do not really give the feature I seek. (e.g. to share the identical script code among a Windows user who wants all their piped products to involve CR LF, and a user who is in a hybrid environment and wants to use that script on a share where things are always LF even on Windows). The code being the same but only checked out via git using different translation settings is the point. As I understand it, PowerShell made a shift away from UTF-16 encoding of its redirection (with Byte-Order Mark) to the much preferable UTF-8 with no Byte Order Mark. This was done to be more cross-platform friendly. But without going that extra mile of embracing LF-only, it really doesn't provide the desired ability of being able to use the same bytes on all platforms. |
Beta Was this translation helpful? Give feedback.
-
|
Last point first. Windows tried to move to Unicode for everything and Windows PowerShell still defaults to a 2-byte character set with byte order marking by default. PowerShell 6 and 7 default to UTF 8. Native Windows programs use CRLF line breaks. To be cross-platform .NET (not PowerShell itself, it is just is something built on .NET) - says "On this OS, use this for line break" - so CRLF on Windows, LF on Linux - and well behaved programs check this instead of inserting the combination for their author's OS. In PowerShell's case objects (not text) come down the pipeline, and when they need to be converted to text You are looking at this through the lens of the problem you have - "I run script.ps1 > file.txt , and I want a way to say to '>' should use a line separator which isn't the OS default. I don't want to run PowerShell on an OS (or OS subsystem) where that is the default. And I don't want to change the script but I think how the script is encoded might be a good idea for toggling it". Which (a) is a corner case so unlikely to get a change made and (b) isn't an optimal solution > is syntactic sugar for | Out-File so the solution would be Out-File has to check an option, looks down the pipeline ignoring where, sort or similar cmdlets and see if there is a script, then forms a view on the line breaks it uses, then change to a different line break output that between objects and if the output of the object-to-text conversion has wrapped over lines replace any line breaks there as well . You can replace Or you can even just follow |
Beta Was this translation helpful? Give feedback.
-
You might think I am asking something more complicated than I am asking. My concept was that the Nothing would ever be sensing the wishes of some other script being piped into.
Indeed, to me a CRLF file and an LF file are fully distinct file formats. Because of this I choose to treat CRLF files to be as foreign as if they were encoded in any random codepage, as CR LF is an empirically worse format. It goes beyond just annoying Linux/Mac users. The number of programming edge cases introduced by having a combo-character as an end of line marker are many. You have to worry about what to do if you hit just a CR, or just an LF, among other things. Every piece of code that deals with line breaks is simpler and less prone to failure with a single character end of line marker.
The Windows kernel does not hardcode CR LF. CMD.EXE does not throw in CRLF in piping, and NOTEPAD.EXE can handle LF-only files. So if you have ten .EXE programs that between them use no LF line endings, it seems sensible that a shell which ties them together could operate agnostically. 🤷 |
Beta Was this translation helpful? Give feedback.
-
| doesn't know where it is, and it is the PowerShell to the right of the | (out-file) which adds line breaks. When sending to a non-PowerShell program which doesn't understand date/user/process objects there's a shim on the right of | - objects go through the engine used when calling "Out-string" or "Out file" to reduce them to text and the lumps of text get OS standard line breaks between them.
I think that you were saying
Having dealt with this for over 40 years now I can tell you neither is "empirically" better or worse than the other. Some makers of printed paper terminals had separate operations for rolling the paper up a line and moving the print head all the way left and some merged them. Like little-endian/bigendian byte ordering, or US / everywhere else date formats or , and . swapping their roles in numbers, we cope.
40 years dealing with two systems... So I know. You need to go back and tell Gary Kildahl the OS's which influenced him in designing CP/M, got it wrong. Tell the IBM execs who wanted a CP/M clone for the PC (or Gates and Co who were figuring out how to deliver it) that they needed to change it (though I think everything IBM used separate characters) . Tell them again when OS/2 was in the works, or get the Idea of Dave Cutler when Windows NT was growing of its ashes. Without the aid of a time machine - that boat has sailed. Every programmer is familiar with "What idiot chose to use the other way" and almost every programmer learns to handle it (and that handling is effort which could have been saved if the first Unix systems had been developed with a different kind of terminal, but like I said the boat has sailed).
Notepad will read LF-Only, (after years of lobbying) but creating one from scratch... you can set the encoding on save, but maybe I've missed an option for selecting LF-Only.
Except that doesn't happen. A program which runs on Windows and takes input from std-in (keyboard < or | ) but can't take CR LF input is pretty useless, because for a start if you just run it in CMD, the return key will be presented with CR LF. If the output came from another Windows Program, it will be CR-LF etc. PowerShell isn't exactly agnostic: it works the way the OS it finds itself on usually works. If it finds itself running with programs which can't work with the OS defaults, things will go wrong. The subtle difference between a file or program's output being one giant string with line break characters in it, and many strings with a line break marking the boundary between adjacent ones is more likely to cause a problem than saying "When I need to output many strings I mark the boundary between adjacent ones with the OS's line break" |
Beta Was this translation helpful? Give feedback.
-
That seems unfortunate for anything that might like to debug scripts, or give error messages citing file/line of offending scripts. But this being the case, then my proposal as I intended it would not be technically possible. It was simply saying that if
I've been programming a long time as well, and I guess we will have to agree to disagree. Time machines aren't needed to take advantage of transitional moments to improve things. e.g. I think if PowerShell 6/7 was going to switch from piped output from UTF-16 with BOM to UTF-8 with no BOM in the service of multi-platform behavior, that would have made a nice time to canonize the line endings. (Also if I were writing a shell, the first time I noticed I couldn't pipe binary output, my reflex would be to force a reckoning in the design to not have that happen. The needs of line printers would not enter my considerations, unless I had important users working with line printers!)
Hopefully the desire w.r.t. line endings can be kept in mind as a factor when weighing solutions to things like #1908 And perhaps some feature of script debugging would also be implemented where a |
Beta Was this translation helpful? Give feedback.
-
|
I don't like the OP proposal of the script or pipe behaving differently based on the script's LF endings. However the OP has a point, and for some reason, great minds are saying a solution shouldn't happen. As for if the pipe doesn't behave as it should (aka is it a bug), I say YES. If the newline is defined as CRLF, and a bare LF is found in the input stream / file, this should NOT be considered a line ending. And if it isn't considered a line ending, then it should be output as part of the stream unchanged. Any CRLF's encountered would be treated as EOL, and when the line hits the output, a CRLF is added meaning there is no change in the file (for this issue anyway). 4 Proposals
My 2c - let's see if this (or any other idea) goes somewhere. |
Beta Was this translation helpful? Give feedback.
-
I had built the executable to test and make sure it wasn't introducing the CR's. So having that on hand, I was able to run these commands and yes, they seem to work.
Hopefully so! |
Beta Was this translation helpful? Give feedback.
-
|
I have wide UNIX shell script experience and suggest that PowerShell is a completely different type of shell. In UNIX everything is a stream of bytes. Files, pipes, network connections, devices etc. However PowerShell really deals with objects with properties. The pipe operator in PowerShell is for passing an output object or collection from one component to another, it is not for passing on byte streams. For example it does not even support the "<" operator. I suggest to treat the files as files and get your components to read the file, process it and write the output to another file. Don't use the PowerShell pipe. You can still invoke cmd.exe from PowerShell to do piping in the traditional byte-stream way. |
Beta Was this translation helpful? Give feedback.
-
|
I would like to see something done in this sense. It is unacceptable that a shell generates different outputs depending on the underlying operating system. Powershell should provide a way to create text files in a consistent way either CRLF or LF. Having a line ending setting would be a good way to ensure a consistent way of generating text files independently from the underlying OS. |
Beta Was this translation helpful? Give feedback.
-
|
So, I've lost track in the discussion drift...
|
Beta Was this translation helpful? Give feedback.
-
|
I don't like the idea of using the line terminator that the script uses. What if a script running on Windows needs to process nonstandard STDOUT streams? For example, robocopy /Unicode produces lines terminated only by LF. The powershell script should be able to specify the line terminator to be used for input, for output and for parsing the output of an external command. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Summary of the new feature / enhancement
I work with source code and files for cross-platform projects that are standardized to all UNIX line endings (LF) instead of "historical Windows" (CR LF) (Notepad.Exe supports LF only since 2018)
My files are kept on shared filesystems that are accessed simultaneously by Windows, Linux, and MacOS systems.
Where possible I use bash for Windows when scripting, falling back on CMD.EXE occasionally. Yet increasingly PowerShell is needed, e.g. deprecations of things like ODBCCONF.EXE means I have to access "cmdlets" like
Add-OdbcDsn.But there appears to be no way to ask Powershell not to use CR LF line endings on Windows when doing redirects. In piping it will translate incoming LF to CR LF, which also leads to binary file corruption (see #1908)
Personally I never want this on any platform. I consider LF endings canon, and certainly prefer no-op pipes produce the same bytes they get in.
However, I'd assume that a global setting asking for LF on Windows always would break expectations of some packaged functionality. That functionality is why I'm using PowerShell in the first place, so no good.
Proposed technical implementation details
I propose an OPTIONAL mode of operation in which the LF / CR LF usage of the source code of the running powershell script itself dictate its behavior. If the script source is LF-only, then it thinks it lives in an LF-only world and likely wants to live by LF-only expectations for piping. If the script is CR LF, then its redirection and piping favor that.
This would allow PowerShell's behavior to piggy-back on the existing git setting for autocrlf translation that cross-platform developers already use. Windows users who turn it on probably want CR LF in redirects when they run a script. Windows users who turn it off probably want just LF on their filesystem always.
I don't know how something like
Add-OdbcDsnworks, but if it did piping on some .INI file that is CR LF based on Windows, my hope would be it would still work under this strategy...as a library script presumably is a CR LF file itself. So I could use a packaged piece of functionality like that in a script which itself was written to use LF only.Maybe if it this option worked out well it could someday become a default. It would mean less conditional code between the Unix and Windows versions. As a first step I'd like it fine as something I could turn on globally and leave on.
Beta Was this translation helpful? Give feedback.
All reactions