The Artima Developer Community
Sponsored Link

Agile Buzz Forum
LineEndWhat?

0 replies on 1 page.

Welcome Guest
  Sign In

Go back to the topic listing  Back to Topic List Click to reply to this topic  Reply to this Topic Click to search messages in this forum  Search Forum Click for a threaded view of the topic  Threaded View   
Previous Topic   Next Topic
Flat View: This topic has 0 replies on 1 page
James Robertson

Posts: 29924
Nickname: jarober61
Registered: Jun, 2003

David Buck, Smalltalker at large
LineEndWhat? Posted: Nov 14, 2005 1:45 PM
Reply to this message Reply

This post originated from an RSS feed registered with Agile Buzz by James Robertson.
Original Post: LineEndWhat?
Feed Title: Travis Griggs - Blog
Feed URL: http://www.cincomsmalltalk.com/rssBlog/travis-rss.xml
Feed Description: This TAG Line is Extra
Latest Agile Buzz Posts
Latest Agile Buzz Posts by James Robertson
Latest Posts From Travis Griggs - Blog

Advertisement

Line end conventions, they're kind of cool. In VisualWorks, you can set one of 5 line end conventions on a stream. The basic idea is that while the OS world plays around with the lf character, VisualWorks internally just uses CRs, so a lineEndConvention maps something like the Windows CRLF standard to internal CRs. This notion is supported by the fact that you can only send the lineEndWhatever messages to instances of ExternalStreams, not a Stream on an Smalltalk string object. To play with them, you can use a test fragment something like:

ws := 'text.txt' asFilename writeStream.
ws lineEndTransparent.
ws nextPutAll: 'ABC'; cr.
ws nextPutAll: 'DEF';  cr; nextPut: Character lf.
ws nextPutAll: 'GHI';  nextPut: Character lf.
ws nextPutAll: 'XYZ'.
ws close.

('text.txt' asFilename readStream) lineEndCR; contents
Varying the final lineEndThing (not the first one) allows you to play with it. +lineEndCr - This doesn't do much. It converts CRs into, well, CR's I guess. +lineEndLF - Converts lf's into CRs. So you get an empty line between DEF and GHI. +lineEndCRLF - Converts the CRLF sequence into a single CR, but does nothing to the final solitary LF. +lineEndTransparent - Does nothing. Kind of like lineEndCR, but clearer about it. +lineEndAuto - This basically scans any characters that might already be in memory. Upon the first encounter of an LF or a CR, it checks the next and then determines whether it's CR, LF, or CRLF. In this case it's lineEndCR and there's no change. If we rearrange the lines, so that different line end patterns come first, we can watch lineEndAuto do different things.

I actually got burned by all this recently. We have these little simple text files for settings. Our technicians often author them on Windows and then move them over to the production machines which run Linux. The windows editors will emit CRLF patterns. And sometimes we write them under Linux, in which case they have just LFs. And from time to time, I write them on my Mac, and those have CRs. Until recently, we used lineEndAuto on those files, and everything just worked.

But the other day, we had a case where a Windows authored file was edited by a Linux editor. The result was a file that was mostly CRLF terminated, but one line had just a single LF on it. lineEndAuto decided it was a CRLF file, and so did not interpret the LF as a CR. Which meant we got a merged line and a parse error. So lineEndAuto didn't work so well. In the end, I switched to lineEndLF, because for our case, that is actually the most "transparent" thing to do. Most structured text files have to deal with empty lines. So by going with lineEndLF, all CRs are still CRs, and so are LFs. Which means that I got a lot of "blank lines" when reading a Windows authored file, but I don't care becase I skip the empty lines anyway.

It does make you wonder why there's not a mode that just does all three of them. LFs are CRs. CRs are CRs. And CRLF sequences are reduced to single CRs. I guess there's the pathological case where someone wanted a blank line, but switched end convention right in the middle.

I also find it "interesting" that when Apple "went Unix" they stuck with the CR line ending, rather than going with the Unix standard LF.

Read: LineEndWhat?

Topic: Reality pokes in Previous Topic   Next Topic Topic: A marker gets placed

Sponsored Links



Google
  Web Artima.com   

Copyright © 1996-2019 Artima, Inc. All Rights Reserved. - Privacy Policy - Terms of Use