[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [CSV] Inconsistent record separator behavior


Will try to look at the code and give a better answer during the weekend. But risking a silly question, would it mean that users are not able to parse a CSV unless each CSV row is separated by LF or CRLF? I remember getting a CSV in a government website some time ago that was formatted in a very strange way, and if I remember well it was a small file, but without LF or CRLF. I think it was using | to separate the rows, and , for columns.

Quick search returned at least another person with similar issue https://stackoverflow.com/questions/29903202/how-to-read-csv-on-python-with-newline-separator

Not sure if I understood the problem well, but in case it makes sense... my suggestion would be to perhaps confirm if we could change CSVPrinter.printComment to accept other characters for line ending? 



From: Benedikt Ritter <britter@xxxxxxxxxx>
To: Commons Developers List <dev@xxxxxxxxxxxxxxxxxx> 
Sent: Tuesday, 21 August 2018 7:13 PM
Subject: [CSV] Inconsistent record separator behavior


we have this strange handling of record separator / line endings in CSV:

Users can use what ever character sequence they like as a record separator.

I could for example use the ! character to mark the end of a record.

Then we have CSVPrinter.printComment(String). This inserts comments into a

CSV output. It detects CRLF and call println() on the CSVFormat, which in

turn uses the record separator to indicate a new record...

So now I'm thinking: Does it make sense to use anything else but LF or CRLF

as record separator? Maybe we should deprecate

CSVFormat.recordSeparator(String) and introduce a LineEnding enum where

users can choose between LF and CRLF. This way we can make the behavior

between parsing and printing consistent.



To unsubscribe, e-mail: dev-unsubscribe@xxxxxxxxxxxxxxxxxx
For additional commands, e-mail: dev-help@xxxxxxxxxxxxxxxxxx