osdir.com


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Creating LF, NEL line terminators by accident? (python3)


Hi,

I have a Python 3 (using 3.6.7) program that reads a TSV file, does
some churning with the data, and writes a TSV file out.

#v+
print('reading', options.input_file)
with open(options.input_file, 'r', encoding='utf-8-sig') as f:
    for line in f.readlines():
        row = line.split('\t')
        # DO STUFF WITH THE CELLS IN THE ROW

# ...

print('writing', options.output_file)
with open(options.output_file, 'w', encoding='utf-8') as f:
    # MAKE THE HEADER list of str
    f.write('\t'.join(header) + '\n')

    for doc_id in sorted(all_ids):
    	# CREATE A ROW list of str FOR EACH DOCUMENT ID
	f.write('\t'.join(row) + '\n')
#v-

I noticed that the file command on the output returns "UTF-8 Unicode
text, with very long lines, with LF, NEL line terminators".

I'd never come across NEL terminators until now, and I've never
(AFAIK) created a file with them before.  Any idea why this is
happening?

(I tried changing the input encoding from 'utf-8-sig' to 'utf-8' but
got the same results with the output.)

Thanks,
Adam


-- 
I am at the moment writing a lengthy indictment against our
century. When my brain begins to reel from my literary labors, I make
an occasional cheese dip.                        ---Ignatius J Reilly