Subject: Re: [bug #12660] byte-offset has an off-by-one

anonymous wrote:

Summary: byte-offset has an off-by-one error

Thanks for trying to help, but this is user error, not a bug.

Project: grep
Submitted by: None
Submitted on: Tue 04/12/2005 at 14:22

the byte offset reported when using the -b option is incorrect. It is off by
one *per line*. (except of course the first line, which correctly reports 0
offset). So the byte offset reported on the Nth line is incorrect by (N-1).

I reproduced this with grep v2.5 on a Solaris system and grep v2.5.1 on a
Linux system.
This can be easily demonstrated. 1. create a small file. e.g. echo '0123456789abcdefghijklmnopqrstuv' > foo
foo will now contain 32 bytes (including the final cr) on a unix system. (DOS will add a lf)

No, that's 32 printable characters, so 33 bytes including the final LF on a Unix system (and DOS will add a CR).

2. view foo with xxd to satisfy self that this is true. use 'xxd -c 8 foo' - this will show four rows of 16

16 whats? You mean 16 hexadecimal digits, plus the other information that it shows. And it shows a fifth row showing the final LF byte:

0000000: 3031 3233 3435 3637 01234567
0000008: 3839 6162 6364 6566 89abcdef
0000010: 6768 696a 6b6c 6d6e ghijklmn
0000018: 6f70 7172 7374 7576 opqrstuv
0000020: 0a .

(xxd will display 2 characters per byte)
3. view output of 'xxd -p -c 8 foo' -- satisfy self that this outputs only the contents of the file in rows of 16 chars

It outputs this:


which is four rows each of sixteen printable char...

acters plus one LF character, and one row of two printable characters plus one LF character.

4. pipe it to grep as follows:
'xxd -p -c 8 foo | grep -b ".*"' 5. view output - note that although there are only 16 chars per line, grep reports offsets of 0,17,34,51 ...

No, there are 17 characters per line including the LF, so the output is correct.

I have closed this bug as invalid.

Please discuss problems on this mailing list before filing a bug in the tracker.

- Julian