[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Python-Dev] IDLE colorizer

On 4/1/2018 10:20 PM, Tim Peters wrote:
> [MRAB <python at mrabarnett.plus.com>[
>> A thread on python-ideas is talking about the prefixes of string literals,
>> and the regex used in IDLE.
>> Line 25 of Lib\idlelib\colorizer.py is:
>>      stringprefix = r"(?i:\br|u|f|fr|rf|b|br|rb)?"
>> which looks slightly wrong to me.

This must be a holdover from years ago, before I was involved.  I have 
wondered about it but left it as is.  Thanks for confirming that it is 
not right.

>> The \b will apply only to the first choice.
>> Shouldn't it be more like:
>>      stringprefix = r"(?:\b(?i:r|u|f|fr|rf|b|br|rb))?"
>> ?

See below.

> I believe the change would capture its real intent.  It doesn't seem
> to matter a whole lot, though - IDLE isn't a syntax checker, and
> applies heuristics to color on the fly based on best guesses.  As is,
> if you type this fragment into an IDLE shell:
> kr"sdf"
> only the last 5 characters get "string colored", presumably because of
> the leading \br in the original regexp.  But if you type in
> ku"sdf"
> the last 6 characters get "string colored", because - as you pointed
> out - the \b part of the original regexp has no effect on anything
> other than the r following \b.

I tested with uf versus ur, which are both plausibly legal but are not.

> But in neither case is the fragment legit Python.  If you do type in
> legit Python, it makes no difference (legit string literals always
> start at a word boundary, regardless of whether the regexp checks for
> that).

I want uniform behavior.  I decided to drop the \b because I prefer 
coloring the maximal legal string rather than the minimum.  I think the 
contrast between two chars legal by themselves, but differently colored 
when put together, makes the bug more obvious.


Terry Jan Reedy