"Data blocks" syntax specification draft
> -----Original Message-----
> I think it would be appropriate to propose an alternative to TQS for this
> specific purposes. Namely for making it easier to implement parsers and
> embedded syntaxes.
> So what do I have now with triple quoted strings - a simple example:
> if 1:
> s = """\
> print ("\n") \\
> foo = 5
> So there is a _possibility_ in the sense it is possible to do, so let's say I have a
> lib with a parser, etc. Though now a developer and a user will face quite real
> - TQS itself has its specific purpose already in many contents,
> which may mean for example hard-coded syntax highlighting
> - there a lot of things happening here: e.g. in the above example
> I use "\n" which I assume a part of string, or \\ - but it is interpreted.
> Maybe some other things regarding escaping. This particular
> issue maybe a blocker for making use of TQS in some data cases,
> Say if the target source text need these very characters.
Yup, I can see this, I do use """ in a number of ways, often to comment out large chunks of code. (OK, I probably should not, but I do).
> - indentation is the part of TQS. That is of couse by design
> so and it's quite logical, though it is hard-coded behaviour and thus
> does not make the presentation a natural part of blocks containing
> this string.
> - appearance: imagine you have some small chunks of embedded
> code parts and you will still have the closing """ everywhere -
> that would be really hairy.
And yup, that does cause some challenges sometimes.
> [here i'll use same symbol /// for the data entry point, but of course it can be
> changed if a better idea comes later. Also for now, just for simplicity - the rule
> is that the contents of a block starts always on the new line.
> So, e.g. this:
> data = /// s4
> first line
> last line
> the rest python code
> - will parse the block and knock out leading 4 spaces.
> i.e. if the first line has 5 leading spaces then 1 space will be left in the string.
> Block parsing terminates when the next line does not satisfy the indent
> sequence (4 spaces in this case).
> Another obvious type: tabs:
OK, I CAN see this as a potentially useful suggestion. There are a number of times where I would like to define a large chunk of text, but using tqs and having it suddenly move to the left is painful visually. Right now, I tend to either a) do it anyway, b) do it in a separate module and import the variables, or c) do it and parse the string to remove the extra spaces.
Personally though, I would not hard code it to knock out 4 leading spaces. I would have it handle spaces the same was that the existing parser does, if there are 4 spaces indending the next line, then it removes 4 spaces, if there are 6 spaces, it removes 6 spaces, etc... ignoring additional spaces within the data-string object. Once it hits a line that has the same number if indenting spaces as the initial token, the data-string object is finished.
> data = /// t1
> first line
> last line
> the rest python code
> Will do the same but with one tabstop character.
Tabs / spaces should be handled as normal (up to the data-string object starts, after which, it pulls off the first x tabs or spaces, and leaves anything else)
> Actually that's it!
> Some further ideas:
> data = /// ts
> - "any whitespace" (mimic current Python behaviour)
> data = /// s # or
> data = /// t
> - simply count amount of spaces (tabs) from first
> line and proceed, otherwise terminate.
> data = /// "???"
> ??? abc foo bar
> - defines indent character by string: crazy idea but why not.
Nope, don't like this one... It's far enough from Python normal that it seems unlikely to not get through, and (personally at least), I struggle to see the benefit.
> Language parameter, e.g.:
> data = /// t1."yaml"
> -this can be reserved for future usage by code analysis tools or dynamic
> syntax highlighting.
I can see where this might be interesting, but again, I just don't see the need, if the spec returns a string, you can use that string in any parser you want. If you want to customize how it's handled, then you can always create a custom object for it.
> That's just a rough specification.
> What should it give as result:
To me, this seems like a simply additional specification for a TQS, with the only enhancement being that it's an indented TQS basically, so the return is a string.
> 1. No clash with current TQS rules - less worries
> about reserved characters.
> 2. Built-in indentation parsing parameter makes it more or
> less natural continuation of Python blocks and is char-precise,
> which is very important here.
> 3. Independent of the indent of containing block!
> 4. Parameter descriptor can be developed in such manner
> that it allows more customisation and additions in the future.
I would not argue about this being in the spec, but it seems like a un-needed complexity.
> Does seem to be more generalized problem-solving here.