Re: cassandra-stress HexStrings generator
Yes, I’m pretty sure you understood correctly (I wrote most of this, but it’s been a long time so I cannot remember much for certain).
It should be implemented like the Strings generator. It looks like both HexStrings and HexBytes are incorrect, and have been for a long time.
> On 12 Dec 2018, at 22:27, Saleil Bhat (BLOOMBERG/ 731 LEX) <sbhat39@xxxxxxxxxxxxx> wrote:
> I have a question about the behavior of the HexStrings value generator in the cassandra-stress tool, particularly concerning its population/identity distribution.
> Per the discussion in JIRA item CASSANDRA-6146 concerning the stress YAML profile, the population field in a columnspec “represents the total unique population distribution of that column across rows.”
> I interpreted this to mean that if I specify some distribution 'F' for a column, then the probability of occurrence for each potential value of that column is given by 'F'.
> So, for example, if I provided the following columnspec for a text column:
> name: fake_column
> size: fixed(32)
> population: gaussian(1..100)
> and then generated a large amount of data according to this specification,
> I would expect there to be 100 distinct values for ‘fake_column’, and that a histogram of the frequency of occurrence of each value would be roughly bell-shaped.
> However, the current implementation of the HexStrings generator deviates from this expectation. In the current implementation, each CHARACTER in the string is drawn from F, rather than the string as a whole. Therefore, if you plot the histogram of frequency of occurrence for each character, you get a bell-shaped curve, but the distribution of the occurrences of whole strings (the actual columns) is something else.
> My question is, is this the desired behavior for string columns? Was my expectation/interpretation incorrect? If so, can anyone give some insight as to why strings are designed to behave this way and what the use case is for this behavior?
To unsubscribe, e-mail: dev-unsubscribe@xxxxxxxxxxxxxxxxxxxx
For additional commands, e-mail: dev-help@xxxxxxxxxxxxxxxxxxxx