Re: Drop 0. from the version
I think it's a good point. Culturally we have been willing to break
extension APIs for relatively small benefits. But we have generally been
unwilling to make breaking changes on the operations side quite so
liberally. Also, most cluster operators don't have their own custom
extensions, in my experience. So it does make sense to differentiate them.
I'm not sure how it makes sense to differentiate them, though. It could be
done through the version number (only increment the major version for
operations breaking changes) or it could be done through an "upgrading"
guide in the documentation (increment the major version for operations or
extension breaking changes, but, have a guide that tells people which
versions have operations breaking changes to aid in upgrades).
Coming back to the question in the subject of your mail: IMO, for
"graduation" out of 0.x, we should talk as a community about what that
means to us. It is a milestone that on the one hand, doesn't mean much, but
on the other hand, can be deeply symbolic. Some things that it has meant to
1) Production readiness. Obviously Druid is well past this. If this is what
dropping the 0. means, then we should do it immediately.
2) Belief that the APIs have become relatively stable. Like you said, the
extension APIs don't seem particularly close to stable, but maybe that's
okay. However, the pace of breaking changes on the operations and query
side for non-experimental features has been relatively calm for the past
couple of years, so if we focus on that then we can make a case here.
3) Completeness of vision. This one is the most interesting to me. I
suspect that different people in the community have different visions for
Druid. It is also the kind of project that may never truly be complete in
vision (in principle, the platform could become a competitive data
warehouse, search engine, etc, …). For what it's worth, my vision of Druid
for the next year at least involves robust stream ingestion being a first
class ingestion method (Kafka / Kinesis indexing service style) and SQL
being a first class query language. These are both, today, still
experimental features. So are lookups. All of these 3 features, from what I
can see, are quite popular amongst Druid users despite being experimental.
For a 'completeness of vision' based 1.0 I would want to lift all of those
out of experimental status and, for SQL in particular, to have its
functionality rounded out a bit more (to support the native query features
it doesn't currently support, like multi-value dimensions, datasketches,
4) Marketing / timing. Like, doing a 1.0 around the time we graduate from
the Incubator. Not sure how much this really matters, but projects do it
Another question is, how often do we intend to rev the version? At the rate
we're going, we rev 2-3 major versions a year. Would we intend to keep that
up, or slow it down by making more of an effort to avoid breaking changes?
On Thu, Dec 20, 2018 at 2:17 PM Roman Leventov <leventov.ru@xxxxxxxxx>
> It may also make sense to distinguish "operations" breaking changes from
> API breaking changes. Operations breaking changes establish the minimum
> cadence of Druid cluster upgrades, that allow rolling Druid versions back
> and forward. I. e. it's related to segment format, the format of the data
> kept in ZooKeeper and the SQL database, or events such as stopping support
> of ZooKeeper for certain things (e. g. forcing using of HTTP
> announcements). So Druid cluster operators cannot update Druid from version
> X to version Z skipping the version Y, if both Y and Z have some operations
> breaking changes. (Any such changes should support rollback options at
> least until the next version with operations breaking changes.)
> API breaking changes are just changes in Druid extensions APIs. Druid
> cluster operators could skip any number of releases with such breaking
> changes, as long as their extension's code is updated for the latest
> version of API.
> On Thu, 20 Dec 2018 at 20:03, Roman Leventov <leventov@xxxxxxxxxx> wrote:
> > It doesn't seem to me that Druid API is going to stabilize in the near
> > future (if ever), because there are so many extension points and
> > is broken in every release. On the other hand, Druid is not Hadoop or
> > Spark, which have applications API. Druid API for extensions, not
> > applications. It is used by people who are closer to Druid development
> > fixing their extensions is routine.
> > With that, I think it make sense to drop "0." from the Druid version and
> > call it Druid 14, Druid 15, etc.