[stackalytics] Reported numbers seem inaccurate
We are using it to see company-wide contributions per release, per some specific time period work company wide.
I usually look for PR, reviews, and time spend. This is for individuals and companies.
From: Julia Kreger <juliaashleykreger at gmail.com>
Sent: Tuesday, April 16, 2019 9:13 AM
To: Thierry Carrez
Subject: Re: [stackalytics] Reported numbers seem inaccurate
On Tue, Apr 16, 2019 at 1:33 AM Thierry Carrez <thierry at openstack.org> wrote:
> Julia Kreger wrote:
> > [...]
> > Is it time for us, as a community, to whip up something that helps
> > provide basic insight?
> I think we should first agree on what we actually need.
I think this is vital. Things like reviewstats seems to be useful for review activity but I think there is also an aspect of activity that comes between releases or between stable branches for project teams.
Statements like "We had x number of contributors during y release contribute code into z project" and "we observed x percentage change in activity over the past cycle? The cycle before was z percentage?"
Helps us determine where we are presently so we can chart our future course. Lots of graphs are pretty, but I think we can all turn numbers based reporting into pretty aggregate graphs if something is collecting the dimensions of raw data needed to count lines or to add numbers in columns.
> A first level would be to extract the data about changes and reviews
> from Gerrit into a query-able system so that we don't have everyone
> hammering Gerrit with individual stats queries. Then people can share
> their "insights scripts" and run them on the same official data.
In my mind, the extracted data could just be data in text files that could be used with some simple scripting to create useful reporting.
The moderate pain point is collecting all of that data and the point where things start breaking is repositories that are part of projects that do are considered released as needed utilities that are not branched and the branch points can't be used to compare velocity.
> A second level would be to come up with some common definition of
> "basic insight" and produce that data for all teams. Personally I
> think the first level would already give us a lot more confidence and
> consistency in the numbers we produce.
> As an aside, the OSF has been driving a proof-of-concept experiment to
> use Bitergia tooling (now ELK-based) for Kata Containers and
> StarlingX, which we could extend to OpenStack and all other OSF projects if successful.
> Historically we dropped the old Bitergia tooling because it was
> falling short with OpenStack complexity (groups of repositories per
> team) and release-timeframe data, and its visualization capabilities
> were limited. But the new version is much better-looking and flexible,
> so it might be a solution in the long term.
> Thierry Carrez (ttx)