OSDir


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Numbers] Benchmark GSoC project (Was: Google Summer of Code)


Hi.

[As the mailing list is shared by many projects, don't forget
to prefix posts with a component's "identifier".]

On Tue, 10 Apr 2018 19:00:14 -0400, Brian Driscoll wrote:
Greg,

I'm sending this email to help explain Gilles response to your GSoC
project and what you should send in response.

Gilles:  There is no structure for benchmarks in Commons Math (there
are home-made codes used there for "FastMath" (that have shown that
"FastMath" is nos always fast...).   Here the purpose is to use JMH.
[There are examples in "Commons RNG".]

Explanation: In your GSoC project said that you would use
commons-math as a guideline to create benchmarks for commons-numbers.
Gilles is saying that benchmarks in commons-math is not a good place
to start, because those benchmarks don't use a test frame work to run
the benchmarks.  Your GSoC proposal is to do the work that's
documented in the the NUMBERS-70 Jira ticket.  That ticket indicates
that the JMH test framework (openjdk.java.net/projects/code-tools/jmh)
should be used.  What Gilles is saying is to use commons-rng as the
example starting point for creating the commons-numbers benchmarks.
This is because commons-rng has benchmarks which are done in jmh.

I checked out commons-rng.  It's a library to generate random
numbers, which is a very important thing for encryption. You can find
it at commons.apache.org/proper/commons-rng.  The link "Source
Repository (current)" is an easy rudimentary way to look at the
source.

commons-rng-examples/examples-jmh/src/main/java/org/apache/commons/rng/examples/jmh
contains the code which benchmarks commons-math using jmh to run the
tests.

Your response:  Thanks for your insights on the benchmarks.  I'll
change my project to use the benchmarks in commons-rng as the template
for commons-numbers benchmarks.  I found jmh benchmarks in

commons-rng/examples-jmh/src/main/java/org/apache/commons/rng/examples/jmh.
 I'm assuming those are the jmh benchmarks you were talking about.

Your project doc:  Update the Background section of your doc to
indicate the benchmarks in commons-rng will be used template for the
benchmark for commons-numbers.  At the end of the doc add a section
titled CHANGE LOG. Below that put "04/10 - Changed Background section
to say that benchmarks will be based on commons-rng rather than
commons-math."

I did not mean that the actual benchmarking code should be
modeled after what exists in "Commons RNG": there, the number
of core methods in relatively small and the purpose was to
compare their relative performances.

Here, the reference (to compare with) will rather be similar
functionality in other languages (e.g. Python or C++).
Given the number of methods, we should perhaps explore how to
generate benchmark codes.

Gilles:  I'd suggest "apt" for the documentation format since it is
somewhat easier than "xdoc" for tables (as the likely output of the
benchmark project).

Explanation:  "xdoc" and "apt" are different documentation formats
for Doxia.  See maven.apache.org/doxia/index.html for more info about
Doxia.  Doxia is a tool for generating web documentation.  The way it
works is your write documentation in a format that Doxia understands,
then run Doxia to process those files to generate web pages to display
the documentation.  Doxia supports a bunch of different formats,
"xdoc" and "apt" are two of them.  See
maven.apache.org/doxia/referenes/index.html for a complete list of the formats supported. From what I can tell "apt" format is seems simpler and easy to use, while "xdoc" is a richer but more complicated format.

Note that Doxia is part of the Apache Maven project.  Maven is tool
to build (compile, etc) a project from its source code and dependent
libraries.  Apache uses Maven to build many of their open source
projects.  For projects that have documentation in a Doxia format,
Maven runs the Doxia tool on the documentation files to generate the
finished documentation files that can be viewed via the web.

Your response:   I don't really know either the xdoc or apt formats
well.  Apt seems simpler & easier to use than xdoc.  xdoc looks like
it has more features but would be harder to use.  So using apt seems
like it would be easier, as long as it supports all the documentation
features that are needed. I was originally thinking the documentation
would be in xdoc because the commons-numbers/src/site/xdoc/userguide
contain the doc from commons-math and is in xdoc format.  I though
this was done because people wanted the commons-numbers doc to use
xdoc and be similar the commons-math doc.  Do you have any good
examples of apt doc that I could use as a starting point?

I don't know whether it's a good example, but the "Commons
RNG" userguide is written in APT format.
A section of "Commons Math" is also written in APT.
Actually, any format supported by Maven should be fine, if
you have another preference, since they are combined into
the generated HTML documents.


Gilles: Don't hesitate to open JIRA reports for each task that may
need interaction on the details.

Explanation:  Jira is the issue tracking system used by the Apache
organization.  It's a very common system and used by many
organizations.  Ullink uses is for the same thing.  Jira
tickets/issues are created for new features that need to be added,
bugs that need to be fixed, etc.  People put in the details of what
they are a requesting. Using Jira, people can track the status of the
issue, see what's going on with it, what release its fixed in, etc.
It's quite common that there is not enough information in the ticket
to implement the request.  It's common for people to ask questions to
clarify the details of things.  They can either be asked on the
existing ticket, which is NUMBERS-70 in your case, or a new ticket
linked to the original ticket.

Your response:   Okay.  I'm just getting familiar with Jira.  I'll
start with updating NUMBERS-70 and adding a comment with a link to my
GSoC project document.  When I need to get details worked out or have
questions, how should I do it in Jira?  Should I put them as comments
on NUMBERS-70?  Or should I create a new Jira issue linked to
NUMBERS-70 and if so what type, i.e. Task?

Yes; creating sub-tasks of the original issue would be fine.

Gilles:  At first sight, script(s) to convert from JMH's output to
"apt" would be welcome.

Explanation:  He's suggesting that a simple program be created which
reads the jmh benchmark test output and creates a doc in apt format
with the test results.  Then those results could be displayed on the
commons-numbers web site.  A simple program like this would typically
be written in a scripting language.  Like Borne Shell (which I know),
which is the command line language available on most Linux machines.
Python is another example of a scripting language, but it is more
complicated (I don't know it).  Perl is another scripting language
(which I know).  Typically scripting language programs don't need to
be complied.  You run them by passing them to the interpreter for the
language which parses and executed the commands in your program file.
Languages like Java, Haskell, etc. need to be compiled before they can
be run.

Your response:  I've got experience with Java and Haskell, but don't
have much experience with scripting languages.  What scripting
language would you suggest for something like this, i.e. Bourne Shell, Perl, Python? I'll give it a try. I'd have to keep it really simple.
I'd do it after I finish the benchmarks. It would be one of the last
things I would do.  But I may not have enough time to complete it, if
learning the scripting language and writing the script take me a
while.

JMH can generate several output formats.
The idea is to explore which is more suited to give a clear
picture (a table, I guess) of the benchmarks result wrt some
expectation (to be determined, e.g. by running similar tests
on another language/platform).

Regards,
Gilles


Hi.

On Fri, 6 Apr 2018 21:09:56 -0400, Greg Driscoll wrote:

Hello all,

I'm a computer science student that's really interested in doing a Google Summer of Code project working on the commons-numbers User Guide and benchmarks. In Jira it'shttps://issues.apache.org/jira/browse/NUMBERS-70.


Thanks for your interest, and welcome.

The link to my proposal is here https://docs.google.com/document/d/1i6yy2cW0x9MYbDOuLPdZrV0XA0eKO5q5N0SNg99mJfA/edit?usp=sharing



Looks good.
A few remarks:
 * There is no structure for benchmarks in Commons Math" (there are
   home-made codes used there for "FastMath" (that have shown that
   "FastMath is nos always fast...).
   Here the purpose is to use JMH. [There are examples in "Commons
   RNG".]
 * I'd suggest "apt" for the documentation format since it is
   somewhat easier than "xdoc" for tables (as the likely output
   of the benchmark project).
 * Don't hesitate to open JIRA reports for each task that may need
   interaction on the details
 * At first sight, script(s) to convert from JMH's output to "apt"
   would be welcome.

Please let me know what you think about it. You can reply to this mailing
list, comment on the doc, or email me directly.


Let's keep discussion on this list so that everyone interested
can participate.

Best,
Gilles

Thanks.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@xxxxxxxxxxxxxxxxxx
For additional commands, e-mail: dev-help@xxxxxxxxxxxxxxxxxx