[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Python-Dev] Official citation for Python


I also see reproducibility and citation graphs as distinct concepts.

If it's reproducibility you're after, bibliographic citations are very
unlikely to enable someone else to assemble an identical build environment
from which the same conclusion should be repeatably derivable.

A ScholarlyArticle can be reproducible with no citations whatsoever.
A ScholarlyArticle may very likely have many citations and still be
woefully unreproducible.

This citation doesn't contain a URL, but still isn't quite useless (while
the paper is excellent); because there's at least a DOI string:

Sandve GK, Nekrutenko A, Taylor J, Hovig E (2013) Ten Simple Rules for
Reproducible Computational Research. PLoS Comput Biol 9(10): e1003285.
doi:10.1371/journal.pcbi.1003285

> Rule 3: Archive the Exact Versions of All External Programs Used

mybinder.org builds Jupyter containers from git repositories that contain
config files with repo2docker.

https://repo2docker.readthedocs.io/en/latest/config_files.html#configuration-files
"""
Dockerfile
environment.yml
requirements.txt
REQUIRE
install.R
apt.txt
setup.py
postBuild
runtime.txt
"""

Specifying the exact version of Python (and what package it was installed
from and/or what URL the source was obtained and built from) is no
substitute for hashes of the 'pinned' versions of said artifacts.

# includes the python version
$ conda env export -f environment.yml

# these do not include the python version
$ pip freeze -r requirements.txt --all
$ pipenv lock # > Pipfile.lock
$ pipenv sync # < Pipfile.lock

Uploading a built container or VM image to e.g. Docker Hub / GitLab
Container Registry / Vagrant Cloud is another way to ensure that research
findings are reproducible.
- Dockerfile, docker-compose.yml
- Vagrantfile

> Rule 4: Version Control All Custom Scripts

https://mozillascience.github.io/code-research-object/ (FigShare + GitHub
=> DOI citation URI)

https://guides.github.com/activities/citable-code/ (Zenodo + GitHub => DOI
citation URI)

...

Is it necessary to cite Python (or all packages) if you're not building a
derivative of Python or said packages?

It's definitely a good idea to "Archive the Exact Versions of All External
Programs Used"; but IDK that those are best represented with bibliographic
citations. Really, a link to the Homepage, Source, Docs, and Wikipedia page
are probably more helpful to a reviewer that's not familiar with and wants
to help support by linking dereferenceable URLs and https://5stardata.info.

While out of scope and OT, it's worth mentioning that search engines index
https://schema.org/Dataset metadata; which is helpful for data reuse and
autodiscovering requisite premises for the argument presented in a
https://schema.org/ScholarlyArticle .

A citation for each MAJ.MIN.PATCH revision of CPython (and/or other
excellent packages) might be a bit much.

On Monday, September 10, 2018, Steven D'Aprano <steve at pearwood.info> wrote:

> On Mon, Sep 10, 2018 at 09:25:29PM +0200, Chris Barker via Python-Dev
> wrote:
> > I"d like ot know what thee citations are expected to be used for?
> >
> > i.e. -- usually, academic papers have a collection of citiations to
> > acknowledge where you got an idea, or fact, or .... It serves both to
> > jusstify something and make it clear that it is not your own idea (i.e.
> not
> > pagerism).
>
> [
> > That is about reproducible results, which is really a different thing
> than
> > the usual citations.
>
> I don't think it is. I think you are seeing a distinction that is not
> there. If citations were just about acknowledgement, we could say "I got
> this idea from Bob" and be done with it. Citations are about identifying
> the *exact* source so that anyone can reproduce the given ideas by
> checking not just "Bob" but the specific page number of a specific
> edition of a specific work.
>
> So the requirement for precision is no different between papers and
> software, and the academic standards for citing software already take
> that into account. There are challenges with software, to be sure --
> code is much more ephemeral, there may be literally hundreds of
> authors, etc. But in principle, the kinds of information needed to
> cite a software package is known. The major citation styles already
> include this. When you are using a specific style, this page:
>
> https://openresearchsoftware.metajnl.com/about/
>
> suggests a few formats, depending on how you got access to the software.
>
> The bottom line is, we don't have to guess what information to provide.
> People like Jacqueline can tell us what they need, and we'll just fill
> in the values.
>
> The people citing Python know what information they need, we just have
> to help them get it. I think that the best way to do that is to provide
> the correct information in a single place, in a single, standard format,
> and let them choose the appropriate citation style for their
> publication.
>
> Jackie, do I have that right?
>
>
>
> --
> Steve
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: https://mail.python.org/mailman/options/python-dev/
> wes.turner%40gmail.com
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20180910/aef38e9a/attachment.html>