OSDir


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Language-independent and cross-language docs


I don't think we should attempt to create a documentation "super
project" that includes the generated API reference for all the
libraries in Apache Arrow. I do think that creating a documentation
"hub" project (with the low-level API docs being the "spokes") is a
good idea. Currently, the Jekyll project website serves as a very
crude hub. It would be better to build something more suited for
writing developer documentation.

So in other words, the subprojects would continue to generate API docs
using the current tools (Javadoc, GTK-Doc, Doxygen, Sphinx, etc.) but
the objective of the "top level docs" is to make the entire project
easier to navigate than it is now.

On Sun, May 20, 2018 at 3:15 AM, Kouhei Sutou <kou@xxxxxxxxxxxxxx> wrote:
> Hi,
>
>> I really like the Scala/Python tabs in the Spark docs [2].
>
>> [2]: http://spark.apache.org/docs/latest/quick-start.html#more-on-dataset-operations
>
> Oh, I also like it.
>
>> > - Should we do this at all (i.e. build up a central documentation system)?
>
> Yes.
>
>> > - Should we use Sphinx for it?
>
> I'm neutral.
>
> If we choose Sphinx or something, we need some works for
> Apache Arrow C. It uses GTK-Doc as its documentation
> system. We'll need to create a tool like
> https://github.com/pygobject/pgi-docgen . (It's a tool for
> Sphinx.)
>
> Apache Arrow C needs to keep using GTK-Doc style for API
> documentation. Because it's also used by GObject
> Introspection. GObject Introspection is very important in
> Apache Arrow C. For example, The Ruby bindings needs GObject
> Introspection support. So we shouldn't drop GObject
> Introspection support.
>
> Other documentations such as tutorial (they doesn't exist
> yet :<) don't need to use GTK-Doc style.
>
>
> We'll need to create a similar tool for Apache Arrow Ruby.
> The most API of Apache Arrow Ruby are generated
> automatically by GObject Introspection support. We can reuse
> GTK-Doc style documentation in Apache Arrow C for Apache
> Arrow Ruby.
>
> We may be able to use
> https://github.com/ruby-gnome2/yard-gobject-introspection
> for Apache Arrow Ruby. It's not completed yet but we can
> improve it. (I'm one of the developers of it.)
>
>
> Thanks,
> --
> kou
>
> In <1526586273.3156930.1375962952.57744C0C@xxxxxxxxxxxxxxxxxxxxxxxxxxx>
>   "Re: Language-independent and cross-language docs" on Thu, 17 May 2018 21:44:33 +0200,
>   "Uwe L. Korn" <uwelk@xxxxxxxxxx> wrote:
>
>> Hello,
>>
>> I can second that we should move the documentation to a central one. As a C++ and Python contributor at the same time it always hard to think of where you should document a specific piece. We have a very small C++ documentation and a bit larger Python one. For some features it would though make sense to have them in both. IPC and in-process sharing is also a main part of the Arrow project. Documenting this separately for each language will be a lot of work and probably leave blind spots in each language.
>>
>> Not everything in each language ecosystem can be directly included in Sphinx but as Sphinx is becoming a very broadly used documentation system, there are many nice converters like Breeze [1] (Doxygen to Sphinx) available.
>>
>> To directly answer the questions:
>>
>> - Should we do this at all (i.e. build up a central documentation system)?
>>
>> Yes
>>
>> - Should we use Sphinx for it?
>>
>> Very much in favour. There is probably also a tendency that some people prefer Markdown (I do) but given the feature set of Sphinx, I would very much argue in favour of it.
>>
>>  - To which extent our current docs should be migrated to Sphinx (apart
>>  from the Python docs, which already use Sphinx)?  For example, should
>>  the specs (currently standalone pages written in Markdown) be migrated
>>  to Sphinx for better cross-referencing and navigation?  What about the
>>  C++ tutorial pages?  etc.
>>
>> I would migrate C++ documentation definitely fully into that but the C++ / Python relation is very tight. There are a lot of topics that either touch two languages or are general to the project, these should also go in there.
>>
>> - Should we preferably have a single Sphinx doctree, or several
>>  independent per-topic / per-language doctrees?
>>
>> I'm not 100% sure what the definition of a "Sphinx doctree" is but as we will have many shared topics between the different implemenations so I would expect that we should have a single documentation with well organized sections.
>>
>> Also we probably will face the issue we have documentation on a specific topic and only a small part is different between two implementations/setups/... I really like the Scala/Python tabs in the Spark docs [2]. There is a Sphinx extension that seems to something similar to this [3]. This could either be used to have documentation on how to construct things where one switches between Ruby and Python or the main issue where I would need it: Setting up the build with slightly different package managers (e.g. conda vs pip in Python).
>>
>> Uwe
>>
>> [1]: https://breathe.readthedocs.io/en/latest/
>> [2]: http://spark.apache.org/docs/latest/quick-start.html#more-on-dataset-operations
>> [3]: http://sphinxcontrib-contentui.readthedocs.io/en/latest/tabs.html
>>
>>
>> On Sat, May 12, 2018, at 6:03 PM, Antoine Pitrou wrote:
>>>
>>> Hi,
>>>
>>> In the following PR discussion it was mentioned that we currently lack a
>>> central documentation system for cross-language topics:
>>> https://github.com/apache/arrow/pull/1575#issuecomment-364062240
>>>
>>> Sphinx looks like a reasonable contender for that purpose.  For that who
>>> don't know it, Sphinx is a documentation system initially developed for
>>> the Python language, which quickly became widely-used amongst Python
>>> projects, and is now being used by non-Python projects as well.  For
>>> example, the LLVM docs (https://llvm.org/docs/) and even the Linux
>>> kernel online docs are now written using Sphinx
>>> (https://www.kernel.org/doc/html/latest/index.html).
>>>
>>> Sphinx uses reStructuredText (a.k.a "reST") as its basic markup
>>> language, but with many extensions.  It allows for structured
>>> documentation with extensive cross-referencing (even between independent
>>> Sphinx sites, using the "intersphinx" extension).
>>>
>>> The questions here are:
>>>
>>> - Should we do this at all (i.e. build up a central documentation system)?
>>>
>>> - Should we use Sphinx for it?
>>>
>>> - To which extent our current docs should be migrated to Sphinx (apart
>>> from the Python docs, which already use Sphinx)?  For example, should
>>> the specs (currently standalone pages written in Markdown) be migrated
>>> to Sphinx for better cross-referencing and navigation?  What about the
>>> C++ tutorial pages?  etc.
>>>
>>> - Should we preferably have a single Sphinx doctree, or several
>>> independent per-topic / per-language doctrees?
>>>
>>> Regards
>>>
>>> Antoine.