osdir.com

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Language-independent and cross-language docs


Among other things, the columnar format specification files should
probably make their way into this new documentation project.

On Mon, May 21, 2018 at 5:19 PM, Wes McKinney <wesmckinn@xxxxxxxxx> wrote:
> I don't think we should attempt to create a documentation "super
> project" that includes the generated API reference for all the
> libraries in Apache Arrow. I do think that creating a documentation
> "hub" project (with the low-level API docs being the "spokes") is a
> good idea. Currently, the Jekyll project website serves as a very
> crude hub. It would be better to build something more suited for
> writing developer documentation.
>
> So in other words, the subprojects would continue to generate API docs
> using the current tools (Javadoc, GTK-Doc, Doxygen, Sphinx, etc.) but
> the objective of the "top level docs" is to make the entire project
> easier to navigate than it is now.
>
> On Sun, May 20, 2018 at 3:15 AM, Kouhei Sutou <kou@xxxxxxxxxxxxxx> wrote:
>> Hi,
>>
>>> I really like the Scala/Python tabs in the Spark docs [2].
>>
>>> [2]: http://spark.apache.org/docs/latest/quick-start.html#more-on-dataset-operations
>>
>> Oh, I also like it.
>>
>>> > - Should we do this at all (i.e. build up a central documentation system)?
>>
>> Yes.
>>
>>> > - Should we use Sphinx for it?
>>
>> I'm neutral.
>>
>> If we choose Sphinx or something, we need some works for
>> Apache Arrow C. It uses GTK-Doc as its documentation
>> system. We'll need to create a tool like
>> https://github.com/pygobject/pgi-docgen . (It's a tool for
>> Sphinx.)
>>
>> Apache Arrow C needs to keep using GTK-Doc style for API
>> documentation. Because it's also used by GObject
>> Introspection. GObject Introspection is very important in
>> Apache Arrow C. For example, The Ruby bindings needs GObject
>> Introspection support. So we shouldn't drop GObject
>> Introspection support.
>>
>> Other documentations such as tutorial (they doesn't exist
>> yet :<) don't need to use GTK-Doc style.
>>
>>
>> We'll need to create a similar tool for Apache Arrow Ruby.
>> The most API of Apache Arrow Ruby are generated
>> automatically by GObject Introspection support. We can reuse
>> GTK-Doc style documentation in Apache Arrow C for Apache
>> Arrow Ruby.
>>
>> We may be able to use
>> https://github.com/ruby-gnome2/yard-gobject-introspection
>> for Apache Arrow Ruby. It's not completed yet but we can
>> improve it. (I'm one of the developers of it.)
>>
>>
>> Thanks,
>> --
>> kou
>>
>> In <1526586273.3156930.1375962952.57744C0C@xxxxxxxxxxxxxxxxxxxxxxxxxxx>
>>   "Re: Language-independent and cross-language docs" on Thu, 17 May 2018 21:44:33 +0200,
>>   "Uwe L. Korn" <uwelk@xxxxxxxxxx> wrote:
>>
>>> Hello,
>>>
>>> I can second that we should move the documentation to a central one. As a C++ and Python contributor at the same time it always hard to think of where you should document a specific piece. We have a very small C++ documentation and a bit larger Python one. For some features it would though make sense to have them in both. IPC and in-process sharing is also a main part of the Arrow project. Documenting this separately for each language will be a lot of work and probably leave blind spots in each language.
>>>
>>> Not everything in each language ecosystem can be directly included in Sphinx but as Sphinx is becoming a very broadly used documentation system, there are many nice converters like Breeze [1] (Doxygen to Sphinx) available.
>>>
>>> To directly answer the questions:
>>>
>>> - Should we do this at all (i.e. build up a central documentation system)?
>>>
>>> Yes
>>>
>>> - Should we use Sphinx for it?
>>>
>>> Very much in favour. There is probably also a tendency that some people prefer Markdown (I do) but given the feature set of Sphinx, I would very much argue in favour of it.
>>>
>>>  - To which extent our current docs should be migrated to Sphinx (apart
>>>  from the Python docs, which already use Sphinx)?  For example, should
>>>  the specs (currently standalone pages written in Markdown) be migrated
>>>  to Sphinx for better cross-referencing and navigation?  What about the
>>>  C++ tutorial pages?  etc.
>>>
>>> I would migrate C++ documentation definitely fully into that but the C++ / Python relation is very tight. There are a lot of topics that either touch two languages or are general to the project, these should also go in there.
>>>
>>> - Should we preferably have a single Sphinx doctree, or several
>>>  independent per-topic / per-language doctrees?
>>>
>>> I'm not 100% sure what the definition of a "Sphinx doctree" is but as we will have many shared topics between the different implemenations so I would expect that we should have a single documentation with well organized sections.
>>>
>>> Also we probably will face the issue we have documentation on a specific topic and only a small part is different between two implementations/setups/... I really like the Scala/Python tabs in the Spark docs [2]. There is a Sphinx extension that seems to something similar to this [3]. This could either be used to have documentation on how to construct things where one switches between Ruby and Python or the main issue where I would need it: Setting up the build with slightly different package managers (e.g. conda vs pip in Python).
>>>
>>> Uwe
>>>
>>> [1]: https://breathe.readthedocs.io/en/latest/
>>> [2]: http://spark.apache.org/docs/latest/quick-start.html#more-on-dataset-operations
>>> [3]: http://sphinxcontrib-contentui.readthedocs.io/en/latest/tabs.html
>>>
>>>
>>> On Sat, May 12, 2018, at 6:03 PM, Antoine Pitrou wrote:
>>>>
>>>> Hi,
>>>>
>>>> In the following PR discussion it was mentioned that we currently lack a
>>>> central documentation system for cross-language topics:
>>>> https://github.com/apache/arrow/pull/1575#issuecomment-364062240
>>>>
>>>> Sphinx looks like a reasonable contender for that purpose.  For that who
>>>> don't know it, Sphinx is a documentation system initially developed for
>>>> the Python language, which quickly became widely-used amongst Python
>>>> projects, and is now being used by non-Python projects as well.  For
>>>> example, the LLVM docs (https://llvm.org/docs/) and even the Linux
>>>> kernel online docs are now written using Sphinx
>>>> (https://www.kernel.org/doc/html/latest/index.html).
>>>>
>>>> Sphinx uses reStructuredText (a.k.a "reST") as its basic markup
>>>> language, but with many extensions.  It allows for structured
>>>> documentation with extensive cross-referencing (even between independent
>>>> Sphinx sites, using the "intersphinx" extension).
>>>>
>>>> The questions here are:
>>>>
>>>> - Should we do this at all (i.e. build up a central documentation system)?
>>>>
>>>> - Should we use Sphinx for it?
>>>>
>>>> - To which extent our current docs should be migrated to Sphinx (apart
>>>> from the Python docs, which already use Sphinx)?  For example, should
>>>> the specs (currently standalone pages written in Markdown) be migrated
>>>> to Sphinx for better cross-referencing and navigation?  What about the
>>>> C++ tutorial pages?  etc.
>>>>
>>>> - Should we preferably have a single Sphinx doctree, or several
>>>> independent per-topic / per-language doctrees?
>>>>
>>>> Regards
>>>>
>>>> Antoine.