logo       

Re: gplot eliminates NaN entries from plot data: msg#00019

gnu.octave.bugs

Subject: Re: gplot eliminates NaN entries from plot data

Hi,

sorry for the delay.

Here is what I found out about the treatment of NaN and Inf entries in datafiles in gnuplot:
  • First I looked into the source code of gnuplot. There I found
    • that gnuplot reads data using the c-library function scanf. So if the c-library can handle NaN and Inf correctly, those values will be loaded into gnuplots memory as NaN and Inf values.
    • Then there is a macro function  STORE_WITH_LOG_AND_UPDATE_RANGE defined in scr/axis.h of the gnuplot source code containing this piece of code:
#define STORE_WITH_LOG_AND_UPDATE_RANGE(STORE, VALUE, TYPE, AXIS,      \
                       OUT_ACTION, UNDEF_ACTION)      \
do {                                      \
    /* HBB 20000726: new check, to avoid crashes with axis index -1 */      \
    if (AXIS==-1)                              \
    break;                                  \
    /* HBB 20040304: new check to avoid storing infinities and NaNs */      \
    if (! (VALUE > -VERYLARGE && VALUE < VERYLARGE)) {              \
    TYPE = UNDEFINED;                          \
    UNDEF_ACTION;                              \
    break;                                  \
    }                                      \
...
This function is used in the plot routines of gnuplot to process the individual data points.
So, NaN and Inf values will be treated as undefined (i.e., missing) and will thus interrupt lines (provided that nan and inf are regarded to lie out of the [-VERYLARGE, VERYLARGE] interval by the c libary).
  • Then, I also asked in the gnuplot newsgroup about this topic. There, I got the following information: Actually, gnuplot knows three cases in data files:
    • missing values, marked by a string explicitly defined by
        set datafile missing {"<string>"}
      In this case, gnuplot plots a gap in the line. The handling of these strings is implemented in the data file loading code of gnuplot.
    • illegal entries, i.e., any string that cannot be interpreted as a number. Such entries will lead to the kind of strange behaviour you wrote about, if plotted without a using parameter in the plot command. So this is to be avoided.
      If you provide a using option, e.g.
        plot "file" using 1:2
      the lines containing illegal entries will be completely ignored while reading the file. So there will be no gap in the plotted line (and no strange behaviour).
    • NaN and Inf values in the data file are the third case. They are not explicitly treated by the file reading routine in gnuplot (here, gnuplot completely relies on the scanf function), but rather by the actual plotting routines as shown above. 
What to do with this?
  • if all (or most) of the users of octave use a gnuplot compiled with a modern version of the c library which knows about inf and nan, it could be ok to apply your patch (and remove the calls to strip_infnan).
  • If this is not the case, one option would be to replace the calls to strip_infnan by calls to a (new) replace_inf_by_nan function and add a
        set datafile missing "NaN"
    to the generated gnupot command.
  • Another option could be to add  a "using ..." option to the generated gnuplot commands.
What do you think?


regards

Thorsten Meyer
    

    
Thorsten Meyer wrote:
John W. Eaton wrote:

On 31-Mar-2005, Thorsten Meyer <thorsten.meyier@xxxxxx> wrote:

| --------
| Bug report for Octave 2.1.67 configured for i386-pc-linux-gnu
| (or rather misfeature for my particular application)
| | Description:
| -----------
| | I am trying to program an octave function that plots 3d data (x,y,z) by | placing symbols on the (x,y) coordinates, the size of which are | determined by the z values.
| In gnuplot this can nicely be done by defining a line along all the | symbol boundaries  with nonnumerical values in between such that no | connecting lines between the symbols are plotted.
| E.g., plotting this data file with gnuplot (version 4.0)
|      0     0
|      2     0
|      2     2
|      2     0
|      0     0
|    NaN   NaN
|     10    10
|     14    10
|     14    14
|     10    14
|     10    10
|    NaN   NaN
|      5     5
|      8     5
|      8     8
|      5     8
|      5     5
| will result in three squares of different size  | | Unfortunately, the gplot function in octave (and consequently all the | high level plot functions) cuts out all NaNs out of the data to be | plotted.

As I recall, this feature was implemented because for many years,
gnuplot did not do anything useful with NaNs in the data.  Likewise
for Inf.

| Would you mind if the behaviour of gplot was changed not to filter the | NaNs out of the data? And could somebody point out to me the part of the | source code to be changed for this?

Look for the functions save_ascii_data_for_plotting and save_three_d
in the file src/ls-oct-ascii.cc.  If gnuplot can handle Inf and NaNs
in random locations (not just a complete row of NaN values) then the
change to Octave is trivial (see below for patch).

What does gnuplot 4.x do with Inf values, or data that looks like

    10    14
    10    10
   NaN     5
     5   NaN
     8     5
     8     8

If it does the "right thing" then I will apply the following changes.

Thanks,

jwe


src/ChangeLog:

2005-03-31  John W. Eaton  <jwe@xxxxxxxxxx>

    * ls-oct-ascii.cc (save_ascii_data_for_plotting, save_three_d):
    Don't strip Inf and NaN values from plot data.


Index: src/ls-oct-ascii.cc
===================================================================
RCS file: /cvs/octave/src/ls-oct-ascii.cc,v
retrieving revision 1.7
diff -u -r1.7 ls-oct-ascii.cc
--- src/ls-oct-ascii.cc    28 Dec 2004 02:43:01 -0000    1.7
+++ src/ls-oct-ascii.cc    31 Mar 2005 19:50:14 -0000
@@ -509,7 +509,7 @@
{
  bool infnan_warned = true;

-  return save_ascii_data (os, t, name, infnan_warned, true, false, 0);
+  return save_ascii_data (os, t, name, infnan_warned, false, false, 0);
}

// Maybe this should be a static function in tree-plot.cc?
@@ -540,7 +540,6 @@
        warning ("ignoring last %d columns", extras);

      Matrix tmp = tc.matrix_value ();
-      tmp = strip_infnan (tmp);
      nr = tmp.rows ();

      for (int i = 0; i < nc-extras; i += 3)
@@ -553,7 +552,6 @@
      else
    {
      Matrix tmp = tc.matrix_value ();
-      tmp = strip_infnan (tmp);
      nr = tmp.rows ();

      for (int i = 0; i < nc; i++)
 

Dear John,

my gnuplot (debian, gnuplot version 4.0 patchlevel 0, no special user configuration) treats all non numerical entries the same:
lines containing such entries are ignored and the lines will not be connected over these values.

However, the manual at http://www.gnuplot.info/docs/gnuplot.html#set_datafile_missing still
predicts different behaviour depending if the missing value is in the x or the y column and also depending on the using option of the plot command. I tried the examples given there and they all give the same result: two disconnected lines.

Is there a bug in my gnuplot? Is the documentation of gnuplot no longer up to date?

Do you know a mailing list for gnuplot, where I could pose this question?

By the way, if you come to the decision to apply your patch, would you please also replace the

  bool infnan_warned = true;
by
  bool infnan_warned = false;
?

regards

Thorsten



-------------------------------------------------------------
Octave is freely available under the terms of the GNU GPL.

Octave's home on the web:  http://www.octave.org
How to fund new projects:  http://www.octave.org/funding.html
Subscription information:  http://www.octave.org/archive.html
-------------------------------------------------------------


<Prev in Thread] Current Thread [Next in Thread>
Google Custom Search

News | FAQ | advertise