|
|
Subject: Re: Questions about the array interface. - msg#00112
List: python.numeric.general
--- Chris Barker <Chris.Barker@xxxxxxxx> wrote:
>
> I can see that it would, but then, we're stuck with checking for all
> these optional attributes. If I don't bother to check for it, one day,
> someone is going to pass a weird array in with an offset, and a strange
> bug will show up.
>
Everyone seems to think that an offset is so weird. I haven't looked at
the internals of Numeric/scipy.base in a while so maybe it doesn't apply
there. However, if you subscript an array and return a view to the data,
you need an offset or you need to create a new buffer that encodes the
offset for you.
A = reshape(arange(9), (3,3))
0, 1, 2
3, 4, 5
6, 7, 8
B = A[2] # create a view into A
6, 7, 8 # Shared with the data above
Unless you're going to create a new buffer (which I guess is what Numeric
is doing), the offset for B would be 6 in this very simple case. I think
specifying the offset is much more elegant than creating a new buffer
object with a hidden offset that refers to the old buffer object.
I guess all I'm saying is that I wouldn't assume the offset is zero...
>
> Couldn't it be required, and return a reference to itself if that works?
>
> Maybe I'm just being lazy, but it feels clunky and prone to errors to
> keep having to check if a attribute exists, then use it (or not).
>
The problem is that you aren't being lazy enough. :-)
The fact that a lot of these attributes are optional should be hidden in
helper functions like those in Travis's array_interface.py module, or a
C/C++ include file (with inline functions).
In a short while, you shouldn't have to check any __array_metadata__
attributes directly. There should even be a helper function for getting
the array elements.
It wouldn't be a horrible mistake to have all the attributes be mandatory,
but it doesn't get array consumes any benefit that they can't get from a
well written helper library, and it does add some burden to array
producers.
Cheers,
-Scott
-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
Was this page helpful?
Thread at a glance:
Previous Message by Date:
click to view message preview
Re: Questions about the array interface.
--- Travis Oliphant <oliphant@xxxxxxxxxx> wrote:
> >
> > 2) As __array_strides__ is optional, I'd kind of like to have a
> > __contiguous__ flag that I could just check, rather than checking for
> > the existence of strides, then calculating what the strides should be,
> > then checking them.
>
>
> I don't want to add too much. The other approach is to establish a set
> of helper functions in Python to check this sort of thing: Thus, if
> you can't handle a general array you check:
>
> ndarray.iscontiguous(obj)
>
> where obj exports the array interface.
>
> But, it could really go either way. What do others think?
>
I think this should definitely be done in the helper functions. Having
extra attributes encode redundant information is a recipe for trouble.
Cheers,
-Scott
-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
Next Message by Date:
click to view message preview
Re: Questions about the array interface.
Hi Chris, Travis, ...
Great conversation you've started. I have two questions at the
moment... I do love the idea that an abstraction can bring the
different but similar num* worlds together.
Which sourceforge CVS repository is the interface (and an
implementation) show up on first? My guess is numpy/numeric3
I see Travis has been updating it while I sleep.
> def DrawPointList(self, points, pens=None):
> ...
> # some checking code on the pens)
> ...
> if (hasattr(points,'__array_shape__') and
> hasattr(points,'__array_typestr__') and
> len(points.__array_shape__) == 2 and
> points.__array_shape__[1] == 2 and
> points.__array_typestr__ == 'i4' and
> ): # this means we have a compliant array
> # return the array protocol version
> return self._DrawPointArray(points.__array_data__, pens,[])
> #This needs to be written now!
This means that whenever you have some complex multivalued
multidementional structure with the data you want to plot, you have to
reshape it into the above 'compliant' array before passing it on. I'm
a newbie, but is this reshape something where the data has to be
copied and take up memory twice? If not, then great, you would
painlessly reshape into something that had a different set of strides
that just accessed the data that complied in the big blob of data. If
the reshape is expensive, then maybe we need the array abstraction,
and then a second 'thing' that described which parts of the array to
use for the sequence of 2-tuples to use for plotting the x,y s of a
scatter plot. (or whatever)
I do think we can accept more than just i4 for a datatype. Especially
since a last-minute cast to i4 in inexpensive for almost every data
type.
> else:
> #return the generic python sequence version
> return self._DrawPointList(points, pens, [])
>
> Then we'll need a function (in C++):
> _DrawPointArray(points.__array_data__, pens,[])
Looks great.
-Jim
-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
Previous Message by Thread:
click to view message preview
Re: Questions about the array interface.
Travis Oliphant wrote:
Scott Gilbert wrote:
--- "David M. Cooke" <cookedm@xxxxxxxxxxxxxxxxxxx> wrote:
Good point, but a pain. Maybe they should be required, that way I
don't have to first check for the presence of '<' or '>', then check
if they have the right value.
I'll second this. Pulling out more Python Zen: Explicit is better than
implicit.
I'll third.
O.K. It's done....
Here's a bit of weirdness which has prevented me from using '<' or '>'
in the past with the struct module. I'm not guru enough to know what's
going on, but it has prevented me from being explicit rather than implicit.
In [1]:import struct
In [2]:from numarray.ieeespecial import nan
In [3]:nan
Out[3]:nan
In [4]:struct.pack('<d',nan)
---------------------------------------------------------------------------
exceptions.SystemError Traceback (most
recent call last)
/home/astraw/<console>
SystemError: frexp() result out of range
In [5]:struct.pack('d',nan)
Out[5]:'\x00\x00\x00\x00\x00\x00\xf8\xff'
-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
Next Message by Thread:
click to view message preview
Re: Questions about the array interface.
Scott Gilbert wrote:
> I think __array_version__ (or __array_protocol__?) is the
> better choice. How about have it optional and default to 1? If it's
> present and greater than 1 then it means there is something new going
on...
Again, I'm uncomfortable with something that I have to check being
optional. If it is, we're encouraging people to not check it, and that'
a recipe for bugs later on down the road.
> Everyone seems to think that an offset is so weird. I haven't looked at
> the internals of Numeric/scipy.base in a while so maybe it doesn't apply
> there. However, if you subscript an array and return a view to the data,
> you need an offset or you need to create a new buffer that encodes the
> offset for you.
> I guess all I'm saying is that I wouldn't assume the offset is zero...
Good point. All the more reason to have the offset be mandatory.
> The fact that a lot of these attributes are optional should be hidden in
> helper functions like those in Travis's array_interface.py module, or a
> C/C++ include file (with inline functions).
Yes, if there is a C/C++ version of all these helper functions, I'll be
a lot happier. And you're right, the same information should not be
encoded in two places, so my "iscontiguous" attribute should be a helper
function or maybe a method.
> In a short while, you shouldn't have to check any __array_metadata__
> attributes directly. There should even be a helper function for getting
> the array elements.
Cool. How would that work? A C++ iterator? I"m thinking not, as this is
all C, no?
> It wouldn't be a horrible mistake to have all the attributes be
mandatory,
> but it doesn't get array consumes any benefit that they can't get from a
> well written helper library, and it does add some burden to array
> producers.
Hardly any. I'm assuming that there will be a base_array class that can
be used as a base class or mixin, so it wouldn't be any work at all to
have a full set of attributes with defaults. It would take up a little
bit of memory. I'm assuming that the whole point of this is to support
large datasets, but maybe that isn't a valid assumption, After all,
small array support has turned out to be very important for Numeric.
As a rule of thumb, I think there will be consumers of arrays that
producers, so I'd rather make it easy on the consumers that the
producers, if we need to make such a trade off. Maybe I'm biased,
because I'm a consumer.
-Chris
--
Christopher Barker, Ph.D.
Oceanographer
NOAA/OR&R/HAZMAT (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception
Chris.Barker@xxxxxxxx
-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
|
|