osdir.com
mailing list archive

Subject: RE: Getting the indexes of the myarray.min() - msg#00038

List: python.numeric.general

Date: Prev Next Index Thread: Prev Next Index
Hi,
Raymond D. Hettinger is writing a general statistics module 'statistics.py A
collection of functions for summarizing data' that is somewhere in a Python
CVS (I can not find the exact reference but it appeared in a fairly recent
Python thread). He uses a one-pass algorithm from Knuth for the variance that
has good numerical stability.

Below is a rather rough version modified from my situation (masked arrays) which
uses Knuth's algorithm for the variance. It lacks features like checking
dimensions (assumes variance can be computed) and documentation.

Regards
Bruce Southey

import numarray

def SummaryStats(Matrix):
mshape=Matrix.getshape()
nrows=mshape[0]
ncols=mshape[1]
#print nrows, ncols
# Create matrices to hold statistics
N_obs =numarray.zeros(ncols, type='Float64')
Sum =numarray.zeros(ncols, type='Float64')
Var =numarray.zeros(ncols, type='Float64')
Min =numarray.zeros(ncols, type='Float64')
Max =numarray.zeros(ncols, type='Float64')
Mean =numarray.zeros(ncols, type='Float64')
AdjM =numarray.zeros(ncols, type='Float64')
NewM =numarray.zeros(ncols, type='Float64')
DifM =numarray.zeros(ncols, type='Float64')

for row in range(nrows):
for col in range(ncols):
t_value=Matrix[row,col]
N_obs[col] = N_obs[col] + 1
Sum[col] = Sum[col] + t_value
if t_value > Max[col]:
Max[col]=t_value
if t_value < Min[col]:
Min[col]=t_value
if N_obs[col]==1:
Mean[col]=t_value
AdjM[col]=(t_value-Mean[col])/(N_obs[col])-DifM[col]
NewM[col]=Mean[col]+AdjM[col]
DifM[col]=(NewM[col]-Mean[col])-AdjM[col]
Var[col] = Var[col] +
(t_value-Mean[col])*(t_value-NewM[col])
Mean[col] = NewM[col]
print 'N_obs\n', N_obs
print 'Sum\n', Sum
print 'Mean\n', Mean
print 'Var\n', Var/(nrows-1)

if __name__ == '__main__':
MValues=numarray.array([[1,2,1],[3,2,2],[5,1,1],[4,3,2]])
SummaryStats(MValues)



---- Original message ----
>Date: Thu, 13 May 2004 15:42:30 -0400
>From: "Perry Greenfield" <perry@xxxxxxxxx>
>Subject: RE: [Numpy-discussion] Getting the indexes of the myarray.min()
>To: "Russell E Owen" <rowen@xxxxxxxxxxxxxxxx>, "numarray"
<numpy-discussion@xxxxxxxxxxxxxxxxxxxxx>
>
>> Russell E Owen wrote:
>>
>> At 9:27 AM -0400 2004-05-13, Perry Greenfield wrote:
>> >... One has to trade off the number of such functions
>> >against the speed savings. Another example is getting max and min values
>> >for an array. I've long thought that this is so often done they could
>> >be done in one pass. There isn't a function that does this yet though.
>>
>> Statistics is another area where multiple return values could be of
>> interest -- one may want the mean and std dev, and making two passes
>> is wasteful (since some of the same info needs to be computed both
>> times).
>>
>> A do-all function that computes min, min location, max, max location,
>> mean and std dev all at once would be nice (especially if the
>> returned values were accessed by name, rather than just being a tuple
>> of values, so they could be referenced safely and readably).
>>
>> -- Russell
>>
>We will definitely add something like this for 1.0 or 1.1.
>(but probably for min and max location, it will just be
>for the first encountered).
>
>Perry
>
>
>-------------------------------------------------------
>This SF.Net email is sponsored by: SourceForge.net Broadband
>Sign-up now for SourceForge Broadband and get the fastest
>6.0/768 connection for only $19.95/mo for the first 3 months!
>http://ads.osdn.com/?ad_id=2562&alloc_id=6184&op=click
>_______________________________________________
>Numpy-discussion mailing list
>Numpy-discussion@xxxxxxxxxxxxxxxxxxxxx
>https://lists.sourceforge.net/lists/listinfo/numpy-discussion


-------------------------------------------------------
This SF.Net email is sponsored by: SourceForge.net Broadband
Sign-up now for SourceForge Broadband and get the fastest
6.0/768 connection for only $19.95/mo for the first 3 months!
http://ads.osdn.com/?ad_id=2562&alloc_id=6184&op=click


Was this page helpful?
Yes No
Thread at a glance:

Previous Message by Date: click to view message preview

RE: Getting the indexes of the myarray.min()

On Thu, 13 May 2004, Russell E Owen wrote: > Statistics is another area where multiple return values could be of > interest -- one may want the mean and std dev, and making two passes > is wasteful (since some of the same info needs to be computed both > times). Single pass std deviations don't work very well if you've got a lot of data points and the std deviation is small compared to the average. I'm not arguing aginst including them, but maybe the documentation for such a function should include a caveat. Warren Focke ------------------------------------------------------- This SF.Net email is sponsored by: SourceForge.net Broadband Sign-up now for SourceForge Broadband and get the fastest 6.0/768 connection for only $19.95/mo for the first 3 months! http://ads.osdn.com/?ad_id=2562&alloc_id=6184&op=click

Next Message by Date: click to view message preview

RE: outer idiom (was Re: Getting the indexes of themyarray.min())

Álvaro Tejero Cantero wrote: > > but this gives > >>> subtract.outer(r,r).shape > (10, 3, 10, 3) > > that is, subtracts y coordinates to x coordinates which is not intended. > AFAIK the outer solution is MUCH faster than the nested for loops, so > what I do now is > > >>> r_rel = transpose(array([subtract.outer(r[:,0],r[:,0]), > subtract.outer(r[:,1],r[:,1]), > subtract.outer(r[:,2],r[:,2])])) > >>> r_rel.shape #as with the double loop > (10,10,3) > > > My question is then if there is any more elegant way to do this, > especially giving as a result independence of the number of dimensions). > Not that I can think of at the moment. > Maybe an "axis" (=0 in this case?) keyword for the outer function would > be useful in this context? > Perhaps. Right now these ufunc methods are pretty complicated so it may not be easy to do, but I understand that there is certainly utility in being able to do that. We'll look into it (but not right away so don't hold your breath). Perry ------------------------------------------------------- This SF.Net email is sponsored by: SourceForge.net Broadband Sign-up now for SourceForge Broadband and get the fastest 6.0/768 connection for only $19.95/mo for the first 3 months! http://ads.osdn.com/?ad_id=2562&alloc_id=6184&op=click

Previous Message by Thread: click to view message preview

Re: Getting the indexes of the myarray.min()

Álvaro Tejero Cantero wrote: Hello, I am new to numarray. I have been searching the documentation, and see no evident vectorial and concise way to get the indexes for the minimum of an array times.min(). If I understand your question correctly, you want argmin: >>> import numarray as na >>> from numarray import random_array >>> times = random_array.randint(0, 16, [4,4]) >>> times array([[15, 3, 11, 10], [ 1, 1, 15, 7], [ 4, 3, 5, 6], [10, 15, 12, 3]]) >>> na.argmin(times) # Takes min along axis 0 array([1, 1, 2, 3]) >>> na.argmin(times, 1) # Take min along axis 1 array([1, 1, 1, 3]) -tim Currently my two alternatives are: (times is NxN array) husband, wife = nonzero(equal(times.min(),times)) #gives tuple of 1x1 arrays, each containing one index and (even uglier) husband = compress(times.min()==times,indices([N,N])[0]) wife = compress(times.min()==times,indices([N,N])[1]) These are weird ways to get something as simple. I am surely missing something, but I have tried several slicing strategies before without success. For getting the minimum times in each row I use: choose(argmin(times),transpose(times)) What are the idioms in numpy for these tasks? Thank you very much in advance, á. I would ------------------------------------------------------- This SF.Net email is sponsored by Sleepycat Software Learn developer strategies Cisco, Motorola, Ericsson & Lucent use to deliver higher performing products faster, at low TCO. http://www.sleepycat.com/telcomwpreg.php?From=osdnemail3

Next Message by Thread: click to view message preview

puzzling compiler warning for numeric extension

I modified an existing numeric c extension today, adding a function that returns a new array. This is the first time I've used NA_NewArray. The code seems to run fine, but I get a disturbing compiler warning: [rowen:Shared Files/Guider/PyGuide] rowen% python setup.py build ... src/radProfModule.c: In function `Py_radSqByRadInd': src/radProfModule.c:378: warning: return from incompatible pointer type The associated code can be summarized as follows (and it is simple enough that I've also appended a copy). The error is thrown on the last line: static PyObject *Py_radSqByRadInd(PyObject *self, PyObject *args) { long nElt; Long *radSqByRadInd; if (!PyArg_ParseTuple(args, "l", &nElt)) return NULL; ... return NA_NewArray((void *)radSqByRadInd, tLong, 1, nElt); } So...am I actually doing something wrong, or is this warning normal? -- Russell P.S. the full code: static PyObject *Py_radSqByRadInd(PyObject *self, PyObject *args) { long nElt; Long radInd; char ModName[] = "radSqByRadInd"; Long *radSqByRadInd; if (!PyArg_ParseTuple(args, "l", &nElt)) return NULL; if (nElt < 0) { PyErr_Format(PyExc_ValueError, "%s: nPts < 0", ModName); return NULL; } // allocate max(3, nElt) elements (simplifies the computation // and avoids calling calloc on 0 elements) radSqByRadInd = calloc(MAX(nElt, 3), sizeof *radSqByRadInd); if (radSqByRadInd == NULL) { PyErr_Format(PyExc_MemoryError, "%s: insufficient memory", ModName); return NULL; } for (radInd=0; radInd<3; radInd++) { radSqByRadInd[radInd] = radInd; } for (radInd=3; radInd<nElt; radInd++) { radSqByRadInd[radInd] = (radInd - 1) * (radInd - 1); } return NA_NewArray((void *)radSqByRadInd, tLong, 1, nElt); } ------------------------------------------------------- This SF.Net email is sponsored by: SourceForge.net Broadband Sign-up now for SourceForge Broadband and get the fastest 6.0/768 connection for only $19.95/mo for the first 3 months! http://ads.osdn.com/?ad_id=2562&alloc_id=6184&op=click
Sign up for updates to this mailing list. email:
Loading Comments...
Home | News | Patents | Sitemap | FAQ | advertise

Advertising by