logo       

Re: [patch #5690] Clean up case code: msg#00080

statistics.pspp.devel

Subject: Re: [patch #5690] Clean up case code

John Darrington <john@xxxxxxxxxxxxxxx> writes:

> On Tue, Jan 30, 2007 at 10:30:43PM -0800, Ben Pfaff wrote:
>
> So I'm not proposing to
> encourage use of random access where it's not necessary.
>
> Would it therefore be worth having a flag passed to the casereader
> constructor which declares whether or not the casereader performs
> random access?

What's the intended usage?

> Probably the most powerful thing to stack on top of a casereader
> is what I'm tentatively calling a "casegrouper". A casegrouper
> takes a casereader and a function that classifies consecutive
> pairs of cases as in the same group or different groups. It then
> hands you a sequence of casereaders, one by one, each of which
> contains a single group. This is invaluable for SPLIT FILE, for
> break groups on AGGREGATE or RANK or SORT CASES, and so on.
>
> Sounds good. I was thinking about looking at the percentiles code
> again. (The more I look at it the less I like it. Also, I'm not
> convinced that it gets the right answers in all cases. It needs more
> test cases.) But in view of the magnitude of the changes you're
> making, I think I'll wait. The new functions will probably make it
> simpler.

If the new functions will make it simpler, OK. Otherwise, please
go ahead and work on anything you like. I'll merge into my tree
as needed.

> * Write an extensive section for the manual describing
> best practices for data processing under PSPP. I'm
> confident that, with this set of changes, PSPP data
> processing will be mature enough that we can provide
> good guidance for future developers this way.
>
> I might break this into a separate developers' guide,
> along with the existing chapter on q2c. What do you
> think?
>
> I think a developers' guide is a good idea. q2c docs really don't
> belong in the user manual, so should be moved, along with the *.sav
> file format description.

OK, I was thinking about moving the .sav and .por descriptions
too, so you've just confirmed it for me.

> data_model is a really really generic name. It could be a name
> for the model for any kind of data. The name datasheet calls to
> my mind a spreadsheet, which more specifically describes what the
> datasheet actually implements. So I'm not 100% happy with the
> suggestion data_model.
>
> How about datasheetmodel or would that be too long?

It might work.
--
A bicycle is one of the world's beautiful machines, beautiful machines
are art, and art is civilisation, good living, and balm to the soul.
--Elisa Francesca Roselli


<Prev in Thread] Current Thread [Next in Thread>
Google Custom Search

News | FAQ | advertise