|
Re: [patch #5690] Clean up case code: msg#00079statistics.pspp.devel
On Tue, Jan 30, 2007 at 10:30:43PM -0800, Ben Pfaff wrote: So I'm not proposing to encourage use of random access where it's not necessary. Would it therefore be worth having a flag passed to the casereader constructor which declares whether or not the casereader performs random access? Probably the most powerful thing to stack on top of a casereader is what I'm tentatively calling a "casegrouper". A casegrouper takes a casereader and a function that classifies consecutive pairs of cases as in the same group or different groups. It then hands you a sequence of casereaders, one by one, each of which contains a single group. This is invaluable for SPLIT FILE, for break groups on AGGREGATE or RANK or SORT CASES, and so on. Sounds good. I was thinking about looking at the percentiles code again. (The more I look at it the less I like it. Also, I'm not convinced that it gets the right answers in all cases. It needs more test cases.) But in view of the magnitude of the changes you're making, I think I'll wait. The new functions will probably make it simpler. What I have left: * Make the GUI compile and work again. Currently it does neither. As part of that, finish and test the datasheet implementation. I might need help or advice with some of the GUI stuff, but I don't know yet. No worries. * Write an extensive section for the manual describing best practices for data processing under PSPP. I'm confident that, with this set of changes, PSPP data processing will be mature enough that we can provide good guidance for future developers this way. I might break this into a separate developers' guide, along with the existing chapter on q2c. What do you think? I think a developers' guide is a good idea. q2c docs really don't belong in the user manual, so should be moved, along with the *.sav file format description. I'm really excited about this set of changes. It feels to me like one-third of the important PSPP implementation (the data processing) is finally coming together. The other two-thirds are syntax parsing and output formatting (see the end of the PSPP README), and I finally have ideas for those that I think will really work. data_model is a really really generic name. It could be a name for the model for any kind of data. The name datasheet calls to my mind a spreadsheet, which more specifically describes what the datasheet actually implements. So I'm not 100% happy with the suggestion data_model. How about datasheetmodel or would that be too long? J' -- PGP Public key ID: 1024D/2DE827B3 fingerprint = 8797 A26D 0854 2EAB 0285 A290 8A67 719C 2DE8 27B3 See http://pgp.mit.edu or any PGP keyserver for public key.
pspp-dev mailing list pspp-dev@xxxxxxx http://lists.gnu.org/mailman/listinfo/pspp-dev |
|
| <Prev in Thread] | Current Thread | [Next in Thread> |
|---|---|---|
| Previous by Date: | Re: [patch #5690] Clean up case code: 00079, Ben Pfaff |
|---|---|
| Next by Date: | Re: [patch #5690] Clean up case code: 00079, Ben Pfaff |
| Previous by Thread: | Re: [patch #5690] Clean up case codei: 00079, Ben Pfaff |
| Next by Thread: | Re: [patch #5690] Clean up case code: 00079, Ben Pfaff |
| Indexes: | [Date] [Thread] [Top] [All Lists] |
| News | FAQ | advertise |