logo       

Re: anova.m: msg#00056

gnu.octave.bugs

Subject: Re: anova.m

Sorry for dropping this until now.

I made the change you suggested.

Thanks,

jwe


On 10-Feb-2004, Andy Adler wrote:

| The anovan.m code in octave-forge has the correct result in
| this case.
|
| > y = [1 3 4 2 1 5 3 5 6 7 4 5 7 10 11 3]';
| > g = [1 1 1 1 1 1 1 1 2 2 2 2 2 3 3 3]';
| > anovan(y, g)
| 1-way ANOVA Table (Factors A,):
|
| Source of Variation Sum Sqr df MeanSS Fval p-value
| *********************************************************************
| Error 62.80 13 4.83
| Factor A 61.64 2 30.82 6.380 0.011737
|
| > anova(y, g)
|
| One-way ANOVA Table:
|
| Source of Variation Sum of Squares df Empirical Var
| *********************************************************
| Between Groups 71.5600 2 35.7800
| Within Groups 62.8000 13 4.8308
| ---------------------------------------------------------
| Total 134.3600 15
|
| Test Statistic f 7.4067
| p-value 0.0071
|
|
| (Aside: I'm still looking for testers of my anonan code. It's quite
| complex, and I suspect there are still bugs)
|
|
| Andy
|
|
| On Tue, 10 Feb 2004, toni saarela wrote:
|
| > Version: Octave 2.1.50 (i686-pc-linux-gnu)
| >
| > Description:
| >
| > I think there's a small bug in anova.m (which performs one-way analysis
| > of variance). It only occurs when using anova with two input arguments,
| > as in
| >
| > octave:1> anova (y,g)
| >
| > ,where y is a vector containing the data and g is a vector defining the
| > groups, and only with unequal group sizes.
| >
| > The total mean is calculated from the group means (see below). This
| > works fine if the group sizes are equal. However, if they are not, it
| > gives too much weight to smaller groups in calculation of total mean,
| > sometimes leading to too high estimates of between-groups variance (and
| > of total variance), and thus too high F- and too small p-values.
| >
| > Example:
| >
| > octave:1>y = [1 3 4 2 1 5 3 5 6 7 4 5 7 10 11 3]';
| > octave:2>g = [1 1 1 1 1 1 1 1 2 2 2 2 2 3 3 3]';
| > octave:3>anova (y, g)
| >
| > gives F=7.4067, p=0.0071 (ssq between groups = 71.5600)
| >
| > should be (please correct me if I'm wrong): F=6.3797, p=0.0117 (ssq
| > between groups = 61.6375)
| >
| > Fix:
| > ---
| >
| > Simply replacing the vector group_mean with y (input vector containing
| > all the data) in calculation of total_mean on line 83 should fix it:
| >
| > line 83:
| > total_mean = mean (group_mean);
| >
| > to:
| > total_mean = mean (y);
| >
| > Now the SSQ's produce the right result: (lines 84-86)
| > SSB = sum (group_count .* (group_mean - total_mean) .^ 2);
| > SST = sumsq (reshape (y, n, 1) - total_mean);
| > SSW = SST - SSB;
| >
| > (Or if group_mean is to be used, it should be weighted with relative
| > group sizes)
| >
| > Best regards,
| > Toni Saarela
|
| -------------------------------------------------------------
| Octave is freely available under the terms of the GNU GPL.
|
| Octave's home on the web: http://www.octave.org
| How to fund new projects: http://www.octave.org/funding.html
| Subscription information: http://www.octave.org/archive.html
| -------------------------------------------------------------



-------------------------------------------------------------
Octave is freely available under the terms of the GNU GPL.

Octave's home on the web: http://www.octave.org
How to fund new projects: http://www.octave.org/funding.html
Subscription information: http://www.octave.org/archive.html
-------------------------------------------------------------




<Prev in Thread] Current Thread [Next in Thread>
Google Custom Search

News | FAQ | advertise