[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

On Thu, Sep 27, 2018 at 05:55:07PM +1200, Greg Ewing wrote: > jab at math.brown.edu wrote: > >I understand from > >https://github.com/cosmologicon/pywat/pull/40#discussion_r219962259 > >that "to always round up... can theoretically skew the data" > > *Very* theoretically. If the number is even a whisker bigger than > 2.5 it's going to get rounded up regardless: > > >>> round(2.500000000000001) > 3 > > That difference is on the order of the error you expect from > representing decimal fractions in binary, so I would be surprised > if anyone can actually measure this bias in a real application. I think you may have misunderstood the nature of the bias. It's not about individual roundings and it definitely has nothing to do with binary representation. Any one round operation will introduce a bias. You had a number, say 2.3, and it gets rounded down to 2.0, introducing an error of -0.3. But if you have lots of rounds, some will round up, and some will round down, and we want the rounding errors to cancel. The errors *almost* cancel using the naive rounding algorithm as most of the digits pair up: .1 rounds down, error = -0.1 .9 rounds up, error = +0.1 .2 rounds down, error = -0.2 .8 rounds up, error = +0.2 etc. If each digit is equally likely, then on average they'll cancel and we're left with *almost* no overall error. The problem is that while there are four digits rounding down (.1 through .4) there are FIVE which round up (.5 through .9). Two digits don't pair up: .0 stays unchanged, error = 0 .5 always rounds up, error = +0.5 Given that for many purposes, our data is recorded only to a fixed number of decimal places, we're dealing with numbers like 0.5 rather than 0.5000000001, so this can become a real issue. Every ten rounding operations will introduce an average error of +0.05 instead of cancelling out. Rounding introduces a small but real bias. The most common (and, in many experts' opinion, the best default behaviour) is Banker's Rounding, or round-to-even. All the other digits round as per the usual rule, but .5 rounds UP half the time and DOWN the rest of the time: 0.5, 2.5, 3.5 etc round down, error = -0.5 1.5, 3.5, 5.5 etc round up, error = +0.5 thus on average the .5 digit introduces no error and the bias goes away. -- Steve

- Prev by Date:
**[Python-Dev] Questions about signal handling.** - Next by Date:
**[Python-Dev] [RELEASE] Python 3.7.1rc1 and 3.6.7rc1 now available for testing** - Previous by thread:
**[Python-Dev] Change in Python 3's "round" behavior** - Next by thread:
**[Python-Dev] Change in Python 3's "round" behavior** - Index(es):