osdir.com

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: MD5 in the read path


Thanks to open source, you can answer yourself:
https://github.com/apache/cassandra/search?q=md5&unscoped_q=md5
At a glance, looks like it's used for digest verification, and to get a
good hash distribution on the RandomPartitioner

I haven't done the math, but I suspect CRC32's just not good enough either
in terms of result distribution for hashes or ability to catch multi-bit
errors without accidental collision (that is, it doesn't sufficiently
guarantee uniqueness).  There are other error checking algorithms that are
probably less computationally complex, but thanks to a lot of hardware and
software optimizations geared towards md5 specifically over the years you'd
be hard-pressed to find something that gives you comparable speed in
practice for a 128-bit result.