[Python-Dev] bpo-28055: Fix unaligned accesses in siphash24(). (GH-6123)
13.05.18 20:42, Christian Heimes ????:
> I was against the approach a good reason. The PR adds additional CPU
> instructions and changes memory access pattern in a critical path of
> CPython. There is no reason to add additional overhead for the majority
> of users or X86 and X86_64 architectures. The memcpy() call should only
> be used on architectures that do not support unaligned memory access.
> See comment https://bugs.python.org/issue28055#msg276257
> At least for latest GCC, the change seems to be fine. GCC emits the same
> assembly code for X86_64 before and after your change. Did you check the
> output on other CPU architectures as well as clang and MSVC, too?
For the initial implementation of pyhash.c  I proposed a patch that
use memcpy() conditionally to avoid an overhead on Windows:
+ block.value = *(const Py_uhash_t*)p;
+ memcpy(block.bytes, p, SIZEOF_PY_UHASH_T);
(and similar code for FNV).
But many developers confirmed that all modern compilers including latest
versions of MS VS optimize memcpy() with a constant size into a single
CPU instruction. Seems avoiding to use memcpy() no longer needed.
If using memcpy() adds an overhead on some platforms we can return to
using a compiler/platform depending code.