[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Python-Dev] bpo-28055: Fix unaligned accesses in siphash24(). (GH-6123)

13.05.18 20:42, Christian Heimes ????:
> I was against the approach a good reason. The PR adds additional CPU
> instructions and changes memory access pattern in a  critical path of
> CPython. There is no reason to add additional overhead for the majority
> of users or X86 and X86_64 architectures. The memcpy() call should only
> be used on architectures that do not support unaligned memory access.
> See comment https://bugs.python.org/issue28055#msg276257
> At least for latest GCC, the change seems to be fine. GCC emits the same
> assembly code for X86_64 before and after your change. Did you check the
> output on other CPU architectures as well as clang and MSVC, too?

For the initial implementation of pyhash.c [1] I proposed a patch that 
use memcpy() conditionally to avoid an overhead on Windows:

+#ifdef _MSC_VER
+        block.value = *(const Py_uhash_t*)p;
+        memcpy(block.bytes, p, SIZEOF_PY_UHASH_T);

(and similar code for FNV).

But many developers confirmed that all modern compilers including latest 
versions of MS VS optimize memcpy() with a constant size into a single 
CPU instruction. Seems avoiding to use memcpy() no longer needed.

If using memcpy() adds an overhead on some platforms we can return to 
using a compiler/platform depending code.

[1] https://bugs.python.org/issue19183