Apple Clang 14.0.3 (Xcode 14.3) miscompiles this builtin on AArch64,
causing the borrow flag to be set incorrectly. I have added a detailed
writeup on Qemu's issue tracker, where the same issue led to a hang when
emulating x86:
https://gitlab.com/qemu-project/qemu/-/issues/1659#note_1408275831
I don't know of any specific issue caused by this on Lagom, but better
safe than sorry.
GCC 14 (https://gcc.gnu.org/g:2b4e0415ad664cdb3ce87d1f7eee5ca26911a05b)
has added support for the previously Clang-specific add/subtract with
borrow builtins. Let's use `__has_builtin` to detect them instead of
assuming that only Clang has them. We should prefer these to the
x86-specific ones as architecture-independent middle-end optimizations
might deal with them better.
As an added bonus, this significantly improves codegen on AArch64
compared to the fallback implementation that uses
`__builtin_{add,sub}_overflow`.
For now, the code path with the x86-specific intrinsics stays so as to
avoid regressing the performance of builds with GCC 12 and 13.
Additionally, split it into two versions (for IsIntegral<T> -- asking
to place value into register and for !IsIntegral<T> -- asking to place
value into memory with memory clobber), so that Clang is no more
completely confused about `taint_for_optimizer(AK::StringView&)`.