Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Drop insline asm in favor of std::arch intrinsics whenever it is ready (avx512dq, avx512vl) #2

Open
martinothamar opened this issue Jul 29, 2023 · 0 comments

Comments

@martinothamar
Copy link
Owner

martinothamar commented Jul 29, 2023

avx512dq and avx512vl has not yet been implemented into std::arch

related PR: rust-lang/stdarch#954
issue: rust-lang/stdarch#1437

code in question:

if is_x86_feature_detected!("avx512dq") && is_x86_feature_detected!("avx512vl") {
// With AVX512 DQ/VL we can use the below instruction
// with both 512bit and 256bit vectors
// see https://www.felixcloutier.com/x86/vcvtuqq2pd
let mut dst: __m256d;
asm!(
"vcvtuqq2pd {1}, {0}",
in(ymm_reg) v,
out(ymm_reg) dst,
// PERF: 'nostack' tells the Rust compiler that our asm won't touch the stack.
// If we don't include this, the compiler might inject additional
// instructions to make the stack pointer 16byte aligned in accordance to x64 ABI.
// If we were to 'call' in our inline asm, it would have to push the 8byte return address
// onto the stack, so the stack would have to be 16byte aligned before this happened
options(nostack),
);
dst
} else {

and

unsafe fn m512i_to_m512d(src: __m512i) -> __m512d {
// this should be exposed through the '_mm512_cvtepu64_pd' C/C++ intrinsic,
// but since I can't find this exposed in Rust anywhere,
// we're doing it the inline asm way here
// TODO: find out what happened in std::arch
let mut dst: __m512d;
asm!(
"vcvtuqq2pd {1}, {0}",
in(zmm_reg) src,
out(zmm_reg) dst,
// PERF: 'nostack' tells the Rust compiler that our asm won't touch the stack.
// If we don't include this, the compiler might inject additional
// instructions to make the stack pointer 16byte aligned in accordance to x64 ABI.
// If we were to 'call' in our inline asm, it would have to push the 8byte return address
// onto the stack, so the stack would have to be 16byte aligned before this happened
options(nostack),
);
dst
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant