Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using simd128 leads to unloadable Wasm on older browsers #94

Open
alecmocatta opened this issue Aug 9, 2024 · 2 comments
Open

Using simd128 leads to unloadable Wasm on older browsers #94

alecmocatta opened this issue Aug 9, 2024 · 2 comments

Comments

@alecmocatta
Copy link

I believe I'm seeing the same as this issue with this crate:

Here's wasm2wat showing the functions at issue:

  (func $bytecount::simd::wasm::chunk_count::h678df4b5a0bba3e6 (type 13) (param i32 i32 i32) (result i32)
    (local i32 i32 i32 i32 i32 i32 v128 v128 v128 v128 v128 v128 v128 v128 v128)
    global.get 0
    i32.const 96
    i32.sub
    local.tee 3
    global.set 0
    local.get 2
    i8x16.splat
    local.set 9
    i32.const 4080
    local.set 4
    i32.const 0
    local.tee 2
    local.set 5
    local.get 2
    local.set 2
    block  ;; label = @1
      block  ;; label = @2
        block  ;; label = @3
          block  ;; label = @4
            block  ;; label = @5
              block  ;; label = @6
                block  ;; label = @7
                  block  ;; label = @8
                    loop  ;; label = @9
                      local.get 2
                      local.set 2
                      local.get 5
                      local.set 6
                      local.get 4
                      local.get 1
                      i32.gt_u
                      br_if 1 (;@8;)
                      local.get 2
                      local.set 2
                      i32.const 1
                      local.set 5
                      v128.const i32x4 0x00000000 0x00000000 0x00000000 0x00000000
                      local.tee 10
                      local.set 11
                      local.get 10
                      local.set 12
                      local.get 10
                      local.set 13
                      local.get 10
                      local.set 10
                      loop  ;; label = @10
                        local.get 10
                        local.set 10
                        local.get 13
                        local.set 13
                        local.get 12
                        local.set 12
                        local.get 11
                        local.set 11
                        local.get 5
                        local.set 4
                        local.get 3
                        local.get 0
                        local.get 1
                        local.get 2
                        local.tee 2
                        call $bytecount::simd::wasm::u8x16x4_from_offset::h99092da5ce490a0f
                        local.get 2
                        i32.const 64
                        i32.add
                        local.tee 7
                        local.get 2
                        i32.lt_u
                        br_if 4 (;@6;)
                        local.get 7
                        local.set 2
                        local.get 4
                        i32.const 1
                        i32.add
                        local.set 5
                        local.get 11
                        local.get 3
                        v128.load
                        local.get 9
                        i8x16.eq
                        i8x16.sub
                        local.tee 14
                        local.set 11
                        local.get 12
                        local.get 3
                        v128.load offset=16
                        local.get 9
                        i8x16.eq
                        i8x16.sub
                        local.tee 15
                        local.set 12
                        local.get 13
                        local.get 3
                        v128.load offset=32
                        local.get 9
                        i8x16.eq
                        i8x16.sub
                        local.tee 16
                        local.set 13
                        local.get 10
                        local.get 3
                        v128.load offset=48
                        local.get 9
                        i8x16.eq
                        i8x16.sub
                        local.tee 17
                        local.set 10
                        local.get 4
                        i32.const 255
                        i32.lt_u
                        br_if 0 (;@10;)
                      end
                      local.get 16
                      i16x8.extadd_pairwise_i8x16_u
                      local.get 17
                      i16x8.extadd_pairwise_i8x16_u
                      i16x8.add
                      local.get 15
                      i16x8.extadd_pairwise_i8x16_u
                      i16x8.add
                      local.get 14
                      i16x8.extadd_pairwise_i8x16_u
                      i16x8.add
                      i32x4.extadd_pairwise_i16x8_u
                      local.tee 11
                      i32x4.extract_lane 0
                      local.tee 4
                      local.get 11
                      i32x4.extract_lane 1
                      i32.add
                      local.tee 2
                      local.get 4
                      i32.lt_u
                      br_if 5 (;@4;)
                      local.get 11
                      i32x4.extract_lane 2
                      local.tee 4
                      local.get 11
                      i32x4.extract_lane 3
                      i32.add
                      local.tee 5
                      local.get 4
                      i32.lt_u
                      br_if 6 (;@3;)
                      local.get 2
                      local.get 5
                      i32.add
                      local.tee 4
                      local.get 2
                      i32.lt_u
                      br_if 7 (;@2;)
                      local.get 6
                      local.get 4
                      i32.add
                      local.tee 2
                      local.get 6
                      i32.lt_u
                      br_if 2 (;@7;)
                      local.get 7
                      i32.const 4080
                      i32.add
                      local.tee 6
                      local.set 4
                      local.get 2
                      local.set 5
                      local.get 7
                      local.set 2
                      local.get 6
                      local.get 7
                      i32.ge_u
                      br_if 0 (;@9;)
                    end
                    i32.const 4119884
                    call $core::panicking::panic_const::panic_const_add_overflow::ha5f2ad64652d0d2f
                    unreachable
                  end
                  local.get 1
                  local.get 2
                  i32.lt_u
                  br_if 2 (;@5;)
                  block  ;; label = @8
                    local.get 1
                    local.get 2
                    i32.sub
                    local.tee 4
                    i32.const 64
                    i32.ge_u
                    br_if 0 (;@8;)
                    v128.const i32x4 0x00000000 0x00000000 0x00000000 0x00000000
                    local.tee 11
                    local.set 12
                    local.get 11
                    local.set 13
                    local.get 11
                    local.set 10
                    local.get 11
                    local.set 11
                    local.get 2
                    local.set 2
                    br 7 (;@1;)
                  end
                  local.get 4
                  i32.const 6
                  i32.shr_u
                  local.set 8
                  local.get 2
                  local.set 2
                  local.get 4
                  i32.const 63
                  i32.gt_u
                  local.set 7
                  v128.const i32x4 0x00000000 0x00000000 0x00000000 0x00000000
                  local.tee 10
                  local.set 11
                  local.get 10
                  local.set 12
                  local.get 10
                  local.set 13
                  local.get 10
                  local.set 10
                  block  ;; label = @8
                    loop  ;; label = @9
                      local.get 10
                      local.set 10
                      local.get 13
                      local.set 13
                      local.get 12
                      local.set 12
                      local.get 11
                      local.set 11
                      local.get 7
                      local.set 4
                      local.get 3
                      local.get 0
                      local.get 1
                      local.get 2
                      local.tee 2
                      call $bytecount::simd::wasm::u8x16x4_from_offset::h99092da5ce490a0f
                      local.get 2
                      i32.const 64
                      i32.add
                      local.tee 5
                      local.get 2
                      i32.lt_u
                      br_if 1 (;@8;)
                      local.get 5
                      local.set 2
                      local.get 4
                      i32.const 1
                      i32.add
                      local.set 7
                      local.get 11
                      local.get 3
                      v128.load offset=48
                      local.get 9
                      i8x16.eq
                      i8x16.sub
                      local.tee 14
                      local.set 11
                      local.get 12
                      local.get 3
                      v128.load offset=32
                      local.get 9
                      i8x16.eq
                      i8x16.sub
                      local.tee 15
                      local.set 12
                      local.get 13
                      local.get 3
                      v128.load
                      local.get 9
                      i8x16.eq
                      i8x16.sub
                      local.tee 16
                      local.set 13
                      local.get 10
                      local.get 3
                      v128.load offset=16
                      local.get 9
                      i8x16.eq
                      i8x16.sub
                      local.tee 17
                      local.set 10
                      local.get 4
                      local.get 8
                      i32.lt_u
                      br_if 0 (;@9;)
                    end
                    local.get 17
                    local.set 12
                    local.get 16
                    local.set 13
                    local.get 15
                    local.set 10
                    local.get 14
                    local.set 11
                    local.get 5
                    local.set 2
                    br 7 (;@1;)
                  end
                  i32.const 4120012
                  call $core::panicking::panic_const::panic_const_add_overflow::ha5f2ad64652d0d2f
                  unreachable
                end
                i32.const 4120028
                call $core::panicking::panic_const::panic_const_add_overflow::ha5f2ad64652d0d2f
                unreachable
              end
              i32.const 4120044
              call $core::panicking::panic_const::panic_const_add_overflow::ha5f2ad64652d0d2f
              unreachable
            end
            i32.const 4119900
            call $core::panicking::panic_const::panic_const_sub_overflow::ha660620485d267ca
            unreachable
          end
          i32.const 4119836
          call $core::panicking::panic_const::panic_const_add_overflow::ha5f2ad64652d0d2f
          unreachable
        end
        i32.const 4119852
        call $core::panicking::panic_const::panic_const_add_overflow::ha5f2ad64652d0d2f
        unreachable
      end
      i32.const 4119868
      call $core::panicking::panic_const::panic_const_add_overflow::ha5f2ad64652d0d2f
      unreachable
    end
    local.get 2
    local.set 7
    block  ;; label = @1
      block  ;; label = @2
        block  ;; label = @3
          block  ;; label = @4
            block  ;; label = @5
              block  ;; label = @6
                local.get 13
                i16x8.extadd_pairwise_i8x16_u
                local.get 12
                i16x8.extadd_pairwise_i8x16_u
                i16x8.add
                local.get 10
                i16x8.extadd_pairwise_i8x16_u
                i16x8.add
                local.get 11
                i16x8.extadd_pairwise_i8x16_u
                i16x8.add
                i32x4.extadd_pairwise_i16x8_u
                local.tee 11
                i32x4.extract_lane 0
                local.tee 4
                local.get 11
                i32x4.extract_lane 1
                i32.add
                local.tee 2
                local.get 4
                i32.lt_u
                br_if 0 (;@6;)
                local.get 11
                i32x4.extract_lane 2
                local.tee 4
                local.get 11
                i32x4.extract_lane 3
                i32.add
                local.tee 5
                local.get 4
                i32.lt_u
                br_if 1 (;@5;)
                local.get 2
                local.get 5
                i32.add
                local.tee 4
                local.get 2
                i32.lt_u
                br_if 2 (;@4;)
                block  ;; label = @7
                  block  ;; label = @8
                    block  ;; label = @9
                      local.get 6
                      local.get 4
                      i32.add
                      local.tee 8
                      local.get 6
                      i32.lt_u
                      br_if 0 (;@9;)
                      local.get 1
                      local.get 7
                      i32.lt_u
                      br_if 1 (;@8;)
                      local.get 1
                      local.get 7
                      i32.sub
                      local.tee 2
                      i32.const 16
                      i32.ge_u
                      br_if 2 (;@7;)
                      v128.const i32x4 0x00000000 0x00000000 0x00000000 0x00000000
                      local.set 12
                      br 8 (;@1;)
                    end
                    i32.const 4119916
                    call $core::panicking::panic_const::panic_const_add_overflow::ha5f2ad64652d0d2f
                    unreachable
                  end
                  i32.const 4119932
                  call $core::panicking::panic_const::panic_const_sub_overflow::ha660620485d267ca
                  unreachable
                end
                local.get 2
                i32.const 4
                i32.shr_u
                local.set 6
                local.get 2
                i32.const 15
                i32.gt_u
                local.set 2
                v128.const i32x4 0x00000000 0x00000000 0x00000000 0x00000000
                local.set 11
                i32.const 0
                local.set 5
                block  ;; label = @7
                  loop  ;; label = @8
                    local.get 11
                    local.set 11
                    local.get 2
                    local.set 4
                    local.get 7
                    local.get 5
                    i32.const 4
                    i32.shl
                    i32.add
                    local.tee 2
                    local.get 7
                    i32.lt_u
                    br_if 1 (;@7;)
                    local.get 3
                    local.get 2
                    i32.store offset=72
                    local.get 2
                    i32.const 16
                    i32.add
                    local.tee 5
                    local.get 2
                    i32.lt_u
                    br_if 5 (;@3;)
                    local.get 5
                    local.get 1
                    i32.gt_u
                    br_if 6 (;@2;)
                    local.get 11
                    local.get 0
                    local.get 2
                    i32.add
                    v128.load align=1
                    local.get 9
                    i8x16.eq
                    i8x16.sub
                    local.tee 11
                    local.set 12
                    local.get 4
                    i32.const 1
                    i32.add
                    local.set 2
                    local.get 11
                    local.set 11
                    local.get 4
                    local.set 5
                    local.get 4
                    local.get 6
                    i32.ge_u
                    br_if 7 (;@1;)
                    br 0 (;@8;)
                  end
                end
                i32.const 4119996
                call $core::panicking::panic_const::panic_const_add_overflow::ha5f2ad64652d0d2f
                unreachable
              end
              i32.const 4119836
              call $core::panicking::panic_const::panic_const_add_overflow::ha5f2ad64652d0d2f
              unreachable
            end
            i32.const 4119852
            call $core::panicking::panic_const::panic_const_add_overflow::ha5f2ad64652d0d2f
            unreachable
          end
          i32.const 4119868
          call $core::panicking::panic_const::panic_const_add_overflow::ha5f2ad64652d0d2f
          unreachable
        end
        i32.const 4119620
        call $core::panicking::panic_const::panic_const_add_overflow::ha5f2ad64652d0d2f
        unreachable
      end
      local.get 3
      i32.const 88
      i32.add
      i32.const 65
      i32.store
      local.get 3
      i32.const 2
      i32.store offset=4
      local.get 3
      i32.const 4119648
      i32.store
      local.get 3
      i64.const 2
      i64.store offset=12 align=4
      local.get 3
      i32.const 65
      i32.store offset=80
      local.get 3
      local.get 1
      i32.store offset=92
      local.get 3
      local.get 3
      i32.const 76
      i32.add
      i32.store offset=8
      local.get 3
      local.get 3
      i32.const 92
      i32.add
      i32.store offset=84
      local.get 3
      local.get 3
      i32.const 72
      i32.add
      i32.store offset=76
      local.get 3
      i32.const 4119664
      call $core::panicking::panic_fmt::hfdaf3eddd0a11d4f
      unreachable
    end
    local.get 12
    local.set 11
    block  ;; label = @1
      block  ;; label = @2
        local.get 1
        i32.const 15
        i32.and
        local.tee 2
        br_if 0 (;@2;)
        local.get 11
        local.set 9
        br 1 (;@1;)
      end
      local.get 11
      local.get 2
      i32.const 4119948
      i32.add
      v128.load align=1
      v128.const i32x4 0x00000000 0x00000000 0x00000000 0x00000000
      local.get 0
      local.get 1
      i32.add
      i32.const -16
      i32.add
      v128.load align=1
      local.get 9
      i8x16.eq
      v128.bitselect
      i8x16.sub
      local.set 9
    end
    block  ;; label = @1
      block  ;; label = @2
        block  ;; label = @3
          block  ;; label = @4
            local.get 9
            i16x8.extadd_pairwise_i8x16_u
            i32x4.extadd_pairwise_i16x8_u
            local.tee 9
            i32x4.extract_lane 0
            local.tee 4
            local.get 9
            i32x4.extract_lane 1
            i32.add
            local.tee 2
            local.get 4
            i32.lt_u
            br_if 0 (;@4;)
            local.get 9
            i32x4.extract_lane 2
            local.tee 4
            local.get 9
            i32x4.extract_lane 3
            i32.add
            local.tee 7
            local.get 4
            i32.lt_u
            br_if 1 (;@3;)
            local.get 2
            local.get 7
            i32.add
            local.tee 4
            local.get 2
            i32.lt_u
            br_if 2 (;@2;)
            local.get 8
            local.get 4
            i32.add
            local.tee 2
            local.get 8
            i32.ge_u
            br_if 3 (;@1;)
            i32.const 4119980
            call $core::panicking::panic_const::panic_const_add_overflow::ha5f2ad64652d0d2f
            unreachable
          end
          i32.const 4119788
          call $core::panicking::panic_const::panic_const_add_overflow::ha5f2ad64652d0d2f
          unreachable
        end
        i32.const 4119804
        call $core::panicking::panic_const::panic_const_add_overflow::ha5f2ad64652d0d2f
        unreachable
      end
      i32.const 4119820
      call $core::panicking::panic_const::panic_const_add_overflow::ha5f2ad64652d0d2f
      unreachable
    end
    local.get 3
    i32.const 96
    i32.add
    global.set 0
    local.get 2)
  (func $bytecount::simd::wasm::u8x16x4_from_offset::h99092da5ce490a0f (type 15) (param i32 i32 i32 i32)
    (local i32 i32 i32 v128)
    global.get 0
    i32.const 48
    i32.sub
    local.tee 4
    global.set 0
    local.get 4
    local.get 3
    i32.store
    block  ;; label = @1
      block  ;; label = @2
        block  ;; label = @3
          block  ;; label = @4
            block  ;; label = @5
              local.get 3
              i32.const 64
              i32.add
              local.tee 5
              local.get 3
              i32.lt_u
              br_if 0 (;@5;)
              local.get 5
              local.get 2
              i32.gt_u
              br_if 1 (;@4;)
              local.get 3
              i32.const 16
              i32.add
              local.tee 2
              local.get 3
              i32.lt_u
              br_if 2 (;@3;)
              local.get 3
              i32.const 32
              i32.add
              local.tee 5
              local.get 3
              i32.lt_u
              br_if 3 (;@2;)
              local.get 3
              i32.const 48
              i32.add
              local.tee 6
              local.get 3
              i32.ge_u
              br_if 4 (;@1;)
              i32.const 4119772
              call $core::panicking::panic_const::panic_const_add_overflow::ha5f2ad64652d0d2f
              unreachable
            end
            i32.const 4119680
            call $core::panicking::panic_const::panic_const_add_overflow::ha5f2ad64652d0d2f
            unreachable
          end
          local.get 4
          i32.const 40
          i32.add
          i32.const 65
          i32.store
          local.get 4
          i32.const 2
          i32.store offset=8
          local.get 4
          i32.const 4119708
          i32.store offset=4
          local.get 4
          i64.const 2
          i64.store offset=16 align=4
          local.get 4
          i32.const 65
          i32.store offset=32
          local.get 4
          local.get 2
          i32.store offset=44
          local.get 4
          local.get 4
          i32.const 28
          i32.add
          i32.store offset=12
          local.get 4
          local.get 4
          i32.const 44
          i32.add
          i32.store offset=36
          local.get 4
          local.get 4
          i32.store offset=28
          local.get 4
          i32.const 4
          i32.add
          i32.const 4119724
          call $core::panicking::panic_fmt::hfdaf3eddd0a11d4f
          unreachable
        end
        i32.const 4119740
        call $core::panicking::panic_const::panic_const_add_overflow::ha5f2ad64652d0d2f
        unreachable
      end
      i32.const 4119756
      call $core::panicking::panic_const::panic_const_add_overflow::ha5f2ad64652d0d2f
      unreachable
    end
    local.get 1
    local.get 3
    i32.add
    v128.load align=1
    local.set 7
    local.get 0
    local.get 1
    local.get 2
    i32.add
    v128.load align=1
    v128.store offset=16
    local.get 0
    local.get 7
    v128.store
    local.get 0
    local.get 1
    local.get 6
    i32.add
    v128.load align=1
    v128.store offset=48
    local.get 0
    local.get 1
    local.get 5
    i32.add
    v128.load align=1
    v128.store offset=32
    local.get 4
    i32.const 48
    i32.add
    global.set 0)
@llogiq
Copy link
Owner

llogiq commented Aug 14, 2024

Does anyone know a way to implement a fallback so that we can use intrinsics and fall back to the generic version if that fails? Otherwise we might add a feature to force the generic version on wasm for such browsers and the user will either have to supply a browser check to select the best version or live with the suboptimal performance on browsers supporting SIMD.

@alecmocatta
Copy link
Author

@llogiq A Wasm binary that includes unsupported intrinsics can fail to parse, even if it won't use them. This comment is accurate unfortunately BurntSushi/memchr#144 (comment):

The current way of doing feature detection with WASM on browsers is to try to load a small WASM with the specific feature and see whether it fails. See for example this library from Google.

The route memchr took BurntSushi/memchr#149 is to only use intrinsics when #[cfg(target_feature = "simd128")]. This way you can force the intrinsic or generic version at compile-time with or without RUSTFLAGS=-Ctarget-feature=+simd128. Alternately:

[target.wasm32-unknown-unknown]
rustflags = ["-Ctarget-feature=+simd128"]

Apps can then build multiple binaries, and use feature detection to serve the optimal one. For the foreseeable future this is the only portable option as far as I know.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants