Skip to content

Data transfer between memory spaces

Chung Leong edited this page Jun 2, 2024 · 2 revisions

From JavaScript to Zig

JavaScript uses garbage collection to manage object lifetime. In the process, the language engine would at times need to move data from one location to another location. For example, after an object has survived multiple gc sweeps, the V8 engine would relocate it from the "young generation" heap to the "old generation" heap, which is less frequently gc'ed. That causes the memory address to change.

When you call a Zig function with a pointer as argument, Zigar would update the address stored in the pointer with its target's current position in physical memory. If the target itself contains pointers, these too will get updated.

If a pointer contains an address that does not meet the Zig's alignment requirement, Zigar will allocate a new buffer, copy the data there, and pass a correctly aligned pointer to the Zig function. Afterward, it'll copy the data back. Consider the following example:

const std = @import("std");

pub fn set(ptr1: *i16, ptr2: *i32) void {
    ptr2.* = 0x22222222;
    ptr1.* = 0x1111;
}
import { set } from './data-transfer-example-1.zig';

const buffer = new ArrayBuffer(16);
const int32 = new DataView(buffer, 1, 4);
const int16 = new DataView(buffer, 3, 2);
set(int16, int32);
console.log(buffer);
ArrayBuffer {
  [Uint8Contents]: <00 22 22 11 11 00 00 00 00 00 00 00 00 00 00 00>,
  byteLength: 16
}

Both int32 and int16 sit on odd-number addresses (ArrayBuffer is aligned to at least 8). Since the alignment for i32 is 4 and i16 is 2, corrective measures are needed. Zigar is designed to handle aliasing pointers, that is, pointers that point to overlapping regions of memory. In the example int16 sits within int32, that's why half of the latter gets partially overwritten.

You'll get an error when you create a situation where alignment requirements cannot be satisfied for all aliasing pointers:

import { set } from './data-transfer-example-1.zig';

const buffer = new ArrayBuffer(16);
const i32 = new DataView(buffer, 1, 4);
const i16 = new DataView(buffer, 4, 2);
try {
    set(i16, i32);
} catch (err) {
    console.log(err.message);
}
Unable to simultaneously align memory to 4-byte and 2-byte boundary

From Zig to JavaScript

After a function call, Zigar would examine pointers contained in the return value as well as all non-const pointers passed as arguments. When it sees a new address, it would look for it among the buffers given to the function and those created through the allocator (if there is one). If the search yields nothing, then the assumption is that the address is outside the JavaScript memory space. The address might be that of a Zig variable (residing in the shared library's data segment). Or it might be that of a memory block allocated off the heap using one of Zig's allocator. Or perhaps malloc() if C is involved. In such cases an external buffer would be created. Example:

const std = @import("std");

var gpa = std.heap.GeneralPurposeAllocator(.{}){};
var allocator = gpa.allocator();

pub fn floatToString(num: f64) ![]const u8 {
    return std.fmt.allocPrint(allocator, "{d}", .{num});
}

pub fn freeString(str: []const u8) void {
    allocator.free(str);
}
import { floatToString, freeString } from './data-transfer-example-2.zig';

const array = floatToString(Math.PI);
console.log(array.string);
freeString(array);
console.log(array.string);
3.141592653589793
Segmentation fault (core dumped)

Memory allocated outside of JavaScript must be freed manually. After that occurs, the external buffer with the now invalid pointer would continue to exist. If you try to access its data, a segfault is the most likely outcome.

Data transfer when running as WebAssembly

While native code has full access to a all of a process's memory, WebAssembly code runs in its own sandbox. WASM does not have access to memory buffers in JavaScript. The way Zigar works around this limitation is to treat all pointers as misaligned. Every buffer gets duplicated in WebAssembly memory, then the contents are copied back when a function call ends.

When an unknown address is encountered, it's assumed to be in WASM memory space. A DataView will be created referencing the ArrayBuffer give by the buffer property of the WASM VM's Memory object. The ArrayBuffer in question is ephemeral in nature. When the WASM VM's memory expands or contract, the buffer becomes "detached". Zig data objects will automatically reacquire views referencing the new buffer. This is does not apply to objects you have obtained previously through the special properties dataView and typedArray.

Clone this wiki locally