-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix input_offset resetting in stream #25
Conversation
Amazing @myl7! Thanks for investigating this. Do you know how long the input has to be to trigger this issue? Is it possible to write a test to reproduce it, just to check this fix and to ensure we don't run into the same issue again in future? |
A ~570KB file is enough. How about to add a test with a 600KB file? |
Yes, that sounds good to me. I'm happy with either committing a 600KB file or just generating it at runtime (we can use We need to remember to confirm that any new test for this does definitely fail in the previous implementation without this fix 😄 |
Oops, sorry, that was closed automatically due to merging #26. |
My fault😂. I mistakenly used the term like "the fix # 25" in that. I checked the processing of When the instance requires more input and returns In the tests since Adding an extra step in the |
I see, thanks for the explanation, I think I understand the issue much more clearly. What's really going on here is that within brotli-wasm we're tracking state that tightly depends on how it's used in the stream wrapper - it's not really part of our Brotli streaming implementation at all. It's breaking here not because it's wrong, but because it doesn't match details of how that other implementation works. The fixed logic in this PR to update const buffer = /* ... some fixed buffer that's being written to elsewhere */
let inputOffset = 0;
while (true) {
const nextInput = buffer.slice(inputOffset, endOfBuffer);
output.append(this.stream.compress(nextInput, outputSize));
if (this.stream.result() === brotliWasm.BrotliStreamResult.NeedsMoreInput) {
// Wait for data from elsewhere to update endOfBuffer
await waitForMoreData();
// Last_input_offset will often be wrong here - it really should be endOfBuffer
// now, but in fact it will be 0, because it's assuming that buffer is reset.
}
} (Very rough code, but you get the idea) Even when it's correct, it's a bit 'too clever' and I think moving that into the stream wrapper (but making it easy to implement there) would be simpler & clearer. What would happen if we remove let inputOffset = 0;
do {
const input = chunk.slice(inputOffset);
const { output, consumedInputLength } = this.stream.compress(input, this.outputSize);
inputOffset += consumedInputLength;
controller.enqueue(output);
} while (this.stream.result() === brotliWasm.BrotliStreamResult.NeedsMoreOutput) What do you think? |
Other than keep them in the stream. New returned struct is added. Also change the error type to JsError. See rustwasm/wasm-bindgen#2710 .
700a90d
to
e983db1
Compare
True. Returning I have updated the stream Rust code with the tests. |
Looks great, thanks @myl7! I think this is definitely better and should make stream usage generally a little clearer, and fix the original bug en route too. I won't merge #27 for now, if you could add the matching changes in there that would be great, thank you! I'll do a release once that's all done. |
Close #22, close #23
The incorrect
input_offset
resetting causes that, while the stream outputs are still decompress-able, when the input is long enough, the output would become incorrect (means decompression can not output the original input). The PR fixes it.