Test failures for 1.5.0 on Alpine Linux #90

orhun · 2023-07-24T10:49:33Z

Hello! 🐻

I've been trying to package tere for Alpine Linux and I created a merge request here: https://gitlab.alpinelinux.org/alpine/aports/-/merge_requests/49128

As you can see from the failed pipelines, some tests are failing:

failures:
---- app_state::tests::test_case_sensitive_mode_change stdout ----
thread 'app_state::tests::test_case_sensitive_mode_change' panicked at 'assertion failed: `(left == right)`
  left: `[2]`,
 right: `[1]`', src/app_state.rs:1711:9
failures:
    app_state::tests::test_case_sensitive_mode_change
test result: FAILED. 90 passed; 1 failed; 1 ignored; 0 measured; 0 filtered out; finished in 0.05s

Also, I get the following failures even though I skip that one test:

failures:

---- panic_guard::tests::test_callback_called_before_panic_hook stdout ----
thread 'panic_guard::tests::test_callback_called_before_panic_hook' panicked at 'assertion failed: `(left == right)`
  left: `["cleanup", "hook", "hook", "hook"]`,
 right: `["cleanup", "hook"]`', src/panic_guard.rs:135:9
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

---- panic_guard::tests::test_callback_called_once_only_panic stdout ----
thread 'panic_guard::tests::test_callback_called_once_only_panic' panicked at 'called `Result::unwrap()` on an `Err` value: PoisonError { .. }', src/panic_guard.rs:99:36

---- panic_guard::tests::test_nested_callback stdout ----
thread 'panic_guard::tests::test_nested_callback' panicked at 'called `Result::unwrap()` on an `Err` value: PoisonError { .. }', src/panic_guard.rs:140:36

---- panic_guard::tests::test_nested_callback_hook_restored stdout ----
thread 'panic_guard::tests::test_nested_callback_hook_restored' panicked at 'called `Result::unwrap()` on an `Err` value: PoisonError { .. }', src/panic_guard.rs:180:36

---- panic_guard::tests::test_nested_callback_with_panic stdout ----
thread 'panic_guard::tests::test_nested_callback_with_panic' panicked at 'called `Result::unwrap()` on an `Err` value: PoisonError { .. }', src/panic_guard.rs:161:36


failures:
    panic_guard::tests::test_callback_called_before_panic_hook
    panic_guard::tests::test_callback_called_once_only_panic
    panic_guard::tests::test_nested_callback
    panic_guard::tests::test_nested_callback_hook_restored
    panic_guard::tests::test_nested_callback_with_panic

test result: FAILED. 85 passed; 5 failed; 1 ignored; 0 measured; 1 filtered out; finished in 0.05s

error: test failed, to rerun pass '--bin tere'

I am skipping these tests for now since I am not sure what is happening there.

Weirdly enough, all the tests (including those ones) pass on Arch Linux.

Let me know if I can provide more info to debug this.

Thanks for this great tool! 💯

The text was updated successfully, but these errors were encountered:

mgunyho · 2023-07-24T10:52:20Z

Hmm, interesting. I can't tell offhand what is causing those, but probably something to do with threading; the panic_guard tests should not run in parallel (they modify the global panic hook), so they use a mutex.

About the first failure, is it somehow possible that files in Alpine are not case-sensitive? Although that sounds like something very fundamental to Unix that shouldn't be different.

orhun · 2023-07-24T11:02:18Z

ping @nekopsykose (Alpine Linux maintainer), maybe they know something about it :/

nekopsykose · 2023-07-24T11:34:59Z

About the first failure, is it somehow possible that files in Alpine are not case-sensitive?

that's a filesystem thing and nothing in this case has casefolding

nekopsykose · 2023-07-24T11:35:51Z

running them locally they all pass, funny..

nekopsykose · 2023-07-24T11:38:09Z

ah, it's flaky. it toggles between:

all pass

these fail:

failures:
  panic_guard::tests::test_nested_callback_hook_restored
  panic_guard::tests::test_nested_callback_with_panic

these:

failures:
  panic_guard::tests::test_callback_called_before_panic_hook
  panic_guard::tests::test_callback_called_on_drop
  panic_guard::tests::test_callback_called_once_only_panic
  panic_guard::tests::test_nested_callback
  panic_guard::tests::test_nested_callback_hook_restored
  panic_guard::tests::test_nested_callback_with_panic

this just sounds like some ordering that is broken

nekopsykose · 2023-07-24T11:40:43Z

ah. the reason they cascade like that is because your mutex doesn't unlock on fail. so whatever test fails first, causes the rest to fail on lock (poisoned dead mutex)

the first failure is always

---- panic_guard::tests::test_callback_called_before_panic_hook stdout ----
thread 'panic_guard::tests::test_callback_called_before_panic_hook' panicked at 'assertion failed: `(left == right)`
  left: `["cleanup", "hook", "hook", "hook"]`,
 right: `["cleanup", "hook"]`', src/panic_guard.rs:135:9

--- panic_guard::tests::test_nested_callback_hook_restored stdout ----
thread 'panic_guard::tests::test_nested_callback_hook_restored' panicked at 'assertion failed: `(left == right)`
  left: `["inner inner cleanup", "inner cleanup", "outer hook", "outer hook"]`,
 right: `["inner inner cleanup", "inner cleanup", "outer hook"]`', src/panic_guard.rs:214:9

etc. seems that something races appending

nekopsykose · 2023-07-24T11:45:01Z

i don't know how to fix that, but maybe you have a clue.

your mutex doesn't unlock on fail

the default Mutex cannot unlock on a thread panic (what your failures do), it always gets poisoned. you should see if there's another implementation :)

mgunyho · 2023-07-24T11:48:14Z

Thanks! Yeah, the poisoning is kind of on purpose, if it happens then something is wrong anyway so I didn't bother cleaning it up.

I don't really know what's happening there that's causing the callbacks to be called multiple times. Maybe I'm doing something stupid with the locking/calls mutex inside the callbacks.

nekopsykose · 2023-07-24T11:51:10Z

Yeah, the poisoning is kind of on purpose, if it happens then something is wrong anyway so I didn't bother cleaning it up.

that makes sense. just a little noisy, but the first failure is all that mattered anyway.

mgunyho · 2024-04-05T18:39:59Z

Could this be resolved now? I believe the panic_guard tests shouldn't fail anymore after an update to rustc, and app_state::tests::test_case_sensitive_mode_change should have been fixed in 8cadf56.

orhun · 2024-04-05T21:56:13Z

Are you planning to create a release soon? It would be nice to activate the tests again along with it 😊

mgunyho · 2024-04-06T09:31:20Z

Ah yes I didn't realize that I would need to make a new release. However, these panic guard test failures still seem to appear with rustc version 1.77, both on develop and master 😩 I will really need to try and address this.

I can only reproduce this on my work computer (ubuntu 22) but not on my laptop (fedora 39), and it only happens occasionally. It happens a little bit more often if I force a re-link by removing the unit test binary, like

rm target/debug/deps/tere-25a48097920d5416 && cargo test

(the name of the binary is shown in the first few lines of the output of cargo test)

This suggests that it can have something to do with the CPU being busy, which would make sense as this seems to be due to a race condition.

mgunyho · 2024-09-15T18:10:41Z

@timon-schelling re: #104 (comment)

Do you know what the environment (distro / rustc version) in the codespace is? I would like to investigate this if it can be reliably reproduced. I tried now on Debian 11 (in a container) and there at least the issue doesn't appear.

mgunyho · 2024-09-15T18:35:48Z

I tried this also with an Alpine Linux container (using Distrobox) and I can't reproduce it there either (on my laptop, which I mentioned above that I couldn't reproduce anyway).

mgunyho · 2024-09-15T19:32:09Z

Okay, I was able to confirm that it's a data race issue by running the thread sanitizer. I installed the nightly toolchain, and then run with

RUSTFLAGS="-Zsanitizer=thread" cargo +nightly test --target=x86_64-unknown-linux-gnu

and it reports data races in those tests. That shouldn't happen in safe Rust...

timon-schelling · 2024-09-15T19:59:28Z

I'm unable to reproduce this, tried on multiple codespaecs and my nix machine.
But the codespace I head the issue on head rustc 1.83.0-nightly (c2f74c3f9 2024-09-09).

timon-schelling · 2024-09-15T20:03:55Z

and it reports data races in those tests. That shouldn't happen in safe Rust...

I guess a library that is using some unsafe code dose not have a save api?

timon-schelling · 2024-09-15T20:07:14Z

RUSTFLAGS="-Zsanitizer=thread" cargo +nightly test --target=x86_64-unknown-linux-gnu

But I don't see any non std or tere line in the warning. 🤔

mgunyho · 2024-09-15T20:10:49Z

The panic_guard module only uses std::sync::Arc and Mutex. I don't exactly know how the test are run, but I suppose messing with the panic hook in multiple tests running in parallel is pretty unusual, and maybe during a panic (where I'm doing stuff) some of the safety guarantees don't apply.

But I will try to investigate with an MWE to see if this is truly a bug in std.

mgunyho · 2024-09-23T20:41:53Z

I looked into this again. Basically, tests run in parallel and concurrently modify the panic hook, see for example these discussions:

If I restrict the tests to a single thread with cargo test -- --test-threads 1 or run only tests in the panic_guard module which is using the mutex to avoid data races, the issue doesn't occur. So the only way to avoid this would be to have a mutex across all tests, which is not feasible.

At least it seems like the issue is with the tests and not with my code being tested, which is a relief. But I don't know of a good way to resolve this.

mgunyho mentioned this issue Sep 17, 2023

Integration test failures during Arch Linux build #93

Closed

timon-schelling mentioned this issue Sep 10, 2024

Add skip first run prompt cli flag #104

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Test failures for 1.5.0 on Alpine Linux #90

Test failures for 1.5.0 on Alpine Linux #90

orhun commented Jul 24, 2023 •

edited

Loading

mgunyho commented Jul 24, 2023 •

edited

Loading

orhun commented Jul 24, 2023

nekopsykose commented Jul 24, 2023

nekopsykose commented Jul 24, 2023

nekopsykose commented Jul 24, 2023

nekopsykose commented Jul 24, 2023

nekopsykose commented Jul 24, 2023

mgunyho commented Jul 24, 2023

nekopsykose commented Jul 24, 2023

mgunyho commented Apr 5, 2024

orhun commented Apr 5, 2024

mgunyho commented Apr 6, 2024

mgunyho commented Sep 15, 2024

mgunyho commented Sep 15, 2024 •

edited

Loading

mgunyho commented Sep 15, 2024

timon-schelling commented Sep 15, 2024

timon-schelling commented Sep 15, 2024

timon-schelling commented Sep 15, 2024

mgunyho commented Sep 15, 2024

mgunyho commented Sep 23, 2024

Test failures for 1.5.0 on Alpine Linux #90

Test failures for 1.5.0 on Alpine Linux #90

Comments

orhun commented Jul 24, 2023 • edited Loading

mgunyho commented Jul 24, 2023 • edited Loading

orhun commented Jul 24, 2023

nekopsykose commented Jul 24, 2023

nekopsykose commented Jul 24, 2023

nekopsykose commented Jul 24, 2023

nekopsykose commented Jul 24, 2023

nekopsykose commented Jul 24, 2023

mgunyho commented Jul 24, 2023

nekopsykose commented Jul 24, 2023

mgunyho commented Apr 5, 2024

orhun commented Apr 5, 2024

mgunyho commented Apr 6, 2024

mgunyho commented Sep 15, 2024

mgunyho commented Sep 15, 2024 • edited Loading

mgunyho commented Sep 15, 2024

timon-schelling commented Sep 15, 2024

timon-schelling commented Sep 15, 2024

timon-schelling commented Sep 15, 2024

mgunyho commented Sep 15, 2024

mgunyho commented Sep 23, 2024

orhun commented Jul 24, 2023 •

edited

Loading

mgunyho commented Jul 24, 2023 •

edited

Loading

mgunyho commented Sep 15, 2024 •

edited

Loading