Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tick/rcu: Fix false positive "softirq work is pending" messages #172

Conversation

linosanfilippo-kunbus
Copy link

commit 96c1fa0 upstream.

In commit 0345691 ("tick/rcu: Stop allowing RCU_SOFTIRQ in idle") the new function report_idle_softirq() was created by breaking code out of the existing can_stop_idle_tick() for kernels v5.18 and newer.

In doing so, the code essentially went from a one conditional:

if (a && b && c)
	warn();

to a three conditional:

if (!a)
	return;
if (!b)
	return;
if (!c)
	return;
warn();

But that conversion got the condition for the RT specific local_bh_blocked() wrong. The original condition was:

!local_bh_blocked()

but the conversion failed to negate it so it ended up as:

    if (!local_bh_blocked())
	return false;

This issue lay dormant until another fixup for the same commit was added in commit a7e282c ("tick/rcu: Fix bogus ratelimit condition"). This commit realized the ratelimit was essentially set to zero instead of ten, and hence no softirq pending messages would ever be issued.

Once this commit was backported via linux-stable, both the v6.1 and v6.4 preempt-rt kernels started printing out 10 instances of this at boot:

NOHZ tick-stop error: local softirq work is pending, handler #80!!!

Remove the negation and return when local_bh_blocked() evaluates to true to bring the correct behaviour back.

Fixes: 0345691 ("tick/rcu: Stop allowing RCU_SOFTIRQ in idle")

Tested-by: Ahmad Fatoum a.fatoum@pengutronix.de
Reviewed-by: Wen Yang wenyang.linux@foxmail.com
Acked-by: Frederic Weisbecker frederic@kernel.org
Link: https://lore.kernel.org/r/20230818200757.1808398-1-paul.gortmaker@windriver.com

commit 96c1fa0 upstream.

In commit 0345691 ("tick/rcu: Stop allowing RCU_SOFTIRQ in idle") the
new function report_idle_softirq() was created by breaking code out of the
existing can_stop_idle_tick() for kernels v5.18 and newer.

In doing so, the code essentially went from a one conditional:

	if (a && b && c)
		warn();

to a three conditional:

	if (!a)
		return;
	if (!b)
		return;
	if (!c)
		return;
	warn();

But that conversion got the condition for the RT specific
local_bh_blocked() wrong. The original condition was:

   	!local_bh_blocked()

but the conversion failed to negate it so it ended up as:

        if (!local_bh_blocked())
		return false;

This issue lay dormant until another fixup for the same commit was added
in commit a7e282c ("tick/rcu: Fix bogus ratelimit condition").
This commit realized the ratelimit was essentially set to zero instead
of ten, and hence *no* softirq pending messages would ever be issued.

Once this commit was backported via linux-stable, both the v6.1 and v6.4
preempt-rt kernels started printing out 10 instances of this at boot:

  NOHZ tick-stop error: local softirq work is pending, handler RevolutionPi#80!!!

Remove the negation and return when local_bh_blocked() evaluates to true to
bring the correct behaviour back.

Fixes: 0345691 ("tick/rcu: Stop allowing RCU_SOFTIRQ in idle")
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Ahmad Fatoum <a.fatoum@pengutronix.de>
Reviewed-by: Wen Yang <wenyang.linux@foxmail.com>
Acked-by: Frederic Weisbecker <frederic@kernel.org>
Link: https://lore.kernel.org/r/20230818200757.1808398-1-paul.gortmaker@windriver.com
@linosanfilippo-kunbus linosanfilippo-kunbus merged commit bd74d29 into RevolutionPi:revpi-6.1 Sep 18, 2023
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants