Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hammerlock - invert lockhammer measurement to a run limit number of timer ticks #68

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

jty2
Copy link
Contributor

@jty2 jty2 commented Oct 7, 2021

This PR adds the -O num_timer_ticks flag to have the benchmark run for a limited number of timer ticks.

The __LSE_CMPXCHG_CASE fix is needed to avoid mysterious bus errors caused by corruption from the legacy hard-coded register usage.

jty2 added 3 commits October 7, 2021 22:43
__cmpxchg_case_##name hard-coded x30 as a temp register, but the compiler flags
did not exclude the use of x30.  This has been causing heisenbugs when LSE was
in use.
Apply earlyclobber constraint on stxr status operand to avoid warnings such as:

/tmp/ccdYBT4m.s:2175: Warning: unpredictable: identical transfer and status registers --`stxr w0,x1,[x0]'
/tmp/ccdYBT4m.s:2716: Warning: unpredictable: identical transfer and status registers --`stxr w0,x1,[x0]'
/tmp/ccdYBT4m.s:4556: Warning: unpredictable: identical transfer and status registers --`stxr w3,x3,[x0]'
…er ticks

While lockhammer synchronizes the start of the locking threads, it does not
synchronize the completion of the threads.  Each thread acquires and releaes
the lock for a specified number of iterations in a loop.  However, some threads
may finish all of their iterations earlier than others, which leads to less
contention for the threads who have not finished.  This results in a
performance measurement that does not describe a full all-thread contention.

This patch provides an option to use a different measurment loop that
terminates after a specified number of timer clock ticks has passed.  To
provide a way to avoid a situation where reading the timer clock tick has high
overhead, the measurement loop only samples the timer after a flag-specified
number of lock/release iterations.

-O ticks	number of timer clock ticks (CNTVCT_EL0 or TSC)
-I inner_iters	inner iterations of measurement loop in between timer reads
@jty2 jty2 changed the title GitHub jty2.hammer lock hammerlock - invert lockhammer measurement to a run limit number of timer ticks Oct 7, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant