hammerlock - invert lockhammer measurement to a run limit number of timer ticks #68

jty2 · 2021-10-07T22:58:13Z

This PR adds the -O num_timer_ticks flag to have the benchmark run for a limited number of timer ticks.

The __LSE_CMPXCHG_CASE fix is needed to avoid mysterious bus errors caused by corruption from the legacy hard-coded register usage.

__cmpxchg_case_##name hard-coded x30 as a temp register, but the compiler flags did not exclude the use of x30. This has been causing heisenbugs when LSE was in use.

Apply earlyclobber constraint on stxr status operand to avoid warnings such as: /tmp/ccdYBT4m.s:2175: Warning: unpredictable: identical transfer and status registers --`stxr w0,x1,[x0]' /tmp/ccdYBT4m.s:2716: Warning: unpredictable: identical transfer and status registers --`stxr w0,x1,[x0]' /tmp/ccdYBT4m.s:4556: Warning: unpredictable: identical transfer and status registers --`stxr w3,x3,[x0]'

…er ticks While lockhammer synchronizes the start of the locking threads, it does not synchronize the completion of the threads. Each thread acquires and releaes the lock for a specified number of iterations in a loop. However, some threads may finish all of their iterations earlier than others, which leads to less contention for the threads who have not finished. This results in a performance measurement that does not describe a full all-thread contention. This patch provides an option to use a different measurment loop that terminates after a specified number of timer clock ticks has passed. To provide a way to avoid a situation where reading the timer clock tick has high overhead, the measurement loop only samples the timer after a flag-specified number of lock/release iterations. -O ticks number of timer clock ticks (CNTVCT_EL0 or TSC) -I inner_iters inner iterations of measurement loop in between timer reads

jty2 added 3 commits October 7, 2021 22:43

lk_cmpxchg: do not use a hard-coded temp register when using LSE

b15a0fa

__cmpxchg_case_##name hard-coded x30 as a temp register, but the compiler flags did not exclude the use of x30. This has been causing heisenbugs when LSE was in use.

jty2 changed the title ~~GitHub jty2.hammer lock~~ hammerlock - invert lockhammer measurement to a run limit number of timer ticks Oct 7, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

hammerlock - invert lockhammer measurement to a run limit number of timer ticks #68

hammerlock - invert lockhammer measurement to a run limit number of timer ticks #68

jty2 commented Oct 7, 2021

hammerlock - invert lockhammer measurement to a run limit number of timer ticks #68

Are you sure you want to change the base?

hammerlock - invert lockhammer measurement to a run limit number of timer ticks #68

Conversation

jty2 commented Oct 7, 2021