Add new cache management instruction support to drmemtrace #7111

derekbruening · 2024-12-09T16:47:02Z

My pending comment in #7109 which adds CLFLUSHOPT opcode decoding support:

drmemtrace needs to be udpated as well: it looks like CLFLUSHOPT also invalidates all cache levels and so should be treated like CLFLUSH in memtraces (or would a simulator want to know about the subtle differences?). So just adding to 2 spots in instru.cpp https://github.com/DynamoRIO/dynamorio/blob/master/clients/drcachesim/tracer/instru.cpp#L227 and https://github.com/DynamoRIO/dynamorio/blob/master/clients/drcachesim/tracer/instru.cpp#L244 should be enough (optionally extend invariant_checker_test). Hmm. Maybe enough questions this should be separated out.

derekbruening · 2024-12-09T16:49:03Z

We may also want to mark other cache ops such as CLWB and CLDEMOTE.

Adds handling to drmemtrace for the x86 CLFLUSHOPT opcode to treat it just like CLFLUSH; the two are similar enough for most uses. Augments the existing CLFLUSH invariant_checker test as a sanity test of the new opcode handling. Issue: #7111

derekbruening · 2024-12-11T22:01:20Z

In fact, e24dbb0 has CLWB marked as a load: which it is not. If we do nothing, the trace will have it labeled as a read==load record. That highlights that this affects all analyzers and not just those that want to model cache behavior: those looking at loads will do the wrong thing. So this is not just about augmenting the trace but correcting the trace.

Today a functional cache simulator like dr$sim doesn't need any architecture-specific code: everything that affects the caches is a first-class record type. If we don't do the same with these operations, we're forcing dr$sim and others to add decoding and architecture-specific handling of the various cache operations. The complexity of figuring out what these operations do has to live somewhere: the question is, does it live in the trace itself and the consumers like dr$sim remain arch-agnostic and simple, or does it live in the consumers and they become complicated? It could live in a library to share among analyzers.

We added the whole series of TRACE_TYPE_PREFETCH_INSTR_L1, TRACE_TYPE_PREFETCH_INSTR_L1_NT, etc. record types, along with cache flush types, back when we didn't have embedded encodings: so there was no simple way to get that info (though you could map the binaries and decode it was rarely done). So is there less of an argument to continue in the pre-existing approach of having all cache-related operations as first-class record types because it is now a little easier to decode?

Since we have to correct the marking of this and related operations as loads, we need some kind of new record type. Trying to boil this down to these choices:

A. Add a catch-all "cache operation" first-class record type and:
1. Add trace markers with the particular cache operations. This tries to avoid breaking analyzers/simulators by having to deal with new top-level record types: new markers are more ignorable as they are just hints.
2. Add a library that provides an enum of cache operations produced from decoding: basically putting the subtypes into a library's enum instead of trace record types.
3. Leave it up to each analyzer to decode with ISA-specific dispatch to figure out the exact semantics.
B. Add first-class record types for each type of operation: this aligns with what is already there for prefetches and cache flushes.

khuey · 2024-12-11T22:35:59Z

In fact, e24dbb0 has CLWB marked as a load: which it is not. If we do nothing, the trace will have it labeled as a read==load record. That highlights that this affects all analyzers and not just those that want to model cache behavior: those looking at loads will do the wrong thing. So this is not just about augmenting the trace but correcting the trace.

fwiw I have a similar issue. Our use of DR is confined to the decoding/encoding components. We examine blocks of instructions and capture their memory inputs so that the block can later be re-executed (with appropriate rewriting) outside of the original program by providing the initial register states and said memory inputs. We end up special casing the prefetch/cache flush instructions to avoid unnecessarily capturing their input memory operands precisely because they are not architectural loads.

derekbruening · 2024-12-12T16:09:32Z

Should there be an IR API routine to query whether a load actually issues a regular load? A different version of opc_is_not_a_real_memory_load(); and maybe export that one too.

khuey · 2024-12-12T22:05:16Z

Yeah that might be nice. I looked at adding an equivalent of instr_is_prefetch() for these cache flushing instructions but it looked non-trivial for ARM because there's not a separate opcode there.

derekbruening · 2024-12-12T22:43:58Z

Yeah that might be nice. I looked at adding an equivalent of instr_is_prefetch() for these cache flushing instructions but it looked non-trivial for ARM because there's not a separate opcode there.

Some of us have lobbied to separate the DR versions of some Arm opcodes to make these things easier. See the discussion at
#4386 (comment) on
opcode philosophy of separate opcodes for separate semantics; also xref #4388 on making drmemtrace's Arm prefetch operation dispatch cleaner.

derekbruening added the Component-DrMemtrace label Dec 9, 2024

derekbruening self-assigned this Dec 9, 2024

derekbruening mentioned this issue Dec 9, 2024

Add decode/encode support for additional Intel cache management instructions. #7109

Merged

derekbruening mentioned this issue Dec 10, 2024

i#7111 cache opcodes: Add CLFLUSHOPT handling to drmemtrace #7116

Merged

derekbruening changed the title ~~Add CLFLUSHOPT support to drmemtrace~~ Add new cache management instruction support to drmemtrace Dec 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add new cache management instruction support to drmemtrace #7111

Add new cache management instruction support to drmemtrace #7111

derekbruening commented Dec 9, 2024

derekbruening commented Dec 9, 2024

derekbruening commented Dec 11, 2024

khuey commented Dec 11, 2024

derekbruening commented Dec 12, 2024

khuey commented Dec 12, 2024

derekbruening commented Dec 12, 2024

Add new cache management instruction support to drmemtrace #7111

Add new cache management instruction support to drmemtrace #7111

Comments

derekbruening commented Dec 9, 2024

derekbruening commented Dec 9, 2024

derekbruening commented Dec 11, 2024

khuey commented Dec 11, 2024

derekbruening commented Dec 12, 2024

khuey commented Dec 12, 2024

derekbruening commented Dec 12, 2024