-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SleighContextBuild set_image with &[u8] slice without copy #7
Comments
Hey! I assume you're referring to the Other than the ergonomics of dealing with more lifetimes, there's no reason why On the Regarding the new impl, you could implement for The best performance fix would be to have |
Yes, I meant this https://github.com/toolCHAINZ/jingle/blob/main/jingle_sleigh/src/ffi/image.rs#L12, would love to have instead
and pass a reference to C++ code P.S: I think there is similar issue to |
Haha, yeah, and there's the fun, because this is not legal rust: #[derive(Debug, Clone)]
pub struct ImageSection {
pub(crate) data: &[u8],
pub(crate) base_address: usize,
pub(crate) perms: Perms,
} It would have to be: #[derive(Debug, Clone)]
pub struct ImageSection<'a> {
pub(crate) data: &'a [u8],
pub(crate) base_address: usize,
pub(crate) perms: Perms,
} Which would then mean that It's possible there's another source of the performance hit you're getting too: for every new I've also internally gated the jingle/jingle_sleigh/src/context/mod.rs Lines 80 to 90 in a17d75c
This is necessary because the routine that parses the sleigh definition is VERY not thread safe. So if you're trying to make lots of sleigh contexts within threads in the same process, you're going to get bottlenecked severely during context creation. Maybe in some future iteration of this we allow re-initializing an existing context with a new image, and pass images by reference, but that will probably require me understanding how |
Oh yeah, and definitely agreed that the |
Sure, there should definitely be a lifetime specified. The code I provided was more of a 'pseudo-Rust' example, just to convey the concept. I’ll be happy to submit a PR if I get something working. The only question is when - hopefully within a week or two, depends on time availability |
I kind of solved it by cloning. Not an optimal solution, but better than parse every time:
And after I just do:
|
So, yes there's the parsing occuring there but I'm actually referring to parsing on the C++ side. We do all that rust-based parsing to enumerate what architectures are available, how to configure them, and to derive the path for the
So in the future, we can maybe better serve your use-case by enabling mutation of a sleigh context and its |
I did some closer reading of the sleigh code and think I have a better understanding of what sleigh will easily support and not support. The I believe this As far as A constant reference however feels possible. The main difficulty there is wanting to avoid a lifetime argument on the sleigh context over in Rust. Maybe the context object could be split up and it could return a I'll definitely be making the first change to avoid reloading soon, and then might also think about making the reference change. |
That sounds great! With const ref, we can eliminate multiple copies. Sorry, I haven't had time to work on this yet |
So I haven't made the change necessary to just send a slice over (and after working with sleigh more, I'm less optimistic that it would work correctly). BUT I've made some changes that should greatly speed up usage of See this unit test as an example: #[test]
fn test_two_images() {
let mov_eax_0: [u8; 4] = [0x0f, 0x05, 0x0f, 0x05];
let nops: [u8; 9] = [0x90, 0x90, 0x90, 0x90, 0x0f, 0x05, 0x0f, 0x05, 0x0f];
let ctx_builder =
SleighContextBuilder::load_ghidra_installation("/Applications/ghidra").unwrap();
let sleigh = ctx_builder.build(SLEIGH_ARCH).unwrap();
let mut sleigh = sleigh.initialize_with_image(mov_eax_0.as_slice()).unwrap();
let instr1 = sleigh.instruction_at(0);
sleigh.set_image(nops.as_slice()).unwrap();
let instr2 = sleigh.instruction_at(0);
assert_ne!(instr1, instr2);
assert_ne!(instr1, None);
let instr2 = sleigh.instruction_at(4);
assert_ne!(instr1, instr2);
assert_ne!(instr2, None);
let instr3 = sleigh.instruction_at(8);
assert_eq!(instr3, None);
} A few things to note here: let sleigh = ctx_builder.build(SLEIGH_ARCH).unwrap();
let mut sleigh = sleigh.initialize_with_image(mov_eax_0.as_slice()).unwrap(); This first line creates a sleigh context without requiring that an image be loaded, which is nice for the modeling work in This sleigh.set_image(nops.as_slice()).unwrap(); This is just done with a mutable reference, so you can reuse the same context for multiple images. It will not re-parse the sleigh definitions; in my very quick performance testing, the reset seems to take on the order of a millisecond. Also, note that slices are passed directly into set image. These slices are still being cloned internally, but all the gross conversion to an |
Thank you for the update, this is great! However, I'm a bit confused. Why do you think it wouldn't be possible to provide just a reference to the bytes to lift, given that we're simply wrapping around the C++ API? |
Err, sorry I was conflating a couple things. The idea of a mutable slice isn't going to fly because of how sleigh caches disassembly. The difficulty with a slice is that my preference would be to also allow images with owned data. The most direct way I can think of to do this would be to make If you (or someone else) finds a way to do allow this without adding a lifetime into the context itself and still allowing owned data, I'd be happy to review/merge. I'll leave this open, but am not actively working on this for now (unless a good approach comes to me, or I decide to just suck it up and add a lifetime). |
I had a thought on another approach that might work, which I've detailed here: #15 I'll try this out when I get the chance. |
Done in #16 |
Hi!
I was wondering if there's a specific reason why Image works with Vec instead of using &[u8] directly. In my use case, I provide multiple slices to the lifter to obtain the corresponding pcode, but since Image is copying slice values to vector, I'm experiencing a significant performance hit.
Additionally, would it be possible for functions like pub fn set_image(mut self, img: Image) -> Self to take &mut self instead?
I suppose my solution would be to fork and implement set_image directly on &[u8], but I wanted to ask first if there's a particular reason for the current implementation.
Thank you!
The text was updated successfully, but these errors were encountered: