Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

string type with no malloc (WiP for discussion) #128

Open
wants to merge 1 commit into
base: develop
Choose a base branch
from

Conversation

pgu-swir
Copy link
Contributor

Hi - while I was tweaking my eRPC use, I needed to find a solution to send a few really large strings. Increasing the default buffer size does not seem to be a good idea, but before looking for a solution, I realized that the server shim does a malloc and a copy no matter what of any received string.

Using malloc is usually not great and I was wondering if there a reason for not using pointers inside transport buffer. Of course, there are cases where it's no desirable for memory isolation, but was there a reason beyond that?

This PR, that is just for discussion, tries to add an annotation @direct which, when used, allows the client shim to send the string with the NULL byte so that the server shim can pass directly a pointer inside the received buffer. When @direct is not used, code should be generated as usual

Let me know what you think.

@Hadatko
Copy link
Member

Hadatko commented Sep 20, 2020

Hi @pgu-swir , i really appreciate your activity here. This is not only about strings but about all pointer types variables (list, binaries, ...) Long time ago i had similar opinion and maybe i will find my notes for this. There can more solutions based on architecture. But we were focusing to make it working generall for all platforms and setups. These are solutions are effective for C/C++ languages, but not that effective for python,....
Quite dangerous could be shared memory on multicore system. SO you can send only first pointer, or all pointers to shared structures, but then user need be carefull to use this shared structures.
Another solution could be use existing MessageBuffers created dynamically so the data are safe for accessing for second side. But user need to know that he need copy data to theirs variables if they want to use them after erpc call. Also the endianess need to be same on both sides.

@Hadatko
Copy link
Member

Hadatko commented Sep 20, 2020

I need really find my notes, I think this solution was taking much more time to serialize on sender side, but almost no time on receiver side.

@MichalPrincNXP maybe you are owning my notes for ePRC as i copied my folder to you. Or even better i think it can be in OneNote for eRPC if you are using that still. Not in weekly meeting byt some eRPC related bookmark (there were 3-4 bookmarks for eRPC tab).

@pgu-swir
Copy link
Contributor Author

pgu-swir commented Sep 20, 2020

Hi @pgu-swir , i really appreciate your activity here. This is not only about strings but about all pointer types variables (list, binaries, ...)

YW :-) I agree that's also why I've made an optional annotation @direct, disabled by default. I also agree it will concern all pointer type variables but I wanted to try one first and see also if the idea was resonating with you

Quite dangerous could be shared memory on multicore system. SO you can send only first pointer, or all pointers to shared structures, but then user need be carefull to use this shared structures.

But here I'm not sure I follow you. The idea is that the client shim still put the data to be exchanged in a buffer that is fully copied to the server side. The server then (in the case of string), used to allocate a temp buffer for each string variable, copy data from the message buffer to its respective string var and then call the real functions
All I'm doing now is passing pointers to location inside the message buffer, so avoiding a malloc/free and a copy. I'm not doing further optimizations, typically the "bufferization" is untouched

Another solution could be use existing MessageBuffers created dynamically so the data are safe for accessing for second side. But user need to know that he need copy data to theirs variables if they want to use them after erpc call. Also the endianess need to be same on both sides.

Yes, but endianness problem exist today already, no? As well, if user wants to continue using pointer on the server side after the erpc call, doesn't he have to annotate the IDL with @retain? Otherwise the allocated pointers are freed. In the example I made, usage of zero-copy is disabled if @retain is set, even if @direct is set as well

@Hadatko
Copy link
Member

Hadatko commented Sep 29, 2020

This was the idea. Basically copy data types one by one and copy recursive their sub-data types.... Pointers addresses will be replaced with offset on sender side and back to pointers on receiver side:
image001

@pgu-swir
Copy link
Contributor Author

Thanks for sharing - this is what I have implemented for strings, but there is still fixes to be done for out strings with @max_length annotation. I started to look at structure, binary and list but it's getting more and more complex so I was wondering what is the appetite for that?

@Hadatko
Copy link
Member

Hadatko commented Sep 29, 2020

I think that maybe we can write another implementation of shim code generator (just template files) for this case. Maybe it can be more transparent and also more generic. Not sure.Also not sure if we want completly new template file or only erpc function call implementation version2

@kikass13
Copy link

kikass13 commented Oct 7, 2021

Every step in the direction of zero-copy is a good step. I am slightly annoyed by the fact the erpc still uses heap (for no reason). Nothing some static buffers can't handle, especially the use case you have described here, where data can be "consumed" via reference / pointer access and copied (onto stack if necessary) later

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants