-
Notifications
You must be signed in to change notification settings - Fork 185
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parallelisation for functors #3596
base: develop
Are you sure you want to change the base?
Conversation
Can parallelization be utilized in the Emscripten compiling process to the javascript version of verovio? It is interesting that |
Yes, maybe this is just my machine. It also depends on which other software is busy at the moment (I had |
At a certain point thread overhead overtakes work done. That’s a pretty normal phenomenon. is this true parallelisation or is it concurrency? JS has concurrency now(async/await) even though it’s still single threaded. |
It is true parallelisation, i.e. multi-threaded. Async/await like suspension corresponds to coroutines in C++ which is something different. |
@lpugin Do we want to proceed with this? The effect on total runtime is marginal at the moment. I guess we can currently achieve greater performance improvements by focusing on that the (single threaded) implementation is as smooth as possible (e.g. in the line of #3587 or #3630). Maybe just keep this as an idea. |
@DavidBauer1984 sorry for the delay on the follow up. I wanted to test the different bindings, and as far as I can see it does work as expected. I still want to investigate more precisely the impact (e.g., size) for the JS one. As you pointed out, the performance improvement remains small. One follow-up refactoring might be to turn the drawing process into a functor. That way we could enable the parallelization for some bounding-box drawing steps, which might lead to some additional gain. However, this would be a quite significant refactoring and might yield some memory access problems to deal with. What do you think? |
@lpugin I think the first step should be to move the beam and (initial) slur calculation from the drawing process into functors. This includes any other layout calculation that is done during drawing. Those calculations are currently done repeatedly and maybe it would be sufficient to do them just once or twice. It would also lead to a cleaner code structure. |
Yes, moving some layout calculation steps would already enable some parallelization for some jobs and would be good. I think I would like to keep the PR as draft so we can keep an eye on it and keep it "mergeable". |
This PR adds the functionality to process functors in parallel and enables parallelisation for a few functors where this is straightforward. It should be seen as a starting point to demonstrate how parallelisation can be exploited. To parallelise a significant part of the layout requires a lot more work.
A bit of profiling shows that we actually achieve mild runtime improvements by processing in parallel.
For a medium sized piece (10 pages):
For a large piece (266 pages):
Further remarks:
--max-threads
option.GetProcessingStrategy
andCloneFunctor
. The first one decides whether concurrent processing happens on measures (horizontal layout) or systems (vertical layout). The second one clones the functor before parallel processing and is quite similar toObject::Clone
.MergeFunctor
can be overridden and is designed to merge the results of the calculation on several functors into one functor again (after parallel processing). One example where this makes sense is them_crossStaffSlurs
flag inAdjustSlurs
.VisitScore
orVisitPage
) or transports information from one measure/system to the next (e.g.,PrepareTimeSpanning
orGenerateMIDI
). Another obstacle is if the functor changes global information (e.g. in the document). This might lead to undefined behaviour and requires mutex protection. One needs to carefully check which data is changed by the functor before adding parallelisation.Object
is currently duplicated for const and non-const functors. Not sure if we want to simplify this with templates.