Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Accelerate insertion of large number of elements #149

Merged
merged 5 commits into from
Jan 20, 2024

Conversation

lopezca
Copy link
Contributor

@lopezca lopezca commented Sep 24, 2023

ObjectTransaction>>registeredObjectsDo: iterates over a collection and checks if the element belongs to another collection. This verification uses Array>>includes: which traverses the array until an element is found. For a small number of elements this approach is OK. but when the number of elements increases, then the verification algorithm is O(n^2).

Converting the search array into a set makes the behavior almost linear again for large sizes and has no impact on smaller sized batches.

See time improvement after change:
image

@codecov
Copy link

codecov bot commented Sep 24, 2023

Codecov Report

All modified lines are covered by tests ✅

Comparison is base (b0e9460) 77.75% compared to head (07cad4d) 77.77%.

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #149      +/-   ##
==========================================
+ Coverage   77.75%   77.77%   +0.01%     
==========================================
  Files         470      470              
  Lines       50331    50332       +1     
==========================================
+ Hits        39136    39144       +8     
+ Misses      11195    11188       -7     
Files Coverage Δ
src/Glorp/ObjectTransaction.class.st 94.33% <100.00%> (+0.03%) ⬆️

... and 2 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@eMaringolo
Copy link
Contributor

The change seems naively safe.
I guess the registered objects are written in random order in this case, do you see any side effect of that? As long as the dependencies are written in the right order I think this is okay to go.

My question is... why not having the registered objects being a Set itself?

The variable previousVersion stored an array of the keys belonging to the dictionary undoMap. This last line regenerated this array for no reason, since previousVersion is a local variable and is not mutated inside the message's body.
…uctures. This then forces the previousVersion array to be recreated. I added as Set to it as well.
@lopezca
Copy link
Contributor Author

lopezca commented Sep 27, 2023

This change does not modify any write order. When iterating over undoMap keys, the order stays the same as before. The change afects the block [:eachKey | (previousVersion includes: eachKey) ifFalse: [newAdditions add: eachKey]] where it checks if previous version contains the key.

I made another commit, because the final line of the block recreates the array with keys from undoMap so I made it a set as well.

I see your point, we could avoid regenerating previousVersion if we update it while adding new keys to the undo map for recursive structures. Let me try that in another PR.

@lopezca
Copy link
Contributor Author

lopezca commented Sep 28, 2023

the last commit improves by adding new elements to the previousVersions set instead of creating a whole new set.
The current implementation passes all tests in my own test suite, so I'm happy with this result.

Copy link
Member

@gcotelli gcotelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@gcotelli gcotelli merged commit c294e22 into pharo-rdbms:master Jan 20, 2024
30 checks passed
@lopezca lopezca deleted the feature/improve-insert-speed branch May 4, 2024 19:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants