-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
group mutate revamp #109
base: devel
Are you sure you want to change the base?
group mutate revamp #109
Conversation
…mutate rely on old mutate_core and mutate_grp functions
…med unnecessary to pass the test in the first place.
… of S4Vectors classes with mutate so everything is coerced to base::list type.
thanks @ppaxisa! I'm just doing a little package cleanup today, I plan to merge in your PR on Wednesday (i'm out of office Mon and Tue) |
Is this a breaking change when a user tries to mutate with a column that is an S4 class (one of the original reasons I didn't use base tidyverse for grouped mutate in plyranges, and why it was slow)? I would prefer an implementation that does allow for S4 columns but understand that maybe this is a rare use case and we are willing to make tradeoffs here. @lawremi do you have thoughts on this? |
I have not formally tested that but yes, I believe this would be a breaking change if attempting to use S4 columns on grouped object (S4 columns would still work on ungrouped object). |
I’ll can do testing on Wednesday (and add unit tests for this case). Maybe there is a solution so we deliver speed when we can but not break on S4. |
@sa-lee Pierre-Paul and I are considering: Upon
Thoughts? |
That sounds like a good approach to me. There are vectorised methods for a lot of S4 List classes that |
Had some things pop up so didn’t get a chance to implement anything but plan on it later this week or next. Thinking this will go to next Bioc cycle which is fine — important to get this working well, given the broad impact. |
If it helps: can you give me some pointers to try and write the convenience function that checks if there are S4 columns in the DataFrame? Not sure where to start but I can spend some time to move forward on this. |
To put a concrete case to the current discussion: |
I have time again to work on this (really!). I'll report back by end of week. |
So I think just a check here that there are no S4 columns is all we need. I can build out the rest. I was trying to follow the logic of this function and got a little lost. Should it just be an https://github.com/ImmuneAxisa/plyranges/blob/mutate_grp_revamp/R/dplyr-mutate.R#L88-L125 Another ask: because the code went through many iterations, could you make a new branch with your final changes, and can you mark the regions with comments, e.g. |
|
Ok I'll review the decision tree for which functions to use, comment that, and create a new branch. This might take a little while, I have limited bandwidth the next 3 weeks. I guess I'll have to close this pull request and submit a new one? |
That would be awesome. No time pressure and thanks again for your effort on this. You can just leave this PR open, and make a new one once you get to it? |
Just chiming into say I'm happy to review an PRs moving forward. |
When operating on an ungrouped object,
mutate
will use methods from{plyranges}
but when operating on a grouped object,mutate
will rely on vanilla tidyverse by converting the object to a tibble. The result is then coerced back toDataFrame
to update the metadata of the object. Of note, core columns can be used in this framework to create new metadata columns.