-
-
Notifications
You must be signed in to change notification settings - Fork 182
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature]: New higher-performance Filter() operators #758
Comments
I think the filter with state option is inspired, and now that you have suggested I am surprised that no-one, myself included have ever to introduce it. I think it's a no-brainer that we should go with it 😃 Regardless the stateless filtering, I see a pitfall:
We could not deterministically specify that 3 should be a remove without knowing that it was originally a match. Alternatively we could always fire a remove, but that would be a load of misleading notifications. The same would be true when we determine an update after an add. I think most operators could handle this but I think it would take a lot of looking to be sure. |
Sure we can, we re-run the predicate for |
Of course. You're right. The third operator must assume that the object does not mutate and that the filter is a pure function. |
For clarity, I want to make sure I understand the use-cases, which might look something like this? class Person
{
public int Id { get; }
public string Name { get; set; }
public int Age { get; set; }
}
var cache = new SourceCache<Person, int>(p => p.Id);
var ageObservable = viewModel.WhenAnyValue(vm => vm.MinAgeFilter); // Connected to some UI, for example
var filtered = cache.Connect().Filter(ageObservable, static (age, p) => p.Age >= age)
.Bind(out this.filteredByAgeCollection)
.Subscribe(); And that would yield Am I understanding everything correctly? If so, I like it. |
That looks about right |
I think we should avoid emitting "misleading notifications" (the first event observed for a given Key should always be Add, the last should always be Remove, there shouldn't be another Add unless there has been a remove, and Update and Refresh should be appear between these two, etc.). We could even modify the Aggregator to validate that operators always have this behavior via testing. Operators should also be written to work-around/ignore misleading notifications as much as possible. These two approaches together should avoid bugs. But consumers would likely be surprised to receive a Remove for something for which there was never an Add. I know I would be.
I understand how applying the predicate to the
How do we find the item that was just dropped so that an Add event can now be omitted unless there's a local cache of all of the items? You have the same issue with any State observable change. I feel like you need the local cache of all items so that the filter can be re-applied ... Or am I missing something? |
The approach of applying the predicate to |
But it isn't stateless. The other observable is even named |
Yes, you're missing the fact that that overload won't be using the "just apply the predicate to |
Got it, thanks! |
Still need to implement the two |
What does that look like? |
Exactly what they look like in the original issue description. |
Are you going to implement those? I might be able to do it. |
I've got an implementation in progress, yeah, no worries. The cache version is done, and I was hoping for a little feedback from @RolandPheasant in Slack, with regard to the list version. |
Got a fun result here.
I worked out 3 different implementations for the List version of this operator, and the one that seems to be the clear winner for performance (Version 3) is actually a FUNCTIONAL improvement as well: it preserves item ordering, through the filter. E.G. if unfiltered items are Verision 1 is the closest implementation to the current List Version 2 attempts to improve on Version 1 by only keeping one list, internally, with each item being stored alongside its filtered index, if any. Removes can now be processed without a linear search for the item, but it instead requires a full iteration of the list to update the "filtered index" values, in accordance with the removal. Version 3 is similar to Version 2, with additional work put in for For extra fun, if we go with Version 3, I think we could theoretically re-implement all the existing List |
Regarding the Cache version of the operator...
All the slightly-different implementations I've tried have essentially the same performance. The different things I've tried include:
I'm planning on going with Version 5. |
Describe the functionality desired 🐞
Consider the following new filtering operators, which should provide performance improvements over existing operators:
Not set on the name for the third one, but it stems from a conversation with @RolandPheasant and @dwcullop in Slack. Alternatives for this would maybe be
.FilterSlim()
or.FilterDeterministic()
or.FilterWithoutCaching()
. Whatever the name, the goal would be to keep it consistent with other operators (like.Transform()
) that normally keep an internal items cache, but could have versions that don't.The steps the functionality will provide
For the first two operators, utilizing an
IObservable<TState>
parameter, the primary improvement operator is to reduce memory allocations required by existing operators that support dynamic filtering.The third operator would improve performance for static-filtering scenarios, where the filter predicate is relatively fast, and is known to be deterministic, I.E. collection items are immutable, at least from the perspective of filtering. Under this scenario,
.Filter()
can be implemented without the need for an internal cache of items, becoming allocation-free for all single-change ChangeSets.This operator would not be applicable as an
ObservableList
operator, as using deterministic-filtering does not eliminate the need for an internal items cache. An internal cache is still needed to track item indexes, so downstream indexes can be adjusted for items filtered out of the source.Considerations
.Filter()
with theIObservable<Func<TObject, bool>>
orIObservable<Func<TObject, TKey, bool>>
parameters to dynamically control filtering, but this requires either a newFunc<>
allocation for each change in filtering rules, or a decent bit of tedious code to reuse and re-emit a sharedFunc<>
instance each time, while instead mutating its closure..FilterOnObservable()
, but this requires multiple object allocations per-added-item, to create anIObservable<bool>
that controls filtering of that item, and to subscribe to that stream. This approach also doesn't really support collection items being immutable, at all, as replace operations force a teardown and reconstruction of the filtering stream for that item.The text was updated successfully, but these errors were encountered: