Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pre-resize lists with Add() calls inside loops with known bound #31

Open
3 of 6 tasks
dubiousconst282 opened this issue Dec 2, 2023 · 1 comment
Open
3 of 6 tasks

Comments

@dubiousconst282
Copy link
Owner

dubiousconst282 commented Dec 2, 2023

Lists are often instantiated and immediately filled with items from a source having a known length, and it isn't always convenient to explicitly initialize the list with a capacity:

var list = new List<TypeDesc>();

foreach (var itfHandle in typeInfo.GetInterfaceImplementations()) { // `C : IReadOnlyCollection<T>`
    // ...
    list.Add(...);
}
return list;

This could be rewritten to a constructor call with the known capacity, or to a later call to EnsureCapacity(). The first is only possible if there aren't any side effects in between the ctor and source def.

This can be naturally generalized to lists that aren't freshly created. must make sure that calls to EnsureCapacity() always double the list's capacity to avoid +1 resizes that lead to O(n^2) complexity. (This is guaranteed by list.EnsureCapacity(list.Count + addCount))


There's another case where the items are added inline, and an upper-bound could be determined by counting the number of Add() calls instead of a loop bound.

var list = new List<Item>();

if (condA) list.Add(itemA);
if (condB) list.Add(itemB);
if (condC) list.Add(itemC);
// ...

return list;

Note that Add() calls behind if conditions are problematic because we don't know anything about them, so I'm not sure it would be a good idea to presize based on a static probability factor as that could end up pessimizing the code by allocating extra memory (but could still be worth for non-escaping lists with an ToArray() call at the end?).

Worklist

  • Detect for..i and foreach(collection) loops
    • Support for sources based on struct collections (ImmutableArray/Span)
      LSR implements this by checking that the user chain of the struct address is non-escaping and not written inside the loop, which seems reasonable until/if we ever get proper infra like MemSSA.
    • Insert EnsureCapacity() calls and set count based on loops bounds
  • Replace add calls with direct array writes
  • Support for other known collections: ImmutableArray.Builder, Dictionary, HashSet, etc.
  • Add more tests
@neon-sunset
Copy link

If the bound is known, the list can be pre-sized and then instead of .Add() it could be written as

var span = CollectionsMarshal.AsSpan(list);
/* for ... */
CollectionsMarshal.SetCount(list, count);

(in .NET 8)

dubiousconst282 added a commit that referenced this issue Dec 11, 2023
dubiousconst282 added a commit that referenced this issue Dec 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants