Pre-resize lists with Add() calls inside loops with known bound #31

dubiousconst282 · 2023-12-02T20:09:36Z

Lists are often instantiated and immediately filled with items from a source having a known length, and it isn't always convenient to explicitly initialize the list with a capacity:

var list = new List<TypeDesc>();

foreach (var itfHandle in typeInfo.GetInterfaceImplementations()) { // `C : IReadOnlyCollection<T>`
    // ...
    list.Add(...);
}
return list;

This could be rewritten to a constructor call with the known capacity, or to a later call to EnsureCapacity(). The first is only possible if there aren't any side effects in between the ctor and source def.

This can be naturally generalized to lists that aren't freshly created. ~~must make sure that calls to EnsureCapacity() always double the list's capacity to avoid +1 resizes that lead to O(n^2) complexity.~~ (This is guaranteed by list.EnsureCapacity(list.Count + addCount))

There's another case where the items are added inline, and an upper-bound could be determined by counting the number of Add() calls instead of a loop bound.

var list = new List<Item>();

if (condA) list.Add(itemA);
if (condB) list.Add(itemB);
if (condC) list.Add(itemC);
// ...

return list;

Note that Add() calls behind if conditions are problematic because we don't know anything about them, so I'm not sure it would be a good idea to presize based on a static probability factor as that could end up pessimizing the code by allocating extra memory (but could still be worth for non-escaping lists with an ToArray() call at the end?).

Worklist

Detect for..i and foreach(collection) loops
- Support for sources based on struct collections (ImmutableArray/Span)
  LSR implements this by checking that the user chain of the struct address is non-escaping and not written inside the loop, which seems reasonable until/if we ever get proper infra like MemSSA.
- Insert EnsureCapacity() calls and set count based on loops bounds
Replace add calls with direct array writes
Support for other known collections: ImmutableArray.Builder, Dictionary, HashSet, etc.
Add more tests

The text was updated successfully, but these errors were encountered:

neon-sunset · 2023-12-02T20:34:01Z

If the bound is known, the list can be pre-sized and then instead of .Add() it could be written as

var span = CollectionsMarshal.AsSpan(list);
/* for ... */
CollectionsMarshal.SetCount(list, count);

(in .NET 8)

Contributes to #31

Contributes to #31, #32

dubiousconst282 mentioned this issue Dec 2, 2023

Copy elision for arrays and lists #32

Open

dubiousconst282 added a commit that referenced this issue Dec 11, 2023

Introduce PresizeLists pass

f414559

Contributes to #31

dubiousconst282 added a commit that referenced this issue Dec 14, 2023

PresizeLists: build over array directly when possible

5dacb95

Contributes to #31, #32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pre-resize lists with Add() calls inside loops with known bound #31

Pre-resize lists with Add() calls inside loops with known bound #31

dubiousconst282 commented Dec 2, 2023 •

edited

Loading

neon-sunset commented Dec 2, 2023

Pre-resize lists with Add() calls inside loops with known bound #31

Pre-resize lists with Add() calls inside loops with known bound #31

Comments

dubiousconst282 commented Dec 2, 2023 • edited Loading

Worklist

neon-sunset commented Dec 2, 2023

dubiousconst282 commented Dec 2, 2023 •

edited

Loading