Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix C++ concatStringsSep #11093

Merged
merged 28 commits into from
Jul 14, 2024
Merged

Fix C++ concatStringsSep #11093

merged 28 commits into from
Jul 14, 2024

Conversation

roberth
Copy link
Member

@roberth roberth commented Jul 13, 2024

Motivation

First off, builtins.concatStringsSep is unaffected 🎉

My original goal was to get #11021 mergeable.
This involved pruning the log output (#11091) and then replacing execvpe which is not available on darwin.
This in turn brought be to a bad PATH executable lookup implementation.
Fixing that required a new splitting function that isn't tokenizeString, which needed to be tested.
That's how I found that concatStringsSep does not behave as documented.
Specifically, it ignores empty strings and does not emit a separator until a non-empty string is encountered.
This doesn't happen often, and in the remaining cases, it may be ok to drop empty string items anyway.
Nonetheless, that's not guaranteed, and I've found and slightly improved a few things by reviewing and migrating to the improved concatStringsSep implementation.
.
I was able to check most call sites, which I've categorized by commit if you want to double-check.

To understand what's going on in the commits, my workflow could be summarized as:

  • Give the "broken" function a descriptive name
  • Assert that the corner case is not encountered
  • Review the call sites and migrate them to the improved function without the bad corner case
  • Remove the assert

By reviewing the call sites, I've found and solved an old bug in NIX_REMOTE_SYSTEMS.
This PR also breaks nix help "" "" "" "" "" build. Very space bar heating.

The remaining call sites can be audited async after this PR.
I'd like to merge this PR soon, so that we remove the possibility to accidentally use the bad function in new code. (and it going unnoticed, because I won't be reviewing new call sites that are created in PRs; sorry)

Context

Priorities and Process

Add 👍 to pull requests you find important.

The Nix maintainer team uses a GitHub project board to schedule and track reviews.

@roberth roberth added the bug label Jul 13, 2024
@github-actions github-actions bot added new-cli Relating to the "nix" command store Issues and pull requests concerning the Nix store repl The Read Eval Print Loop, "nix repl" command and debugger labels Jul 13, 2024
@roberth roberth force-pushed the fix-concatStringsSep branch from d7f1d21 to f6205bb Compare July 13, 2024 01:06
roberth added 21 commits July 13, 2024 03:06
The buggy version was previously renamed to
dropEmptyInitThenConcatStringsSep
… docs

These are non-critical, so their behavior is ok to change.
Dropping empty items is not needed and usually not expected.
…attr

The empty attribute name should not be dropped from attribute paths.
Rendering attribute paths with concatStringsSep is lossy and wrong,
but this is just a first improvement while dealing with the
dropEmptyInitThenConcatStringsSep problem.
Bug not reported in 6 years, but here you go.

Also it is safe to switch to normal concatStringsSep behavior
because tokenizeString does not produce empty items.
The sigs field is produced by tokenizeStrings, which does not return
empty strings.
…ty not feasible

I don't think it's completely impossible, but I can't construct
one easily as derivationStrict seems to (re)tokenize the outputs
attribute, dropping the empty output.

It's not a scenario we have to account for here.
…not be empty

(System) features are unlikely to be empty strings, but when they
come in through structuredAttrs, they probably can.
I don't think this means we should drop them, but most likely they
will be dropped after this because next time they'll be parsed with
tokenizeString.

TODO: We should forbid empty features.
…ributes with empty names

Empty attributes are probably not well supported, but the least we
could do is leave a hint.
Attribute path rendering and parsing should be done according to
Nix expression syntax in my opinion.
…"" "" "" build

Garbage in, error out. Experimental CLI. Zero derivations given.
The law has nothing to do with this, although I do feel like a
badass when I mess with the config.
I'm a conf artist.
…as already harmed

Considering that `value` was probably parsed with tokenizeString
prior, it's unlikely to contain empty strings, and we have no
reason to remove them either.
It's still wrong, but one step closer to correct.
Not that anyone should use "" or "." in their PATH, but that is not
for us to intervene.
roberth added 2 commits July 13, 2024 03:06
... but if they are, I'd like to see at least a hint of it so that
I'd know to fix it.
It's usually harmless, if it occurs at all.
@roberth roberth force-pushed the fix-concatStringsSep branch from f6205bb to 1c97718 Compare July 13, 2024 01:06
strings.push_back("");
strings.push_back("");

ASSERT_EQ(concatStringsSep(",", strings), ",");
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would result in the empty string in the old version of concatStringsSep.

@@ -56,7 +65,7 @@ auto concatStrings(Parts && ... parts)
-> std::enable_if_t<(... && std::is_convertible_v<Parts, std::string_view>), std::string>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rename this one?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was unaffected and I've changed it to use the fixed concatMapStrings.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I missed that it uses the fixed one

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was good to change anyway, no problem/ thanks!

roberth added 3 commits July 14, 2024 11:50
When the separator is empty, no difference is observable.

Note that concatStringsSep has centralized definitions. This adds the
required definitions. Alternatively, `strings-inline.hh` could be
included at call sites.
... at call sites that are may be in the hot path.

I do not know how clever the compiler gets at these sites.
My primary concern is to not regress performance and I am confident
that this achieves it the easy way.
@roberth roberth enabled auto-merge July 14, 2024 10:22
@roberth roberth merged commit 9d7397c into master Jul 14, 2024
19 checks passed
@roberth roberth deleted the fix-concatStringsSep branch July 14, 2024 11:11
Ericson2314 added a commit to obsidiansystems/nix that referenced this pull request Aug 5, 2024
The test split matches PR NixOS#8920, so the utility files and tests files
are once again to 1-1. The string changes continues what was started in
PR NixOS#11093.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug new-cli Relating to the "nix" command repl The Read Eval Print Loop, "nix repl" command and debugger store Issues and pull requests concerning the Nix store
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

2 participants