Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify how stars work in file patterns #1675

Merged
merged 2 commits into from
Aug 30, 2023
Merged

Clarify how stars work in file patterns #1675

merged 2 commits into from
Aug 30, 2023

Conversation

nedbat
Copy link
Owner

@nedbat nedbat commented Aug 29, 2023

@webknjaz How does this look?

CHANGES.rst Outdated
<pull 1650_>`_ and graciously agreed to a pragmatic resolution.

.. _starbad: https://github.com/nedbat/coveragepy/issues/1407#issuecomment-1631085209
.. _issue 1646: https://github.com/nedbat/coveragepy/issues/1646
Copy link
Contributor

@webknjaz webknjaz Aug 29, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will fix the docs build crash @ https://github.com/nedbat/coveragepy/actions/runs/6018518753/job/16326834195?pr=1675#step:5:33

Suggested change
.. _issue 1646: https://github.com/nedbat/coveragepy/issues/1646
.. _pull 1650: https://github.com/nedbat/coveragepy/pull/1650

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

D'oh!

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed.

nomatches=["abc/xfoo/hi.py"],
),
globs_to_regex_params(
["*c/foo"], case_insensitive=False, partial=True,
matches=["abc/foo/hi.py"],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this will also work:

Suggested change
matches=["abc/foo/hi.py"],
matches=["abc/foo/hi.py", "stuff/abc/foo/hi.py"],

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh wait, I see that a similar case is already in nomatches.. Let me reread the new doc explanation.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't match. The next line includes nomatches=["def/abc/foo/hi.py"]

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I saw that after writing the comment.

doc/source.rst Outdated
Comment on lines 189 to 190
- ``*`` as the first or last path component matches anything including
directory separators: any number of path components, including none, like
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this kinda implies that "*thing/stuff/" would match "some/xthing/stuff/" because it's unclear what a “path component” refers to.
I think, this is because of the ambiguity — the match expression with a wildcard looks like "*thing/stuff/" and so I read this as "*thing" is the first path component, and it matches things like [a-zA-Z0-9/] basically.
So while you must've been thinking of the path of a discovered file during the filtering process, the reader is likely to think of what's in front of them — a file path pattern.

Maybe, we could rephrase this by specifically pointing out the “if a path pattern starts with */, this is equivalent to **/”? WDYT?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I had a few tries at how to explain this, and didn't love any of them. I think your idea is a good one.

doc/source.rst Outdated
- ``?`` matches a single file name character.

- ``**`` matches any number of nested directory names, including none.
- ``*`` as part of a path component matches any number of file name characters,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe, we could clarify the distinction between "part of" and "the entire match pattern component" through a concrete example?

globs_to_regex_params(
["foo/x*"], case_insensitive=False, partial=False,
matches=["foo/x", "foo/xhi.py"],
nomatches=["foo/x/hi.py"],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even though, it's there in the source code, I forgot that this wouldn't match since some time has passed since I first opened my PR. So when I read the new explanation, I assumed this would match too. I think that adding more specific examples to the docs would really help to disambiguate them.

CHANGES.rst Outdated
leading or trailing star component matches any number of path components,
like a double star would. This is different than the behavior of a star in
the middle of a pattern. This discrepancy was `identified by Sviatoslav
Sydorenko <starbad_>`_, who `provided patient detailed diagnosis
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On a somewhat unrelated note, I wanted to share one practice I came up with for crediting people in changelogs in projects using Towncrier fragments. I've added a :user: RST role to Sphinx (example: https://github.com/pypa/setuptools/blob/1ef36f2/docs/conf.py#L138) and encourage the contributors to add bylines with that role. It expands into a GitHub Sponsors link. When it's not activated, GH makes a redirect to the normal user page.
I borrowed part of the idea from Tox — the only link GH profiles. Here's what it looks like: https://setuptools.pypa.io/en/latest/history.html#v67-0-0.
I use it in some of the projects I maintain or just contribute to.

I figured, if this document is targeted at being consumed by Sphinx and not other tools/systems, you might want to implement something similar in the future.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is a cool idea, I will look into it. This changelog becomes part of the GitHub releases, so I have to be careful about unusual features.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's one pain point. You need to convert to GitHub-Flavoured Markdown for GH Releases. I used pandoc for this in some workflows. Though, since pandoc doesn't know of Sphinx roles (only about the docutils' ones), I had to arm with regexes and pre-process those files via sed.

@nedbat nedbat force-pushed the nedbat/finish-1650 branch 2 times, most recently from e63f03c to 3c8265f Compare August 29, 2023 23:54
doc/source.rst Outdated
- ``*`` matches any number of file name characters, not including the directory
separator. As a special case, if a pattern starts with ``*/``, it is treated
as ``**/``, and if a pattern ends with ``/*``, it is treated as ``/**`.

- ``**`` matches any number of nested directory names, including none.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having reviewed other places, I'm starting to get a feeling that things like x/**z/y and s/x**z/y won't work the way it's implied here either. How about adding more examples to this as well?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

x/**z/y and s/x**z/y are refused as invalid.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, sounds reasonable. Is there a test for this? Should this be mentioned in a "non-working example"?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this also clarify something like “must not be combined with any other characters in the same path frament of the match pattern”?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity, what happens with "***/**"?

doc/source.rst Outdated
- ``?`` matches a single file name character.

- ``*`` matches any number of file name characters, not including the directory
separator. As a special case, if a pattern starts with ``*/``, it is treated
as ``**/``, and if a pattern ends with ``/*``, it is treated as ``/**`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
as ``**/``, and if a pattern ends with ``/*``, it is treated as ``/**`.
as ``**/``, and if a pattern ends with ``/*``, it is treated as ``/**``.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

We decided to leave trailing `*` meaning the same as `**` because it was
already in use.

Original description:

This patch demonstrates a corner case in the path glob matcher.
Specifically, it documents how a single trailing asterisk is supposed
to be treated as opposed to a double asterisk.
With [[1]], a trailing `/*` is interpreted as an equivalent of `/**`.
The commit add a case that shows that `/*` shouldn't be greedy as
described in the docs [[2]][[3]].

See also the observations in the bug report ticket [[4]].

[1]: ec6205a
[2]: https://coverage.rtfd.io/en/stable/source.html#file-patterns
[3]: https://coverage.rtfd.io/en/7.2.7/migrating.html#migrating-to-coverage-py-7-x
[4]: #1407 (comment)
Copy link
Owner Author

@nedbat nedbat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

x

doc/source.rst Outdated
- ``?`` matches a single file name character.

- ``*`` matches any number of file name characters, not including the directory
separator. As a special case, if a pattern starts with ``*/``, it is treated
as ``**/``, and if a pattern ends with ``/*``, it is treated as ``/**`.
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

@nedbat
Copy link
Owner Author

nedbat commented Aug 30, 2023

I hear what you are saying about examples in the docs. I'll have to think about how to do that.

@webknjaz
Copy link
Contributor

Thanks for picking this up! I've been meaning to add some docs myself, but you got to doing this first.
Short of the examples, I think the PR is on the right track towards clarity!

@webknjaz
Copy link
Contributor

@nedbat so I wanted to preview how the PR renders and realized that this repository doesn't have PR builds enabled on RTD.
Assuming, this isn't on purpose, could you enable this feature? It's just a single checkbox at the bottom of the first section on this page: https://readthedocs.org/dashboard/coverage/advanced/.
Ref: https://docs.readthedocs.io/en/stable/guides/pull-requests.html.

@nedbat nedbat force-pushed the nedbat/finish-1650 branch 8 times, most recently from 0bf5932 to 32383e0 Compare August 30, 2023 12:14
@nedbat
Copy link
Owner Author

nedbat commented Aug 30, 2023

I've added some examples. Here's the rendered page: https://coverage.readthedocs.io/en/nedbat-finish-1650/source.html#file-patterns

@webknjaz
Copy link
Contributor

webknjaz commented Aug 30, 2023

@nedbat I see… Though, it's not the same as PR builds. This only works for your personal branches created upstream. PR builds work for PRs coming from forks submitted by anybody.

@nedbat
Copy link
Owner Author

nedbat commented Aug 30, 2023

Right, I don't have pull requests configured to build docs on readthedocs, I created this one manually.

@webknjaz
Copy link
Contributor

Is that because you don't want to see those build statuses in PRs?

@nedbat
Copy link
Owner Author

nedbat commented Aug 30, 2023

It's because I don't usually use pull requests myself, and I don't get enough pull requests that update docs in significant ways, so it's never been important to set up.

@webknjaz
Copy link
Contributor

Fair enough. Though, from the contributor point, it's definitely useful. I'm so used to having this available that it even confused me a little. I think nowadays, all the new projects have it enabled by default and the only ones that don't are old projects that were created before 2018.

doc/source.rst Outdated
Comment on lines 181 to 182
File path patterns are used for include and omit, and for combining path
remapping. They follow common shell syntax:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a good place to reference the corresponding features:

Suggested change
File path patterns are used for include and omit, and for combining path
remapping. They follow common shell syntax:
File path patterns are used for :ref:`include <config_run_include>` and
:ref:`omit <config_run_omit>`, and for :ref:`combining path remapping
<cmd_combine_remapping>`. They follow common shell syntax:

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shameless plug: I looked up the references through my little static site generator for intersphinx-enabled docs: https://webknjaz.github.io/intersphinx-untangled/coverage.rtfd.io/

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you force-push on top after that?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know what you mean. The links are now on master.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, sorry about the noise! GitHub must've been showing me outdated diffs due to its infamous eventual consistency. And then, I've confused myself even more by clicking on the readme badge. Now I realize that the badge doesn't lead to the latest docs as it often happens in other projects.
Having understood that, I checked the correct place and I do indeed see the final change.

@webknjaz
Copy link
Contributor

I've added some examples.

Can we also have examples for */sub*/*, */*sub/test_*.py, */*sub/*sub/**. And maybe a separate one under an .. attention:: admonition for all the things that are supposed to error out?

@webknjaz
Copy link
Contributor

Review round complete.

@nedbat
Copy link
Owner Author

nedbat commented Aug 30, 2023

I've enabled PR builds on readthedocs.

@nedbat
Copy link
Owner Author

nedbat commented Aug 30, 2023

Thanks for all the help!

@nedbat nedbat merged commit 86955f2 into master Aug 30, 2023
62 checks passed
@nedbat nedbat deleted the nedbat/finish-1650 branch August 30, 2023 14:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants