Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Package Manager with Version for GitHub Actions #11043

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

kbukum1
Copy link
Contributor

@kbukum1 kbukum1 commented Dec 2, 2024

What are you trying to accomplish?

This PR introduces the implementation of a package manager with versioning support for GitHub Actions workflows. The addition ensures that version information can be extracted and managed for dependencies defined in uses keys within GitHub Actions workflow files.

This change is part of improving ecosystem compatibility and dependency management for GitHub Actions workflows.

Anything you want to highlight for special attention from reviewers?

  • The package_manager method in the file parser extracts the version from uses keys in workflow files.
  • The default version is set to "0.0.0" if no valid versions are found.
  • This approach supports the most common use cases for GitHub Actions versioning.

If there are alternate strategies for handling multiple versions or selecting the appropriate version, I welcome feedback.

How will you know you've accomplished your goal?

  • Versioned package manager initialization will function correctly with GitHub Actions workflows.
  • Tests have been added to validate that the correct version is extracted from workflows and passed to the package manager.
  • Successful integration within the Dependabot framework, ensuring accurate dependency resolution for GitHub Actions.

Checklist

  • I have run the complete test suite to ensure all tests and linters pass.
  • I have thoroughly tested my code changes to ensure they work as expected, including adding additional tests for new functionality.
  • I have written clear and descriptive commit messages.
  • I have provided a detailed description of the changes in the pull request, including the problem it addresses, how it fixes the problem, and any relevant details about the implementation.
  • I have ensured that the code is well-documented and easy to understand.

@github-actions github-actions bot added the L: github:actions GitHub Actions label Dec 2, 2024
# Initialize the PackageManager with the extracted version
PackageManager.new(default_version)
end, T.nilable(Dependabot::GithubActions::PackageManager))
end
Copy link
Contributor Author

@kbukum1 kbukum1 Dec 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Tip: The GitHub Actions version is extracted from uses declarations in workflow files (e.g., actions/checkout@v2.3.4 captures v2.3.4). If no version is found, it defaults to 0.0.0, ensuring the PackageManager is initialized with a valid version for proper dependency management.

Copy link
Member

@jakecoffman jakecoffman Dec 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't make sense to me. GitHub Actions doesn't have a package manager. This is extracting dependency information and setting it as the PackageManager version?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh good catch Jake.

I was assuming this PR was creating a generic package manager and then extracting the dependency and setting the name/version on the dependency.

Creating a pseudo package manager for GitHub Actions so that we could code generic functionality across all package managers in core makes sense to me. But I agree with Jake that the version you're extracting shouldn't be tied to the package manager, but rather the dependency, which is updated by the package manager.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see your point. In this case it may be better to make package manager version not required since we don't have package manager and add another field to share this information that will show in metrics.

@kbukum1 kbukum1 self-assigned this Dec 3, 2024
@kbukum1 kbukum1 marked this pull request as ready for review December 3, 2024 22:07
@kbukum1 kbukum1 requested a review from a team as a code owner December 3, 2024 22:07
@kbukum1 kbukum1 marked this pull request as draft December 3, 2024 22:08
@@ -12,16 +12,14 @@ class FileFetcher < Dependabot::FileFetchers::Base
extend T::Sig
extend T::Helpers

FILENAME_PATTERN = /\.ya?ml$/
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just moved into constants. Here we are looking for all yml, yaml files.

@@ -49,9 +47,10 @@ def fetch_files
if incorrectly_encoded_workflow_files.none?
expected_paths =
if directory == "/"
File.join(directory, "action.yml") + " or /.github/workflows/<anything>.yml"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we were looking only for yml files

else
File.join(directory, "<anything>.yml")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we are looking only for yml files

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is actually looking for files, more like providing examples...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's correct.

else
workflows_dir = "."
end

@workflow_files +=
repo_contents(dir: workflows_dir, raise_errors: false)
.select { |f| f.type == "file" && f.name.match?(FILENAME_PATTERN) }
.select { |f| f.type == "file" && f.name.match?(MANIFEST_FILE_PATTERN) }
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we were looking for yml and yaml files

(?<path>/[^\@]+)?
@(?<ref>.+)
}x

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just moved into PackageManager as a constant

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBH, this one feels even more generic, like it could be a top-level constant in utils or something. Package manager is fine too, just highlighting this is super generic.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I generally try to add constants into package managers that will provide general information. But I can create another module that used for constants in ecosystem such as constants.rb

@@ -49,9 +49,10 @@ def fetch_files
if incorrectly_encoded_workflow_files.none?
expected_paths =
if directory == "/"
File.join(directory, "action.yml") + " or /.github/workflows/<anything>.yml"
File.join(directory,
MANIFEST_FILE_YML) + " or #{MANIFEST_FILE_YAML} or /#{CONFIG_YMLS} or /#{CONFIG_YAMLS}"
Copy link
Contributor Author

@kbukum1 kbukum1 Dec 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replaced to look for yml or yaml files.

else
File.join(directory, "<anything>.yml")
File.join(directory, ANYTHING_YML) + " or #{ANYTHING_YAML}"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

replaced to look for yml or yaml files

@jeffwidman
Copy link
Member

jeffwidman commented Dec 4, 2024

I took a quick look, and the places you pointed don't appear (at first glance) to be actually looking for files, but rather providing examples, or places where they need a placeholder file string... in that case, it's fine to just do action.yml because you have to pick something. (although personally I always prefer .yaml over .yml for readability 😄 )

I haven't looked at the code in depth, but there may be times where we need a placeholder "directory/file" and then later we replace that with the actual file that we found. 🤷‍♂️

Per https://docs.github.com/en/actions/writing-workflows/workflow-syntax-for-github-actions#about-yaml-syntax-for-workflows, the file name can end with either .yaml or .yml, so you're completely right that we need to have full support for both... but only where we are actually finding/filtering...

The other reason I suspect this isn't actually a problem is that if it were, we'd be getting issues about it filed here in core, and I don't recall seeing that happening (unless it's a recent regression--I've been too busy the past six months to keep up with Core 😢 ).

@kbukum1 kbukum1 force-pushed the kamil/add_package_manager_versions_for_github_actions branch from 670e720 to fd0013d Compare December 4, 2024 00:53
@kbukum1
Copy link
Contributor Author

kbukum1 commented Dec 4, 2024

I took a quick look, and the places you pointed don't appear (at first glance) to be actually looking for files, but rather providing examples, or places where they need a placeholder file string... in that case, it's fine to just do action.yml because you have to pick something. (although personally I always prefer .yaml over .yml for readability 😄 )

I haven't looked at the code in depth, but there may be times where we need a placeholder "director/file" and then later we replace that with the actual file that we found. 🤷‍♂️

Per https://docs.github.com/en/actions/writing-workflows/workflow-syntax-for-github-actions#about-yaml-syntax-for-workflows, the file name can end with either .yaml or .yml, so you're completely right that we need to have full support for both... but only where we are actually finding/filtering...

The other reason I suspect this isn't actually a problem is that if it were, we'd be getting issues about it filed here in core, and I don't recall seeing that happening (unless it's a recent regression--I've been too busy the past six months to keep up with Core 😢 ).

Thanks for the explanation. I will check what is required and what is for place holder and come up with a solution.

@kbukum1 kbukum1 force-pushed the kamil/add_package_manager_versions_for_github_actions branch from a8b11d9 to d492e0f Compare December 5, 2024 18:39
@kbukum1 kbukum1 marked this pull request as ready for review December 9, 2024 22:15
@kbukum1 kbukum1 requested a review from jeffwidman December 9, 2024 22:15
@kbukum1 kbukum1 force-pushed the kamil/add_package_manager_versions_for_github_actions branch from 16ba652 to 2734df6 Compare December 9, 2024 22:20
Copy link
Member

@jeffwidman jeffwidman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I think there's two things going on this PR:

  1. Adding a pseudo-package manager class to the GitHub Actions. In reality that ecosystem doesn't have a package manager, but across Dependabot Core there's "generic functionality we apply to package managers" so from that perspective it would be helpful if GitHub Actions had a package manager abstraction that we could operate on. 👍
  2. A stylistic change to abstract a bunch of strings to constants. While related, much of these code changes don't have to happen at the same time as the package manager abstraction. This PR would be simpler to review if the style/moving-strings-to-constants changes were in a separate PR. And this PR stayed focused on the logic changes of abstracting out the pseudo package manager itself.

Beyond that, as Jake points out, while the pseudo-package manager probably needs its own version, it's a dummy version to satisfy the package manager interface across ecosystems. The action/dependency version should be completely separate from the language/package manager versions.

Thoughts?

Comment on lines +23 to +24
"Repo must contain a #{WORKFLOW_DIRECTORY} directory with YAML files or " \
"an #{MANIFEST_FILE_YML} file"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔 So on the one hand abstractions are nice and forward thinking, but on the other hand I an not really sure that workflows will ever be allowed to run from a location beyond .github/workflows... If that's true, then not having the abstraction is actually more clear when reading the code. Preparing for a possible future where they reside in another folder seems like premature optimization unless there's some other reason to have a constant here...

(?<path>/[^\@]+)?
@(?<ref>.+)
}x

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBH, this one feels even more generic, like it could be a top-level constant in utils or something. Package manager is fine too, just highlighting this is super generic.

# Initialize the PackageManager with the extracted version
PackageManager.new(default_version)
end, T.nilable(Dependabot::GithubActions::PackageManager))
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh good catch Jake.

I was assuming this PR was creating a generic package manager and then extracting the dependency and setting the name/version on the dependency.

Creating a pseudo package manager for GitHub Actions so that we could code generic functionality across all package managers in core makes sense to me. But I agree with Jake that the version you're extracting shouldn't be tied to the package manager, but rather the dependency, which is updated by the package manager.

sig { params(file: Dependabot::DependencyFile).returns(Dependabot::FileParsers::Base::DependencySet) }
def workfile_file_dependencies(file)
dependency_set = DependencySet.new

json = YAML.safe_load(T.must(file.content), aliases: true, permitted_classes: [Date, Time, Symbol])
return dependency_set if json.nil?

uses_strings = deep_fetch_uses(json.fetch("jobs", json.fetch("runs", nil))).uniq
uses_strings = deep_fetch_uses(json.fetch(JOBS_KEY, json.fetch(RUNS_KEY, nil))).uniq
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason for this?

This style feels a little like Java where everything is formalized and abstracted... we don't need to do that until/unless we actually derive a specific benefit from adding the abstraction. Because there is a cost in code clarity.

But maybe I'm missing something and there is a reason for doing this now?


module Dependabot
module GithubActions
DOTCOM = "github.com"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you expect this URL to change sometime? If so, the naming shouldn't be DOTCOM because the domain may not use the .com TLD... perhaps something more like DOMAIN_ROOT or ACTIONS_HOST? 🤷‍♂️

But if you don't expect it to change, then I don't see the value in this abstraction? DOTCOM is a pretty generic name and the code would be more readable with github.com inlined... or if there's really a need for a constant but don't expect it to change, then what about using GITHUB_COM?

@@ -328,6 +328,11 @@ def find_container_branch(sha)
branches_including_ref.first
end
end

sig { params(sha: String).returns(String) }
def git_branches_containing(sha)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason this needs to be extracted as a function? The type checking adds little to no value here IMO... Again, the code is more readable if the command is inlined...

@jeffwidman
Copy link
Member

jeffwidman commented Dec 13, 2024

I want to reiterate that I do see value in adding the "pseudo package manager" for ecosystems that don't already have one. I'm actually kinda surprised this isn't already present in Dependabot Core for GitHub Actions... maybe it is? I didn't check the source... but if it's not, then adding one would let us leverage various features over time.

That said, the primary things we use the package manager abstraction for are around versions of package managers... metrics on what versions are being used, then deprecating particular versions...

🤔 If the only benefits of a generic package manager abstraction are around managing versions, then since this pseudo package manager doesn't exist in reality, then maybe adding this abstraction actually has no value??

Are there other benefits to a pseudo package manager that don't involve versions?

@kbukum1
Copy link
Contributor Author

kbukum1 commented Dec 13, 2024

2. strings-to-constants changes were

Thanks for the points. I will try to come up with a better approach to handle dependencies but if in metrics we want to understand github actions dependencies are only thing we can use since there is no package manager version.

I will create another PR for clean up and keep that PR only focused on package manager.

@jeffwidman
Copy link
Member

jeffwidman commented Dec 13, 2024

but if in metrics we want to understand github actions, dependencies are only thing we can use since there is no package manager version.

Yes, but those metrics are about package manager versions... we use them for tracking popularity of various versions of a package manager. We don't report versions on the dependencies themselves.

Since GitHub Actions has no package manager, then there's no point in reporting a package manager version for it. If there is a dashboard somewhere showing package manager versions for all ecosystems and it's breaking because there's no value for GitHub Actions, then just hardcode a placeholder version like 1.0.0... do not hoist a dependency's version into the package manager version.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
L: github:actions GitHub Actions
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants