-
Notifications
You must be signed in to change notification settings - Fork 372
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Block extensions disallowed by policy #3259
base: develop
Are you sure you want to change the base?
Conversation
b440696
to
a37508f
Compare
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## develop #3259 +/- ##
===========================================
+ Coverage 71.97% 72.77% +0.79%
===========================================
Files 103 114 +11
Lines 15692 17081 +1389
Branches 2486 2277 -209
===========================================
+ Hits 11295 12431 +1136
- Misses 3881 4107 +226
- Partials 516 543 +27 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did an initial review not including tests
""" | ||
# TODO: when CRP adds terminal error code for policy-related extension failures, set that as the default code. | ||
def __init__(self, msg, inner=None, code=-1): | ||
msg = "Extension is disallowed by agent policy and will not be processed: {0}".format(msg) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In case where agent failed to parse policy, I'm not sure we should say 'Extension is disallowed by policy'. In this case, extension is disallowed because there's some issue reading or parsing the policy.
I also am hesitant about 'agent policy' since policy is provided by customer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could change this to "Extension will not be processed: "
Parsing errors (InvalidPolicyError) would look like "Extension will not be processed: customer-provided policy file (path) is invalid, please correct the following error..."
Extension disallowed errors (ExtensionPolicyError) would look like "Extension will not be processed: failed to enable extension CustomScript because extension is not specified in policy allowlist. To enable, add extension to the allowed list in policy file (path)."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've updated the error message as discussed above.
azurelinuxagent/ga/exthandlers.py
Outdated
policy_op, policy_err_code = policy_err_map.get(ext_handler.state) | ||
if policy_error is not None: | ||
err = ExtensionPolicyError(msg="", inner=policy_error, code=policy_err_code) | ||
self.__handle_and_report_ext_handler_errors(handler_i, err, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this create .status files for single config extensions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a new function __handle_and_report_policy_error() - this should create a status file for any extension with settings.
azurelinuxagent/ga/exthandlers.py
Outdated
ext_handler.name, | ||
conf.get_policy_file_path()) | ||
err = ExtensionPolicyError(msg, code=policy_err_code) | ||
self.__handle_and_report_ext_handler_errors(handler_i, err, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same question here about .status file for single config extensions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a new function __handle_and_report_policy_error() - this should create a status file for any extension with settings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes look good. I'm going to spend tomorrow going through each e2e scenario and unit test, sorry for the slow review :/
azurelinuxagent/ga/exthandlers.py
Outdated
ExtensionRequestedState.Enabled: ('enable', ExtensionErrorCodes.PluginEnableProcessingFailed), | ||
# TODO: CRP does not currently have a terminal error code for uninstall. Once CRP adds | ||
# an error code for uninstall or for policy, use this code instead of PluginDisableProcessingFailed | ||
# Note that currently, CRP waits for 90 minutes to time out for a failed uninstall operation, instead of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we add some more detail to this comment?
Something like:
Note that currently, CRP will poll until the agent does not report a status for an extension that should be uninstalled. In the case of a policy error, the agent will report a failed status on behalf of the extension, which will cause CRP to poll for the full timeout period, instead of failing fast.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added
azurelinuxagent/ga/exthandlers.py
Outdated
@@ -692,6 +734,26 @@ def __handle_and_report_ext_handler_errors(ext_handler_i, error, report_op, mess | |||
add_event(name=name, version=handler_version, op=report_op, is_success=False, log_event=True, | |||
message=message) | |||
|
|||
@staticmethod | |||
def __handle_and_report_policy_error(ext_handler_i, error, report_op, message, report=True, extension=None): | |||
# TODO: Consider merging this function with __handle_and_report_ext_handler_errors() above. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please leave some comment explaining why we broke this into a separate function? For policy related failures, we want to fail extensions fast. CRP will continue to poll for single-config ext status until timeout, so agent should write a status for single-config extensions. The other function does not create that status and we didn't want to touch the other function without investigating the impact of that change further
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added
|
||
# Create status file for extensions with settings (single and multi config). | ||
if extension is not None: | ||
ext_handler_i.create_status_file_if_not_exist(extension, status=ExtensionStatusValue.error, code=error.code, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
create_status_file_if_not_exist() will not overwrite existing status file (for the current sequence number). Is this behavior acceptable?
azurelinuxagent/ga/exthandlers.py
Outdated
ExtensionRequestedState.Enabled: ('enable', ExtensionErrorCodes.PluginEnableProcessingFailed), | ||
# Note: currently, when uninstall is requested for an extension, CRP polls until the agent does not | ||
# report status for that extension, or until timeout is reached. In the case of a policy error, the | ||
# agent reports failed status on behalf of the extension, which will cause CRP to for the full timeout, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# agent reports failed status on behalf of the extension, which will cause CRP to for the full timeout, | |
# agent reports failed status on behalf of the extension, which will cause CRP to poll for the full timeout, |
nit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed, thanks!
@@ -3507,5 +3510,144 @@ def test_report_msg_if_handler_manifest_contains_invalid_values(self): | |||
self.assertIn("'supportsMultipleExtensions' has a non-boolean value", kw_messages[2]['message']) | |||
|
|||
|
|||
class TestExtensionPolicy(TestExtensionBase): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could we add a test case for extension is allowed by policy
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added
tests_e2e/test_suites/ext_policy.yml
Outdated
name: "ExtensionPolicy" | ||
tests: | ||
- "ext_policy/ext_policy.py" | ||
images: "random(endorsed)" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should run this on more distros so we can get better coverage before releasing the changes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will running on all endorsed distros add too much overhead to the daily runs?
# Prepare extensions to test | ||
unique = str(uuid.uuid4()) | ||
test_file = f"waagent-test.{unique}" | ||
custom_script = ExtPolicy.TestCase( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please add a comment explaining why you chose each of these extensions (single-config, multi-config, etc), so that it is obvious to anyone who reviews this test case in the future
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added
fail(f"The agent should have reported an error trying to {operation} {extension_case.extension.__str__()} " | ||
f"because the extension is disallowed by policy.") | ||
except Exception as error: | ||
assert_that("Extension will not be processed" in str(error)) \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we also check for [ExtensionPolicyError] in the message to confirm the failure was due to policy
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated
@@ -630,6 +630,70 @@ def test_it_should_handle_and_report_enable_errors_properly(self): | |||
} | |||
self._assert_extension_status(sc_handler, expected_extensions) | |||
|
|||
def test_it_should_handle_and_report_disallowed_extensions_properly(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you also please add a case for multi config ext allowed by policy
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added
name: "ExtPolicyWithDependencies" | ||
tests: | ||
- "ext_policy/ext_policy_with_dependencies.py" | ||
images: "random(endorsed)" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same comment here, we should get more coverage than 1 run per day, maybe consider running on all endorsed, or 5-10 endorsed images per day
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated to all endorsed, but I can change to 5-10 if this adds too much overhead.
} | ||
} | ||
self._create_policy_file(policy) | ||
self._operation_should_succeed("enable", custom_script) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we also add the test case where single-config ext should fail due to policy
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the test case on line 189 sufficient, or did you mean something else?
(I will also add comments to make it clearer that this case is being tested)
e909568
to
86de0c5
Compare
Description
Issue #
PR #2 for the policy engine allowlist feature:
PR information
Quality of Code and Contribution Guidelines