Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Response Ops][Task Manager] Changing task manager schedule.interval schema to string from duration #204413

Merged
merged 5 commits into from
Dec 17, 2024

Conversation

ymao1
Copy link
Contributor

@ymao1 ymao1 commented Dec 16, 2024

Summary

Currently, when task manager schema changes are migrated down (which can occur during a release when there are multiple Kibana nodes running and they get updated one after another), we are running into this bug: #204395 where the task schedule.interval gets mutated from expected values (like 60s) to unallowed (but still valid for the schema) values (like PT1M).

This changes the schema for schedule.interval to use a string with validation function.

To verify

  1. Run Kibana on main and create some rules that run frequently.
  2. "Upgrade" to this PR branch and verify that rules continue to run.
  3. "Downgrade" back to main and verify that rules continue to run.

@ymao1 ymao1 changed the title Changing task manager schema to string from duration [Response Ops][Task Manager] Changing task manager schedule.interval schema to string from duration Dec 16, 2024
@ymao1 ymao1 self-assigned this Dec 16, 2024
@ymao1 ymao1 added release_note:skip Skip the PR/issue when compiling release notes Feature:Task Manager Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams) v9.0.0 backport:prev-minor Backport to (8.x) the previous minor version (i.e. one version back from main) v8.18.0 labels Dec 16, 2024
@ymao1 ymao1 marked this pull request as ready for review December 16, 2024 18:47
@ymao1 ymao1 requested a review from a team as a code owner December 16, 2024 18:47
@elasticmachine
Copy link
Contributor

Pinging @elastic/response-ops (Team:ResponseOps)

@@ -7,6 +7,26 @@

import { schema } from '@kbn/config-schema';

const SECONDS_REGEX = /^[1-9][0-9]*s$/;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use one of the functions from here, instead? x-pack/plugins/task_manager/server/lib/intervals.ts

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated in 7413a36

@ymao1 ymao1 requested a review from rudolf December 16, 2024 23:40
@elasticmachine
Copy link
Contributor

💚 Build Succeeded

Metrics [docs]

✅ unchanged

History

cc @ymao1

Copy link
Member

@pmuellr pmuellr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@elasticmachine
Copy link
Contributor

💚 Build Succeeded

Metrics [docs]

✅ unchanged

History

cc @ymao1

@mikecote mikecote merged commit ccdd662 into elastic:main Dec 17, 2024
8 checks passed
@kibanamachine
Copy link
Contributor

Starting backport for target branches: 8.x

https://github.com/elastic/kibana/actions/runs/12382698240

kibanamachine pushed a commit to kibanamachine/kibana that referenced this pull request Dec 17, 2024
…` schema to string from duration (elastic#204413)

## Summary

Currently, when task manager schema changes are migrated down (which can
occur during a release when there are multiple Kibana nodes running and
they get updated one after another), we are running into this bug:
elastic#204395 where the task
`schedule.interval` gets mutated from expected values (like `60s`) to
unallowed (but still valid for the schema) values (like `PT1M`).

This changes the schema for `schedule.interval` to use a string with
validation function.

## To verify
1. Run Kibana on `main` and create some rules that run frequently.
2. "Upgrade" to this PR branch and verify that rules continue to run.
3. "Downgrade" back to `main` and verify that rules continue to run.

(cherry picked from commit ccdd662)
@kibanamachine
Copy link
Contributor

💚 All backports created successfully

Status Branch Result
8.x

Note: Successful backport PRs will be merged automatically after passing CI.

Questions ?

Please refer to the Backport tool documentation

kibanamachine added a commit that referenced this pull request Dec 18, 2024
…le.interval` schema to string from duration (#204413) (#204669)

# Backport

This will backport the following commits from `main` to `8.x`:
- [[Response Ops][Task Manager] Changing task manager
`schedule.interval` schema to string from duration
(#204413)](#204413)

<!--- Backport version: 9.4.3 -->

### Questions ?
Please refer to the [Backport tool
documentation](https://github.com/sqren/backport)

<!--BACKPORT [{"author":{"name":"Ying
Mao","email":"ying.mao@elastic.co"},"sourceCommit":{"committedDate":"2024-12-17T22:41:20Z","message":"[Response
Ops][Task Manager] Changing task manager `schedule.interval` schema to
string from duration (#204413)\n\n## Summary\r\n\r\nCurrently, when task
manager schema changes are migrated down (which can\r\noccur during a
release when there are multiple Kibana nodes running and\r\nthey get
updated one after another), we are running into this
bug:\r\nhttps://github.com//issues/204395 where the
task\r\n`schedule.interval` gets mutated from expected values (like
`60s`) to\r\nunallowed (but still valid for the schema) values (like
`PT1M`).\r\n\r\nThis changes the schema for `schedule.interval` to use a
string with\r\nvalidation function.\r\n\r\n## To verify\r\n1. Run Kibana
on `main` and create some rules that run frequently.\r\n2. \"Upgrade\"
to this PR branch and verify that rules continue to run.\r\n3.
\"Downgrade\" back to `main` and verify that rules continue to
run.","sha":"ccdd662a053f9d02fe1ea04afe2bdd80827459c0","branchLabelMapping":{"^v9.0.0$":"main","^v8.18.0$":"8.x","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:skip","Feature:Task
Manager","Team:ResponseOps","v9.0.0","backport:prev-minor","v8.18.0"],"title":"[Response
Ops][Task Manager] Changing task manager `schedule.interval` schema to
string from
duration","number":204413,"url":"https://github.com/elastic/kibana/pull/204413","mergeCommit":{"message":"[Response
Ops][Task Manager] Changing task manager `schedule.interval` schema to
string from duration (#204413)\n\n## Summary\r\n\r\nCurrently, when task
manager schema changes are migrated down (which can\r\noccur during a
release when there are multiple Kibana nodes running and\r\nthey get
updated one after another), we are running into this
bug:\r\nhttps://github.com//issues/204395 where the
task\r\n`schedule.interval` gets mutated from expected values (like
`60s`) to\r\nunallowed (but still valid for the schema) values (like
`PT1M`).\r\n\r\nThis changes the schema for `schedule.interval` to use a
string with\r\nvalidation function.\r\n\r\n## To verify\r\n1. Run Kibana
on `main` and create some rules that run frequently.\r\n2. \"Upgrade\"
to this PR branch and verify that rules continue to run.\r\n3.
\"Downgrade\" back to `main` and verify that rules continue to
run.","sha":"ccdd662a053f9d02fe1ea04afe2bdd80827459c0"}},"sourceBranch":"main","suggestedTargetBranches":["8.x"],"targetPullRequestStates":[{"branch":"main","label":"v9.0.0","branchLabelMappingKey":"^v9.0.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/204413","number":204413,"mergeCommit":{"message":"[Response
Ops][Task Manager] Changing task manager `schedule.interval` schema to
string from duration (#204413)\n\n## Summary\r\n\r\nCurrently, when task
manager schema changes are migrated down (which can\r\noccur during a
release when there are multiple Kibana nodes running and\r\nthey get
updated one after another), we are running into this
bug:\r\nhttps://github.com//issues/204395 where the
task\r\n`schedule.interval` gets mutated from expected values (like
`60s`) to\r\nunallowed (but still valid for the schema) values (like
`PT1M`).\r\n\r\nThis changes the schema for `schedule.interval` to use a
string with\r\nvalidation function.\r\n\r\n## To verify\r\n1. Run Kibana
on `main` and create some rules that run frequently.\r\n2. \"Upgrade\"
to this PR branch and verify that rules continue to run.\r\n3.
\"Downgrade\" back to `main` and verify that rules continue to
run.","sha":"ccdd662a053f9d02fe1ea04afe2bdd80827459c0"}},{"branch":"8.x","label":"v8.18.0","branchLabelMappingKey":"^v8.18.0$","isSourceBranch":false,"state":"NOT_CREATED"}]}]
BACKPORT-->

Co-authored-by: Ying Mao <ying.mao@elastic.co>
JoseLuisGJ pushed a commit to JoseLuisGJ/kibana that referenced this pull request Dec 19, 2024
…` schema to string from duration (elastic#204413)

## Summary

Currently, when task manager schema changes are migrated down (which can
occur during a release when there are multiple Kibana nodes running and
they get updated one after another), we are running into this bug:
elastic#204395 where the task
`schedule.interval` gets mutated from expected values (like `60s`) to
unallowed (but still valid for the schema) values (like `PT1M`).

This changes the schema for `schedule.interval` to use a string with
validation function.

## To verify
1. Run Kibana on `main` and create some rules that run frequently.
2. "Upgrade" to this PR branch and verify that rules continue to run.
3. "Downgrade" back to `main` and verify that rules continue to run.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport:prev-minor Backport to (8.x) the previous minor version (i.e. one version back from main) Feature:Task Manager release_note:skip Skip the PR/issue when compiling release notes Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams) v8.18.0 v9.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants