Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: added retry on files sync error #9261

Conversation

idsulik
Copy link
Contributor

@idsulik idsulik commented Jan 18, 2024

Description

Hi! I've been using skaffold for the last 2 years and from time to time I see error - "Skipping deploy due to sync error" and you can't do nothing with it in it was executed in auto sync mode.
Within this PR I added retry mechanism if sync returns error

Copy link

google-cla bot commented Jan 18, 2024

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

@pull-request-size pull-request-size bot added size/S and removed size/M labels Jan 18, 2024
@idsulik
Copy link
Contributor Author

idsulik commented Apr 13, 2024

@ericzzzzzzz Hi!
Should I do anything else or I just need to wait?

@ericzzzzzzz
Copy link
Contributor

Hi @idsulik I think this needs rebase, also do we know what errors are retryable and what not? I'm thinking we should distinguish those if we implement a retry mechanism for sync error

@idsulik
Copy link
Contributor Author

idsulik commented Apr 13, 2024

@ericzzzzzzz I think we should retry sync regardless of the error because in the case of auto-sync mode, you can't force it to sync again. there are several cases:

  • pre/post hooks error.
  • copying/deleting docker files.

So if something goes wrong it can retry it because otherwise I have to use a workaround - checkout a different branch, checkout back to force it to sync changes

@idsulik
Copy link
Contributor Author

idsulik commented Apr 13, 2024

do you know any error related to the sync when we shouldn't retry ?

@idsulik idsulik force-pushed the feature/added-retry-on-sync-error branch from d2c616b to 96bdc3e Compare April 13, 2024 19:14
@ericzzzzzzz
Copy link
Contributor

@idsulik You can find a few errors which are not recoverable without manual work using the query in "sync error" in:title.

@idsulik
Copy link
Contributor Author

idsulik commented Apr 14, 2024

@ericzzzzzzz what do you mean?
Search for "sync error" in the project gives me only 3 lines:
image

I've already specified these errors above. Could you please specify 1 sync error that is not recoverable without manual action?

@ericzzzzzzz
Copy link
Contributor

ericzzzzzzz commented Apr 15, 2024

I mean you can use the query on this github repo, to find the related issues. Like permission error.

@@ -85,9 +86,13 @@ func (r *SkaffoldRunner) doDev(ctx context.Context, out io.Writer) error {
r.changeSet.ResetSync()
r.intents.ResetSync()
}()

// todo: make this configurable
Copy link
Contributor

@ericzzzzzzz ericzzzzzzz Apr 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you leave this here intentionally while working on this feature? Were you planning to incorporate any changes before committing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, not in this PR, I left it in case if anyone wants to do it, maybe I'll do it by myself, but not now

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest to implement configurable re-try for sync and default it to 0 -- not re-try to keep the user experience the same as before, if we want to have a sync retry, but I am still not convinced that we need this.

While // todo comments might seem like helpful reminders, they often create more problems than they solve:

Limited Visibility: Only those who stumble upon the specific code section see the comment, potentially leaving others unaware of the task.
Unclear Ownership and Status: There's no indication of who's responsible for the task, whether it's assigned, in progress, or forgotten.
Lack of Discussion: Meaningful conversations are difficult without a dedicated issue or pull request, hindering collaboration.
Prioritization Challenges: Maintainers struggle to assess the urgency and importance of tasks scattered throughout the codebase.
Ambiguity in Pull Requests: It's unclear if // todo marks unfinished work or instructions for future changes, causing confusion during reviews.
These issues often lead to // todo items being overlooked or forgotten.

A more effective approach is to create an issue for each task:

  • Issues provide a centralized location for tracking, assigning, and discussing tasks.
  • They allow for clear prioritization and status updates.
  • Discussions within issues create better communication and collaboration.

@ericzzzzzzz
Copy link
Contributor

@ericzzzzzzz I think we should retry sync regardless of the error because in the case of auto-sync mode, you can't force it to sync again. there are several cases:

  • pre/post hooks error.
  • copying/deleting docker files.

So if something goes wrong it can retry it because otherwise I have to use a workaround - checkout a different branch, checkout back to force it to sync changes

  • pre/post hooks error.
    Do you mean hooks error will cause sync fail and we should re-try sync in this case?
    Cloud you give me a specific example why re-try sync would help ?
  • copying/deleting docker files.
    Wouldn't this trigger a rebuild?

@idsulik
Copy link
Contributor Author

idsulik commented Jul 11, 2024

  • Cloud you give me a specific example why re-try sync would help ?

while skaffold is trying to sync files something happened with the connection, skaffold will fail and when the connection become stable, you have to somehow trigger skaffold to start syncing again. if we have retry, on second/third retry it can sync the files successfully .

I have such cases a lot(VPN issue, quick checkout between branches lead to sync error etc.)

@ericzzzzzzz
Copy link
Contributor

  • Cloud you give me a specific example why re-try sync would help ?

while skaffold is trying to sync files something happened with the connection, skaffold will fail and when the connection become stable, you have to somehow trigger skaffold to start syncing again. if we have retry, on second/third retry it can sync the files successfully .

I have such cases a lot(VPN issue, quick checkout between branches lead to sync error etc.)

@idsulik thank you for the explanation! Could you please fix the ci issue. I'll approve the change.

@idsulik idsulik force-pushed the feature/added-retry-on-sync-error branch from 96bdc3e to c09fa6c Compare July 12, 2024 15:12
@idsulik
Copy link
Contributor Author

idsulik commented Jul 12, 2024

@ericzzzzzzz pushed the fix. thank you!

@ericzzzzzzz ericzzzzzzz added the kokoro:force-run forces a kokoro re-run on a PR label Jul 12, 2024
@kokoro-team kokoro-team removed the kokoro:force-run forces a kokoro re-run on a PR label Jul 12, 2024
@ericzzzzzzz ericzzzzzzz merged commit 2cbeb78 into GoogleContainerTools:main Jul 12, 2024
13 checks passed
idsulik added a commit to idsulik/skaffold that referenced this pull request Jul 19, 2024
* feat: added retry on files sync error

* reverted code style changes

* fix(dev.go): change package order

Signed-off-by: Suleiman Dibirov <idsulik@gmail.com>

---------

Signed-off-by: Suleiman Dibirov <idsulik@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants