-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add DRY_RUN config option #97
Conversation
ba27050
to
b25a214
Compare
pkg/timelock/scheduler.go
Outdated
@@ -112,13 +112,17 @@ func (tw *Worker) updateSchedulerDelay(t time.Duration) { | |||
|
|||
// addToScheduler adds a new CallSchedule operation safely to the store. | |||
func (tw *Worker) addToScheduler(op *contract.TimelockCallScheduled) { | |||
tw.mu.Lock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these changes to the scheduler.go
module is due to a race condition uncovered by the new integration test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there are other places in this module where other race conditions are present, but I opted to leave as is to avoid adding to much scope to this PR.
b25a214
to
26394ab
Compare
@@ -55,7 +55,7 @@ func (tl testLogger) LastMessage() string { | |||
tl.mutex.Lock() | |||
defer tl.mutex.Unlock() | |||
|
|||
return (*tl.messages)[tl.NumMessages()-1] | |||
return (*tl.messages)[len(*tl.messages)-1] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is to avoid a deadlock uncovered by the new integration test
pkg/timelock/timelock.go
Outdated
@@ -449,7 +451,9 @@ func (tw *Worker) handleLog(ctx context.Context, log types.Log) error { | |||
|
|||
if !isDone(ctx, tw.contract, cs.Id) && isOperation(ctx, tw.contract, cs.Id) { | |||
tw.logger.Info().Hex(fieldTXHash, cs.Raw.TxHash[:]).Uint64(fieldBlockNumber, cs.Raw.BlockNumber).Msgf("%s received", eventCallScheduled) | |||
tw.addToScheduler(cs) | |||
if !tw.dryRun { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel this changes the nature of the function, now a flag completely changes the output of it. It can make it confusing and hard to maintain
The goal of the PR is having an option to avoid side effects right? That's not on the Worker side, but on the Scheduler, so why don't we create another type of scheduler that doesn't execute?
func NewTimelockWorker(..., dryRun bool) (*Worker, error) {
scheduler := *newScheduler(time.Duration(pollPeriod) * time.Second)
if dryRun {
// This could be exactly the same, except the execute, which does nothing
scheduler := *newDryRunScheduler(time.Duration(pollPeriod) * time.Second)
}
tWorker := &Worker{
...
scheduler: scheduler
}
return tWorker, nil
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The goal of the PR is having an option to avoid side effects right?
It's more about "testing the event listener". I'll give your suggestion a try but I have a feeling it will be harder to test and maintain in the end.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes look good to me, how do you feel about it?
e080275
to
fd8a997
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Might be worth to create a ticket as a part of repo standarization to add t.Parallel linter rule to make sure unit tests are running in parallel as I think the t.Parallel()
} | ||
if !(reflect.TypeOf(tt.want) == reflect.TypeOf(got)) { | ||
t.Errorf("NewTimelockWorker() = %v, want %v", got, tt.want) | ||
args := defaultArgs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
might be worth adding t.Parallel()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I opted to leave it as is for now. Parallelizing integration tests -- which often rely on a shared data store, such as the geth instance -- can get tricky. It might even work as is, but I didn't explicitly design the tests with parallelism in mind. So I'd rather take the time to think it thoroughly if/when it makes sense.
Nevermind. I got confused and thought this comment was in tests/integration/timelock_test.go
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
ef13b61
86207a0
to
0838722
Compare
@@ -153,6 +157,48 @@ func Test_dumpOperationStore(t *testing.T) { | |||
assert.Equal(t, wantRead, gotRead) | |||
} | |||
|
|||
func Test_scheduler_concurrency(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After the deadlocks I saw with the earlier version, I decided to add this unorthodox test to validate that the scheduler can simultaneously add and consume operations.
0838722
to
ba738cb
Compare
tw.logger.Debug().Msgf("scheduling operation: %x", op.Id) | ||
tw.add <- op | ||
tw.logger.Debug().Msgf("operations in scheduler: %v", len(tw.store)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This len(tw.store)
call is causing a deadlock, so I'm removing it from addToScheduler
and delFromScheduler
.
The alternative would be to refactor the scheduler implementation, which I've started doing in this branch. It seems to work as expected but it's a risky change which I'd rather not worry about right now. Once we've enhanced our test suite I'll give it another go.
Add a `DRY_RUN` configuration option that controls whether the service executes operations or simply logs the received events. This is a temporary feature needed to help test the recently added logic to monitor contract events using http connections without causing any side-effects.
Instead of using the `dryRun` flag to control whether or not operations are added to the (standard) scheduler, we now select the type of the scheduler pass to the timelock worker service: * if dryRun is false, use the standard scheduler * if dryRun is true, use the new "nop" scheduler, which only logs the calls but does not do anything In practice the "standard scheduler" is a new type + interface as well, since the existing implementation defined a the schedule as a simple data type which was associated with the timelock worker via implicit composition (though all the schedule related methods were defined on the timelock worker type).
ba738cb
to
a9143ad
Compare
friendly bump on this @RodrigoAD @jkongie @graham-chainlink @ecPablo |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not super confident but LGTM
Add a
DRY_RUN
configuration option that controls whether the service executes operations or simply logs the received events.This is a temporary feature needed to help test the recently added logic to monitor contract events using http connections without causing any side-effects.