Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When a test framework setup fails, no test are run but the run succeed #2098

Open
tnozicka opened this issue Aug 27, 2024 · 6 comments
Open
Labels
kind/bug Categorizes issue or PR as related to a bug. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. triage/accepted Indicates an issue or PR is ready to be actively worked on.

Comments

@tnozicka
Copy link
Member

tnozicka commented Aug 27, 2024

When any assertion in the Framework fails, no test are executed but the run succeeds. It can be simulated by minting the framework like

func NewFramework(namePrefix string) *Framework {
	o.Expect(true).To(o.BeFalse())

this go through ginkgo.Recover and calls the fail handler correctly but my suspicion is that when no test gets registered is forgets to check the fail handler. I've tried failing the individual tests before the framework creation directly and that has correctly failed the run.

go run ./cmd/scylla-operator-tests run all --ingress-controller-address=$( kubectl -n haproxy-ingress get svc haproxy-ingress --template='{{ .spec.clusterIP }}' ) --loglevel=2 --parallelism=1 --progress --delete-namespace-policy=OnSuccess --feature-gates='AutomaticTLSCertificates=true' --artifacts-dir=/tmp/so-artifacts --fail-fast --loglevel=5
Flag --delete-namespace-policy has been deprecated, --delete-namespace-policy is deprecated - please use --cleanup-policy instead
I0827 19:39:14.798450  632799 tests/tests.go:74] maxprocs: Leaving GOMAXPROCS=[16]: CPU quota undefined
I0827 19:39:14.799273  632799 tests/tests_run.go:213] "scylla-operator-tests run" version "unknown"
I0827 19:39:14.799288  632799 flag/flags.go:64] FLAG: --artifacts-dir="/tmp/so-artifacts"
I0827 19:39:14.799292  632799 flag/flags.go:64] FLAG: --burst="75"
I0827 19:39:14.799297  632799 flag/flags.go:64] FLAG: --cleanup-policy="OnSuccess"
I0827 19:39:14.799300  632799 flag/flags.go:64] FLAG: --color="true"
I0827 19:39:14.799303  632799 flag/flags.go:64] FLAG: --delete-namespace-policy="OnSuccess"
I0827 19:39:14.799304  632799 flag/flags.go:64] FLAG: --dry-run="false"
I0827 19:39:14.799306  632799 flag/flags.go:64] FLAG: --fail-fast="true"
I0827 19:39:14.799307  632799 flag/flags.go:64] FLAG: --feature-gates="AutomaticTLSCertificates=true"
I0827 19:39:14.799321  632799 flag/flags.go:64] FLAG: --flake-attempts="0"
I0827 19:39:14.799322  632799 flag/flags.go:64] FLAG: --focus="[]"
I0827 19:39:14.799328  632799 flag/flags.go:64] FLAG: --gcs-service-account-key-path=""
I0827 19:39:14.799330  632799 flag/flags.go:64] FLAG: --help="false"
I0827 19:39:14.799332  632799 flag/flags.go:64] FLAG: --ingress-controller-address="10.111.86.222"
I0827 19:39:14.799335  632799 flag/flags.go:64] FLAG: --ingress-controller-custom-annotations="[]"
I0827 19:39:14.799340  632799 flag/flags.go:64] FLAG: --ingress-controller-ingress-class-name=""
I0827 19:39:14.799343  632799 flag/flags.go:64] FLAG: --kubeconfig="[]"
I0827 19:39:14.799347  632799 flag/flags.go:64] FLAG: --label-filter=""
I0827 19:39:14.799349  632799 flag/flags.go:64] FLAG: --loglevel="5"
I0827 19:39:14.799351  632799 flag/flags.go:64] FLAG: --object-storage-bucket=""
I0827 19:39:14.799353  632799 flag/flags.go:64] FLAG: --parallel-loglevel="0"
I0827 19:39:14.799355  632799 flag/flags.go:64] FLAG: --parallel-server-address=""
I0827 19:39:14.799357  632799 flag/flags.go:64] FLAG: --parallel-shard="0"
I0827 19:39:14.799359  632799 flag/flags.go:64] FLAG: --parallelism="1"
I0827 19:39:14.799361  632799 flag/flags.go:64] FLAG: --progress="true"
I0827 19:39:14.799363  632799 flag/flags.go:64] FLAG: --qps="50"
I0827 19:39:14.799367  632799 flag/flags.go:64] FLAG: --quiet="false"
I0827 19:39:14.799369  632799 flag/flags.go:64] FLAG: --random-seed="1724780354"
I0827 19:39:14.799372  632799 flag/flags.go:64] FLAG: --s3-credentials-file-path=""
I0827 19:39:14.799375  632799 flag/flags.go:64] FLAG: --scyllacluster-clients-broadcast-address-type="PodIP"
I0827 19:39:14.799388  632799 flag/flags.go:64] FLAG: --scyllacluster-node-service-type="Headless"
I0827 19:39:14.799390  632799 flag/flags.go:64] FLAG: --scyllacluster-nodes-broadcast-address-type="PodIP"
I0827 19:39:14.799393  632799 flag/flags.go:64] FLAG: --scyllacluster-storageclass-name=""
I0827 19:39:14.799395  632799 flag/flags.go:64] FLAG: --skip="[]"
I0827 19:39:14.799398  632799 flag/flags.go:64] FLAG: --timeout="24h0m0s"
I0827 19:39:14.799401  632799 flag/flags.go:64] FLAG: --v="5"
I0827 19:39:14.799548  632799 tests/tests_run.go:299] "Running specs"
Running Suite: Scylla operator E2E tests - /home/dev/dev/go/src/github.com/scylladb/scylla-operator
===================================================================================================
Random Seed: 1724780354 - will randomize all specs

Will run 0 of 0 specs

Ran 0 of 0 Specs in 0.000 seconds
SUCCESS! -- 0 Passed | 0 Failed | 0 Pending | 0 Skipped
@tnozicka tnozicka added kind/bug Categorizes issue or PR as related to a bug. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. labels Aug 27, 2024
@rzetelskik
Copy link
Member

rzetelskik commented Aug 27, 2024

When any assertion in the Framework fails, no test are executed but the run succeeds. It can be simulated by minting the framework like

func NewFramework(namePrefix string) *Framework {
	o.Expect(true).To(o.BeFalse())

this go through ginkgo.Recover and calls the fail handler correctly but my suspicion is that when no test gets registered is forgets to check the fail handler.

This is by design (as in we're using ginkgo incorrectly) - we run framework initialisation in a container node, so any failing assertions there will fail at the spec tree construction phase, not when running specs, see https://onsi.github.io/ginkgo/#no-assertions-in-container-nodes. Iirc putting GinkgoRecover there only causes it to (unintentionally from our perspective) pass silently on a failed assertion - if you remove it you'll see that the spec tree construction fails. GinkgoRecover shouldn't be called in the container nodes in the first place, see e.g. onsi/ginkgo#931 (comment).

Imo to properly fix this we'd have to move all initialisation in the framework from the container nodes to setup nodes. I think I actually tried to do this at some point but it required quite a few changes given how the framework and tests are set up now.

@tnozicka
Copy link
Member Author

tnozicka commented Aug 28, 2024

yeah, I guess we should put stuff that can fail (init) into a beforeEach

the weird thing is that is works if not all nodes fail...

@rzetelskik
Copy link
Member

the weird thing is that is works if not all nodes fail...

Can you give an example? I'm not sure what that means

Copy link
Contributor

The Scylla Operator project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 30d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out

/lifecycle stale

@scylla-operator-bot scylla-operator-bot bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Sep 27, 2024
Copy link
Contributor

The Scylla Operator project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 30d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle rotten
  • Close this issue with /close
  • Offer to help out

/lifecycle rotten

@scylla-operator-bot scylla-operator-bot bot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Oct 28, 2024
@tnozicka
Copy link
Member Author

/remove-lifecycle rotten
/triage accepted

@scylla-operator-bot scylla-operator-bot bot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. labels Oct 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

No branches or pull requests

2 participants