Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Flake] ScyllaCluster [It] should allow to build connection pool using shard aware ports #2189

Open
tnozicka opened this issue Nov 18, 2024 · 4 comments
Assignees
Labels
kind/flake Categorizes issue or PR as related to a flaky test. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete.

Comments

@tnozicka
Copy link
Member

Link to the job that flaked.

https://prow.scylla-operator.scylladb.com/view/gs/scylla-operator-prow/logs/ci-scylla-operator-latest-e2e-gke-arm64-parallel/1858389818862473216#1:test-build-log.txt%3A823

Snippet of what failed.

 • [FAILED] [300.390 seconds]
ScyllaCluster [It] should allow to build connection pool using shard aware ports
github.com/scylladb/scylla-operator/test/e2e/set/scyllacluster/scyllacluster_shardawareness.go:29
  Timeline >>
  STEP: Creating a new namespace @ 11/18/24 06:21:48.127
  Nov 18 06:24:32.851: INFO: Connecting to 10.27.193.91:9042 using 0 source port
  Nov 18 06:24:32.852: INFO: Connecting to 10.27.193.91:19042 using 32771 source port
  Nov 18 06:24:32.852: INFO: Connecting to 10.27.193.91:19042 using 32769 source port
  Nov 18 06:24:32.852: INFO: Connecting to 10.27.193.91:19042 using 32768 source port
  Nov 18 06:24:32.852: INFO: Connecting to 10.27.193.91:19042 using 32772 source port
  [FAILED] in [It] - github.com/scylladb/scylla-operator/test/e2e/set/scyllacluster/scyllacluster_shardawareness.go:100 @ 11/18/24 06:24:37.853
  STEP: Collecting events from namespace "e2e-test-scyllacluster-qgl6f-0-xjfzj". @ 11/18/24 06:24:37.872
  STEP: Found 47 events. @ 11/18/24 06:24:37.882
  Nov 18 06:24:37.882: INFO: At 2024-11-18 06:21:48 +0000 UTC - event for basic-7ltzh: {controllermanager } NoPods: No matching pods found
  Nov 18 06:24:37.882: INFO: At 2024-11-18 06:21:48 +0000 UTC - event for basic-7ltzh: {scyllaclustermigration-controller } ScyllaDBDatacenterCreated: ScyllaDBDatacenter e2e-test-scyllacluster-qgl6f-0-xjfzj/basic-7ltzh created
  Nov 18 06:24:37.882: INFO: At 2024-11-18 06:21:48 +0000 UTC - event for basic-7ltzh: {scylladbdatacenter-controller } PodDisruptionBudgetCreated: PodDisruptionBudget e2e-test-scyllacluster-qgl6f-0-xjfzj/basic-7ltzh created
  Nov 18 06:24:37.883: INFO: At 2024-11-18 06:22:20 +0000 UTC - event for basic-7ltzh-us-east-1-us-east-1a-0: {kubelet gke-so-74a6779a-0ef5-41d8-9e4-workers-9332f80a-49l8} Pulled: Container image "docker.io/scylladb/scylla-operator@sha256:21e38512cdc63260fc0317dc72bec8501a47b7694b38ef6be225fa5450a0d150" already present on machine
  Nov 18 06:24:37.883: INFO: At 2024-11-18 06:22:21 +0000 UTC - event for basic-7ltzh-us-east-1-us-east-1a-0: {kubelet gke-so-74a6779a-0ef5-41d8-9e4-workers-9332f80a-49l8} Created: Created container scylla-manager-agent
  Nov 18 06:24:37.883: INFO: At 2024-11-18 06:22:21 +0000 UTC - event for basic-7ltzh-us-east-1-us-east-1a-0: {kubelet gke-so-74a6779a-0ef5-41d8-9e4-workers-9332f80a-49l8} Pulled: Container image "docker.io/scylladb/scylla-manager-agent:3.3.3@sha256:40e31739e8fb1d48af87abaeaa8ee29f71607964daa8434fe2526dfc6f665920" already present on machine
  Nov 18 06:24:37.883: INFO: At 2024-11-18 06:22:21 +0000 UTC - event for basic-7ltzh-us-east-1-us-east-1a-0: {kubelet gke-so-74a6779a-0ef5-41d8-9e4-workers-9332f80a-49l8} Started: Started container scylladb-ignition
  Nov 18 06:24:37.883: INFO: At 2024-11-18 06:22:21 +0000 UTC - event for basic-7ltzh-us-east-1-us-east-1a-0: {kubelet gke-so-74a6779a-0ef5-41d8-9e4-workers-9332f80a-49l8} Started: Started container scylla-manager-agent
  Nov 18 06:24:37.883: INFO: At 2024-11-18 06:22:22 +0000 UTC - event for basic-7ltzh-us-east-1-us-east-1a-0: {kubelet gke-so-74a6779a-0ef5-41d8-9e4-workers-9332f80a-49l8} Unhealthy: Readiness probe failed: Get "http://10.27.193.91:42081/readyz": dial tcp 10.27.193.91:42081: connect: connection refused
  Nov 18 06:24:37.883: INFO: At 2024-11-18 06:22:22 +0000 UTC - event for basic-7ltzh-us-east-1-us-east-1a-0: {kubelet gke-so-74a6779a-0ef5-41d8-9e4-workers-9332f80a-49l8} Unhealthy: Readiness probe failed: dial tcp 10.27.193.91:8080: connect: connection refused
  Nov 18 06:24:37.883: INFO: At 2024-11-18 06:22:22 +0000 UTC - event for basic-7ltzh-us-east-1-us-east-1a-0: {kubelet gke-so-74a6779a-0ef5-41d8-9e4-workers-9332f80a-49l8} Unhealthy: Readiness probe failed: dial tcp 10.27.193.91:10001: connect: connection refused
  Nov 18 06:24:37.883: INFO: At 2024-11-18 06:22:22 +0000 UTC - event for basic-7ltzh-us-east-1-us-east-1a-0: {kubelet gke-so-74a6779a-0ef5-41d8-9e4-workers-9332f80a-49l8} Unhealthy: Startup probe failed: Get "http://10.27.193.91:8080/healthz": dial tcp 10.27.193.91:8080: connect: connection refused
  Nov 18 06:24:37.883: INFO: At 2024-11-18 06:22:22 +0000 UTC - event for nodeconfig-podinfo-d4d2ee60-a550-4439-b462-6487855288f3: {NodeConfigCM-controller } ConfigMapUpdated: ConfigMap e2e-test-scyllacluster-qgl6f-0-xjfzj/nodeconfig-podinfo-d4d2ee60-a550-4439-b462-6487855288f3 updated
  Nov 18 06:24:37.883: INFO: At 2024-11-18 06:24:32 +0000 UTC - event for basic-7ltzh-us-east-1-us-east-1a-0: {scylladbdatacenter-controller } ServiceUpdated: Service e2e-test-scyllacluster-qgl6f-0-xjfzj/basic-7ltzh-us-east-1-us-east-1a-0 updated
  STEP: Collecting dumps from namespace "e2e-test-scyllacluster-qgl6f-0-xjfzj". @ 11/18/24 06:24:37.883
  STEP: Destroying namespace "e2e-test-scyllacluster-qgl6f-0-xjfzj". @ 11/18/24 06:24:38.452
  STEP: Waiting for namespace "e2e-test-scyllacluster-qgl6f-0-xjfzj" to be removed. @ 11/18/24 06:24:38.464
  << Timeline
  [FAILED] Unexpected error:
      <context.deadlineExceededError>: 
      context deadline exceeded
      {}
  occurred
  In [It] at: github.com/scylladb/scylla-operator/test/e2e/set/scyllacluster/scyllacluster_shardawareness.go:100 @ 11/18/24 06:24:37.853
  Full Stack Trace
    github.com/scylladb/scylla-operator/test/e2e/set/scyllacluster.init.func14.1()
    	github.com/scylladb/scylla-operator/test/e2e/set/scyllacluster/scyllacluster_shardawareness.go:100 +0x8f4
------------------------------
• [300.302 seconds] 
@tnozicka tnozicka added kind/flake Categorizes issue or PR as related to a flaky test. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. labels Nov 18, 2024
@zimnx
Copy link
Collaborator

zimnx commented Nov 18, 2024

Nov 18 06:24:32.843: INFO: Connecting to 10.27.193.91:9042 using 0 source port
Nov 18 06:24:32.844: INFO: Connecting to 10.27.193.91:9042 using 0 source port
Nov 18 06:24:32.851: INFO: Connecting to 10.27.193.91:9042 using 0 source port
Nov 18 06:24:32.852: INFO: Connecting to 10.27.193.91:19042 using 32771 source port
Nov 18 06:24:32.852: INFO: Connecting to 10.27.193.91:19042 using 32769 source port
Nov 18 06:24:32.852: INFO: Connecting to 10.27.193.91:19042 using 32768 source port
Nov 18 06:24:32.852: INFO: Connecting to 10.27.193.91:19042 using 32772 source port

Indicates image doesn't contain recent #2175 which was merged 3 weeks ago (!). Either we hadn't successful run for that long, or there's something wrong with CI.

@zimnx
Copy link
Collaborator

zimnx commented Nov 18, 2024

Indeed, last successful ci-scylla-operator-master-images build was on Oct 27 while #2175 was merged on Oct 31.

@tnozicka
Copy link
Member Author

tnozicka commented Nov 18, 2024

yeah, well, we know the arm builds fail, which means the promotion doesn't happen

@tnozicka
Copy link
Member Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/flake Categorizes issue or PR as related to a flaky test. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete.
Projects
None yet
Development

No branches or pull requests

2 participants