Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Segfault when doing explicit update on hypertable #7516

Open
martindaehn23 opened this issue Dec 4, 2024 · 4 comments
Open

[Bug]: Segfault when doing explicit update on hypertable #7516

martindaehn23 opened this issue Dec 4, 2024 · 4 comments

Comments

@martindaehn23
Copy link

What type of bug is this?

Crash

What subsystems and features are affected?

Data ingestion, Platform/OS, Query executor, Replication

What happened?

We are operating a patroni cluster. The following update query on a hypertable leads to a segfault. All info's we can gather you can find below.

TimescaleDB version affected

2.17.2

PostgreSQL version used

14.14

What operating system did you use?

Ubuntu 22.04.5 LTS

What installation method did you use?

Deb/Apt

What platform did you run on?

On prem/Self-hosted

Relevant log output and stack trace

## dmesg
[1266168.466753] postgres[729933]: segfault at 7ef2fd228b48 ip 00007ef6457812c2 sp 00007fff32a4a930 error 4 in timescaledb-2.17.2.so[7ef64573f000+61000]

## postgres log
In the postgres log we see the following:
2024-12-04 13:22:10 CET [4110337]: [142-1] user=,db=,app=,client= 'LOG:  server process (PID 3198125) was terminated by signal 11: Segmentation fault
2024-12-04 13:22:10 CET [4110337]: [143-1] user=,db=,app=,client= 'DETAIL:  Failed process was running: UPDATE "data_meters_ts" SET "updated_at" = $1, "migrated_at" = $2 WHERE "data_meters_ts"."device_uid" = $3 /*application:Core*/
2024-12-04 13:22:10 CET [4110337]: [144-1] user=,db=,app=,client= 'LOG:  terminating any other active server processes
2024-12-04 13:22:10 CET [4110337]: [145-1] user=,db=,app=,client= 'LOG:  all server processes terminated; reinitializing
2024-12-04 13:22:12 CET [3199061]: [1-1] user=,db=,app=,client= 'LOG:  database system was interrupted; last known up at 2024-12-04 13:10:45 CET
2024-12-04 13:22:12 CET [3199062]: [1-1] user=[unknown],db=[unknown],app=[unknown],client=10.1.0.4 'LOG:  connection received: host=10.1.0.4 port=56874
2024-12-04 13:22:12 CET [3199063]: [1-1] user=[unknown],db=[unknown],app=[unknown],client=127.0.0.1 'LOG:  connection received: host=127.0.0.1 port=30558
2024-12-04 13:22:12 CET [3199064]: [1-1] user=[unknown],db=[unknown],app=[unknown],client=127.0.0.1 'LOG:  connection received: host=127.0.0.1 port=30560
2024-12-04 13:22:12 CET [3199065]: [1-1] user=[unknown],db=[unknown],app=[unknown],client=127.0.0.1 'LOG:  connection received: host=127.0.0.1 port=30570
2024-12-04 13:22:12 CET [3199066]: [1-1] user=[unknown],db=[unknown],app=[unknown],client=127.0.0.1 'LOG:  connection received: host=127.0.0.1 port=30584
2024-12-04 13:22:12 CET [3199063]: [2-1] user=pgbouncer,db=postgres,app=[unknown],client=127.0.0.1 'FATAL:  the database system is in recovery mode
2024-12-04 13:22:12 CET [3199064]: [2-1] user=xxx_ug,db=xxx_ug,app=[unknown],client=127.0.0.1 'FATAL:  the database system is in recovery mode
2024-12-04 13:22:12 CET [3199065]: [2-1] user=xxx_ug_grafana,db=xxx_ug_grafana,app=[unknown],client=127.0.0.1 'FATAL:  the database system is in recovery mode
2024-12-04 13:22:12 CET [3199066]: [2-1] user=xxx_mw_grafana,db=xxx_mw_grafana,app=[unknown],client=127.0.0.1 'FATAL:  the database system is in recovery mode
2024-12-04 13:22:12 CET [3199062]: [2-1] user=replicator,db=[unknown],app=[unknown],client=10.1.0.4 'FATAL:  the database system is in recovery mode
2024-12-04 13:22:12 CET [3199061]: [2-1] user=,db=,app=,client= 'LOG:  database system was not properly shut down; automatic recovery in progress
2024-12-04 13:22:12 CET [3199061]: [3-1] user=,db=,app=,client= 'LOG:  redo starts at 3E3/8C0CBD28
2024-12-04 13:22:12 CET [3199061]: [4-1] user=,db=,app=,client= 'LOG:  invalid record length at 3E3/8CFB6A38: wanted 24, got 0
2024-12-04 13:22:12 CET [3199061]: [5-1] user=,db=,app=,client= 'LOG:  redo done at 3E3/8CFB6A00 system usage: CPU: user: 0.04 s, system: 0.01 s, elapsed: 0.05 s
2024-12-04 13:22:12 CET [3199061]: [6-1] user=,db=,app=,client= 'LOG:  checkpoint starting: end-of-recovery immediate
2024-12-04 13:22:12 CET [3199069]: [1-1] user=[unknown],db=[unknown],app=[unknown],client=[local] 'LOG:  connection received: host=[local]
2024-12-04 13:22:12 CET [3199069]: [2-1] user=postgres,db=postgres,app=[unknown],client=[local] 'FATAL:  the database system is in recovery mode
2024-12-04 13:22:12 CET [3199070]: [1-1] user=[unknown],db=[unknown],app=[unknown],client=[local] 'LOG:  connection received: host=[local]
2024-12-04 13:22:12 CET [3199070]: [2-1] user=postgres,db=postgres,app=[unknown],client=[local] 'FATAL:  the database system is in recovery mode
2024-12-04 13:22:12 CET [3199071]: [1-1] user=[unknown],db=[unknown],app=[unknown],client=[local] 'LOG:  connection received: host=[local]
2024-12-04 13:22:12 CET [3199071]: [2-1] user=postgres,db=postgres,app=[unknown],client=[local] 'FATAL:  the database system is in recovery mode
2024-12-04 13:22:12 CET [3199061]: [7-1] user=,db=,app=,client= 'LOG:  checkpoint complete: wrote 5448 buffers (0.1%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.100 s, sync=0.003 s, total=0.105 s; sync files=431, longest=0.002 s, average=0.001 s; distance=15275 kB, estimate=15275 kB
2024-12-04 13:22:12 CET [4110337]: [146-1] user=,db=,app=,client= 'LOG:  database system is ready to accept connections

### dmesg
[1266168.466753] postgres[729933]: segfault at 7ef2fd228b48 ip 00007ef6457812c2 sp 00007fff32a4a930 error 4 in timescaledb-2.17.2.so[7ef64573f000+61000]


### coredump
coredumpctl dump 835417
           PID: 835417 (postgres)
           UID: 115 (postgres)
           GID: 122 (postgres)
        Signal: 11 (SEGV)
     Timestamp: Wed 2024-12-04 15:29:42 UTC (6min ago)
  Command Line: $'postgres: postgres-cluster: postgres xxx_review1 [local] UPDATE' "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" ""
    Executable: /usr/lib/postgresql/14/bin/postgres
 Control Group: /system.slice/patroni.service
          Unit: patroni.service
         Slice: system.slice
       Boot ID: 8f7a9197d6a24b36be6cfe2fa2c9ba42
    Machine ID: 43fbdea8349742b6a40ec9ce6c69d36f
      Hostname: bukavu
       Storage: /var/lib/systemd/coredump/core.postgres.115.8f7a9197d6a24b36be6cfe2fa2c9ba42.835417.1733326182000000.zst (truncated)
     Disk Size: 363.0M
       Message: Process 835417 (postgres) of user 115 dumped core.

                Found module /usr/lib/postgresql/14/bin/postgres with build-id: c030f54129901e3e919853e042807a20aafbeaba
                Found module /usr/lib/postgresql/14/lib/timescaledb-tsl-2.17.2.so with build-id: 64d7777a078da90d42fb71649694e3ad8ded22c9
                Found module /usr/lib/postgresql/14/lib/timescaledb-2.17.2.so with build-id: 2de6a8207f4ed2ea128011d4fa537786ceb2f4a3
                Stack trace of thread 835417:
                #0  0x00007ef6457812c2 ExecInitUpdateProjection (/usr/lib/postgresql/14/lib/timescaledb-2.17.2.so + 0x5d2c2)

How can we reproduce the bug?

## Query
We are able to reproduce the issue with any explicit update on any database on a hypertable. We already did a failover, its happening on both servers.

UPDATE hyper_table SET attr = 1 WHERE id = 1;

Any update on any non hypertable does just work fine.
And also the following still works even on hypertables:

INSERT INTO ... VALUES ON CONFLICT ... DO UPDATE SET ...
@erimatnor
Copy link
Contributor

Looks like a duplicate of #7497

@martindaehn23 Is the table using compression?

@martindaehn23
Copy link
Author

No we dont use any compression. The mentioned issue looks simular beside a difference is this part of the core dump:

  Command Line: $'postgres: postgres-cluster: postgres xxx_review1 [local

Any idea?

@svenklemm
Copy link
Member

Can you add a fully self-contained reproducer to allow us to reproduce this is a fresh database? All the table definitions, hypertable definitions are missing from the ticket.

@svenklemm
Copy link
Member

@martindaehn23 could you post the stacktrace from the core dump?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants