Skip to content

Database

Cedric Bosdonnat edited this page Feb 7, 2023 · 10 revisions

PostgreSQL Tips and Tricks

HOWTO monitor currently active queries

spacewalk-sql --select-mode - <<<"SELECT pid, age(clock_timestamp(), query_start), usename, query FROM pg_stat_activity WHERE query != '<IDLE>' AND query NOT ILIKE '%pg_stat_activity%' ORDER BY query_start DESC;"

To continuously log active queries to a file, use:

while sleep 1; do spacewalk-sql --select-mode - <<<"SELECT pid, age(clock_timestamp(), query_start), usename, query FROM pg_stat_activity WHERE query != '<IDLE>' AND query NOT ILIKE '%pg_stat_activity%' ORDER BY query_start DESC;" | tee -a /tmp/query.log; done

HOWTO monitor currently open Postgres connections

TCP socket connections

netstat --numeric-ports | grep localhost:5432 | awk '{print $5}' | grep -v localhost:5432 | awk -F ':' '{print $2 "/tcp"}' | xargs fuser 2>/dev/null | tr ' ' '\n' | xargs -L 1 ps -o args --no-headers -p

Detail:

  • get the active TCP connections to Postgres: netstat --numeric-ports | grep localhost:5432;
  • filter only the outgoing ones (being on localhost you will see both incoming and outgoing, so you basically have to discard one half of the lines): awk '{print $5}' | grep -v localhost:5432;
  • get the (unique, source) port numbers: awk -F ':' '{print $2 "/tcp"}';
  • get the PIDs of processes using those ports: xargs fuser 2>/dev/null | tr ' ' '\n';
  • get those PIDs full command line: xargs -L 1 ps -o args --no-headers -p.

Those should all be Java processes coming from Taskomatic, Tomcat or the search process. Each of those processes has a pool with 5 (minimum) to 20 (maximum) connections. See: /usr/share/rhn/config-defaults/rhn_hibernate.conf

Unix socket connections

fuser /tmp/.s.PGSQL.5432 2>/dev/null | tr ' ' '\n' | xargs -L 1 ps -o args --no-headers -p

Note that the kernel will not let you know who is using the socket besides Postgres itself. For reference:

Credits: http://unix.stackexchange.com/questions/16300/whos-got-the-other-end-of-this-unix-socketpair

2-3 of these are taken by jabberd and the rest by httpd (Python and Perl server-side components). Every httpd subprocess consumes one.

HOWTO log slow queries

Add log_min_duration_statement to the configuration file and reload it:

echo "log_min_duration_statement = 1000" >> /var/lib/pgsql/data/postgresql.conf 
spacewalk-sql --select-mode - <<<"SELECT pg_reload_conf();"

Queries taking longer than 1 second will appear in logs with their durations. To check:

tail -f /var/lib/pgsql/data/log/`ls -t /var/lib/pgsql/data/log/ | head -1`

HOWTO analyze a query plan with PEV2

A very nice tool to get a visual, browseable representation of a query plan is PEV2.

Easiest way to use it is:

  • prepare a file with the query with an EXPLAIN preamble:
cat >explain.sql <<EOF
EXPLAIN (ANALYZE, COSTS, VERBOSE, BUFFERS, FORMAT JSON)
  SELECT ...;
EOF
  • run the query via psql to capure the output in JSON format:
export PGUSER=`grep db_user /etc/rhn/rhn.conf | grep -v "report" | grep -oP "db_user ?= ?\K.*"`
export PGPASSWORD=`grep db_password /etc/rhn/rhn.conf | grep -v "report" | grep -oP "db_password ?= ?\K.*"`
export PGDATABASE=`grep db_name /etc/rhn/rhn.conf | grep -v "report" | grep -oP "db_name ?= ?\K.*"`
psql --quiet --no-align --tuples-only --file=explain.sql > explain.json
  • drag explain.json into

https://explain.dalibo.com

HOWTO list database tables by disk size occupation

Copy and paste:

date

printf "\n\n\n**** DATABASE SIZES\n"
du -h /var/lib/pgsql/data

printf "\n\n\n**** TABLE SIZES\n"
spacewalk-sql --select-mode - <<<"
  WITH table_sizes AS (
    SELECT CAST(relname AS TEXT) AS table_name,
        pg_size_pretty(pg_total_relation_size(C.oid)) AS size,
        pg_total_relation_size(C.oid) AS ordering
      FROM pg_class C
      LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace)
      WHERE nspname NOT IN ('pg_catalog', 'information_schema')
        AND C.relkind <> 'i'
        AND nspname !~ '^pg_toast'
      ORDER BY ordering DESC
      LIMIT 20
  ),
  total_size AS (
    SELECT CAST('**TOTAL**' AS TEXT) as table_name,
      pg_size_pretty(CAST(SUM(pg_total_relation_size(C.oid)) AS BIGINT)) AS size,
      -1 AS ordering
    FROM pg_class C
    LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace)
    WHERE nspname NOT IN ('pg_catalog', 'information_schema')
      AND C.relkind <> 'i'
      AND nspname !~ '^pg_toast'
  ),
  all_sizes AS (
    SELECT * FROM table_sizes UNION ALL (SELECT * FROM total_size)
  )
  SELECT table_name, size
    FROM all_sizes
    ORDER BY ordering DESC;
"

HOWTO get a database dump

There are different ways to get database dumps:

  1. pg_dump: creates a big SQL file that can be run to re-create the DB. This is a compact representation and it can be restored on any later Postgres version, so it's ideal for data analysis. Copy and paste:
mkdir /tmp/backup
chown postgres:postgres /tmp/backup
chmod 700 /tmp/backup
su - postgres -c "pg_dump --create `grep -oP '^db_name ?= ?\K.*' /etc/rhn/rhn.conf` | gzip > /tmp/backup/pg_dump.gz"
  1. pg_basebackup through smdba: creates copies of internal PostgreSQL files that can be used as-is. It is less compact than pg_dump, it requires the same PostgreSQL version but it preserves some lower level properties, like previous versions of rows, stale space in tables, etc. It is thus more useful in case of performance issues. Moreover many users already use smdba for their own backup purposes, so it might be easier to obtain one.
mkdir /tmp/backup
chown postgres:postgres /tmp/backup
chmod 700 /tmp/backup
smdba backup-hot --enable=on --backup-dir=/tmp/backup
smdba backup-hot --enable=off
  1. tar of the data directory with the service down. Simplest, closest to reality, requires downtime.
rcpostgresql stop
tar czvf /tmp/raw_copy.tar.gz /var/lib/pgsql

HOWTO restore a database dump taken from pg_dump

Create a server with the matching version (eg. using sumaform and stop all services:

spacewalk-service stop
rcsalt-master stop

Then take a look at the dump and see who the main user is (in this example uyuni):

zless pg_dump.gz

...
ALTER SCHEMA logging OWNER TO uyuni;
...

Create that user with a password (susemanager in the example), and re-create the destination database ('susemanager' in the example, which is sumaform's default):

su - postgres

psql
CREATE USER susemanager WITH ENCRYPTED PASSWORD 'susemanager';
ALTER ROLE susemanager superuser;
DROP DATABASE susemanager;
CREATE DATABASE susemanager;
\q

Restore the dump to the destination database ('susemanager' in the example):

gzip -d -c /tmp/pg_dump.gz | psql susemanager --set ON_ERROR_STOP=on

BIG DUMP NOTE: if your dump is in the gigabytes of size, copying it to the server and decompressing it might be very slow and require a lot of disk space. In those cases it is more practical to do the decompression on your host with pigz, which decompresses gzipped files using multiple cores/threads, and transfer the result to a socket to be fed directly into psql.

On the destination server run: nc -l 8123 | psql susemanager --set ON_ERROR_STOP=on

(nc comes from the netcat-openbsd package and waits for data on port 8123. More about netcat)

On your host run: unpigz <pg_dump.gz | nc -N server.tf.local 8123

(unpigz comes from the pigz package and decompresses gzipped files using multiple cores/threads. More about pigz)

You can omit the --set ON_ERROR_STOP=on if you are OK with a non-perfect restore (eg. for data analysis purposes you might not want to stop the process if pl/tcl is missing).

User passwords and email addresses

Now change all registered users' passwords to admin and all email addresses to a dummy address, otherwise the user will get bombed with error emails.

psql susemanager
BEGIN;
UPDATE web_contact set password='$1$IrobqEPV$9lT.npWHmOezFreAJg8Dw/';
UPDATE web_user_personal_info set email = 'dummy@dummy.dummy';
COMMIT;
SELECT
  wc.login,
  string_agg(g.name, ', ') AS groups
FROM web_contact wc
  LEFT JOIN rhnUserGroupMembers gm ON gm.user_id = wc.id
  LEFT JOIN rhnUserGroup g ON g.id = gm.user_group_id
GROUP BY login
ORDER BY login;
\q

Note usernames if you want to log in later.

If you need to use the Web UI, adjust /etc/rhn/rhn.conf connection parameters and /var/lib/pgsql/data/pg_hba.conf according to the possibly changed username, password and database name. Example:

vim /etc/rhn/rhn.conf
...
db_backend = postgresql
db_user = susemanager
db_password = susemanager
db_name = susemanager
db_host = localhost
db_port = 5432


# check pg_hba.conf. There should be lines like the following (username is the third column):
vim /var/lib/pgsql/data/pg_hba.conf
local   susemanager     susemanager     md5
host    susemanager     susemanager     127.0.0.1/8     md5
host    susemanager     susemanager     ::1/128 md5
...

rcpostgresql restart
rcsalt-master start
spacewalk-service start

HOWTO Restore a database dump from an smdba or a plain tarball

spacewalk-service stop
rcsalt-master stop
rcpostgresql stop

# move away existing data
mv /var/lib/pgsql/data /var/lib/pgsql/data_orig
mkdir /var/lib/pgsql/data

# replace with data from archive
cd /var/lib/pgsql/data
tar xvf ~/base.tar.gz
chown -R postgres:postgres /var/lib/pgsql/data
chmod -R 0700 /var/lib/pgsql/data

# replace base tuning configuration, hardware is likely different
cp /var/lib/pgsql/data/postgresql.conf /var/lib/pgsql/data/postgresql.orig
smdba system-check autotuning --max_connections=50

# allow access from other hosts
echo listen_addresses = \'*\' >> /var/lib/pgsql/data/postgresql.conf
echo host    all     all       0.0.0.0/0      md5 >> /var/lib/pgsql/data/pg_hba.conf

rcpostgresql start

# list Postgres users and change the password to 'susemanager'
su -l postgres
psql
\du

# there should be two roles: "postgres" and a user one (typically "susemanager")
# from here on, it is assumed "susemanager" is the user name. If it is not, please adapt
# following instructions accordingly

ALTER USER susemanager WITH ENCRYPTED PASSWORD 'susemanager';
ALTER ROLE susemanager superuser;
\q

# check pg_hba.conf. There should be lines like the following (username is the third column):
# local   susemanager     susemanager     md5
# host    susemanager     susemanager     127.0.0.1/8     md5
# host    susemanager     susemanager     ::1/128 md5
# ...

vim /var/lib/pgsql/data/pg_hba.conf

# change all SUSE Manager users' passwords to 'admin'
# and all email addresses. Otherwise the user will get bombed with error emails
PGPASSWORD=susemanager psql -h localhost -d susemanager -U susemanager
BEGIN;
UPDATE web_contact set password='$1$IrobqEPV$9lT.npWHmOezFreAJg8Dw/';
UPDATE web_user_personal_info set email = 'dummy@dummy.dummy';
COMMIT;

SELECT
  wc.login,
  string_agg(g.name, ', ') AS groups
FROM web_contact wc
  LEFT JOIN rhnUserGroupMembers gm ON gm.user_id = wc.id
  LEFT JOIN rhnUserGroup g ON g.id = gm.user_group_id
GROUP BY login
ORDER BY login;
\q

exit

# double check that database credentials are correct (in particular DB username and password)
vim /etc/rhn/rhn.conf

rcsalt-master start
spacewalk-service start

BIG DUMP NOTE: if your dump is in the gigabytes of size, copying it to the server and decompressing it might be very slow and require a lot of disk space. In those cases it is more practical to do the decompression on your host with pigz, which decompresses gzipped files using multiple cores/threads, and transfer the result to a socket to be fed directly into psql.

On the destination server run: nc -l 8123 | tar -xv

(nc comes from the netcat-openbsd package and waits for data on port 8123. More about netcat)

On your host run: unpigz <pg_dump.gz | nc -N server.tf.local 8123

(unpigz comes from the pigz package and decompresses gzipped files using multiple cores/threads. More about pigz)

HOWTO: Change the Database name

  1. stop services

spacewalk-service stop

  1. access the database not from the one you want to rename

sudo -u postgres psql

  1. list all databases available and check there is the one you want to rename

\list

  1. rename the database

alter database FOO rename to BAR;

  1. check that the database has been renamed

\list

  1. logout from postgres

\q

  1. change the database name in the rhn.conf file

vim /etc/rhn/rhn.conf

  1. change the database name in pg_hba.conf

vim /var/lib/pgsql/data/pg_hba.conf

  1. restart services

spacewalk-service start

HOWTO Check VACUUM/ANALYZE status

Postgres tables need to be periodically VACUUMed (to reclaim space from DELETEd rows) and ANALYZEd (to update statistics that will be used by the query planner).

Postgres >= 9 has a daemon called autovacuum that automatically runs both in the background, when needed, enabled by default.

Autovacuum has some tweakable settings:

  • naptime: minimum time between a check and another;
  • max_workers: number of concurrent instances;
  • threshold and scale_factor: the deamon will run VACUUM only if the table changed row estimate is >= threshold + row count estimate * scale_factor;
  • analyze_threshold and analyze_scale_factor: same, for ANALYZE.

You can check those settings with the following commands (can be run with spacewalk-sql -i, defaults in comments):

SHOW autovacuum_naptime; -- 1min
SHOW autovacuum_max_workers; -- 3
SHOW autovacuum_analyze_threshold; -- 50
SHOW autovacuum_analyze_scale_factor; -- 0.1

Those are typically reasonable but can be inadequate for very large tables depending on the access pattern (in that case, per table settings can be added).

You can check the latest autovacuum date/time with the following query:

SELECT relname, last_analyze, last_autoanalyze FROM pg_stat_user_tables;

You can check the (in)accuracy of row count estimates with the following script:

CREATE FUNCTION count_rows(name) RETURNS INTEGER AS $$
DECLARE
  table_name ALIAS FOR $1;
  result INTEGER;
BEGIN
    EXECUTE 'SELECT COUNT(*) FROM ' || table_name INTO result;
    RETURN result;
END;
$$ LANGUAGE plpgsql;

SELECT pg_stat_user_tables.relname,
    count_rows(pg_stat_user_tables.relname) AS count,
    reltuples AS estimated_count
    FROM pg_stat_user_tables
        JOIN pg_class ON
            pg_stat_user_tables.relid = pg_class.oid
    ORDER BY count DESC;

To go deeper, run the postgres_autovacuum_stats.sh script, which will return:

  • the configured autovacuum threshold for each table
  • whether that threshold was crossed (hence an autovacuum visit is expected soon)
  • the timestamp of the last autovacuum visit
  • the timestamp of the last (manual) vacuum
  • the count of autovacuum visits
  • the count of (manual) vacuum visits
  • all of the above data for analyze/autoanalyze as well

HOWTO: Find all tables that can have dependent rows to be deleted

Use the following script to find out all tables that have FOREIGN KEY... ON DELETE NO ACTION constraints on a table, run in spacewalk-sql -i:

    SELECT DISTINCT (to_table.relname)
    FROM
      pg_trigger
        JOIN pg_class from_table
          ON (pg_trigger.tgrelid = from_table.oid)
        JOIN pg_class to_table
          ON (pg_trigger.tgconstrrelid = to_table.oid)
        JOIN pg_proc
          ON (pg_trigger.tgfoid = pg_proc.oid)
        JOIN pg_constraint
          ON pg_trigger.tgconstraint = pg_constraint.oid
    WHERE
      pg_trigger.tgname ~ '^RI_' AND
      pg_proc.proname like '%noaction_del%' AND 
      from_table.relname = 'rhnserver';

Replace 'rhnserver' with the table you are interested in.

HOWTO: Get a view on locks

The postgres_locks.sh shell script will output:

  • a table with a list of processes blocking one another like the following:
 blocked_pid | blocked_user | blocking_pid | blocking_user |               blocked_statement               | current_statement_in_blocking_process
-------------+--------------+--------------+---------------+-----------------------------------------------+---------------------------------------
        6049 | spacewalk    |         5997 | spacewalk     | update rhnchannelpackage set channel_id = 1 ; | delete from rhnchannelpackage;
  • the output of ps for all processes (you can look up by PIDs above)
  • the output of netstat with process information (you can look up which sockets were used by the PIDs above, and from them the process originating the query, eg. Tomcat or Taskomatic). This works for TCP connections to the database
  • the output of ss for Unix sockets

HOWTO: diff databases data

You might want to observe differences between two datbases sharing the same schema, for example before and after an operation. This is one way to get a list of any tables that changed.

Collect a dump with full, individual INSERT lines from the first database:

mkdir /tmp/backup
chown postgres:postgres /tmp/backup
chmod 700 /tmp/backup
su - postgres -c "pg_dump --data-only --inserts --rows-per-insert=1 --column-inserts `grep -oP "db_name ?= ?\K.*" /etc/rhn/rhn.conf` > /tmp/backup/before.sql"

Repeat the procedure with the second database:

mkdir /tmp/backup
chown postgres:postgres /tmp/backup
chmod 700 /tmp/backup
su - postgres -c "pg_dump --data-only --inserts --rows-per-insert=1 --column-inserts `grep -oP "db_name ?= ?\K.*" /etc/rhn/rhn.conf` > /tmp/backup/after.sql"

Now collect before.sql and after.sql, they can be diff-ed as is or, in the steps below, used to capture only the list of changed tables:

sort before.sql >before_sorted.sql
sort after.sql >after_sorted.sql
diff before_sorted.sql after_sorted.sql > diff.sql
cat diff.sql | grep -i "insert into" | awk '{print $4}' | sort -u > changed_table_list

Guideline: Use a PL/SQL debugger

Step-by-step debugging of PL/SQL stored procedures exists, but it needs a PostgreSQL plugin which is not packaged at the moment, and was last tested in 9.1, so there are no precise instructions at this time.

At a high level:

  • install the postgresql-$VERSION-devel package, gcc, make. Other libraries might be needed
  • get the full Postgres sources, which are needed, from ftp://ftp.postgresql.org/pub/source
  • compile Postgres itself
  • clone git://git.postgresql.org/git/pldebugger.git in the contrib/ directory
  • compile pldebugger
  • add "shared_preload_libraries = '$libdir/plugin_debugger' to /var/lib/pgsql/data/postgresql.conf in order to load the plugin
  • restart Postgres
  • run CREATE EXTENSION pldbgapi;

At this point it is possible to connect a PostgreSQL client with debug support - for example pgAdmin 4.

Clone this wiki locally