Releases · pganalyze/collector

30 Jun 17:09

github-actions

v0.40.0

290466d

v0.40.0

Update to pg_query_go v2.0.4
- Normalize: Don't touch "GROUP BY 1" and "ORDER BY 1" expressions, keep original text
- Fingerprint: Cache list item hashes to fingerprint complex queries faster
  (this change also significantly reduces memory usage for complex queries)
Install script: Support CentOS in addition to RHEL

Assets 3

01 Jun 07:49

github-actions

v0.39.0

4226c73

v0.39.0

Docker: Use Docker's USER command to set user, to support running as non-root
- This enables the collector container to run in environments that require the
  whole container to run as a non-root user, which previously was not the case.
- For compatibility reasons the container can still be run as root explicitly,
  in which case the setpriv command is used to drop privileges. setpriv replaces
  gosu since its available for installation in most distributions directly, and
  fulfills the same purpose here.
Selfhosted: Support running log discovery with non-localhost db_host settings
- Previously this was prevented by a fixed check against localhost/127.0.0.1,
  but sometimes one wants to refer to the local server by a non-local IP address
AWS: Add support for AssumeRoleWithWebIdentity
- This is useful when running the collector inside EKS in order to access
  AWS resources, as recommended by AWS: https://docs.aws.amazon.com/eks/latest/userguide/specify-service-account-role.html
Statement stats retrieval: Get all rows first, before fingerprinting queries
- This avoids showing a bogus ClientWrite event on the Postgres server side whilst
  the collector is running the fingerprint method. There is a trade-off here,
  because we now need to retrieve all statement texts (for the full snapshot) before
  doing the fingerprint, leading to a slight increase in memory usage. Nonetheless,
  this improves debuggability, and avoids bogus statement timeout issues.
Track additional meta information about guided setup failures
Fix reporting of replication statistics for more than 1 follower

Assets 3

03 Apr 17:32

github-actions

v0.38.1

2c2b210

v0.38.1

Update to pg_query_go 2.0.2
- Normalize: Fix handling of two subsequent DefElems (resolves rare crashes)
Redact primary_conninfo setting if present and readable
- This can contain sensitive information (full connection string to the
  primary), and pganalyze does not do anything with it right now. In the
  future, we may partially redact this and use primary hostname
  information, but for now, just fully redact it.

Assets 3

31 Mar 21:26

github-actions

v0.38.0

3288b13

v0.38.0

Update to pg_query 2.0 and Postgres 13 parser
- This is a major upgrade in terms of supported syntax (Postgres 10 to 13),
  as well as a major change in the fingerprints, which are now shorter and
  not compatible with the old format.
- When you upgrade to this version of the collector you will see a break
  in statistics, that is, you will see new query entries in pganalyze after
  adopting this version of the collector.
Amazon RDS: Support long log events beyond 2,000 lines
- Resolves edge cases where very long EXPLAIN plans would be ignored since
  they exceeded the previous 2,000 limit
- We now ensure that we go back up to 10 MB in the file with each log
  download that happens, with support for log events that exceed the RDS API
  page size limit of 10,000 log lines
Self-managed: Also check for the process name "postmaster" when looking for
Postgres PID (fixes data directory detection for RHEL-based systems)

Assets 3

17 Mar 05:29

github-actions

v0.37.1

0594fb7

v0.37.1

Docker builds: Increase stack size to 2MB to prevent rare crashes
- Alpine has a very small stack size by default (80kb) which is less than
  the default that Postgres expects (100kb). Since there is no good reason
  to reduce it to such a small amount, increase to usually common Linux
  default of 2MB stack size.
- This would have surfaced as a hard crash of the Docker container with
  error code 137 or 139, easily confused with out of memory errors, but
  clearly distinct from it.
Reduce timeout for accessing EC2 instance metadata service
- Previously we were re-using our shared HTTP client, which has a rather
  high timeout (120 seconds) that causes the HTTP client to wait around
  for a long time. This is generally intentional (since it includes the
  time spent downloading a request body), but is a bad idea when running
  into EC2's IDMSv2 service that has a network-hop based limit. If that
  hop limit is exceeded, the requests just go to nowhere, causing the
  client to wait for a multiple of 120 seconds (~10 minutes were observed).
Don't use pganalyze query marker for "--test-explain" command
- The marker means the resulting query gets hidden from the EXPLAIN plan
  list, which is what we don't want for this test query - it's intentional
  that we can see the EXPLAIN plan we're generating for the test.

Assets 3

19 Feb 19:19

github-actions

v0.37.0

889b846

v0.37.0

Add support for receiving logs from remote servers over syslog
- You can now specify the new "db_log_syslog_server" config setting, or
  "LOG_SYSLOG_SERVER" environment variable in order to setup the collector
  as a syslog server that can receive logs from a remote server via syslog
  to the server that runs the collector.
- Note that the format of this setting is "listen_address:port", and its
  recommended to use a high port number to avoid running the collector as root.
- For example, you can specify "0.0.0.0:32514" and then send syslog messages
  to the collector's server address at port 32514.
- Note that you need to use protocol RFC5424, with an unencrypted TCP
  connection. Due to syslog not being an authenticated protocol it is
  recommended to only use this integration over private networks.
Add support for "pid=%p,user=%u,db=%d,app=%a,client=%h " and
"user=%u,db=%d,app=%a,client=%h " log_line_prefix settings
- This prefix misses a timestamp, but is useful when sending data over syslog.
Log parsing: Correctly handle %a containing commas/square brackets
- Note that this does not support all cases since Go's regexp engine
  does not support negative lookahead, so we can't handle an application
  name containing a comma if the log_line_prefix has a comma following %a.
Ignore CSV log files in log directory #83
- Some Postgres installations are configured to log both standard-format
  log files and CSV log files to the same directory, but the collector
  currently reads all files specified in a db_log_location, which works
  poorly with this setup.
Tweak collector sample config file to match setup instructions
Improvements to "--discover-log-location"
- Don't keep running if there's a config error
- Drop the log_directory helper command and just fetch the setting from Postgres
- Warn and only show relative location if log_directory is inside
  the data directory (this requires special setup steps to resolve)
Improvements to "--test-logs"
- Run privilege drop test when running log test as root, to allow running
  "--test-logs" for a complete log setup test, avoiding the need to run
  a full "--test"
Update pg_query_go to incorporate memory leak fixes
Check whether pg_stat_statements exists in a different schema, and give a
clear error message
Drop support for Postgres 9.2
- Postgres 9.2 has been EOL for almost 4 years
Update to Go 1.16
- This introduces a change to Go's certificate handling, which may break
  certain older versions of Amazon RDS certificates, as they do not
  include a SAN. When this is the case you will see an error message like
  "x509: certificate relies on legacy Common Name field".
- As a temporary workaround you can run the collector with the
  GODEBUG=x509ignoreCN=0 environment setting, which ignores these incorrect
  fields in these certificates. For a permanent fix, you need to update
  your RDS certificates to include the correct SAN field: https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/UsingWithRDS.SSL-certificate-rotation.html

Assets 3

22 Jan 06:22

github-actions

v0.36.0

7e92fbe

v0.36.0

Config parsing improvements:
- Fail fast when pganalyze section is missing in config file
- Ignore duplicates in db_name config setting
  - Previously this could cause malformed snapshots that would be submitted
    correctly but could not be processed
- Validate db_url parsing to avoid collector crash with invalid URLs
Include pganalyze-collector-setup program (see 0.35 release notes) in supported packages
Rename <unidentified queryid> query text placeholder to <query text unavailable>
- This makes it clearer what the underlying issue is
Revert to using <truncated query> instead of <unparsable query> in some situations
- When a query is cut off due to pg_stat_activity limit being reached,
  show <truncated query>, to make it clear that increasing track_activity_query_size
  would solve the issue
Ignore I/O stats for AWS Aurora utility statements
- AWS Aurora appears to report incorrect blk_read_time and blk_write_time values
  for utility statements (i.e., non-SELECT/INSERT/UPDATE/DELETE); we zero these out for now
Fix log-based EXPLAIN bug where query samples could be dropped if EXPLAIN failed
Add U140 log event (inconsistent range bounds)
- e.g.: ERROR: range lower bound must be less than or equal to range upper bound
Fix issue where incomplete schema information in snapshots was not marked correctly
- This could lead to schema objects disappearing and being re-created
Fix trailing newline handling for GCP and self-hosted log streams
- This could lead to queries being poorly formatted in the UI, or some queries
  with single-line comments being ignored
Include additional collector configuration settings in snapshot metadata for diagnostics
Ignore "insufficient privilege" queries w/o queryid
- Previously, these could all be aggregated together yielding misleading stats

Assets 3

06 Dec 04:28

github-actions

v0.35.0

2c338a7

v0.35.0

Add new "pganalyze-collector-setup" program that streamlines collector installation
- This is initially targeted for self-managed servers to make it easier to set up
  the collector and required configuration settings for a locally running Postgres
  server
- To start, this supports the following environments:
  - Postgres 10 and newer, running on the same server as the collector
  - Ubuntu 14.04 and newer
  - Debian 10 and newer
Collector test: Show server URLs to make it easier to access the servers in
pganalyze after the test
Collector test+reload: In case of errors, return exit code 1
Ignore manual vacuums if the collector can't access pg_stat_progress_vacuum
Don't run log test for Heroku, instead provide info message
- Also fixes "Unsupported log_line_prefix setting: ' sql_error_code = %e '"
  error on Heroku Postgres
Add pganalyze system user to adm group in Debian/Ubuntu packages
- This gives the collector permission to read Postgres log files in a default
  install, simplifying Log Insights setup
Handle NULL parameters for query samples correctly
Add a skip_if_replica / SKIP_IF_REPLICA option (#117)
- You can use this to configure the collector in a no-op mode on
  replicas (we only query if the monitored database is a replica), and
  automatically switch to active monitoring when the database is no
  longer a replica.
Stop building packages for CentOS 6 and Ubuntu 14.04 (Trusty)
- Both of these systems are now end of life, and the remaining survivor
  of the CentOS 6 line (Amazon Linux 1) will be EOL on December 31st 2020.

Assets 3

08 Nov 04:51

github-actions

v0.34.0

9c2230b

v0.34.0

Check and report problematic log collection settings
- Some Postgres settings almost always cause a drastic increase in log
  volume for little actual benefit. They tend to cause operational problems
  for the collector (due to the load of additional log parsing) and the
  pganalyze service itself (or indeed, likely for any service that would
  process collector snapshots), and do not add any meaningful insights.
  Furthermore, we found that these settings are often turned on
  accidentally.
- To avoid these issues, add some client-side checks in the collector to
  disable log processing if any of the problematic settings are on.
- The settings in question are:
  - log_min_duration_statement less than 10ms
  - log_statement set to 'all'
  - log_duration set to 'on'
  - log_error_verbosity set to 'verbose'
- If any of these are set to these unsupported values, all log collection will be
  disabled for that server. The settings are re-checked every full snapshot, and can be
  explicitly re-checked with a collector reload.
Log Insights improvements
- Self-managed server: Process logs every 3 seconds, instead of on-demand
- Self-managed server: Improve handling of multi-line log events
- Google Cloud SQL: Always acknowledge Pub Sub messages, even if collector doesn't handle them
- Optimize stitching logic for reduced CPU consumption
- Explicitly close temporary files to avoid running out of file descriptors
Multiple changes to improve debugging in support situations
- Report collector config in full snapshot
  - This reports certain collector config settings (except for passwords/keys/credentials)
    to the pganalyze servers to help with debugging.
- Print collector version at beginning of test for better support handling
- Print collection status and Postgres version before submitting snapshots
- Change panic stack trace logging from Verbose to Warning
Add support for running the collector on ARM systems
- Note that we don't provide packages yet, but with this the collector
  can be built on ARM systems without any additional patches.
Introduce API system scope fallback
- This fallback is intended to allow changing the API scope, either based
  on user configuration (e.g. moving the collector between different
  cloud provider accounts), or because of changes in the collector identify
  system logic.
- The new "api_system_scope_fallback" / PGA_API_SYSTEM_SCOPE_FALLBACK config
  variable is intended to be set to the old value of the scope. When the
  pganalyze backend receives a snapshot with a fallback scope set, and there
  is no server created with the regular scope, it will first search the
  servers with the fallback scope. If found, that server's scope will be
  updated to the (new) regular scope. If not found, a new server will be
  created with the regular scope. The main goal of the fallback scope is to
  avoid creating a duplicate server when changing the scope value
Use new fallback scope mechanism to change scope for RDS databases
- Previously we identified RDS databases by their ID and region only, but
  the ID does not have to be unique within a region, it only has to be
  unique within the same AWS account in that region. Thus, adjust the
  scope to include both the region and AWS Account ID (if configured or
  auto-detected), and use the fallback scope mechanism to migrate existing
  servers.
Add support for GKE workload identity Yash Bhutwala #91
Add support for assuming AWS instance roles
- Set the role to be assumed using the new aws_assume_role / AWS_ASSUME_ROLE
  configuration setting. This is useful when the collector runs in a different
  AWS account than your database.

Assets 3

11 Sep 15:50

github-actions

v0.33.1

0f93630

v0.33.1

Ignore internal admin databases for GCP and Azure
- This avoids collecting data from these internal databases, which produces
  unnecessary errors when using the all databases setting.
Add log_line_prefix check to GCP self-test
Schema stats handling: Avoid crash due to nil pointer dereference
Add support for "%m [%p]: [%l-1] db=%d,user=%u " log_line_prefix

Assets 3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Releases: pganalyze/collector

v0.40.0

v0.39.0

v0.38.1

v0.38.0

v0.37.1

v0.37.0

v0.36.0

v0.35.0

v0.34.0

v0.33.1