Skip to content
This repository has been archived by the owner on Oct 23, 2024. It is now read-only.

Latest commit

 

History

History
100 lines (77 loc) · 7.91 KB

internal-metrics.md

File metadata and controls

100 lines (77 loc) · 7.91 KB

internal-metrics

Monitor Type: internal-metrics (Source)

Accepts Endpoints: Yes

Multiple Instances Allowed: Yes

Overview

Emits metrics about the internal state of the agent. Useful for debugging performance issues with the agent and to ensure the agent isn't overloaded.

This can also scrape any HTTP endpoint that exposes metrics as a JSON array containing JSON-formatted SignalFx datapoint objects. It is roughly analogous to the prometheus-exporter monitor except for SignalFx datapoints.

monitors:
  - type: internal-metrics

Configuration

To activate this monitor in the Smart Agent, add the following to your agent config:

monitors:  # All monitor config goes under this key
 - type: internal-metrics
   ...  # Additional config

For a list of monitor options that are common to all monitors, see Common Configuration.

Config option Required Type Description
host no string Defaults to the top-level internalStatusHost option
port no integer Defaults to the top-level internalStatusPort option
path no string The HTTP request path to use to retrieve the metrics (default: /metrics)

Metrics

These are the metrics available for this monitor. This monitor emits all metrics by default; however, none are categorized as container/host -- they are all custom.

  • sfxagent.active_monitors (gauge)
    The total number of monitor instances actively working

  • sfxagent.active_observers (gauge)
    The number of observers configured and running

  • sfxagent.configured_monitors (gauge)
    The total number of monitor configurations

  • sfxagent.correlation_updates_client_errors (cumulative)
    The number of HTTP status code 4xx responses received while updating trace host correlations

  • sfxagent.correlation_updates_invalid (cumulative)
    The number of trace host correlation updates attempted against invalid dimensions

  • sfxagent.correlation_updates_retries (cumulative)
    The total number of times a trace host correlation requests have been retried

  • sfxagent.datapoint_channel_len (gauge)
    The total number of datapoints that have been emitted by monitors but have yet to be accepted by the writer. This number should be 0 most of the time. This will max out at 3000, at which point no datapoints will be generated by monitors. If it does max out, it indicates a bug or extreme CPU starvation of the agent.

  • sfxagent.datapoint_requests_active (gauge)
    The total number of outstanding requests to ingest currently active. If this is consistently hovering around the writer.maxRequests setting, that setting should probably be increased to give the agent more bandwidth to send datapoints.

  • sfxagent.datapoints_failed (cumulative)
    The total number of datapoints that tried to be sent but could not be by the agent writer since it last started. This can be due to network failures or an incorrect access token, among other things.

  • sfxagent.datapoints_filtered (cumulative)
    The total number of datapoints that were filtered out in the writer. This does not include datapoints filtered by monitor-specific filters.

  • sfxagent.datapoints_in_flight (gauge)
    The total number of datapoints that have been sent out in a request to ingest but have yet to receive confirmation from ingest that they have been received (i.e. the HTTP response hasn't been gotten).

  • sfxagent.datapoints_received (cumulative)
    The total number of non-filtered datapoints received by the agent writer since it last started. This number should generally equal sfxagent.datapoints_sent + sfxagent.datapoints_waiting + sfxagent.datapoints_in_flight, although sampling timing issues might cause it to temporarily not be.

  • sfxagent.datapoints_sent (cumulative)
    The total number of datapoints sent by the agent writer since it last started

  • sfxagent.datapoints_waiting (gauge)
    The total number of datapoints that have been accepted by the writer but have yet to be sent out to ingest over HTTP. If this continues to grow it indicates that datapoints are not being sent out fast enough and the writer.maxRequests setting should be increased.

  • sfxagent.dim_request_senders (gauge)
    Current number of worker goroutines active that can send dimension updates.

  • sfxagent.dim_updates_completed (cumulative)
    Total number of dimension property updates successfully completed

  • sfxagent.dim_updates_currently_delayed (gauge)
    Current number of dimension updates that are being delayed to avoid sending spurious updates due to flappy dimension property sets.

  • sfxagent.dim_updates_dropped (cumulative)
    Total number of dimension property updates that were dropped, due to an overfull buffer of dimension updates pending.

  • sfxagent.dim_updates_failed (cumulative)
    Total number of dimension property updates that failed for some reason. The failures should be logged.

  • sfxagent.dim_updates_flappy_total (cumulative)
    Total number of dimension property updates that ended up replacing a dimension property set that was being delayed.

  • sfxagent.dim_updates_started (cumulative)
    Total number of dimension property updates requests started, but not necessarily completed or failed.

  • sfxagent.discovered_endpoints (gauge)
    The number of discovered service endpoints. This includes endpoints that do not have any matching monitor configuration discovery rule.

  • sfxagent.events_buffered (gauge)
    The total number of events that have been emitted by monitors but have yet to be sent to SignalFx

  • sfxagent.events_sent (cumulative)
    The total number of events sent by the agent since it last started

  • sfxagent.go_frees (cumulative)
    Total number of heap objects freed throughout the lifetime of the agent

  • sfxagent.go_heap_alloc (gauge)
    Bytes of live heap memory (memory that has been allocated but not freed)

  • sfxagent.go_heap_idle (gauge)
    Bytes of memory that consist of idle spans (that is, completely empty spans of memory)

  • sfxagent.go_heap_inuse (gauge)
    Size in bytes of in use spans

  • sfxagent.go_heap_released (gauge)
    Bytes of memory that have been returned to the OS. This is quite often 0. sfxagent.go_heap_idle - sfxagent.go_heap_release is the memory that Go is retaining for future heap allocations.

  • sfxagent.go_heap_sys (gauge)
    Virtual memory size in bytes of the agent. This will generally reflect the largest heap size the agent has ever had in its lifetime.

  • sfxagent.go_mallocs (cumulative)
    Total number of heap objects allocated throughout the lifetime of the agent

  • sfxagent.go_next_gc (gauge)
    The target heap size -- GC tries to keep the heap smaller than this

  • sfxagent.go_num_gc (gauge)
    The number of GC cycles that have happened in the agent since it started

  • sfxagent.go_num_goroutine (gauge)
    Number of goroutines in the agent

  • sfxagent.go_stack_inuse (gauge)
    Size in bytes of spans that have at least one goroutine stack in them

  • sfxagent.go_total_alloc (cumulative)
    Total number of bytes allocated to the heap throughout the lifetime of the agent The agent does not do any built-in filtering of metrics coming out of this monitor.