Performance of ondemand_exporter and polling frequency

I haven’t had to look into this too deeply, but how frequently does ondemand_exporter run? And are there any configuration options to change the behavior? I’d like to keep using it for stats collection but maybe less frequent polling.

Doing some troubleshooting with performance and many active users it seems to be somewhat significant and noticeable in the dashboard responsiveness between having it running or disabled.

Many calls to /usr/bin/ruby /opt/rh/ondemand/root/usr/sbin/passenger-status --show=xml --pid-identifier

A quick glance at the code base seems to be the poll frequency is from another library and there are no flags to change it.

Maybe @tdockendorf knows more.

The frequency is determined by the Prometheus scrape interval, it’s not set by the exporter. Example:

- job_name: ondemand
  scrape_timeout: 50s
  relabel_configs:
  - source_labels: [__address__,host]
    regex: "([^.]+)..+;$"
    replacement: "$1"
    target_label: host
  file_sd_configs:
  - files:
    - "/etc/prometheus/file_sd_config.d/ondemand_*.yaml"

In our case we don’t define a scrape specific interval but instead rely on the global default:

global:
  scrape_interval: 1m
  scrape_timeout: 10s
  evaluation_interval: 1m

I don’t recommend any less frequently than every 3-4 minutes otherwise Prometheus will expire the metrics and you’ll have gaps in the graphs when using something like Grafana. If you want like every 10 or 15 minutes but don’t want gaps you can run the exporter scrape with cron and dump the metrics to something picked up by the node exporter: GitHub - prometheus/node_exporter: Exporter for machine metrics

1 Like

Thanks! That makes sense then. We’re not running Prometheus, but pulling the data every minute and putting it into Graphite. I had some calls to the /metrics page from the dashboard and Passenger apps that I believe were adding to the load.

For the record, the Passenger analytics collection sleep time might affect the usefulness of scraping passenger-status too often. See the new default 30s sleep time introduced in: Passenger patches for analytics to avoid dynamic sleeping by default by CSC-swesters · Pull Request #300 · OSC/ondemand-packaging · GitHub

I know Matt was part of the PR review already, but wanted to link this here, in case anyone finds this thread later on :slight_smile:

If one wishes to set the sleep interval to something else, use the OOD_OVERRIDE_PASSENGER_ANALYTICS_COLLECTION_SLEEP_TIME_SECONDS environment variable in nginx_stage.yml. See this comment for an example.


Simon

1 Like