项目作者: simplifi

项目描述 :
Anemometer is a tool for running SQL queries and pushing results as metrics to Datadog
高级语言: Go
项目地址: git://github.com/simplifi/anemometer.git
创建时间: 2020-05-27T17:35:44Z
项目社区:https://github.com/simplifi/anemometer

开源协议:MIT License

下载


Anemometer

Build Status Go Report Card Release

Anemometer is a tool for running SQL queries and pushing results as metrics to Datadog

Why “Anemometer”

An anemometer is a device used for measuring wind speed and direction.

This project was originally created to help us monitor some tables in Airflow, but was later updated so it could work
generically with any database.

Supported Databases

We currently support the following databases:

  • Postgres
  • Vertica

Adding support for another database

Support for any of the databases listed here can be added fairly easily!

Setup

The latest version of Anemometer can be found on the Releases tab.

Example Configuration:

  1. statsd:
  2. address: 127.0.0.1:8125
  3. tags:
  4. - environment:production
  5. monitors:
  6. - name: airflow-dag-disabled
  7. database:
  8. type: postgres
  9. uri: postgresql://username:password@localhost:5432/database?sslmode=disable
  10. sleep_duration: 300
  11. metric: airflow.dag.disabled
  12. sql: >
  13. SELECT dag_id AS dag_id,
  14. CASE WHEN is_paused AND NOT is_subdag THEN 1 ELSE 0 END AS metric
  15. FROM dag
  16. - name: airflow-task-queued-seconds
  17. database:
  18. type: postgres
  19. uri: postgresql://username:password@localhost:5432/database?sslmode=disable
  20. sleep_duration: 300
  21. metric: airflow.task.queued_seconds
  22. sql: >
  23. SELECT dag_id AS dag_id,
  24. task_id AS task_id,
  25. EXTRACT(EPOCH FROM (current_timestamp - queued_dttm)) AS metric
  26. FROM task_instance
  27. WHERE state = 'queued'

statsd

This is where you tell Anemometer where to send StatsD metrics

  • address - The address:port on which StatsD is listening (usually 127.0.0.1:8125)
  • tags - Default tags to send with every metric, optional

monitors

This is where you tell Anemometer about the monitor(s) configuration

  • name - The name of this monitor, mainly used in logging
  • database.type - The type of database connection to be used (postgres and vertica are currently supported)
  • database.uri - The URI connection string used to connect to the database (usually follows protocol://username:password@hostname:port/database)
  • sleep_duration - How long to wait between pushes to StatsD (in seconds)
  • metric - The name of the metric to be sent to StatsD
  • sql - The SQL query to execute when populating the metric’s values/tags (see SQL Query Structure)

SQL Query Structure

Anemometer makes the following assumptions about the results of your query:

  • Exactly one column will be named metric, and the value is convertable to float64 (no strings)
  • All other columns will be aggregated into tags and sent to StatsD
  • The tags will take the form of column_name:value

Query Example

Single row result

To monitor the number of records in your user’s table you might do something like this:

  1. SELECT 'production' AS environment,
  2. 'users' AS table_name,
  3. COUNT(0) AS metric
  4. FROM users

Resulting in the following:

  1. environment | table_name | metric
  2. -------------+------------+--------
  3. production | users | 99

Assuming we named our metric table.records, this would result in the following data being sent to StatsD:
table.records:99|g|#environment:production,table_name:users

Multiple row result

To monitor the number of queries each user is running in your database you might do something like this:

  1. SELECT 'production' AS environment,
  2. usename AS user_name,
  3. COUNT(0) AS metric
  4. FROM pg_stat_activity
  5. WHERE query != '<IDLE>'
  6. GROUP BY usename

Resulting in the following:

  1. environment | user_name | metric
  2. -------------+-----------+--------
  3. production | cjonesy | 160
  4. production | postgres | 6

Assuming we named our metric database.queries, this would result in the following data being sent to StatsD:
database.queries:160|g|#environment:production,user_name:cjonesy
database.queries:6|g|#environment:production,user_name:postgres

Notice that one metric is sent for each row in the query.

Usage

Basic Usage

  1. Anemometer (A SQL -> StatsD metrics generator)
  2. Usage:
  3. anemometer [command]
  4. Available Commands:
  5. help Help about any command
  6. start Start the Anemometer agent
  7. version Print the version number
  8. Flags:
  9. -h, --help help for anemometer
  10. Use "anemometer [command] --help" for more information about a command.

To start the agent:

```shell script
anemometer start -c /path/to/your/config.yml

  1. # Development
  2. ### Testing locally
  3. If you want to test this out locally you can run the following to start Anemometer:
  4. ```shell script
  5. anemometer start -c /path/to/config.yml

You can see the metrics that would be sent by watching the statsd port on localhost:
```shell script
nc -u -l 8125

  1. ### Compiling
  2. ```shell script
  3. make build

Running Tests

To run all the standard tests:
```shell script
make test

  1. ### Releasing
  2. This project is using [goreleaser](https://goreleaser.com). GitHub release creation is automated using Travis CI. New releases are automatically created when new tags are pushed to the repo.
  3. ```shell script
  4. $ TAG=0.1.0 make tag

How to contribute

This project has some clear Contribution Guidelines and expectations that you can read here (CONTRIBUTING).

The contribution guidelines outline the process that you’ll need to follow to get a patch merged.

And you don’t just have to write code. You can help out by writing documentation, tests, or even by giving feedback about this work.

Thank you for contributing!