使用 windows_exporter 可以非常方便地给 prometheus 增加监控 windows server 的能力。
通常情况下只需使用默认配置就可以监控 cpu,内存,网络,服务了。但某些场合,如服务器安装了安全狗,在某些配置下可能不能获取某些服务的状态,此时就需要自定义配置,比如只监控某些服务。
windows_exporter 配置说明
来源
https://github.com/prometheus-community/windows_exporter
说明
适用于 windows 机器的 prometheus 导出器。
兼容性
windows_exporter 支持 windows server 版本 2008r2 和更高版本,以及桌面 windows 版本 7 和更高版本。
部署方式
下载exporter:
https://github.com/prometheus-community/windows_exporter/releases/download/v0.16.0/windows_exporter-0.16.0-amd64.exe
可直接执行.exe文件,也可自定义方式启动,直接启动将使用默认配置:
自定义配置
flags: -h, --help show context-sensitive help (also try --help-long and --help-man). --collectors.dfsr.sources-enabled="connection,folder,volume" comma-seperated list of dfsr perflib sources to use. --collectors.exchange.list list the collectors along with their perflib object name/ids --collectors.exchange.enabled="" comma-separated list of collectors to use. defaults to all, if not specified. --collector.iis.site-whitelist=". " regexp of sites to whitelist. site name must both match whitelist and not match blacklist to be included. --collector.iis.site-blacklist=collector.iis.site-blacklist regexp of sites to blacklist. site name must both match whitelist and not match blacklist to be included. --collector.iis.app-whitelist=". " regexp of apps to whitelist. app name must both match whitelist and not match blacklist to be included. --collector.iis.app-blacklist=collector.iis.app-blacklist regexp of apps to blacklist. app name must both match whitelist and not match blacklist to be included. --collector.logical_disk.volume-whitelist=". " regexp of volumes to whitelist. volume name must both match whitelist and not match blacklist to be included. --collector.logical_disk.volume-blacklist="" regexp of volumes to blacklist. volume name must both match whitelist and not match blacklist to be included. --collector.msmq.msmq-where=collector.msmq.msmq-where wql 'where' clause to use in wmi metrics query. limits the response to the msmqs you specify and reduces the size of the response. --collectors.mssql.classes-enabled="accessmethods,availreplica,bufman,databases,dbreplica,genstats,locks,memmgr,sqlstats,sqlerrors,transactions" comma-separated list of mssql wmi classes to use. --collectors.mssql.class-print if true, print available mssql wmi classes and exit. only displays if the mssql collector is enabled. --collector.net.nic-whitelist=". " regexp of nic:s to whitelist. nic name must both match whitelist and not match blacklist to be included. --collector.net.nic-blacklist="" regexp of nic:s to blacklist. nic name must both match whitelist and not match blacklist to be included. --collector.process.whitelist=".*" regexp of processes to include. process name must both match whitelist and not match blacklist to be included. --collector.process.blacklist="" regexp of processes to exclude. process name must both match whitelist and not match blacklist to be included. --collector.service.services-where="" wql 'where' clause to use in wmi metrics query. limits the response to the services you specify and reduces the size of the response. --collector.smtp.server-whitelist=". " regexp of virtual servers to whitelist. server name must both match whitelist and not match blacklist to be included. --collector.smtp.server-blacklist=collector.smtp.server-blacklist regexp of virtual servers to blacklist. server name must both match whitelist and not match blacklist to be included. --collector.textfile.directory="c:\\program files\\windows_exporter\\textfile_inputs" directory to read text files with metrics from. --config.file=config.file yaml configuration file to use. values set in this file will be overriden by cli flags. --web.config.file="" [experimental] path to configuration file that can enable tls or authentication. --telemetry.addr=":9182" host:port for exporter. --telemetry.path="/metrics" url path for surfacing collected metrics. --telemetry.max-requests=5 maximum number of concurrent requests. 0 to disable. --collectors.enabled="cpu,cs,logical_disk,net,os,service,system,textfile" comma-separated list of collectors to use. use '[defaults]' as a placeholder for all the collectors enabled by default. --collectors.print if true, print available collectors and exit. --scrape.timeout-margin=0.5 seconds to subtract from the timeout allowed by the client. tune to allow for overhead or high loads. --log.level="info" only log messages with the given severity or above. valid levels: [debug, info, warn, error, fatal] --log.format="logger:stderr" set the log target and format. example: "logger:syslog?appname=bob&local=7" or "logger:stdout?json=true" --version show application version.
使用配置文件
可以使用–config.file标志指定 yaml 配置文件。例如
.\windows_exporter.exe --config.file=config.yml
config.yml格式如下,可根据配置文档进行内容调整:
collectors: enabled: cpu,cs,net,service collector: service: services-where: "name='windows_exporter'" log: level: warn
rules配置参考
包含cpu超过90%使用量预警,内存超过90%用量预警,磁盘用量90%预警,windows_export自身预警及服务预警,如开头所说,未配置时将会监控所有服务,很多时候只需要监控特定服务即可
- name: windowsserver rules: - alert: windowsservercpuusage expr: 100 - (avg by (instance) (rate(windows_cpu_time_total{mode="idle"}[2m])) * 100) > 90 for: 0m labels: severity: warning annotations: summary: windows server cpu usage (instance {{ $labels.instance }}) description: "cpu usage is more than 90%\n value = {{ $value }}\n labels = {{ $labels }}" - alert: windowsservermemoryusage expr: 100 - ((windows_os_physical_memory_free_bytes / windows_cs_physical_memory_bytes) * 100) > 90 for: 2m labels: severity: warning annotations: summary: windows server memory usage (instance {{ $labels.instance }}) description: "memory usage is more than 90%\n value = {{ $value }}\n labels = {{ $labels }}" - alert: windowsserverdiskspaceusage expr: 100.0 - 100 * ((windows_logical_disk_free_bytes / 1024 / 1024 ) / (windows_logical_disk_size_bytes / 1024 / 1024)) > 90 for: 2m labels: severity: critical annotations: summary: windows server disk space usage (instance {{ $labels.instance }}) description: "disk usage is more than 80%\n value = {{ $value }}\n labels = {{ $labels }}" - alert: windowsservercollectorerror expr: windows_exporter_collector_success == 0 for: 5m labels: severity: critical annotations: summary: windows server collector error (instance {{ $labels.instance }}) description: "collector {{ $labels.collector }} was not successful\n value = {{ $value }}\n labels = {{ $labels }}" - alert: windowsserverservicestatus expr: windows_service_status{status="ok"} != 1 for: 1m labels: severity: critical annotations: summary: windows server service status (instance {{ $labels.instance }}) description: "windows service state is not ok\n value = {{ $value }}\n labels = {{ $labels }}"
使用prometheus能够非常简单地建立起 web 服务器集群/数据库集群监控,通过这些监控,不仅能实时监控服务器集群的状态,也能够通过这些监控信息对服务器进行优化,特别是数据库参数方面的优化,以后月萌api将分享更多相关的文章。
参考:https://blog.csdn.net/qq_43021786/article/details/118809772
麻将胡了pg电子网站的版权属于:月萌api www.moonapi.com,转载请注明出处