发布于 

利用kube-state-metrics自动生成prometheus的配置文件

最近运维的k8s集群的node节点变动频繁,总是要手动更新prometheus配置文件表示很蛋疼。所以研究一下怎么对node节点做service discovery,自动更新监控targets列表。

先看看已有的配置维护方案

prometheus的配置通过file_sd_config实现动态加载,用Python脚本访问每个集群的apiserver获取node节点然后生成对应的json配置文件。

这个方案可以拿来直接用的,不过配置起来不够灵活。我不想在脚本里面维护各个集群的认证方式,只想安安静静的更新prometheus自己的配置。pass

再看看prometheus官方提供的方案

prometheus提供了kubernetes_sd_config,可以在prometheus.yml中配置好集群的认证方式,这样prometheus会定期去各个apiserver获取需要监控的node列表。在测试环境折腾了半天,发现这种方式对于部署在内部的prometheus配置起来很友好,然后如果是多个集群共用一个prometheus的话认证证书维护起来比较麻烦且容易出现集群认证配置更新了,prometheus中的配置没更新的尴尬情况。虽说集群的认证更新不会很频繁,但是每次更新就得重启prometheus也是不方便。

所以这种的确是最优雅的方案,也被pass了。

我的方案

最后在研究promethes的各个监控项目的时候,发现了kubernetes官方提供了详细的node节点监控:kube-state-metrics,这些监控配置同样可以通过file_sd_config动态加载。于是有了以下方案:

人工维护kube-state-metrics的配置groups/kube-state-metrics/*.json,使用定时执行的脚本通过获取localhost的prometheus监控数据来更新node列表

prometheus部署

  • prometheus.yaml中必备配置:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    51
    52
    53
    54
    55
    56
    57
    58
    59
    60
    61
    62
    63
    64
    65
    66
    67
    68
    69
    70
    71
    72
    73
    74
    75
    76
    77
    78
    79
    # my global config
    global:
    scrape_interval: 60s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
    evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
    scrape_timeout: 60s
    # scrape_timeout is set to the global default (10s).

    # Attach these labels to any time series or alerts when communicating with
    # external systems (federation, remote storage, Alertmanager).
    external_labels:
    monitor: 'k8s-prometheus-monitor'

    alerting:
    alertmanagers:
    - static_configs:
    - targets: ["localhost:9093"]

    # Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
    rule_files:
    # - "first.rules"
    # - "second.rules"
    - /home/server/prometheus/rule.yml
    # A scrape configuration containing exactly one endpoint to scrape:
    # Here it's Prometheus itself.


    scrape_configs:
    # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
    - job_name: 'prometheus'

    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.
    static_configs:
    - targets: ['localhost:9090']

    - job_name: 'kube-state-metrics'
    scrape_interval: 180s
    scrape_timeout: 30s
    file_sd_configs:
    - files: ['groups/kube-state-metrics/*.json']
    metric_relabel_configs:
    - source_labels: [__name__]
    regex: "(kube_node_status_condition|kube_node_labels|kube_node_info|kube_pod_container_resource_requests_cpu_cores|kube_node_status_allocatable_cpu_cores|kube_pod_container_resource_requests_memory_bytes|kube_node_status_allocatable_memory_bytes)"
    action: keep

    - job_name: 'cadvisor'
    scrape_interval: 90s
    scrape_timeout: 30s
    file_sd_configs:
    - files: ['groups/cadvisor/*.json']
    metric_relabel_configs:
    - source_labels: [__name__]
    regex: "(container_cpu_usage_seconds_total|container_memory_rss|container_memory_usage_bytes|container_spec_memory_limit_bytes|container_spec_cpu_quota|container_memory_swap|container_memory_cache|container_network_receive_bytes_total|container_network_transmit_bytes_total|container_cpu_cfs_throttled_periods_total|container_cpu_cfs_periods_total|container_cpu_user_seconds_total|container_cpu_system_seconds_total|container_memory_failures_total|container_fs_reads_bytes_total|container_fs_writes_bytes_total|container_cpu_cfs_throttled_seconds_total|container_memory_working_set_bytes|kube_deployment_spec_replicas|kube_node_status_capacity_cpu_cores|kube_pod_container_resource_limits|kube_pod_container_resource_limits_cpu_cores|kube_pod_container_resource_limits_memory_bytes|kube_pod_container_resource_requests|kube_pod_container_resource_requests_cpu_cores|kube_replicationcontroller_spec_replicas|kube_replicationcontroller_status_replicas)"
    action: keep

    # kubernetes > 1.13
    - job_name: 'cadvisor-standalone'
    scrape_interval: 90s
    scrape_timeout: 30s
    file_sd_configs:
    - files: ['groups/cadvisor-standalone/*.json']
    metric_relabel_configs:
    - source_labels: ['container_label_io_kubernetes_pod_name']
    target_label: 'pod_name'
    - source_labels: ['container_label_io_kubernetes_container_name']
    target_label: 'container_name'
    - source_labels: [__name__]
    regex: "(container_cpu_usage_seconds_total|container_memory_rss|container_memory_usage_bytes|container_spec_memory_limit_bytes|container_spec_cpu_quota|container_memory_swap|container_memory_cache|container_network_receive_bytes_total|container_network_transmit_bytes_total|container_cpu_cfs_throttled_periods_total|container_cpu_cfs_periods_total|container_cpu_user_seconds_total|container_cpu_system_seconds_total|container_memory_failures_total|container_fs_reads_bytes_total|container_fs_writes_bytes_total|container_cpu_cfs_throttled_seconds_total|container_memory_working_set_bytes|kube_deployment_spec_replicas|kube_node_status_capacity_cpu_cores|kube_pod_container_resource_limits|kube_pod_container_resource_limits_cpu_cores|kube_pod_container_resource_limits_memory_bytes|kube_pod_container_resource_requests|kube_pod_container_resource_requests_cpu_cores|kube_replicationcontroller_spec_replicas|kube_replicationcontroller_status_replicas)"
    action: keep

    - job_name: 'node-exporter'
    scrape_interval: 90s
    scrape_timeout: 30s
    file_sd_configs:
    - files: ['groups/node-exporter/*.json']
    metric_relabel_configs:
    - source_labels: [__name__]
    regex: "(node_cpu_seconds_total|node_memory_MemAvailable_bytes|node_memory_MemTotal_bytes|node_load1|node_load5)"
    action: keep
  • docker-comose.yaml配置

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    prometheus:
    image: prom/prometheus:v2.24.1
    net: host
    restart: always
    environment:
    - TZ=Asia/Shanghai
    volumes:
    - /etc/localtime:/etc/localtime:ro
    - ./prometheus/:/etc/prometheus/
    - /home/data/prometheus_data/:/prometheus_data/:rw
    command:
    - '--config.file=/etc/prometheus/prometheus.yml'
    - '--storage.tsdb.path=/prometheus_data/'
    - '--storage.tsdb.retention.time=2d'
    - '--storage.tsdb.max-block-duration=2h'
    - '--storage.tsdb.min-block-duration=2h'
    - '--query.max-samples=100000000'
    - '--web.console.libraries=/usr/share/prometheus/console_libraries'
    - '--web.console.templates=/usr/share/prometheus/consoles'
    ports:
    - 9090:9090

    alertmanager:
    image: prom/alertmanager
    ports:
    - 9093:9093
    volumes:
    - ./alertmanager/:/etc/alertmanager/
    net: host
    restart: always
    command:
    - '--config.file=/etc/alertmanager/config.yml'
    - '--storage.path=/alertmanager'
  • 启动脚本

    1
    2
    3
    4
    5
    #!/bin/bash

    mkdir -p /home/server/prometheus/groups/{kube-state-metrics,node-exporter,cadvisor,cadvisor-standalone}
    mkdir -p /home/data/prometheus_data/
    mkdir -p /home/server/alertmanager/
  • Python脚本生成node配置

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    51
    52
    53
    54
    55
    56
    57
    58
    59
    60
    61
    62
    63
    64
    65
    66
    67
    68
    69
    70
    71
    72
    73
    74
    75
    76
    77
    78
    79
    80
    81
    82
    83
    84
    85
    86
    87
    88
    89
    90
    91
    92
    93
    94
    95
    96
    97
    98
    99
    100
    101
    102
    103
    104
    105
    106
    107
    108
    109
    110
    111
    112
    113
    114
    115
    116
    117
    118
    119
    120
    # coding: utf-8

    import json
    import socket
    import time
    from datetime import datetime

    import requests


    def send_alarm(msg):
    pass

    def simple_query(query='kube_node_status_condition{status="true"}==1'):
    """
    获取5分钟前的监控数据
    """
    step = 20
    now = int(time.time())
    start = now - 300
    end = now - 300
    params = (
    ('query', str(query)),
    ('start', str(start)),
    ('end', str(end)),
    ('step', str(step)),
    )

    response = requests.get(
    'http://localhost:9090/api/v1/query_range', params=params)

    if response:
    return response.json()
    else:
    return None


    def json_node_exporter_cadvisor():
    kube_node_status_condition = simple_query(
    query='kube_node_status_condition{status="true"}==1')
    kube_node_labels = simple_query(query='kube_node_labels')
    kube_node_info = simple_query(query='kube_node_info')
    try:
    cluster_count_old = {}
    with open("/home/server/prometheus/groups/node-exporter/config_node_exporter.json") as f:
    config = json.loads(f.read())
    for node in config:
    cluster = node["labels"]["cluster"]
    if cluster not in cluster_count_old:
    cluster_count_old[cluster] = 1
    else:
    cluster_count_old[cluster] += 1
    except:
    cluster_count_old = {}


    try:
    nodes_ready = [i["metric"]["node"]
    for i in kube_node_status_condition["data"]["result"]]
    node_label_dict = {
    i["metric"]["node"]: {
    "cluster": i["metric"]["cluster"],
    "group": i["metric"].get("label_group")
    } for i in kube_node_labels["data"]["result"]
    }
    node_version_dict = {
    i["metric"]["node"]: str(
    i["metric"]["kubelet_version"]).strip("v").split(".")
    for i in kube_node_info["data"]["result"]
    }
    config_node_exporter = []
    config_cadvisor = []
    config_cadvisor_standalone = []
    cluster_count_new = {}
    for n in nodes_ready:
    targets_9100 = [str(n) + ":9100"]
    targets_4194 = [str(n) + ":4194"]
    version = node_version_dict[n]
    cluster = node_label_dict[n]["cluster"]
    if cluster not in cluster_count_new:
    cluster_count_new[cluster] = 1
    else:
    cluster_count_new[cluster] += 1
    item_node = {"labels": node_label_dict[n], "targets": targets_9100}
    item_cadvisor = {
    "labels": node_label_dict[n], "targets": targets_4194}
    config_node_exporter.append(item_node)
    if int(version[1]) == 1 and int(version[1]) >= 14:
    config_cadvisor_standalone.append(item_cadvisor)
    else:
    config_cadvisor.append(item_cadvisor)

    cluster_config_change = {
    cluster: cluster_count_new.get(cluster, 0) - cluster_count_old.get(cluster) for cluster in cluster_count_old
    }
    change_min = min(list(cluster_config_change.values()))
    if change_min >= -3: #node节点减少如果大于3个则不自动更新配置
    with open("/home/server/prometheus/groups/node-exporter/config_node_exporter.json", "w") as f:
    f.write(json.dumps(config_node_exporter, indent=4))
    with open("/home/server/prometheus/groups/cadvisor/config_cadvisor.json", "w") as f:
    f.write(json.dumps(config_cadvisor, indent=4))
    with open("/home/server/prometheus/groups/cadvisor-standalone/config_cadvisor_standalone.json", "w") as f:
    f.write(json.dumps(config_cadvisor_standalone, indent=4))
    else:
    with open("/home/server/prometheus/groups/node-exporter/config_node_exporter.json.new", "w") as f:
    f.write(json.dumps(config_node_exporter, indent=4))
    with open("/home/server/prometheus/groups/cadvisor/config_cadvisor.json.new", "w") as f:
    f.write(json.dumps(config_cadvisor, indent=4))
    with open("/home/server/prometheus/groups/cadvisor-standalone/config_cadvisor_standalone.json.new", "w") as f:
    f.write(json.dumps(config_cadvisor_standalone, indent=4))

    msg = "prometheus配置中node数量变更为{},请确认配置: {}".format(str(change_min), str(cluster_config_change))
    send_alarm(msg)
    return cluster_config_change
    except Exception as e:
    print(e)

    if __name__ == "__main__":
    res = json_node_exporter_cadvisor()
    print(res)