云原生09-k8s prometheus

深度解读Prometheus

Prometheus基本介绍

Prometheus配置文件参考地址： https://prometheus.io/docs/prometheus/latest/configuration/configuration/
Prometheus监控组件对应的exporter部署地址： https://prometheus.io/docs/instrumenting/exporters/
Prometheus基于k8s服务发现参考： https://github.com/prometheus/prometheus/blob/release-2.31/documentation/examples/prometheus-kubernetes.yml

Prometheus是一个开源的系统监控和警报工具，用于收集、存储和查询时间序列数据。它专注于监控应用程序和基础设施的性能和状态，并提供丰富的查询语言和灵活的告警机制。以下是对Prometheus的基本介绍：

数据模型：Prometheus使用时间序列数据模型来存储监控数据。时间序列由一个唯一的指标名称和一组键值对标签组成，代表了某个指标在特定时间点的数值。这种数据模型非常适合度量指标的变化和趋势。
数据采集：Prometheus支持多种数据采集方式。它可以直接采集应用程序的指标数据，也可以通过各种监控插件和集成来获取系统和网络层面的指标数据。采集的数据通过HTTP或其他协议发送给Prometheus服务器进行存储和处理。
存储和查询：Prometheus使用本地存储方式，将采集的时间序列数据保存在本地磁盘上。它提供了灵活而高效的查询语言（PromQL），可以对存储的数据进行实时查询和聚合操作，以便生成有关监控指标的图表、报表和警报。
告警和警报规则：Prometheus具有强大的告警功能，可以根据指标的阀值、表达式和持续时间等条件设置警报规则。当条件满足时，它可以触发警报并发送通知，如发送电子邮件、短信或通过集成的通知服务进行报警。
可视化和集成：虽然Prometheus本身提供了基本的查询和图表功能，但它也可以与其他工具和服务集成，如Grafana，用于更丰富的数据可视化和仪表板展示。

prometheus特点

多维度数据模型：Prometheus采用了多维度的时间序列数据模型，每个时间序列都由指标名称和一组标签键值对组成。这种数据模型使得用户可以灵活地对监控数据进行多维度的查询和聚合，以获取更准确和细粒度的监控指标。
高效的数据采集：Prometheus支持多种灵活的数据采集方式。它可以通过客户端库（例如 Prometheus客户端库）直接采集应用程序的指标数据，也可以通过各种监控插件和集成方式获取系统、网络和第三方服务的指标数据。数据采集的过程高效而可靠，可以适应各种规模和复杂度的监控场景。
强大的查询语言：Prometheus提供了强大而灵活的查询语言（PromQL），用于对存储的监控数据进行实时查询和聚合操作。PromQL支持范围查询、聚合函数、算术运算和向量操作等，使得用户能够方便地分析和提取所需的监控指标。
动态监控和自动发现：Prometheus支持动态监控和自动发现机制。它可以自动探测和监控新加入集群的目标，例如新部署的应用实例或新增的节点。通过定义合适的自动发现规则， Prometheus能够及时识别和监控新的目标，实现动态的监控配置和管理。
灵活的告警机制：Prometheus具备强大的告警功能，用户可以定义灵活的警报规则，并根据阀值、表达式和持续时间等条件触发警报。它能够及时发送通知，如电子邮件、短信或调用 API，以便运维人员能够快速响应和解决潜在的问题。
生态系统和集成：Prometheus拥有丰富的生态系统和广泛的集成能力。它可以与其他工具和服务集成，如Grafana用于可视化、Alertmanager用于告警通知、Exporter用于采集非 Prometheus格式的指标数据等。这种集成能力使得用户能够构建全面和强大的监控解决方案。

prometheus生态系统包含的组件

alt text

19.2 安装采集节点资源指标组件node-exporter

node-exporter官方网站： https://prometheus.io/docs/guides/node-exporter/
node-exporter的github地址： https://github.com/prometheus/nodeexporter/
node-exporter基本介绍：

NodeExporter是Prometheus的一个官方Exporter，用于收集和暴露有关操作系统和硬件资源的指标数据。它在目标主机上运行，并提供了各种系统级别的指标，例如CPU利用率、内存使用情况磁盘空间、网络流量等。

# 查看控制节点污点
$ kubectl describe node xuegod63 | grep Taints
Taints:             node-role.kubernetes.io/control-plane:NoSchedule
$ kubectl create ns monitor-sa

day19/node-exporter.yaml

kind: DaemonSet
metadata:
  name: node-exporter
  namespace: monitor-sa
  labels:
    name: node-exporter
spec:
  selector:
    matchLabels:
      name: node-exporter
  template:
    metadata:
      labels:
        name: node-exporter
    spec:
      hostPID: true
      hostIPC: true
      hostNetwork: true
      containers:
      - name: node-exporter
        image: prom/node-exporter:v0.16.0
        imagePullPolicy: IfNotPresent
        ports:
        - containerPort: 9100
        resources:
          requests:
            cpu: 0.15
        securityContext:
          privileged: true
        args:
        - --path.procfs
        - /host/proc
        - --path.sysfs
        - /host/sys
        - --collector.filesystem.ignored-mount-points
        - '"^/(sys|proc|dev|host|etc)($|/)"'
        volumeMounts:
        - name: dev
          mountPath: /host/dev
        - name: proc
          mountPath: /host/proc
        - name: sys
          mountPath: /host/sys
        - name: rootfs
          mountPath: /rootfs
      tolerations:
        key: "node-role.kubernetes.io/control-plane"
        operator: "Exists"
        effect: "NoSchedule"
      volumes:
      - name: dev
        hostPath:
          path: /dev
      - name: proc
        hostPath:
          path: /proc
      - name: sys
        hostPath:
          path: /sys
      - name: rootfs
        hostPath:
          path: /

$ kubectl apply -f node-exporter.yaml
$ kubectl get pod -n monitor-sa -o wide
NAME                  READY   STATUS    RESTARTS   AGE   IP              NODE       NOMINATED NODE   READINESS GATES
node-exporter-72cvt   1/1     Running   0          28m   192.168.59.64   xuegod64   <none>           <none>
node-exporter-76b9b   1/1     Running   0          28m   192.168.59.62   xuegod62   <none>           <none>
node-exporter-sjvp6   1/1     Running   0          28m   192.168.59.63   xuegod63   <none>           <none>
$ curl 192.168.59.63:9100/metrics
$ curl 192.168.59.62:9100/metrics
$ curl 192.168.59.64:9100/metrics

不同模式cpu花费时间

$ curl localhost:9100/metrics | grep node_cpu_seconds
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0# HELP node_cpu_seconds_total Seconds the cpus spent in each mode.                                                                                                              
# TYPE node_cpu_seconds_total counter
node_cpu_seconds_total{cpu="0",mode="idle"} 27319.3
node_cpu_seconds_total{cpu="0",mode="iowait"} 156.72
node_cpu_seconds_total{cpu="0",mode="irq"} 3425.22
node_cpu_seconds_total{cpu="0",mode="nice"} 6.84
node_cpu_seconds_total{cpu="0",mode="softirq"} 665.04
node_cpu_seconds_total{cpu="0",mode="steal"} 0
node_cpu_seconds_total{cpu="0",mode="system"} 7969.3
node_cpu_seconds_total{cpu="0",mode="user"} 1909.82
node_cpu_seconds_total{cpu="1",mode="idle"} 27707.98
node_cpu_seconds_total{cpu="1",mode="iowait"} 172.91
node_cpu_seconds_total{cpu="1",mode="irq"} 3534.76
node_cpu_seconds_total{cpu="1",mode="nice"} 7.72
node_cpu_seconds_total{cpu="1",mode="softirq"} 543.61
node_cpu_seconds_total{cpu="1",mode="steal"} 0
node_cpu_seconds_total{cpu="1",mode="system"} 7674.42
node_cpu_seconds_total{cpu="1",mode="user"} 1806.81
node_cpu_seconds_total{cpu="2",mode="idle"} 27818.07
node_cpu_seconds_total{cpu="2",mode="iowait"} 256.02
node_cpu_seconds_total{cpu="2",mode="irq"} 3421.07
node_cpu_seconds_total{cpu="2",mode="nice"} 9.74
node_cpu_seconds_total{cpu="2",mode="softirq"} 560.75
node_cpu_seconds_total{cpu="2",mode="steal"} 0
node_cpu_seconds_total{cpu="2",mode="system"} 7634.81
node_cpu_seconds_total{cpu="2",mode="user"} 1893.19
node_cpu_seconds_total{cpu="3",mode="idle"} 27926.25
node_cpu_seconds_total{cpu="3",mode="iowait"} 164.71
node_cpu_seconds_total{cpu="3",mode="irq"} 3421.2
node_cpu_seconds_total{cpu="3",mode="nice"} 6.43
node_cpu_seconds_total{cpu="3",mode="softirq"} 656.97
node_cpu_seconds_total{cpu="3",mode="steal"} 0
node_cpu_seconds_total{cpu="3",mode="system"} 7640.92
node_cpu_seconds_total{cpu="3",mode="user"} 1822.63
100 75494  100 75494    0     0  4914k      0 --:--:-- --:--:-- --:--:-- 4914k

模式（mode）含义

counter 表示计数器，该指标只会增加

user    用户态时间，表示 CPU 执行用户进程（应用程序）所花费的时间。
system  内核态时间，表示 CPU 执行内核系统调用（如 I/O 操作、内存管理等）所花费的时间。
idle    空闲时间，CPU 没有任务可执行时处于空闲状态的时间。
nice    调整优先级的用户进程时间，即低优先级（nice 值大于 0）的用户进程所占用的 CPU 时间。
iowait  等待 I/O 完成的时间，当 CPU 没有任务可做，但有未完成的磁盘 I/O 请求时，进入此状态。
irq 处理硬件中断请求的时间，例如来自网卡、硬盘等设备的中断处理。
softirq 处理软中断的时间，通常用于处理下半部中断任务（softirq 或 tasklet）。
steal   在虚拟化环境中，被其他虚拟机“偷走”的 CPU 时间（仅适用于虚拟机）。

$ curl localhost:9100/metrics | grep node_load
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0# HELP node_load1 1m load average.
10# TYPE node_load1 gauge
0 node_load1 0.62
75# HELP node_load15 15m load average.
51# TYPE node_load15 gauge
4 node_load15 1.02
 1# HELP node_load5 5m load average.
00# TYPE node_load5 gauge
 7node_load5 0.87
5514    0     0  1993k      0 --:--:-- --:--:-- --:--:-- 1993k

gauge 表示瞬时值，指标可能增加或减少

输出示例解读：

node_load1 0.62    # 过去 1 分钟平均负载为 0.62
node_load5 0.87    # 过去 5 分钟平均负载为 0.87
node_load15 1.02   # 过去 15 分钟平均负载为 1.02
这说明：

系统负载呈缓慢上升趋势。
当前负载较低，远未达到 CPU 核心数上限（例如 4 核 CPU，负载 1 表示约 25% 利用率）。

19.3 在k8s集群中安装Prometheusserver服务

19.3.1 创建sa账号

# 在k8s集群的控制节点操作，创建一个sa账号
$ kubectl create serviceaccount monitor -n monitor-sa
# 把sa账号monitor通过clusterrolebing绑定到clusterrole上
$ kubectl create clusterrolebinding monitor-clusterrolebinding --clusterrole=cluster-admin --serviceaccount=monitor-sa:monitor
#注意：有的同学执行上面授权也会报错，那就需要下面的授权命令：
$ kubectl create clusterrolebinding monitor-clusterrolebinding-1 --clusterrole=cluster-admin --user=system:serviceaccount:monitor:monitor-sa

19.3.2 创建数据目录

#在xuegod64作节点创建存储数据的目录：
$ mkdir /data -p
$ chmod 777 /data/

19.3.3 安装prometheus服务

以下步骤均在k8s集群的控制节点操作：创建一个configmap存储卷，用来存放prometheus配置信息

scape_configs来自 https://raw.githubusercontent.com/prometheus/prometheus/refs/heads/release-2.31/documentation/examples/prometheus-kubernetes.yml

day19/prometheus-cfg.yaml

--- 
kind: ConfigMap
apiVersion: v1
metadata:
  labels:
    app: prometheus
  name: prometheus-config
  namespace: monitor-sa
data:
  prometheus.yml: |
    global:
      scrape_interval: 15s
      scrape_timeout: 10s
      evaluation_interval: 1m
    scrape_configs:
    - job_name: "kubernetes-apiservers"

      kubernetes_sd_configs:
        - role: endpoints

      # Default to scraping over https. If required, just disable this or change to
      # `http`.
      scheme: https

      # This TLS & authorization config is used to connect to the actual scrape
      # endpoints for cluster components. This is separate to discovery auth
      # configuration because discovery & scraping are two separate concerns in
      # Prometheus. The discovery auth config is automatic if Prometheus runs inside
      # the cluster. Otherwise, more config options have to be provided within the
      # <kubernetes_sd_config>.
      tls_config:
        ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
        # If your node certificates are self-signed or use a different CA to the
        # master CA, then disable certificate verification below. Note that
        # certificate verification is an integral part of a secure infrastructure
        # so this should only be disabled in a controlled environment. You can
        # disable certificate verification by uncommenting the line below.
        #
        # insecure_skip_verify: true
      authorization:
        credentials_file: /var/run/secrets/kubernetes.io/serviceaccount/token

      # Keep only the default/kubernetes service endpoints for the https port. This
      # will add targets for each API server which Kubernetes adds an endpoint to
      # the default/kubernetes service.
      relabel_configs:
        - source_labels:
            [
              __meta_kubernetes_namespace,
              __meta_kubernetes_service_name,
              __meta_kubernetes_endpoint_port_name,
            ]
          action: keep
          regex: default;kubernetes;https

    # Scrape config for nodes (kubelet).
    #
    # Rather than connecting directly to the node, the scrape is proxied though the
    # Kubernetes apiserver.  This means it will work if Prometheus is running out of
    # cluster, or can't connect to nodes for some other reason (e.g. because of
    # firewalling).
    - job_name: "kubernetes-nodes"

      # Default to scraping over https. If required, just disable this or change to
      # `http`.
      scheme: https

      # This TLS & authorization config is used to connect to the actual scrape
      # endpoints for cluster components. This is separate to discovery auth
      # configuration because discovery & scraping are two separate concerns in
      # Prometheus. The discovery auth config is automatic if Prometheus runs inside
      # the cluster. Otherwise, more config options have to be provided within the
      # <kubernetes_sd_config>.
      tls_config:
        ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
        # If your node certificates are self-signed or use a different CA to the
        # master CA, then disable certificate verification below. Note that
        # certificate verification is an integral part of a secure infrastructure
        # so this should only be disabled in a controlled environment. You can
        # disable certificate verification by uncommenting the line below.
        #
        # insecure_skip_verify: true
      authorization:
        credentials_file: /var/run/secrets/kubernetes.io/serviceaccount/token

      kubernetes_sd_configs:
        - role: node

      relabel_configs:
        - action: labelmap
          regex: __meta_kubernetes_node_label_(.+)

    # Scrape config for Kubelet cAdvisor.
    #
    # This is required for Kubernetes 1.7.3 and later, where cAdvisor metrics
    # (those whose names begin with 'container_') have been removed from the
    # Kubelet metrics endpoint.  This job scrapes the cAdvisor endpoint to
    # retrieve those metrics.
    #
    # In Kubernetes 1.7.0-1.7.2, these metrics are only exposed on the cAdvisor
    # HTTP endpoint; use the "/metrics" endpoint on the 4194 port of nodes. In
    # that case (and ensure cAdvisor's HTTP server hasn't been disabled with the
    # --cadvisor-port=0 Kubelet flag).
    #
    # This job is not necessary and should be removed in Kubernetes 1.6 and
    # earlier versions, or it will cause the metrics to be scraped twice.
    - job_name: "kubernetes-cadvisor"

      # Default to scraping over https. If required, just disable this or change to
      # `http`.
      scheme: https

      # Starting Kubernetes 1.7.3 the cAdvisor metrics are under /metrics/cadvisor.
      # Kubernetes CIS Benchmark recommends against enabling the insecure HTTP
      # servers of Kubernetes, therefore the cAdvisor metrics on the secure handler
      # are used.
      metrics_path: /metrics/cadvisor

      # This TLS & authorization config is used to connect to the actual scrape
      # endpoints for cluster components. This is separate to discovery auth
      # configuration because discovery & scraping are two separate concerns in
      # Prometheus. The discovery auth config is automatic if Prometheus runs inside
      # the cluster. Otherwise, more config options have to be provided within the
      # <kubernetes_sd_config>.
      tls_config:
        ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
        # If your node certificates are self-signed or use a different CA to the
        # master CA, then disable certificate verification below. Note that
        # certificate verification is an integral part of a secure infrastructure
        # so this should only be disabled in a controlled environment. You can
        # disable certificate verification by uncommenting the line below.
        #
        # insecure_skip_verify: true
      authorization:
        credentials_file: /var/run/secrets/kubernetes.io/serviceaccount/token

      kubernetes_sd_configs:
        - role: node

      relabel_configs:
        - action: labelmap
          regex: __meta_kubernetes_node_label_(.+)

    # Example scrape config for service endpoints.
    #
    # The relabeling allows the actual service scrape endpoint to be configured
    # for all or only some endpoints.
    - job_name: "kubernetes-service-endpoints"

      kubernetes_sd_configs:
        - role: endpoints

      relabel_configs:
        # Example relabel to scrape only endpoints that have
        # "example.io/should_be_scraped = true" annotation.
        #  - source_labels: [__meta_kubernetes_service_annotation_example_io_should_be_scraped]
        #    action: keep
        #    regex: true
        #
        # Example relabel to customize metric path based on endpoints
        # "example.io/metric_path = <metric path>" annotation.
        #  - source_labels: [__meta_kubernetes_service_annotation_example_io_metric_path]
        #    action: replace
        #    target_label: __metrics_path__
        #    regex: (.+)
        #
        # Example relabel to scrape only single, desired port for the service based
        # on endpoints "example.io/scrape_port = <port>" annotation.
        #  - source_labels: [__address__, __meta_kubernetes_service_annotation_example_io_scrape_port]
        #    action: replace
        #    regex: ([^:]+)(?::\d+)?;(\d+)
        #    replacement: $1:$2
        #    target_label: __address__
        #
        # Example relabel to configure scrape scheme for all service scrape targets
        # based on endpoints "example.io/scrape_scheme = <scheme>" annotation.
        #  - source_labels: [__meta_kubernetes_service_annotation_example_io_scrape_scheme]
        #    action: replace
        #    target_label: __scheme__
        #    regex: (https?)
        - action: labelmap
          regex: __meta_kubernetes_service_label_(.+)
        - source_labels: [__meta_kubernetes_namespace]
          action: replace
          target_label: kubernetes_namespace
        - source_labels: [__meta_kubernetes_service_name]
          action: replace
          target_label: kubernetes_name

    # Example scrape config for probing services via the Blackbox Exporter.
    #
    # The relabeling allows the actual service scrape endpoint to be configured
    # for all or only some services.
    - job_name: "kubernetes-services"

      metrics_path: /probe
      params:
        module: [http_2xx]

      kubernetes_sd_configs:
        - role: service

      relabel_configs:
        # Example relabel to probe only some services that have "example.io/should_be_probed = true" annotation
        #  - source_labels: [__meta_kubernetes_service_annotation_example_io_should_be_probed]
        #    action: keep
        #    regex: true
        - source_labels: [__address__]
          target_label: __param_target
        - target_label: __address__
          replacement: blackbox-exporter.example.com:9115
        - source_labels: [__param_target]
          target_label: instance
        - action: labelmap
          regex: __meta_kubernetes_service_label_(.+)
        - source_labels: [__meta_kubernetes_namespace]
          target_label: kubernetes_namespace
        - source_labels: [__meta_kubernetes_service_name]
          target_label: kubernetes_name

    # Example scrape config for probing ingresses via the Blackbox Exporter.
    #
    # The relabeling allows the actual ingress scrape endpoint to be configured
    # for all or only some services.
    - job_name: "kubernetes-ingresses"

      metrics_path: /probe
      params:
        module: [http_2xx]

      kubernetes_sd_configs:
        - role: ingress

      relabel_configs:
        # Example relabel to probe only some ingresses that have "example.io/should_be_probed = true" annotation
        #  - source_labels: [__meta_kubernetes_ingress_annotation_example_io_should_be_probed]
        #    action: keep
        #    regex: true
        - source_labels:
            [
              __meta_kubernetes_ingress_scheme,
              __address__,
              __meta_kubernetes_ingress_path,
            ]
          regex: (.+);(.+);(.+)
          replacement: ${1}://${2}${3}
          target_label: __param_target
        - target_label: __address__
          replacement: blackbox-exporter.example.com:9115
        - source_labels: [__param_target]
          target_label: instance
        - action: labelmap
          regex: __meta_kubernetes_ingress_label_(.+)
        - source_labels: [__meta_kubernetes_namespace]
          target_label: kubernetes_namespace
        - source_labels: [__meta_kubernetes_ingress_name]
          target_label: kubernetes_name

    # Example scrape config for pods
    #
    # The relabeling allows the actual pod scrape to be configured
    # for all the declared ports (or port-free target if none is declared)
    # or only some ports.
    - job_name: "kubernetes-pods"

      kubernetes_sd_configs:
        - role: pod

      relabel_configs:
        # Example relabel to scrape only pods that have
        # "example.io/should_be_scraped = true" annotation.
        #  - source_labels: [__meta_kubernetes_pod_annotation_example_io_should_be_scraped]
        #    action: keep
        #    regex: true
        #
        # Example relabel to customize metric path based on pod
        # "example.io/metric_path = <metric path>" annotation.
        #  - source_labels: [__meta_kubernetes_pod_annotation_example_io_metric_path]
        #    action: replace
        #    target_label: __metrics_path__
        #    regex: (.+)
        #
        # Example relabel to scrape only single, desired port for the pod
        # based on pod "example.io/scrape_port = <port>" annotation.
        #  - source_labels: [__address__, __meta_kubernetes_pod_annotation_example_io_scrape_port]
        #    action: replace
        #    regex: ([^:]+)(?::\d+)?;(\d+)
        #    replacement: $1:$2
        #    target_label: __address__
        - action: labelmap
          regex: __meta_kubernetes_pod_label_(.+)
        - source_labels: [__meta_kubernetes_namespace]
          action: replace
          target_label: kubernetes_namespace
        - source_labels: [__meta_kubernetes_pod_name]
          action: replace
          target_label: kubernetes_pod_name

$ kubectl apply -f prometheus-cfg.yaml 
configmap/prometheus-config created

假设在Kubernetes的节点资源中，存在以下标签： meta_kubernetes_node_label_app:frontend meta_kubernetes_node_label_env:production meta_kubernetes_node_label region: us-west 使用action：labelmap并指定正则表达式_meta_kubernetes_node_label（.+），它会匹配到这些标签，并为每个匹配到的标签创建一个对应的标签映射。在这个例子中，它会创建以下标签映射： app: frontend

使用action:labelmap并指定正则表达式_meta_kubernetes_node_label_（.+），它会匹配到这些客标签，并为每个匹配到的标签创建一个对应的标签映射。在这个例子中，它会创建以下标签映射： app: frontend env:production region:us-west 这样，在监控系统中，你就可以根据这些新的标签进行更精确的标识和查询。例如，你可以根据app: frontend标签过滤出所有具有该标签的节点，并针对这些节点进行特定的监控配置或报警设置。标签重整的目的是提供更灵活和有针对性的标签处理，以便更好地利用监控系统的功能和查询能力。希望这次解释更加清晰明了。

通过deployment部署prometheus

day19/prometheus-deploy.yaml

---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: prometheus-server
  namespace: monitor-sa
  labels:
    app: prometheus
spec:
  replicas: 1
  selector:
    matchLabels:
      app: prometheus
      component: server
    #matchExpressions:
    #-{key: app, operator:In,values:[prometheus]]
    #-{key: component, operator: In,values:[server]}
  template:
    metadata:
      labels:
        app: prometheus
        component: server
      annotations:
        prometheus.io/scrape: 'false'
    spec:
      nodeName: xuegod64
      serviceAccountName: monitor
      containers:
      - name: prometheus
        image: prom/prometheus:v2.33.5
        imagePullPolicy: IfNotPresent
        command:
          - prometheus
          - --config.file=/etc/prometheus/prometheus.yml
          - --storage.tsdb.path=/prometheus
          - --storage.tsdb.retention=720h
          - --web.enable-lifecycle
        ports:
        - containerPort: 9090
          protocol: TCP
        volumeMounts:
        - mountPath: /etc/prometheus
          name: prometheus-config
        - mountPath: /prometheus/
          name: prometheus-storage-volume
      volumes:
      - name: prometheus-config
        configMap:
          name: prometheus-config
      - name: prometheus-storage-volume
        hostPath:
          path: /data
          type: Directory

创建4层代理

day19/prometheus-svc.yaml

apiVersion: v1
kind: Service
metadata:
  name: prometheus
  namespace: monitor-sa
  labels:
    app: prometheus
spec:
  type: NodePort
  ports:
  - port: 9090
    targetPort: 9090
    protocol: TCP
  selector:
    app: prometheus
    component: server

$ kubectl -n monitor-sa get svc
NAME         TYPE       CLUSTER-IP     EXTERNAL-IP   PORT(S)          AGE
prometheus   NodePort   10.103.84.66   <none>        9090:30633/TCP   49s
$ curl localhost:30633