OpenTelemetry Operator 最佳实践

小夏 教育 更新 2024-01-30

OpenTelemetry Operator 是 Kubernetes Operator 的实现。

它主要管理以下操作:

opentelemetry collector

自动检测:使用 OpenTelemetry 检测库自动检测工作负载。

观测云采集器DataKit引入OpenTelemetry设计理念,兼容otlp协议,因此您可以绕过 OpenTelemetry Collector 并将数据直接推送到 DataKit,或者您可以将 OpenTelemetry Collector 的导出器设置为otlp地址指向 Datakit。

我们将使用两种场景将 APM 数据集成到观测云中。

APM 数据通过 OpenTelemetry Collector 推送到观测云

APM数据直接推送到观测云。

k8s环境。

观测云帐户。

opentelemetry-operator.yaml

wget
安装opentelemetry-operator.yaml

[root@k8s-master ~]# kubectl apply -f opentelemetry-operator.yaml namespace/opentelemetry-operator-system createdcustomresourcedefinition.apiextensions.k8s.io/instrumentations.opentelemetry.io createdcustomresourcedefinition.apiextensions.k8s.io/opentelemetrycollectors.opentelemetry.io createdserviceaccount/opentelemetry-operator-controller-manager createdrole.rbac.authorization.k8s.io/opentelemetry-operator-leader-election-role createdclusterrole.rbac.authorization.k8s.io/opentelemetry-operator-manager-role createdclusterrole.rbac.authorization.k8s.io/opentelemetry-operator-metrics-reader createdclusterrole.rbac.authorization.k8s.io/opentelemetry-operator-proxy-role createdrolebinding.rbac.authorization.k8s.io/opentelemetry-operator-leader-election-rolebinding createdclusterrolebinding.rbac.authorization.k8s.io/opentelemetry-operator-manager-rolebinding createdclusterrolebinding.rbac.authorization.k8s.io/opentelemetry-operator-proxy-rolebinding createdservice/opentelemetry-operator-controller-manager-metrics-service createdservice/opentelemetry-operator-webhook-service createddeployment.apps/opentelemetry-operator-controller-manager createdcertificate.cert-manager.io/opentelemetry-operator-serving-cert createdissuer.cert-manager.io/opentelemetry-operator-selfsigned-issuer createdmutatingwebhookconfiguration.admissionregistration.k8s.io/opentelemetry-operator-mutating-webhook-configuration createdvalidatingwebhookconfiguration.admissionregistration.k8s.io/opentelemetry-operator-validating-webhook-configuration created
检查pod

[root@k8s-master df-demo]# kubectl get pod -n opentelemetry-operator-systemname ready status restarts ageopentelemetry-operator-controller-manager-7b4687df88-9s967 2/2 running 0 26h
opentelemetry-collector.yaml

apiversion: opentelemetry.io/v1alpha1kind: opentelemetrycollectormetadata: name: demospec: config: |receivers: otlp: protocols: grpc: http: processors: memory_limiter: check_interval: 1s limit_percentage: 75 spike_limit_percentage: 15 batch: send_batch_size: 10000 timeout: 10s exporters: logging: otlp: endpoint: ""将链接信息输出到观测云平台 TLS:不安全:真 压缩:无 不启用 gzip 服务: 管道: 跟踪: 接收器: [otlp] 处理器: [内存限制器,批处理] 导出器: [日志记录,otlp] 指标: 接收器: [otlp] 处理器: [内存限制器,批处理] 导出器: [日志记录] 日志: 接收器: [otlp] 处理器: [内存限制器,批处理] 导出器: [日志记录]
执行opentelemetry-collector.yaml

kubectl apply -f opentelemetry-collector.yaml
检查pod

[root@k8s-master ~]# kubectl get pod name ready status restarts agedemo-collector-59b9447bf9-dz47k 1/1 running 0 61m
OpenTelemetry Operator 可以注入和配置 OpenTelemetry 自动检测库。 目前支持apache httpddotnetgoj**anodejspython若要使用自动检测,请使用 SDK 和检测的配置来配置检测资源。

opentelemetry-instrumentation.yaml

apiversion: opentelemetry.io/v1alpha1kind: instrumentationmetadata: name: my-instrumentationspec: exporter: endpoint: http://demo-collector:4317 # opentelemetry collector address # endpoint: # guance datakit opentelemetry collector address propagators: -tracecontext - baggage - b3 #sampler: #type: parentbased_traceidratio #argument: "0.25" j**a: image: ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-j**a:latest nodejs: image: ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-nodejs:latest python: image: ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-python:latest
exporter:数据上报的地址,可以是 OpenTelemetry Collector,也可以是其他可以接收的数据otlp协议数据的收集器。

Propaators:链接数据传播器。

sampler:采样器。

j**aodejs python:不同语言的代理,根据项目实际需求填写。

执行opentelemetry-instrumentation.yaml

kubectl apply -f opentelemetry-instrumentation.yaml
检查instrumentation

[root@k8s-master ~]# kubectl get instrumentationname age endpoint sampler sampler argmy-instrumentation 65m http://demo-collector:4317
或使用kubectl get otelinst命令查看。

[root@k8s-master ~]# kubectl get otelinstname age endpoint sampler sampler argmy-instrumentation 71m http://demo-collector:4317
datakit.yamldaemonset新品如下volumemounts

apiversion: apps/v1kind: daemonsetmetadata: labels: app: daemonset-datakit name: datakit namespace: datakitspec: revisionhistorylimit: 10 selector: matchlabels: app: daemonset-datakit template: metadata: labels: app: daemonset-datakit spec: hostnetwork: true dnspolicy: clusterfirstwithhostnet containers: .volumemounts: -mountpath: /usr/local/datakit/conf.d/opentelemetry/opentelemetry.conf name: datakit-conf subpath: opentelemetry.conf ..
datakit.yamlconfigmap数据opentelemetry.conf

apiversion: v1kind: configmapmetadata: name: datakit-conf namespace: datakitdata: opentelemetry.conf: |inputs.opentelemetry]] inputs.opentelemetry.grpc] trace_enable = true metric_enable = true addr = "0.0.0.0:4319" # de***t 4317 [inputs.opentelemetry.http] enable = false http_status_ok = 200
重启 DataKit

这里有一个 j**a 应用程序springboot-server

springboot-server.yaml

apiversion: v1kind: servicemetadata: name: springboot-server labels: app: springboot-serverspec: selector: app: springboot-server ports: -protocol: tcp port: 8080 targetport: 8080 nodeport: 31010 type: nodeport---apiversion: apps/v1kind: deploymentmetadata: name: springboot-serverspec: selector: matchlabels: app: springboot-server replicas: 1 template: metadata: labels: app: springboot-server annotations: sidecar.opentelemetry.io/inject: "true" instrumentation.opentelemetry.io/inject-j**a: "true" spec: containers: -name: app image: registry.cn-shenzhen.aliyuncs.com/lr_715377484/springboot-server ports: -containerport: 8080 protocol: tcp
执行springboot-server.yaml

kubectl apply -f springboot-server.yaml
检查pod

[root@k8s-master ~]# kubectl get pod -owidename ready status restarts age ip node nominated node readiness gatesdemo-collector-59b9447bf9-dz47k 1/1 running 0 24h 100.111.156.98 k8s-node1 springboot-server-64b78f4487-9hv9r 1/1 running 0 24h 100.111.156.108 k8s-node1
检查pod细节。

root@k8s-master ~]# kubectl describe pod springboot-server-64b78f4487-9hv9rname: springboot-server-64b78f4487-9hv9rnamespace: defaultpriority: 0node: k8s-node1/172.31.22.247start time: .labels: app=springboot-server pod-template-hash=64b78f4487annotations: cni.projectcalico.org/containerid: 5700e2ab666a8bbc32b1ac84cc3d98137a7e186ca5cf4b0b6e7407ac8139d391 cni.projectcalico.org/podip: 100.111.156.108/32 cni.projectcalico.org/podips: 100.111.156.108/32 instrumentation.opentelemetry.io/inject-j**a: true sidecar.opentelemetry.io/inject: truestatus: runningip: 100.111.156.108ips: ip: 100.111.156.108controlled by: replicaset/springboot-server-64b78f4487init containers: opentelemetry-auto-instrumentation: container id: containerd://c5747d8217b43fcb1a8eac00fbd33d70c7b25d1a3f0faaccdacea94c8b1e016b image: ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-j**a:latest image id: ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-j**a@sha256:f903e6eb067f28cba1f37b6ac592b511c61ce0bf2a73f6e7619359ac5d500d85 port: host port: command: cp /j**aagent.jar /otel-auto-instrumentation/j**aagent.jar ..mounts: /otel-auto-instrumentation from opentelemetry-auto-instrumentation (rw) /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-lbmf6 (ro)containers: app: container id: containerd://0db185a75e9eeb5eed97aaf8e707f4bd30f210e404f5fae98fc0d55a300a4470 image: registry.cn-shenzhen.aliyuncs.com/lr_715377484/springboot-server image id: registry.cn-shenzhen.aliyuncs.com/lr_715377484/springboot-server@sha256:bf394ec31566653bc6aa0e56dfc94a602bde3d95dfb08ac96d7f33c5dc00005e port: 8080/tcp host port: 0/tcp state: running started: .ready: true restart count: 0 environment: j**a_tool_options: -j**aagent:/otel-auto-instrumentation/j**aagent.jar otel_service_name: springboot-server otel_exporter_otlp_endpoint: http://demo-collector:4317 otel_resource_attributes_pod_name: springboot-server-64b78f4487-9hv9r (v1:metadata.name) otel_resource_attributes_node_name: (v1:spec.nodename) otel_propagators: tracecontext,baggage,b3 otel_resource_attributes: k8s.container.name=app,k8s.deployment.name=springboot-server,k8s.namespace.name=default,k8s.node.name=$(otel_resource_attributes_node_name),k8s.pod.name=$(otel_resource_attributes_pod_name),k8s.replicaset.name=springboot-server-64b78f4487 mounts: /otel-auto-instrumentation from opentelemetry-auto-instrumentation (rw) /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-lbmf6 (ro)
init containers:容器初始化并执行opentelemetry-auto-instrumentationsidecar 。

默认的 j**a 应用注入环境变量。

environment: j**a_tool_options: -j**aagent:/otel-auto-instrumentation/j**aagent.jar otel_service_name: springboot-server otel_exporter_otlp_endpoint: http://demo-collector:4317 otel_resource_attributes_pod_name: springboot-server-64b78f4487-9hv9r (v1:metadata.name) otel_resource_attributes_node_name: (v1:spec.nodename) otel_propagators: tracecontext,baggage,b3 otel_resource_attributes: k8s.container.name=app,k8s.deployment.name=springboot-server,k8s.namespace.name=default,k8s.node.name=$(otel_resource_attributes_node_name),k8s.pod.name=$(otel_resource_attributes_pod_name),k8s.replicaset.name=springboot-server-64b78f4487
至此,j**a 应用程序已成功注入opentelemetry-auto-instrumentationsidecar 。

在主机上,通过执行以下命令,生成它trace数据。

[root@k8s-master ~]# curl
对于豆荚ip也可以通过访问 SVC 的端口来生成它trace数据。

[root@k8s-master ~]# curl http://localhost:31010/gateway
[root@k8s-master ~]# kubectl logs -f springboot-server-64b78f4487-9hv9r...2023-08-* 16:34:17.454 [http-nio-8080-exec-8] info c.z.o.s.c.servercontroller - auth,74] traceid=a1b510158fc09c55c04de2d9472d10d7 spanid=61b6bd8264f7d8b1 - this is auth2023-08-* 16:34:17.456 [http-nio-8080-exec-5] info c.z.o.s.f.corsfilter - dofilter,32] traceid=a1b510158fc09c55c04de2d9472d10d7 spanid=62370160a0fc0738 - url:/billing,header:accept :application/json, application/*+jsontraceparent :00-a1b510158fc09c55c04de2d9472d10d7-057f9e068e9cc007-01b3 :a1b510158fc09c55c04de2d9472d10d7-057f9e068e9cc007-1user-agent :j**a/1.8.0_212host :localhost:8080connection :keep-alive2023-08-* 16:34:17.456 [http-nio-8080-exec-5] info c.z.o.s.c.servercontroller - billing,82] traceid=a1b510158fc09c55c04de2d9472d10d7 spanid=9514404368a2d4fd - this is method3,null
您可以看到已生成与跟踪相关的信息:traceparentb3这里发现日志也生成了traceidspanid有关如何关联日志的信息trace,请参阅文档日志关联

调整opentelemetry-instrumentation.yaml文件otel_exporter_otlp_endpoint

[root@k8s-master ~]# kubectl describe pod springboot-server-64b78f4487-t7gph |grep otel_exporter_otlp_endpoint otel_exporter_otlp_endpoint:
重新执行以下命令yaml

kubectl delete -f opentelemetry-instrumentation.yaml kubectl apply -f opentelemetry-instrumentation.yaml kubectl delete -f springboot-server.yamlkubectl apply -f springboot-server.yaml
访问应用程序 URL 以生成跟踪数据。

登录到您的 Observation Cloud 帐户并查看链接视图。

opentelemetry-operator-demo:

相似文章