DevOps & SRE notes
Открыть в Telegram
Helpful articles and tools for DevOps&SRE WhatsApp: https://whatsapp.com/channel/0029Vb79nmmHVvTUnc4tfp2F For paid consultation (RU/EN), contact: @tutunak All ways to support https://telegra.ph/How-support-the-channel-02-19
Больше12 591
Подписчики
+1524 часа
+687 дней
+23430 день
Архив постов
12 592
Networking within container orchestration can often seem like a black box to developers. This explanation aims to demystify Kubernetes CNI providers and how they manage connectivity.
https://medium.com/@csinclair11/demystifying-kubernetes-cni-providers-5ed79569c797
12 592
The article details how to implement production-grade distributed tracing for complex multi-agent AI workflows using OpenTelemetry.
https://developers.redhat.com/articles/2026/04/06/distributed-tracing-agentic-workflows-opentelemetry#
12 592
Many organizations are looking for more efficient logging solutions than the traditional stack. This comparison highlights a modern alternative to ELK that aims to reduce complexity and resource usage.
https://osuite.io/articles/modern-alternative-to-elk
12 592
This informative post details a clever method for securing Grafana dashboards when using Google Cloud Identity-Aware Proxy. You will learn how to seamlessly integrate these two powerful technologies for enhanced access control.
https://www.vidbregar.com/blog/grafana-gcp-iap
12 592
Managing expenses in the cloud requires a strategic approach beyond just looking at bills. A senior engineer shares valuable insight into optimizing costs effectively in this detailed read.
https://medium.com/@razkevich8/cloud-cost-optimization-a-senior-engineers-guide-d49ed4606de1
12 592
A popular & widely deployed Open Source Container Native Storage platform for Stateful Persistent Applications on Kubernetes.
https://github.com/openebs/openebs
12 592
The observability market is shifting from volume-based data ingestion to a value-driven model due to the unsustainable costs of scaling cloud-native and AI workloads. Driven by innovations like Chronosphere’s "Logs 2.0" and its subsequent acquisition by Palo Alto Networks, the industry is prioritizing "signal discipline"—retaining only actionable telemetry—and integrating observability directly into broader AI and security platforms.
https://siliconangle.com/2026/02/05/observability-cost-ai-scale-chronosphere-opensourcesummit/
12 592
┌──────────┐ ┌──────────┐ ┌──────────┐
) CC ✻ ┊ ( ) CC ✻ ┊ ( ) CC ✻ ┊ (
└──────────┘ └──────────┘ └──────────┘
Claude Code gave me three "tickets" for a free week. You can grab them using this link: https://claude.ai/referral/NXtyf-cgbQ12 592
Uber engineered an automated approach to migrate its massive Java monorepo (over 600,000 tests, 15 million lines of code) from the deprecated JUnit 4 to JUnit 5. Facing challenges like the lack of native JUnit 5 support in their Bazel build system and custom test configurations, they successfully migrated over 75,000 test classes and 1.25 million lines of code in just four months without disrupting developer workflows.
https://www.uber.com/us/en/blog/junit-migration/
12 592
The article explains that while Kubernetes excels at scheduling and isolating workloads, it lacks the context to secure Large Language Models (LLMs), which process untrusted natural language inputs. Highlighting four key risks from the OWASP Top 10 for LLMs, the author argues that security controls shouldn't live within the model runtime (like Ollama). Instead, organizations need a dedicated, LLM-aware policy layer (such as LiteLLM, Kong AI Gateway, or Portkey) in front of the model to enforce validation, filtering, and authorization.
https://www.cncf.io/blog/2026/03/30/llms-on-kubernetes-part-1-understanding-the-threat-model/
12 592
Bulk port forwarding Kubernetes services for local development.
https://github.com/txn2/kubefwd
12 592
The article features an interview with Landon Clipp, who built a multi-tenant GPU-based CaaS platform.
- Bypassing the NVIDIA GPU Operator
- Why gVisor Fails for GPUs
- VM Boot Delays
- Firmware and Memory Security
- Ideal Workload
https://kube.fm/gpu-containers-as-a-service-landon
12 592
Kubernetes Goat is a "Vulnerable by Design" cluster environment to learn and practice Kubernetes security using an interactive hands-on playground 🚀
https://github.com/madhuakula/kubernetes-goat
12 592
Any user with Argo CD application get permissions can extract real Kubernetes Secret values including service account tokens, TLS certificates, database credentials, and API keys. On Applications where IncludeMutationWebhook=true is already set, exploitation requires only read-only Argo CD access.
https://github.com/argoproj/argo-cd/security/advisories/GHSA-3v3m-wc6v-x4x3
12 592
🚀2.3x faster than MinIO for 4KB object payloads. RustFS is an open-source, S3-compatible high-performance object storage system supporting migration and coexistence with other S3-compatible platforms such as MinIO and Ceph.
https://github.com/rustfs/rustfs
Уже доступно! Исследование Telegram 2025 — ключевые инсайты года 
