en
Feedback
DevOps & SRE notes

DevOps & SRE notes

Open in Telegram

Helpful articles and tools for DevOps&SRE WhatsApp: https://whatsapp.com/channel/0029Vb79nmmHVvTUnc4tfp2F For paid consultation (RU/EN), contact: @tutunak All ways to support https://telegra.ph/How-support-the-channel-02-19

Show more

πŸ“ˆ Analytical overview of Telegram channel DevOps & SRE notes

Channel DevOps & SRE notes (@devops_sre_notes) in the English language segment is an active participant. Currently, the community unites 12 657 subscribers, ranking 10 040 in the Technologies & Applications category and 2 978 in the USA region.

πŸ“Š Audience metrics and dynamics

Since its creation on Π½Π΅Π²Ρ–Π΄ΠΎΠΌΠΎ, the project has demonstrated rapid growth, gathering an audience of 12 657 subscribers.

According to the latest data from 11 June, 2026, the channel demonstrates stable activity. Although there has been a change in the number of participants by 228 over the last 30 days and by 17 over the last 24 hours, overall reach remains high.

  • Verification status: Not verified
  • Engagement rate (ER): The average audience engagement rate is 17.75%. Within the first 24 hours after publication, content typically collects 4.84% reactions from the total number of subscribers.
  • Post reach: On average, each post receives 2 247 views. Within the first day, a publication typically gains 612 views.
  • Reactions and interaction: The audience actively supports content: the average number of reactions per post is 3.
  • Thematic interests: Content is focused on key topics such as kubernete, cluster, author, engineering, monitoring.

πŸ“ Description and content policy

The author describes the resource as a platform for expressing subjective opinions:
β€œHelpful articles and tools for DevOps&SRE WhatsApp: https://whatsapp.com/channel/0029Vb79nmmHVvTUnc4tfp2F For paid consultation (RU/EN), contact: @tutunak All ways to support https://telegra.ph/How-support-the-channel-02-19”

Thanks to the high frequency of updates (latest data received on 12 June, 2026), the channel maintains relevance and a high level of publication reach. Analytics show that the audience actively interacts with content, making it an important point of influence in the Technologies & Applications category.

12 657
Subscribers
+1724 hours
+677 days
+22830 days
Posts Archive
Richard Artoul explores the distinctions between "shared nothing" and "shared storage" architectures, particularly within data streaming contexts. He highlights how shared storage systems, by decoupling data from metadata, offer enhanced flexibility and scalability compared to traditional shared-nothing models. ξˆ€citeξˆ‚turn0search0 https://www.warpstream.com/blog/the-case-for-shared-storage

In his article "TTR: the out-of-control metric," Lorin Hochstein critiques the application of the Time-to-Resolve (TTR) metric in incident management. He argues that since incidents represent periods when systems are out of control, applying statistical analyses to TTR is ineffective and does not lead to meaningful improvements. https://surfingcomplexity.blog/2024/11/23/ttr-the-out-of-control-metric/

OpenTofu / Terraform / Terragrunt and Atmos version manager https://github.com/tofuutils/tenv

An operator to manage ephemeral Kubernetes resources 🐝 https://github.com/NCCloud/mayfly

The blogpost highlights potential security risks associated with automating Terraform lifecycle management. It discusses how malicious actors can exploit vulnerabilities in Terraform automation platforms, such as Hashicorp Cloud and Atlantis, by creating custom providers or using data sources to execute malicious code during the terraform plan phase. This can lead to unauthorized access to sensitive cloud credentials, compromising entire cloud environments. The article emphasizes the need for secure defaults and validation mechanisms in these platforms to mitigate such risks https://snyk.io/blog/gitflops-dangers-of-terraform-automation-platforms/

The article "Autoscaling with Keda and Prometheus Using Custom Metrics in Go" on *Medium* provides a detailed guide on how to implement autoscaling in Kubernetes using Keda and Prometheus. It demonstrates creating custom Prometheus metrics in a Go application, deploying it on Kubernetes, and configuring Prometheus to scrape these metrics. The article then shows how to integrate Keda with Prometheus to scale pods based on custom metrics, such as the number of HTTP requests or product orders, ensuring dynamic resource allocation during varying traffic conditions. https://medium.com/vakifbank-teknoloji/autoscaling-with-keda-and-prometheus-using-custom-metrics-in-go-558a64668fc4

Repost from N/a
πŸš€ Golang Notes 🐹 Looking for a place to level up your Go skills? Join Golang Notes and stay ahead in the world of Golang! ✨
πŸš€ Golang Notes 🐹 Looking for a place to level up your Go skills? Join Golang Notes and stay ahead in the world of Golang! ✨ What you'll find: πŸ”Ή Best practices and coding tips πŸ”Ή Latest updates from the Go ecosystem πŸ”Ή Useful tools, snippets, and guides πŸ”Ή Community discussions and expert insights πŸ‘¨β€πŸ’» Whether you're a beginner or an experienced developer, this channel has something for you! πŸ”— Join now

Kuzco reviews your Terraform and OpenTofu resources, compares them to the provider schema to detect unused parameters, and uses AI to suggest improvements and fixes https://github.com/RoseSecurity/Kuzco

Retry a command with exponential backoff and jitter (+ Starlark expressions) https://github.com/dbohdan/recur

The author provides a comprehensive guide to building a REST API hosted on AWS API Gateway with a backend on AWS Lambda and a database on DynamoDB. The guide includes setting up AWS services using Terraform, creating a Lambda function to perform CRUD operations on DynamoDB, and implementing authentication with Amazon Cognito to secure certain routes https://awstip.com/a-step-by-step-guide-on-deploying-rest-api-using-api-gateway-lambda-cognito-terraform-f277814d048e

The blogpost addresses the challenges engineering managers face in maintaining their technical skills amidst busy schedules. It suggests that instead of trying to dedicate a significant portion of their time to hands-on technical work, managers can leverage their team's diversity and projects to stay updated. This involves guiding team members through experimental projects, learning from their experiences, and teaching junior engineers, which helps maintain a technical edge without compromising work-life balance https://medium.com/engineering-managers-journal/real-ways-to-maintain-your-technical-edge-as-an-engineering-manager-25652fa1495c

Goliat - Dashboard is an open-source tool for managing, visualizing, and optimizing Terraform deployments, with integration to Terraform Cloud and a custom provider. https://github.com/danieljsaldana/goliat-dashboard

Stateless cluster local OCI registry mirror. https://github.com/spegel-org/spegel

The article delves into the intricacies of Kubernetes resource management, specifically focusing on requests and limits. It explains how these settings impact pod scheduling, resource allocation, and performance, highlighting the importance of correctly configuring them to ensure efficient use of cluster resources and prevent overcommitting or underutilization. Understanding these concepts is crucial for optimizing application performance and reliability in Kubernetes environments. https://thenewstack.io/how-kubernetes-requests-and-limits-really-work/

The author discusses strategies for significantly reducing the startup time of AWS EKS Windows nodes. The author achieved this by using Karpenter for dynamic node provisioning, optimizing PowerShell scripts, and pre-caching images with AWS Image Builder. Key optimizations included uninstalling unnecessary PowerShell modules and rewriting the bootstrap script in C# for better performance, resulting in startup times under 90 seconds https://hackernoon.com/how-i-reduced-eks-windows-node-start-time-from-5-min-to-90s

πŸ”₯ Critical vulnarabliiity in ingress-nginx controlller 9.8/10 πŸ”₯ https://github.com/advisories/GHSA-mgvx-rpfc-9mpv If you're running Kubernetes with the ingress-nginx controller and are affected by the vulnerability described in GHSA-mgvx-rpfc-9mpv (CVE-2025-1974), you face several serious security risks: Critical Security Risks This vulnerability, published on March 25, 2025, is part of a set of critical flaws collectively named "IngressNightmare" with a CVSS score of 9.8[6]. The specific issues include: - Unauthenticated Remote Code Execution (RCE): An attacker with access to the pod network can execute arbitrary code in the context of the ingress-nginx controller without authentication[1][2]. - Cluster-wide Secret Exposure: The vulnerability allows attackers to access and steal all secrets accessible to the controller. In default installations, the controller can access all secrets across all namespaces in the cluster[1][3]. - Complete Cluster Takeover: Due to the elevated privileges of the admission controller, successful exploitation could lead to full compromise of your Kubernetes environment[3][6]. - Public Exposure Risk: Over 6,500 clusters with publicly accessible admission controllers are at immediate risk, including those operated by Fortune 500 companies[8]. How the Vulnerability Works The attack targets the admission controller component of the ingress-nginx controller: 1. The vulnerability allows attackers to inject arbitrary NGINX configuration remotely by sending a malicious ingress object directly to the admission controller[3]. 2. When the controller processes this malicious object during validation, it causes the NGINX validator to execute malicious code[6][8]. 3. The admission controller's elevated privileges and network accessibility create a critical escalation path, allowing an attacker to access sensitive resources across the entire cluster[3]. Required Action To mitigate this issue, you should: - Update immediately to one of the patched versions: 1.12.1, 1.11.5, or 1.10.7[6]. - Ensure your admission webhook endpoint is not exposed externally[6]. - Limit access to the admission controller to only the Kubernetes API Server[6]. - Temporarily disable the admission controller component if it's not needed[6]. This vulnerability affects approximately 43% of cloud environments, making it a widespread and serious threat to Kubernetes deployments[6].

Repost from Golang notes
A PostgreSQL database explorer TUI (Terminal User Interface) application written in Go. https://github.com/ddoemonn/go-dot-dot

The incredible HULL - Helm Uniform Layer Library - is a Helm library chart to improve Helm chart based workflows https://github.com/vidispine/hull

The article focuses on the importance of handling termination signals gracefully in applications deployed in orchestrated environments like Kubernetes. Graceful shutdowns are crucial to prevent data loss and system instability that can occur with abrupt terminations, ensuring that applications can exit cleanly and maintain consistency even when they are stopped or scaled down. https://packagemain.tech/p/graceful-shutdowns-k8s-go

Repost from Python notes
The recursive internet scanner for hackers. 🧑 https://github.com/blacklanternsecurity/bbot