DevOps&SRE Library

Open in Telegram

Библиотека статей по теме DevOps и SRE. Реклама: @ostinostin Контент: @mxssl РКН: https://www.gosuslugi.ru/snet/67704b536aa9672b963777b3

Network:DevOps&SRE Library Russia34 858 Technologies & Applications6 952...

📈 Analytical overview of Telegram channel DevOps&SRE Library

Channel DevOps&SRE Library (@devopslibrary) in the English language segment is an active participant. Currently, the community unites 19 407 subscribers, ranking 6 952 in the Technologies & Applications category and 34 858 in the Russia region.

📊 Audience metrics and dynamics

Since its creation on невідомо, the project has demonstrated rapid growth, gathering an audience of 19 407 subscribers.

According to the latest data from 11 June, 2026, the channel demonstrates stable activity. Although there has been a change in the number of participants by 162 over the last 30 days and by 13 over the last 24 hours, overall reach remains high.

Verification status: Not verified
Engagement rate (ER): The average audience engagement rate is 15.12%. Within the first 24 hours after publication, content typically collects 7.09% reactions from the total number of subscribers.
Post reach: On average, each post receives 2 932 views. Within the first day, a publication typically gains 1 376 views.
Reactions and interaction: The audience actively supports content: the average number of reactions per post is 1.
Thematic interests: Content is focused on key topics such as kubernete, cluster, infrastructure, storage, configuration.

📝 Description and content policy

The author describes the resource as a platform for expressing subjective opinions:
“Библиотека статей по теме DevOps и SRE. Реклама: @ostinostin Контент: @mxssl РКН: https://www.gosuslugi.ru/snet/67704b536aa9672b963777b3”

Thanks to the high frequency of updates (latest data received on 12 June, 2026), the channel maintains relevance and a high level of publication reach. Analytics show that the audience actively interacts with content, making it an important point of influence in the Technologies & Applications category.

19 407

Subscribers

+1324 hours

+187 days

+16230 days

2 932

Post views

~ 1 37624 hours

~ 1 73148 hours

15.12%

Engagement rate

~ 2

Posts per day

Ads index

beta

Posts Archive

19 411

ayaFlow

A high-performance, eBPF-based network traffic analyzer written in Rust. Designed to run as a sidecarless DaemonSet in Kubernetes, providing kernel-native visibility into node-wide network traffic with minimal overhead.

https://github.com/DavidHavoc/ayaFlow

19 411

CloudnativePG: postgres database the modern way https://medium.com/@pascal.toepke/cloudnativepg-postgres-database-the-modern-way-6a3bc9cf5eba

19 411

Аудитные логи в облаке — отдельная распределённая система со своими требованиями к надёжности и стоимости хранения, а не «таблица с событиями». Команда MWS Cloud Platform выложила подробный разбор архитектуры своего сервиса: от библиотеки, которую подключают сервисы облака, до хранилища на Apache Iceberg и движка StarRocks, с объяснением, почему выбрали именно такой набор технологий и где спрятаны неочевидные грабли. Полезно всем, кто разрабатывает ИБ-инструменты, работает с большим количеством событий или просто интересуется инструментами безопасности в облаке. Читать статью на Хабре

19 411

Vibe Coding a Kubernetes Media Server: What I Learned About AI-First Engineering https://medium.com/@fl.lettner/vibe-coding-a-kubernetes-media-server-what-i-learned-about-ai-first-engineering-88a38224ea5e

19 411

Repost from Performance matters!

О бенчмаркинге, часть 1 Бенчмаркинг занимает значительную часть моей повседневной работы, поэтому важно понимать его основы, типы и ограничения. По сути, это способ оценить производительность системы под нагрузкой. Брендан Грегг в своё время ввёл наглядную терминологию: * Passive Benchmarking * Active Benchmarking ——— Passive Benchmarking простой и распространённый подход: настраиваешь окружение, запускаешь нагрузку, получаешь на выходе цифры и в дальнейшем ими руководствуешься. Но это как раз тот случай, когда «просто» не значит «лучше». Проблемы таких замеров: * риск измерить не то, что планировалось изначально * непонятно, что именно ограничивает производительность * нельзя отличить систематическое отклонение от шума (об этом позже) * остаемся без ответа, почему получены именно такие результаты * сами бенчмарки могут содержать баги, что останется от нас скрыто

«Бенчмаркаешь A, на самом деле измеряешь B, а выводы делаешь о C». ©

В итоге решения на основе таких данных могут оказаться даже хуже, чем если бы данных не было вовсе. ——— В противовес ему стоит Active Benchmarking (AB). Помимо настройки окружения требуется активное наблюдение за системой во время прогона. Задача понять: * то ли мы меряем, что планировали * что ограничивает производительность * согласуется ли наблюдаемое поведение с нашей моделью системы * что нужно изменить, чтобы улучшить результат AB способен дать более надёжный результат. Цена этого: более высокие требования к проведению эксперимента. Нужно не только уметь настроить окружение, но и верно интерпретировать наблюдаемое. Алгоритм: 1. Собрать данные о работе системы, тулинг в помощь (perf, bcc, iostat, bpftrace, tcpdump, ...) 2. Интерпретировать, как реагирует система (методологии USE, RED, off-CPU, TSA, ...) 3. Применять в цикле:

запустил
 └─ пронаблюдал
    └─ сформулировал гипотезу во что упираемся
       └─ проверил в следующем прогоне
          └─ повторил

Каждый пункт по отдельности даёт ограниченный эффект, зато вместе позволяет быстрее и увереннее продвигаться вперёд. ——— Вывод Сырые цифры от Passive Benchmarking могут выглядеть правдоподобно и при этом вести к неверным и дорогим решениям. Слепо доверяясь им, мы фактически надеемся, что угадали с сетапом с первого раза и учли все нюансы. Не похоже на надёжную стратегию Active Benchmarking напротив, позволяет избежать ловушки «бенчмаркаешь A, измеряешь B, делаешь выводы о C». Цифры, полученные таким методом, поддаются объяснению, их можно оспорить и воспроизвести. И на них уже можно опираться при принятии инженерных решений. ——— Что почитать - Active Benchmarking - CPU Benchmarks and Bad Tinder Dates - Performance Methodologies - Producing Wrong Data Without Doing Anything Obviously Wrong (тут может помочь заметка о чтении white paper) To be continued... ——— Поддержать лайком на Linkedin.

19 411

Keeping Your Security Model Intact When Running VMs in Kubernetes https://medium.com/@dillon_b/migrating-vms-on-kubernetes-without-losing-nsx-level-security-1c4027396568

19 411

Стрим о защите контейнеров, который нельзя пропустить Утёнок — для привлечения внимания. 28 мая в 11:00 на стриме разберём угрозы для контейнерных сред и где их ловить по пути от кода до кластера. Покажем, как новинки в Kaspersky Container Security меняют игру в защите контейнеров: от глубинного анализа образов с помощью ИИ до кастомных политик в пару кликов. За комплексный взгляд на тему отвечает специальный гость из платформы «Штурвал». Чтобы не пропустить, регистрируйтесь.

19 411

We brought Skew Protection to your Kubernetes

We're excited to share a new experimental feature for Platformatic: Skew Protection in the Intelligent Command Center (ICC). This brings Vercel-style deployment safety to Kubernetes, letting you deploy without downtime and avoid version-mismatch problems.

https://blog.platformatic.dev/skew-protection-for-kubernetes

19 411

DocumentDB on Kubernetes: Resilient, Highly Available Databases with Automatic Failover https://itnext.io/documentdb-on-kubernetes-resilient-highly-available-databases-with-automatic-failover-74c1a3ec882e

19 411

Autoscaling Hid Our LLM Cost Regression (85% → 4% Cache Hit Rate)

Why we moved capacity engineering into CI and started gating on prefix-cache efficiency

https://medium.com/@nroan/autoscaling-hid-our-llm-cost-regression-85-4-cache-hit-rate-b4beab5df240

19 411

Why post-mortem action items die https://incident.io/blog/why-post-mortem-action-items-die

19 411

The Incident Hero Trap https://uptimelabs.io/articles/the-incident-hero-trap

19 411

What does using AI for post-mortems actually mean? https://incident.io/blog/what-does-using-ai-for-post-mortems-actually-mean

19 411

Why LLMs Write Incorrect SQL (and What That Means for Your Database)

Most LLM-generated SQL doesn't fail. It runs and returns results, and that's exactly what makes it dangerous. The errors don't surface until they're already in your data.

https://readyset.io/blog/why-llms-write-incorrect-sql-and-what-that-means-for-your-database

19 411

The Human Infrastructure: How Netflix Built the Operations Layer Behind Live at Scale

In the three years since our first Live show, Chris Rock: Selective Outrage, we have witnessed an incredible expansion of our live content slate and the live operations that support it. From modest beginnings of streaming just one show per month, we are now capable of streaming over nine shows in a single day, reaching tens of millions of concurrent members. This post pulls back the curtain on the Live Operations teams that enable this rapid scale.

https://netflixtechblog.com/the-human-infrastructure-how-netflix-built-the-operations-layer-behind-live-at-scale-33e2a311c597

19 411

codeburn

CodeBurn tracks token usage, cost, and performance across 19 AI coding tools. It breaks down spending by task type, model, tool, project, and provider so you can see exactly where your budget goes.

https://github.com/getagentseal/codeburn

19 411

openhare

openhare is an AI-powered, cross-platform desktop SQL client with multi-database support, built for everyday development, data analysis, and DBA management workflows.

https://github.com/sjjian/openhare

19 411

eraser

Eraser helps Kubernetes admins remove a list of non-running images from all Kubernetes nodes in a cluster.

https://github.com/eraser-dev/eraser

19 411

Finding zombies in our systems: A real-world story of CPU bottlenecks https://medium.com/pinterest-engineering/finding-zombies-in-our-systems-a-real-world-story-of-cpu-bottlenecks-ea4722e552eb

19 411

Стартуем с Kubernetes без боли в Managed Kubernetes от MWS Cloud Platform. 27 мая в 16:00 Александр Курасов, технический владелец продукта в MWS Cloud Platform, покажет, как развернуть кластер за минуты, на вебинаре «Быстрый старт с Managed Kubernetes в облаке MWS». Разберём архитектуру сервиса, его интеграцию с IAM, сетями и балансировщиками. Увидите, как управляемый сервис берёт на себя администрирование master-узлов и упрощает жизнь. Будет интересно: ♦DevOps-инженерам, которые хотят упростить работу с Kubernetes ♦Backend-разработчикам, которым нужно быстро задеплоить сервис ♦Platform-инженерам, строящим cloud-native инфраструктуру ♦Техлидам и архитекторам, выбирающим Kubernetes в облаке ➡ Зарегистрироваться