DevOps&SRE Library

Open in Telegram

Библиотека статей по теме DevOps и SRE. Реклама: @ostinostin Контент: @mxssl РКН: https://www.gosuslugi.ru/snet/67704b536aa9672b963777b3

Network:DevOps&SRE Library Russia34 717 Technologies & Applications6 929...

📈 Analytical overview of Telegram channel DevOps&SRE Library

Channel DevOps&SRE Library (@devopslibrary) in the English language segment is an active participant. Currently, the community unites 19 409 subscribers, ranking 6 929 in the Technologies & Applications category and 34 717 in the Russia region.

📊 Audience metrics and dynamics

Since its creation on невідомо, the project has demonstrated rapid growth, gathering an audience of 19 409 subscribers.

According to the latest data from 20 June, 2026, the channel demonstrates stable activity. Although there has been a change in the number of participants by 109 over the last 30 days and by -1 over the last 24 hours, overall reach remains high.

Verification status: Not verified
Engagement rate (ER): The average audience engagement rate is 14.80%. Within the first 24 hours after publication, content typically collects 7.24% reactions from the total number of subscribers.
Post reach: On average, each post receives 2 873 views. Within the first day, a publication typically gains 1 405 views.
Reactions and interaction: The audience actively supports content: the average number of reactions per post is 1.
Thematic interests: Content is focused on key topics such as kubernete, cluster, infrastructure, storage, configuration.

📝 Description and content policy

The author describes the resource as a platform for expressing subjective opinions:
“Библиотека статей по теме DevOps и SRE. Реклама: @ostinostin Контент: @mxssl РКН: https://www.gosuslugi.ru/snet/67704b536aa9672b963777b3”

Thanks to the high frequency of updates (latest data received on 21 June, 2026), the channel maintains relevance and a high level of publication reach. Analytics show that the audience actively interacts with content, making it an important point of influence in the Technologies & Applications category.

19 409

Subscribers

-124 hours

-47 days

+10930 days

2 873

Post views

~ 1 40524 hours

~ 1 75148 hours

14.80%

Engagement rate

~ 2

Posts per day

Ads index

beta

Posts Archive

19 409

HashiTalks 2024: Mastering Terraform Testing, a layered approach to testing complex infrastructure https://mattias.engineer/posts/hashitalks-2024

19 409

Executing Cron Scripts Reliably At Scale

Cron scripts are responsible for critical Slack functionality. They ensure reminders execute on time, email notifications are sent, and databases are cleaned up, among other things. Over the years, both the number of cron scripts and the amount of data these scripts process have increased. While generally these cron scripts executed as expected, over time the reliability of their execution has occasionally faltered, and maintaining and scaling their execution environment became increasingly burdensome. These issues lead us to design and build a better way to execute cron scripts reliably at scale.

https://slack.engineering/executing-cron-scripts-reliably-at-scale

19 409

📬 Создаём почтовый сервер с чистого поля Михаил, DevOps-инженер компании Nixys, пишет цикл статей по созданию полноценного почтового сервера с нуля. В первой части он разбирает основные компоненты почтового сервера (Exim4, Dovecot, PostfixAdmin и RainLoop) и показывает, как настроить Exim и обеспечить надёжную отправку писем. Публикация будет полезна всем начинающим администраторам. Прочитать статью можно здесь.

19 409

A Distributed Systems Reading List

This document contains various resources and quick definition of a lot of background information behind distributed systems. It is not complete, even though it is kinda sorta detailed. I had written it some time in 2019 when coworkers at the time had asked for a list of references, and I put together what I thought was a decent overview of the basics of distributed systems literature and concepts. Since I was asked for resources again recently, I decided to pop this text into my blog. I have verified the links again and replaced those that broke with archive links or other ones, but have not sought alternative sources when the old links worked, nor taken the time to add any extra content for new material that may have been published since then. It is meant to be used as a quick reference to understand various distsys discussions, and to discover the overall space and possibilities that are around this environment.

https://ferd.ca/a-distributed-systems-reading-list.html

19 409

🖥 Не используете Docker в своей работе? Очень зря! 🫵 Приглашаем 18 марта в 20:00 МСК на бесплатный вебинар «Внутренние механизмы Docker» от Отус. Вебинар пройдет в рамках полноценного онлайн-курса "DevOps практики и инструменты". На вебинаре, вы получите глубокое понимание работы с Docker, аименно: ✅ изучите неймспейсы, cgroups, хранение данных, сетевое взаимодействие контейнеров и сборку образов; ✅ рассмотрите практические кейсы; ✅ узнаете о подводных камнях; ✅ разберете примеры решения проблем. ➡ Регистрация на вебинар https://vk.cc/cvpX0b В рамках вебинара преподаватель подробно объяснит каждый из этих кейсов, расскажет о подводных камнях и покажет примеры решения проблем. Все участники смогут задать вопросы, обсудить возникающие проблемы и получить рекомендации по использованию Docker в своих проектах. 🤝 Записывайтесь сейчас, а мы потом напомним. Участие бесплатно. Реклама. ООО «Отус онлайн-образование», ОГРН 1177746618576, www.otus.ru, erid: 2VtzqufXX4p

19 409

connect() - why are you so slow? https://blog.cloudflare.com/linux-transport-protocol-port-selection-performance

19 409

SRE Archetypes

Different hats that SRE's wear in the industry: Admin, Architect, Toolsmith, and firefighter

https://blog.alexewerlof.com/p/sre-archetypes

19 409

The importance of SEV-1 call leaders

Incidents come in different shapes and sizes. The most severe incidents require special handling that is unlike their less-critical variants. These SEV-1 (aka CRITICAL) incidents can have material financial impact for a company and create a challenging environment for any incident commander creating a need for specially designated SEV-1 call leaders.

https://argoday.medium.com/sev-1-call-leaders-8fdc0ae5f6be

19 409

The Single Pain of Glass

How do we create better dashboards?

https://medium.com/site-reliability-engineering-leadership/the-single-pain-of-glass-6e42930e966

19 409

VK Kubernetes Conf — первая Kubernetes-конференция 2024 года ⏰ 28.03.2024, 14:00 МСК 28 марта VK Cloud проведёт новую конференцию VK Kubernetes Conf, чтобы команды, которые работают с оркестратором и накапливают колоссальный опыт, могли поделиться им с сообществом. Основной темой обсуждения внутри комьюнити была и остается сложность оркестратора: растёт количество инструментов, усложняются архитектурные паттерны, повышаются требования к ИБ и к администрированию решений — защищенность Kubernetes становится приоритетом №1. Первое K8s-событие этого года будет посвящено отказоустойчивости и аварийному восстановлению, уязвимостям кластеров и тому, как с ними бороться. Спикеры и участники обсудят сценарии применения Kubernetes, которые в последние годы набирают обороты, среди них — работа с данными и ML. Спикерами конференции станут эксперты из «Тинькофф», VK, Wildberries, VK Cloud, Газпромбанка, билайна и других компаний. Доклады помогут решить типичные проблемы пользователей оркестратора, а также просто расширят кругозор, чтобы в непредвиденных ситуациях каждый специалист понимал вектор подходов к решению аварийной или среднестатистической задачи на своих проектах. VK Kubernetes Conf — это повод встретиться с сообществом и обменяться опытом, получить совет от специалистов и просто хорошо провести время с людьми, которые говорят с тобой на одном языке. Конференция будет интересна разработчикам, инженерам доступности сервисов (SRE), DevOps и DevSecOps-инженерам, архитекторам, тестировщикам и всем, кто работает с Kubernetes. Регистрация

19 409

Staying in the Zone: How DoorDash used a service mesh to manage data transfer, reducing hops and cloud spend

There have been many benefits gained through DoorDash’s evolution from a monolithic application architecture to one that is based on cells and microservices. The new architecture has reduced the time required for development, test, and deployment and at the same time has improved scalability and resiliency for end-users including merchants, Dashers, and consumers. As the number of microservices and back-ends has grown, however, DoorDash has observed an uptick in cross-availability zone (AZ) data transfer costs. These data transfer costs — incurred on both send and receive — allow DoorDash to provide its end users a highly available service that can withstand degradations of one or more AZs. The cost increase prompted our engineering team to investigate alternative ways to provide the same level of service more efficiently. In this blog post, we describe the journey DoorDash took using a service mesh to realize data transfer cost savings without sacrificing service quality.

https://doordash.engineering/2024/01/16/staying-in-the-zone-how-doordash-used-a-service-mesh-to-manage-data-transfer-reducing-hops-and-cloud-spend

19 409

Как ускорить SQL-проекции с 7 часов до 7 минут А еще быстро разрабатывать MVP и запускать свои пет-проекты на .Net. Все это специалисты из Тинькофф расскажут на митапе в Москве, который проведут вместе с Moscow .Net. Встречу соберут 18 марта в новой штаб-квартире на Белорусской. Сначала будут обсуждаться кейсы, а потом можно остаться понетворкать. 📆 18 марта в Москве. Регистрируйтесь и зовите коллег с собой! erid:2Vtzqw5L4Ub Реклама. АО "Тинькофф Банк", ИНН 7710140679, лицензия ЦБ РФ № 2673

19 409

testkube

Testkube natively integrates test orchestration and execution into Kubernetes and your CI/CD/GitOps pipeline. It decouples test artifacts and execution from CI/CD tooling; tests are meant to be part of your clusters state and can be executed as needed: - Kubectl plugin - Externally triggered via API (CI, external tooling, etc) - Automatically on deployment of annotated/labeled services/pods/etc (WIP) Testkube advantages: - Avoids vendor lock-in for test orchestration and execution in CI/CD pipelines - Makes it easy to orchestrate and run any kind of tests - functional, load/performance, security, compliance, etc. in your clusters, without having to wrap them in docker-images or providing network access - Makes it possible to decouple test execution from build processes; engineers should be able to run specific tests whenever needed - Centralizes all test results in a consistent format for "actionable QA analytics" - Provides a modular architecture for adding new types of tests and executors

https://github.com/kubeshop/testkube

19 409

erid: 2VtzqxVzBdD Приглашаем на вебинар завтра: Динамические окружения для stateless- и stateful-сервисов Завтра в 12:00 ребята из KTS и Yandex.Cloud проведут вебинар про динамические окружения. Вот некоторые из тем, которые будут на вебинаре: — Какие проблемы решают динамические окружения? — Какие проблемы они создают? — Динамические окружения для фронтенда и для бэкенда: отличия, тонкости. — Отличия стендов для монолитных и микросервисных приложений Посмотреть более подробную программу можно на странице вебинара. 👉 Ссылку на трансляцию можно получить в нашем боте Ждем вас завтра, приходите! Реклама. ООО "СТУДИЯ КТС" , ИНН 7733257480

19 409

Key metrics for monitoring etcd https://www.datadoghq.com/blog/etcd-key-metrics Tools for collecting etcd metrics and logs https://www.datadoghq.com/blog/etcd-monitoring-tools

19 409

AWS Extended EKS Support: A Costly Band-Aid for Kubernetes Clusters

Amazon Web Services (AWS) recently announced extended support for Amazon Elastic Kubernetes Service (EKS) versions (starting April, 2024), allowing customers to use older versions of Kubernetes for an additional 12 months. While this may seem like a convenient option, it comes with a hefty price tag and several drawbacks that customers should carefully consider before opting for it.

https://medium.com/@talkimhi/aws-extended-eks-support-a-costly-band-aid-for-kubernetes-clusters-120b8d537abe

19 409

Ansible vs Terraform: Choose One or Use Both? https://www.env0.com/blog/ansible-vs-terraform-when-to-choose-one-or-use-them-together

19 409

Starting SRE at startups and smaller organizations

Most of the original thinking behind SRE focuses on implementing it in large-scale systems. I believe that any organization that has software at the foundation of its core business should at the very least pay attention to SRE principles. You can always pare hyperscale ideas down to your level of need, which we will explore later in this article.

https://www.srepath.com/starting-sre-at-startups-and-smaller-organizations

19 409

10 Tips for Onboarding New SRE Hires

There’s more than one way to mess up your new SRE hire and get them stuck in a loop. Here are 6 ways new hires will know you’ve made this mistake: 1. unclear role requirements 2. going too advanced too soon 3. not having any tangible, measurable things to do in the first few months 4. not feeling connected with the rest of the SRE team 5. no clarity on how SRE fits into the wider organization 6. little to no collaboration with teams outside of SRE This article will unpack these 6 sticking points and show how to solve them.

https://www.srepath.com/10-tips-for-onboarding-new-sre-hires

19 409

Rebuilding Netflix Video Processing Pipeline with Microservices https://netflixtechblog.com/rebuilding-netflix-video-processing-pipeline-with-microservices-4e5e6310e359