ch
Feedback
Kube Builders

Kube Builders

前往频道在 Telegram

News and links on infrastructure and building Kubernetes clusters curated by the @Learnk8s team

显示更多
1 578
订阅者
+324 小时
+57
+1130
帖子存档
Repost from N/a
SREs are drowning in logs. The real error is buried under thousands of noise messages — and AI tools are starting to fix that. Mario Fahlandt sees AI's biggest value in reducing white noise: filtering log floods so engineers can focus on the actual issue instead of sifting through patterns manually. But he draws a clear line — automated AI debugging of clusters is still a step too far. The human component matters. AI should handle the groundwork. Humans should make the calls. Watch the full interview: https://ku.bz/36wSy63cC This interview is a reaction to Isala Piyarisi's episode https://ku.bz/kJjXQlmTw

This blog post tells how the Render team: - tracked down Kubernetes memory waste caused by many daemonset namespace watches,
This blog post tells how the Render team: - tracked down Kubernetes memory waste caused by many daemonset namespace watches, - fixed config issues, - and freed over 7 TiB of memory across clusters by reducing unnecessary listwatch overhead. More: https://ku.bz/2vS0QsvjY

Kubectl OpenAI plugin is a kubectl plugin to generate and apply Kubernetes manifests using OpenAI GPT. More: https://ku.bz/fxBdsk7Kf

Repost from LearnKube news
Pumba lets you kill, pause, and stress containers while injecting network delays, packet loss, and corruption. You can deploy it as a DaemonSet for cluster-wide chaos engineering. More: https://ku.bz/K7_RB9tSq

Repost from N/a
Billy Thompson, DevOps Platform Engineering - Office of the CTO @ Akamai, discusses the strategic decision between building custom Kubernetes tools versus adopting existing CNCF projects. The discussion provides a practical framework for evaluating time investment, maintenance capacity, and the broader impact of tooling decisions in Kubernetes environments. Watch the full interview: https://ku.bz/Jk2xSwXHp This interview is a reaction to Alessandro Pomponio's episode https://ku.bz/5sK7BFZ-8

This article shows why Grafana becomes slow on Kubernetes when multiple replicas share SQLite over EFS, and explains why a single replica on block storage or a real external database is the correct fix. More: https://ku.bz/JGj7gl5wt

Repost from LearnKube news
This week on Learn Kubernetes Weekly 186: 🔥 1 Million Tokens Per Second: Qwen 3.5 27B on GKE with B200 GPUs 🤖 How I Built K
This week on Learn Kubernetes Weekly 186: 🔥 1 Million Tokens Per Second: Qwen 3.5 27B on GKE with B200 GPUs 🤖 How I Built Kernel: An AI-Powered IT Helpdesk That Deflects 80% of Support Tickets ⚙️ Ansible AWX: Infrastructure Automation on Top of Kubernetes 🛡️ I Setup Kubermatic SecureGuard Before It Even Existed 🔐 SRE: Secrets Management in Kubernetes Read it now: https://kube.today/issues/186 ⭐️ This newsletter is brought to you by StormForge by CloudBolt. Stop setting Kubernetes requests. Let ML handle rightsizing https://ku.bz/2wYKp0Q2Y

CloudNativePG is the Kubernetes operator that covers the entire lifecycle of a highly available PostgreSQL database cluster with a primary/standby architecture, using native streaming replication. More: https://ku.bz/n6gpgcYtf

Cyphernetes lets you query the Kubernetes API as if it were a graph database and discover relationships between resources. More: https://ku.bz/5vrBXrCHN

Repost from LearnKube news
📣 New on LearnKube: "The mechanics of Kubernetes RBAC and how it connects users to permissions." Kubernetes RBAC can feel co
📣 New on LearnKube: "The mechanics of Kubernetes RBAC and how it connects users to permissions." Kubernetes RBAC can feel confusing because the object names sound broader than the scope they actually grant. A ClusterRole does not always mean cluster-wide access. If you bind a ClusterRole with a RoleBinding, the permissions apply only in the namespace where the RoleBinding lives. The article walks through: - Why direct user-to-permission mappings do not scale - how Roles and ClusterRoles group permissions into reusable sets - how RoleBindings and ClusterRoleBindings connect identities to permissions - How to test access with kubectl auth can-i Read the full guide: https://learnkube.com/rbac-kubernetes

Repost from N/a
Mike Stefaniak, Head of Product, Kubernetes and Registries at Amazon Web Services (AWS), discusses the challenges of operating across multiple Kubernetes clusters and environments without requiring custom scripting or multiple kubeconfig files. Mike outlines AWS's strategy to host the MCP server centrally, providing AWS with context for all clusters across accounts and regions. This architectural shift transforms troubleshooting from a single-cluster operation to fleet-wide visibility, eliminating the need for users to configure access to individual clusters manually. Watch the full interview: https://ku.bz/PzjrglcZJ

Repost from Kubesploit
This tutorial shows how to run Cloudflare Tunnels as a DaemonSet to expose services with zero open inbound ports, using liveness probes, Kubernetes Secrets, and GitOps with ArgoCD. More: https://ku.bz/RYlKnctWf

Repost from N/a
What is Udi Hofesh bringing to KCD New York? Kubernetes was never easy, and AI workloads just turned the difficulty up to eleven. Udi will break down why operations are getting harder, where the cost pressure is coming from, and how AI SRE is a practical answer—not a buzzword. We also have 10 free tickets available—email hello@kube.events to grab one. KCD website: https://ku.bz/JkjmffBzw

This tutorial shows how to migrate Amazon EKS VPC CNI from a self-managed DaemonSet to an AWS managed add-on by preserving custom env settings, moving permissions to IRSA, and avoiding downtime during adoption. More: https://ku.bz/HLl9fhxc7

Repost from N/a
What does Stacey Potter have in store for KCD New York? A practical conversation about why open-source security still creates too much cognitive load, why secure-by-design can't succeed without broad adoption, and how projects like SLSA and Sigstore help make security resources more useful and accessible—not just academically correct. If you're interested in open-source security, software supply chain security, cloud-native infrastructure, platform engineering, and community-driven security practices, this session is a strong reason to get your ticket. We also have 10 free tickets available—email hello@kube.events to grab one before they're gone. 🌎 https://ku.bz/JkjmffBzw

Repost from Kube Architect
Percona vs MongoDB Community vs KubeDB vs Atlas — which operator should you run for MongoDB on Kubernetes? Full breakdown + architecture + PITR guide → https://ku.bz/2n-smMsxC

KubeSolo is a single-node Kubernetes distribution optimized for edge, IoT and embedded devices. It eliminates clustering and
KubeSolo is a single-node Kubernetes distribution optimized for edge, IoT and embedded devices. It eliminates clustering and etcd, uses SQLite via Kine, and runs in under 200MB RAM while remaining OCI-compliant and Helm-ready. More: https://ku.bz/SPpVGdZ5Y

Repost from N/a
GPU requests often run 2-3x higher than actual consumption in inference workloads. Why? Andrew Hillier explains the core problem: inference is transactional, not batch. GPUs sit idle between requests, but you still have to size for peak load. Unlike CPUs with Linux schedulers filling utilization gaps, GPUs run monolithic models — what you allocate is what you get. The fix? MIGs to partition GPUs, or time slicing for less critical workloads. Both help squeeze more value out of expensive hardware. Watch the full interview: https://ku.bz/wL-0d1X0y

This article explains how a team deployed Ansible AWX on K3s and extended it for OpenStack inventory, dynamic SSH users, execution nodes, custom execution environments, and air-gapped installs. More: https://ku.bz/6Ms2R5RTk

Repost from LearnKube news
This week on Learn Kubernetes Weekly 185: 🔥 A One-Line Kubernetes Fix That Saved 600 Hours a Year 🔐 Why Kubernetes Has No L
This week on Learn Kubernetes Weekly 185: 🔥 A One-Line Kubernetes Fix That Saved 600 Hours a Year 🔐 Why Kubernetes Has No Login — And How We Solved It for AuditRadar ⚙️ Durable Workflows Beyond Vercel: Version-Safe Orchestration for Kubernetes 🧩 The Missing Layers in Your Kubernetes Operator 🚨 Why Your KServe InferenceService Won't Become Ready: Four Production Failures and Fixes Read it now: https://kube.today/issues/185 ⭐️ This issue is brought to you by Qodo, the AI code integrity platform helping teams review, test, and ship reliable infrastructure code faster https://ku.bz/NvLHsnl-6