ru
Feedback
Devops and aws interview preparation Hyderabad

Devops and aws interview preparation Hyderabad

Открыть в Telegram

Who interested to learn Linux AWS and devops I will explain and I will give my support until you got job and no need to go for proxy after training. For traning call me 9154078579 For serious learners I will share my entire knowledge

Больше
4 847
Подписчики
+724 часа
+367 дней
+9930 день
Архив постов
Question:-89 Memory usage is normal but containers are killed. What would you check first? Standard monitoring tools (like docker stats or basic Prometheus metrics) often fail to capture sudden, millisecond-level memory spikes, non-RSS memory overhead, or host-level resource constraints 1.When containers are killed despite normal memory usage, the most likely culprit is the Out-Of-Memory (OOM) killer operating at the host level. 2.Inspect the stopped container for exit code 137. This specifically indicates that the container was forcefully terminated by an external signal (SIGKILL), usually because it exceeded a set memory limit 3.we can use the Linux man page for dmesg to check for kernel logs by running dmesg -T | grep -i -E 'oom|kill'. This will show exactly which process was terminated and why. 4.Nedd to check if your orchestrator or runtime (like Docker or Kubernetes) has strict memory limits configured. A container's memory limit might be set too low, causing it to crash before overall system memory registers high usage. 5.A failing custom liveness or readiness probe can cause the platform to restart a container even if the application process itself is not technically out of memory. To learn aws&Devops from scratch ping me 9154078579

Question:-88 Explain the complete request flow from Route 53 → ALB → EKS/EC2 1.Route 53, using a hosted zone, looks up the DNS records for our domain. It is typically configured with an Alias record that points directly to our ALB’s Fully Qualified Domain Name (FQDN). 2.Route 53 returns the IP addresses of the ALB nodes back to the user's browser. 3.The user’s browser initiates a connection to the ALB’s IP address. The ALB (which is deployed in a public subnet) acts as the primary, secure entry point for external traffic 4.The ALB evaluates incoming requests based on its configured Listeners (e.g., port 443 for HTTPS). It offloads TLS/SSL certificates and applies security group rules to validate the traffic. 5.The ALB evaluates host and path-based routing rules (e.g., /api/* or service1.example.com) to determine which Target Group the request should go to. 6.The ALB forwards the traffic directly to the node ports of our EKS worker nodes (or directly to Pod IPs if we use AWS CNI). Inside the cluster, an Ingress controller (like NGINX) or an EKS Service (ClusterIP) receives the traffic and load-balances it to the specific, healthy application Pod. 7.The ALB forwards traffic to its registered Target Group, which consists of backend EC2 instances running in private subnets. To learn aws&Devops from scratch ping me 9154078579

Question:-87 Explain about kubernetes networking architecture Kubernetes networking architecture provides a flat, routable network where all Pods can communicate with each other and with Services, regardless of their host node 1.All Pods are assigned their own IP addresses 2.Nodes run a root network namespace that bridges between the Pod interfaces. This allows all Pods to communicate with each other using their IP addresses 3.Communication does not depend on Network Address Translation (NAT), reducing complexity and improving portability. 4.Pods are assigned their own network namespaces and interfaces. All communications with Pods go through their assigned interfaces. 5.The cluster-level network layer maps the Node-level namespaces, allowing traffic to be correctly routed across Nodes. The three main types of network components in Kubernetes are Pod networking, Service networking, and Ingress/Egress networking. 1.Pod Networking: Handles direct communication between Pods. Each Pod gets a unique IP address, and Pods communicate without NAT, using CNI plugins for routing and IP management. 2. Service Networking: Provides stable virtual IPs (ClusterIP) for accessing a group of Pods. Kube-proxy manages traffic routing to healthy Pods using iptables, IPVS, or eBPF. 3. Ingress/Egress Networking: Manages external access to the cluster. Ingress controllers handle HTTP/S routing to internal Services, while Egress defines how Pods reach resources outside the cluster, often using NAT or egress gateways. To learn aws&Devops from scratch ping me 9154078579

Question:-84 How do you safely rotate IAM credentials in production without breaking active sessions? To safely rotate IAM access keys in production without downtime, we generate a secondary key, distribute and validate it, deactivate the old key, and finally delete it. 1. we can log into the AWS IAM Console or use the AWS CLI to generate a new key while the old one remains active. AWS permits up to two active access keys per IAM user at any given time. 2.Distribute the new key to your applications, configuration files, or the AWS Secrets Manager. Reload or restart services so they pick up the updated environment variables without causing active API sessions to drop. 3.we need to observe our applications to ensure they are executing successfully 4.Once traffic is routed through the new key, change the old key's status to Inactive in the AWS console or using the CLI. 5.After a sufficient buffer period (typically 14 to 90 days), safely delete the inactive key to minimize your security footprint. 6.Store credentials in encrypted parameter stores or use AWS Secrets Manager so applications pull credentials dynamically without code changes. To learn aws &Devops from scratch ping me 9154078579

Question:-83 etcd healthy but kubectl commands freezing intermittently. Why? When etcd logs report a healthy state, but kubectl commands freeze intermittently, the issue typically lies in the communication pipeline between kubectl and the kube-apiserver 1.The kube-apiserver acts as a proxy for aggregated APIs and admission webhooks. When we run a command like kubectl get, the server may block the response while waiting for a down or slow external service to reply. 2.If the API server is hitting its Kubernetes resource limits or node limits, it will intermittently freeze processing its internal connection queue. 3.kubectl checks for updated discovery APIs frequently. If our local DNS server or corporate VPN intermittently delays resolving the API server's endpoint, kubectl will freeze for several seconds before executing. 4.Need to Identify if any registered API extensions are currently in a failed state kubectl get apiservice | grep -v True To learn aws&Devops practically from scratch ping me 9154078579

Question:-85 Do you maintain a single CI/CD pipeline for all environments (Development, SIT, UAT, and Production), or do you use separate pipelines? How do you manage environment-specific configurations and deployments? Maintaining separate, disconnected pipelines for each environment introduces a high risk of configuration drift, environment-specific script bugs, and deployment inconsistencies so we can maintain a single CI/CD pipeline for all environments (Development, SIT, UAT, and Production) 1.A single master pipeline file uses distinct, sequential Stages or Jobs to isolate environments. This enables the foundational DevOps rule Build once, deploy many times 2. The pipeline compiles code, runs unit tests, and creates a single immutable build artifact (e.g., a Docker image, a ZIP folder, or a compiled binary) just once. 3.The pipeline promotes exact same artifact sequentially through the environments. we do not rebuild code from source for Production, which ensures that what we tested in UAT is identically what goes live. 4.Moving from Dev to SIT might be fully automatic upon successful automated testing, while advancing to UAT and Production requires Manual Approval Gates from QA leads or product owners. 5.Platforms like GitHub Actions and GitLab CI/CD allow to define "Environments". Variables and secrets (like DB_HOST or API_SECRET) are mapped to specific environments and only injected when that specific pipeline stage runs To learn aws&Devops from scratch ping me 9154078579

Hi I am going to start the AWS&Devops new session on 22-06-26 @7 Am IST training main intention is to clear the interview on your own ping me who are interested to join the session 9154078579

Question:-82 How would you design and manage a highly available, secure, and scalable cloud platform supporting thousands of applications and millions of users globally?    Designing and managing a global, enterprise-grade cloud platform requires a multi-layered architectural approach and it relies on automation, strict isolation boundaries, and decoupled system 1.we can distribute applications across at least three geographically distinct cloud regions. 2.we can use Latency-Based Routing (LBR) and Failover routing via Global Server Load Balancers (GSLB). 3.Use synchronous replication within regions and asynchronous replication across regions for database clusters. 4.Isolate workloads using Virtual Private Clouds (VPCs), strict security groups, and service meshes (e.g., Istio). 5. Encrypt all data at rest (AES-256) and in transit (TLS 1.3) using customer-managed keys.   6.Deploy Web Application Firewalls (WAF) and continuous DDoS mitigation layers at the cloud perimeter. 7.Use Kubernetes (EKS/GKE) with Horizontal Pod Autoscalers (HPA) and Cluster Autoscalers to handle traffic spikes. 8.Cache static and dynamic API assets at edge locations using a Content Delivery Network (CDN). 9.Decouple microservices using event-driven architectures with message brokers like Apache Kafka 10.Manage application state through Git repositories using automated reconciliation tools like ArgoCD.    To learn aws&Devops from scratch ping me 9154078579

Hi I am going to start a new AWS&Devops session from scratch @7Am IST in one week ping me who are interested in joining my session 9154078579 The training main intention is to clear the interview on our own.

Question:-81 How do you fix a corrupted Terraform state? 1. we never run a terraform apply or destroy while it is malformed or corrupted 2.we need to verify if the corruption is structural (invalid JSON) or semantic (bad values) using local command-line tools. cat terraform.tfstate | jq empty 3.If we store our state remotely with versioning enabled (e.g., AWS S3, Azure Blob, Google Cloud Storage), rolling back is the safest approach 4.Overwrite the broken state with the last healthy version ID 5.If we run Terraform locally or our backend versioning is disabled, we can use the local backup engine mv terraform.tfstate terraform.tfstate.corrupt 6.If the file is totally wiped or unrecoverable and no backups exist, we must re-import your cloud architecture A.Initialize a blank configuration state file B.Utilize native declarative import blocks inside our .tf import { to = aws_instance.web id = "i-0123456789abcdef0" } Execute terraform plan to continuously check our progress until the plan outputs 0 changes To learn aws&Devops practically from scratch ping me 9154078579

Question:-80 Your container canot reach internet what could be wrong? When a container cannot reach the internet, the issue usually  from misconfigured DNS, firewall and routing issues on the host machine, or proxy settings 1. we can check  /etc/resolv.conf inside the container. If it lists a local host address (e.g., nameserver 127.0.0.x), the container cannot resolve names we can resolve this by adding public DNS servers to our Docker configuration 2.we can check if IP forwarding is enabled on the host by running cat /proc/sys/net/ipv4/ip_forward. If it returns 0, we can enable it by running sysctl net.ipv4.ip_forward=1. 3.System updates or firewall tools (like ufw on Linux) often reset or override Docker's Network Address Translation (NAT) rules  4.If our machine is behind a corporate network, our container will lack internet access unless the proxy is explicitly passed to it we can pass the necessary HTTP_PROXY and HTTPS_PROXY environment variables when running our container  5.Containers attached to the default bridge network block certain incoming requests by default, and can sometimes exhibit communication issues  we use a custom network by running docker network create my-custom-net and attach our container to it using --network my-custom-net    To learn aws&Devops practically ping me 9154078579

Question:-79 Docker image build takes too long how do you optimize it? 1. Most build delays occur because package managers reinstall everything from scratch whenever a minor source code file changes. Separate manifest copies solve this type of issue COPY . . RUN npm install It is not good COPY package.json package-lock.json ./ RUN npm ci # Cached securely unless packages change COPY . . It is good 2.Without a .dockerignore file, Docker packs up our entire local directory—including heavy directories like .git, local node_modules, or massive database storage folders 3.Compilers, test suites, and temporary SDK tooling drastically bloat images and add unnecessary processing overhead to runtime configurations. we can divide environments using multi-stage parameters 4.Every unique RUN, COPY, or ADD declaration generates an immutable layer within the container image file structure. Chaining commands prevents intermediate bloat Best practice is RUN apt-get update RUN apt-get install -y curl RUN rm -rf /var/lib/apt/lists/* # This clean-up fails because previous layers are locked! 5. we can avoid massive general operating system base tags like ubuntu:latest or node:latest. we can use highly stripped-down variants like -alpine To learn aws&Devops from scratch ping me 9154078579

Question:-78 What’s the difference between metrics, logs, and traces, and when do you start with each? metrics tell what is wrong traces show where it broke down logs explain why it happened Metrics :-Numeric measurements of system health, like CPU usage, request counts, or error rates, aggregated over time. Setting up active alerts and monitoring long-term performance trends. Traces:- Visual maps that follow a single user request as it travels through multiple microservices, networks, and databases. use case is Troubleshooting latency and mapping out complex dependencies. Logs:- Discrete, time-stamped text records of specific events, such as a stack trace or an "Order Processed" notification. use case is digging into the root cause of an error. To learn aws&Devops from scratch ping me 9154078579

Question:-77 We are not able to create new files in a partition. We have enough disk space and permissions. What can be the possible reason ? If our partition has available storage space and correct permissions but we still cannot create new files, the issue is typically caused by a hidden limit on metadata (inodes), read-only file systems, or process-level locks. 1. we can run df -i in your terminal to check inode usage. If it shows 100%, we must delete excess small files or clear cached data 2.we run mount (or lsblk) to verify the partition status. If it is marked as ro (read-only), remount it in Read-Write mode 3.we can run sudo lsof +L1 to find unlinked files that are still open. we can restart the service locking the files, or reboot the system to reclaim the space. For aws&Devops practical learning from scratch ping me 9154078579

Question:-76 What will you verify if Jenkins shared library causes failures? 1. Jenkins has correct permissions and keys to clone the repository. 2.Need to verify that the specified branch version exists (e.g., @Library('my-library@main') 3.Look for connection timeouts or host verification errors if the Jenkins controller cannot reach the Git server 4.Need to check that custom step files are lower-case or camelCase and end strictly with a .groovy extension (e.g., vars/myStep.groovy) 5.If we see MissingMethodException, confirm the signature called in the Jenkinsfile exactly matches the method defined in the library. 6. If the pipeline throws a RejectedAccessException or a script security warning, investigate the Script Approval ($JENKINS_URL/scriptApproval/) To learn aws&Devops from scratch ping me 9154078579

Question:-75 What are common Grafana issues in production? Common Grafana issues in production and DevOps environments typically revolve around data source connectivity, sluggish dashboard performance, alert storms, and configuration drift. 1.Panels pulling large, unfiltered datasets from Prometheus (PromQL) or SQL databases overload backend storage. 2.Dashboards with too many active panels or auto-refresh intervals set too low trigger "Query Timeout" or freeze the browser.   3.Slow or blocked network connectivity between Grafana's backend and data sources (like Prometheus or AWS CloudWatch). 4.Deleting and recreating Grafana environments without properly provisioning data sources via YAML ruins observability. 5.A single core infrastructure outage (e.g., a shared database) generating hundreds of dependent alerts, overwhelming on-call engineers 6.Automated alerts and provisioned dashboards failing silently because the Service Account tokens expired or lack the correct permissions. 7.Grafana instances exceeding memory limits (often when generating large PDF/image reports) and being killed by Kubernetes.

Question:-74 You deployed a sidecar logging agent. Suddenly, CPU throttling spikes. Diagnose and rollback. 1.CPU throttling happens when a container exceeds its configured CPU limit.   2.A sudden increase in application logs forces the sidecar to use more processing power to parse, compress, and ship data. 3.Complex log parsing rules (like heavy regular expressions) consume massive CPU cycles under load 4.we can view container CPU metrics and look for nr_throttled or throttled_time kubectl top pods -n --containers 5. we can monitor real-time resource usage of the sidecar container kubectl top pod -n --container= 6.If production stability is at risk, execute a rollback or temporary patch immediately. kubectl rollout undo deployment/ -n

Question:-73 terraform module working in dev but failing in prod. Why? Terraform modules fail in production after working in development due to differences in permissions, environment states, or provider versions 1.Prod IAM roles might block specific actions like creating public IPs or deleting databases. 2.AWS Organizations often apply strict SCP Service Control Policies to Prod accounts that do not exist in Dev. 3.Prod variables might request larger instance sizes or disk types that hit service quotas in that region. 4.Conditional logic (count or for_each) based on environment variables might trigger untested code paths in Prod. 5.If provider versions are not locked, Prod might be downloading a newer, breaking version of a provider 6.Prod networks often have tighter security groups or firewalls that prevent resources from communicating during provisioning.

Question:-71 You need to temporarily save changes without committing. How? 1.To temporarily save our work without committing, use the git stash command. 2.git stash takes our current changes (both staged and unstaged) and stores them on a internal stack, giving a clean working directory. 3.To add a descriptive note, we can use git stash push -m "our message" 4.By default, stash only tracks modified files. Use git stash -u to include untracked files. 5.git stash pop to apply the most recent changes and immediately delete them from the stash 6.git stash apply if we want to restore the changes but keep the record in our stash for later use. 7.git stash list to see all our temporarily saved changes. For practical learning aws&Devops ping me 9154078579

Question:-70 How do you implement cross-account access in Amazon Web Services using IAM roles? Implementing cross-account access in AWS involves a "trusting" account (the resource owner) and a "trusted" account (where the users live). 1.In IAM Console Choose Create role and select AWS account as the trusted entity type. 2.Enter the Account ID of the "trusted" account (where our users are). 3.Attach Permission Policies that define what the role can do (e.g., AmazonS3ReadOnlyAccess). 4.Name the role and take note of its ARN (e.g., arn:aws:iam::Trusting-Account-ID:role/CrossAccountRole). 5.In the account where our users live, select the specific user or group that needs access. 6.Attach an Identity-based Policy (inline or managed) that allows the sts:AssumeRole action. { "Version": "2012-10-17", "Statement": { "Effect": "Allow", "Action": "sts:AssumeRole", "Resource": "arn:aws:iam::Trusting-Account-ID:role/CrossAccountRole" } } 7.Log in to the trusted account, click our username in the top right, select Switch Role, and enter the trusting account ID and role name. For practical aws&Devops learning ping me 9154078579