en
Feedback
Devops and aws interview preparation Hyderabad

Devops and aws interview preparation Hyderabad

Open in Telegram

Who interested to learn Linux AWS and devops I will explain and I will give my support until you got job and no need to go for proxy after training. For traning call me 9154078579 For serious learners I will share my entire knowledge

Show more
4 767
Subscribers
-224 hours
+167 days
+10530 days
Posts Archive
Question:-80 Your container canot reach internet what could be wrong? When a container cannot reach the internet, the issue usually  from misconfigured DNS, firewall and routing issues on the host machine, or proxy settings 1. we can check  /etc/resolv.conf inside the container. If it lists a local host address (e.g., nameserver 127.0.0.x), the container cannot resolve names we can resolve this by adding public DNS servers to our Docker configuration 2.we can check if IP forwarding is enabled on the host by running cat /proc/sys/net/ipv4/ip_forward. If it returns 0, we can enable it by running sysctl net.ipv4.ip_forward=1. 3.System updates or firewall tools (like ufw on Linux) often reset or override Docker's Network Address Translation (NAT) rules  4.If our machine is behind a corporate network, our container will lack internet access unless the proxy is explicitly passed to it we can pass the necessary HTTP_PROXY and HTTPS_PROXY environment variables when running our container  5.Containers attached to the default bridge network block certain incoming requests by default, and can sometimes exhibit communication issues  we use a custom network by running docker network create my-custom-net and attach our container to it using --network my-custom-net    To learn aws&Devops practically ping me 9154078579

Question:-79 Docker image build takes too long how do you optimize it? 1. Most build delays occur because package managers reinstall everything from scratch whenever a minor source code file changes. Separate manifest copies solve this type of issue COPY . . RUN npm install It is not good COPY package.json package-lock.json ./ RUN npm ci # Cached securely unless packages change COPY . . It is good 2.Without a .dockerignore file, Docker packs up our entire local directory—including heavy directories like .git, local node_modules, or massive database storage folders 3.Compilers, test suites, and temporary SDK tooling drastically bloat images and add unnecessary processing overhead to runtime configurations. we can divide environments using multi-stage parameters 4.Every unique RUN, COPY, or ADD declaration generates an immutable layer within the container image file structure. Chaining commands prevents intermediate bloat Best practice is RUN apt-get update RUN apt-get install -y curl RUN rm -rf /var/lib/apt/lists/* # This clean-up fails because previous layers are locked! 5. we can avoid massive general operating system base tags like ubuntu:latest or node:latest. we can use highly stripped-down variants like -alpine To learn aws&Devops from scratch ping me 9154078579

Question:-78 What’s the difference between metrics, logs, and traces, and when do you start with each? metrics tell what is wrong traces show where it broke down logs explain why it happened Metrics :-Numeric measurements of system health, like CPU usage, request counts, or error rates, aggregated over time. Setting up active alerts and monitoring long-term performance trends. Traces:- Visual maps that follow a single user request as it travels through multiple microservices, networks, and databases. use case is Troubleshooting latency and mapping out complex dependencies. Logs:- Discrete, time-stamped text records of specific events, such as a stack trace or an "Order Processed" notification. use case is digging into the root cause of an error. To learn aws&Devops from scratch ping me 9154078579

Question:-77 We are not able to create new files in a partition. We have enough disk space and permissions. What can be the possible reason ? If our partition has available storage space and correct permissions but we still cannot create new files, the issue is typically caused by a hidden limit on metadata (inodes), read-only file systems, or process-level locks. 1. we can run df -i in your terminal to check inode usage. If it shows 100%, we must delete excess small files or clear cached data 2.we run mount (or lsblk) to verify the partition status. If it is marked as ro (read-only), remount it in Read-Write mode 3.we can run sudo lsof +L1 to find unlinked files that are still open. we can restart the service locking the files, or reboot the system to reclaim the space. For aws&Devops practical learning from scratch ping me 9154078579

Question:-76 What will you verify if Jenkins shared library causes failures? 1. Jenkins has correct permissions and keys to clone the repository. 2.Need to verify that the specified branch version exists (e.g., @Library('my-library@main') 3.Look for connection timeouts or host verification errors if the Jenkins controller cannot reach the Git server 4.Need to check that custom step files are lower-case or camelCase and end strictly with a .groovy extension (e.g., vars/myStep.groovy) 5.If we see MissingMethodException, confirm the signature called in the Jenkinsfile exactly matches the method defined in the library. 6. If the pipeline throws a RejectedAccessException or a script security warning, investigate the Script Approval ($JENKINS_URL/scriptApproval/) To learn aws&Devops from scratch ping me 9154078579

Question:-75 What are common Grafana issues in production? Common Grafana issues in production and DevOps environments typically revolve around data source connectivity, sluggish dashboard performance, alert storms, and configuration drift. 1.Panels pulling large, unfiltered datasets from Prometheus (PromQL) or SQL databases overload backend storage. 2.Dashboards with too many active panels or auto-refresh intervals set too low trigger "Query Timeout" or freeze the browser.   3.Slow or blocked network connectivity between Grafana's backend and data sources (like Prometheus or AWS CloudWatch). 4.Deleting and recreating Grafana environments without properly provisioning data sources via YAML ruins observability. 5.A single core infrastructure outage (e.g., a shared database) generating hundreds of dependent alerts, overwhelming on-call engineers 6.Automated alerts and provisioned dashboards failing silently because the Service Account tokens expired or lack the correct permissions. 7.Grafana instances exceeding memory limits (often when generating large PDF/image reports) and being killed by Kubernetes.

Question:-74 You deployed a sidecar logging agent. Suddenly, CPU throttling spikes. Diagnose and rollback. 1.CPU throttling happens when a container exceeds its configured CPU limit.   2.A sudden increase in application logs forces the sidecar to use more processing power to parse, compress, and ship data. 3.Complex log parsing rules (like heavy regular expressions) consume massive CPU cycles under load 4.we can view container CPU metrics and look for nr_throttled or throttled_time kubectl top pods -n --containers 5. we can monitor real-time resource usage of the sidecar container kubectl top pod -n --container= 6.If production stability is at risk, execute a rollback or temporary patch immediately. kubectl rollout undo deployment/ -n

Question:-73 terraform module working in dev but failing in prod. Why? Terraform modules fail in production after working in development due to differences in permissions, environment states, or provider versions 1.Prod IAM roles might block specific actions like creating public IPs or deleting databases. 2.AWS Organizations often apply strict SCP Service Control Policies to Prod accounts that do not exist in Dev. 3.Prod variables might request larger instance sizes or disk types that hit service quotas in that region. 4.Conditional logic (count or for_each) based on environment variables might trigger untested code paths in Prod. 5.If provider versions are not locked, Prod might be downloading a newer, breaking version of a provider 6.Prod networks often have tighter security groups or firewalls that prevent resources from communicating during provisioning.

Question:-71 You need to temporarily save changes without committing. How? 1.To temporarily save our work without committing, use the git stash command. 2.git stash takes our current changes (both staged and unstaged) and stores them on a internal stack, giving a clean working directory. 3.To add a descriptive note, we can use git stash push -m "our message" 4.By default, stash only tracks modified files. Use git stash -u to include untracked files. 5.git stash pop to apply the most recent changes and immediately delete them from the stash 6.git stash apply if we want to restore the changes but keep the record in our stash for later use. 7.git stash list to see all our temporarily saved changes. For practical learning aws&Devops ping me 9154078579

Question:-70 How do you implement cross-account access in Amazon Web Services using IAM roles? Implementing cross-account access in AWS involves a "trusting" account (the resource owner) and a "trusted" account (where the users live). 1.In IAM Console Choose Create role and select AWS account as the trusted entity type. 2.Enter the Account ID of the "trusted" account (where our users are). 3.Attach Permission Policies that define what the role can do (e.g., AmazonS3ReadOnlyAccess). 4.Name the role and take note of its ARN (e.g., arn:aws:iam::Trusting-Account-ID:role/CrossAccountRole). 5.In the account where our users live, select the specific user or group that needs access. 6.Attach an Identity-based Policy (inline or managed) that allows the sts:AssumeRole action. { "Version": "2012-10-17", "Statement": { "Effect": "Allow", "Action": "sts:AssumeRole", "Resource": "arn:aws:iam::Trusting-Account-ID:role/CrossAccountRole" } } 7.Log in to the trusted account, click our username in the top right, select Switch Role, and enter the trusting account ID and role name. For practical aws&Devops learning ping me 9154078579

Question:-69 How do sidecars help in containerizing legacy applications? By running a supporting container (the sidecar) alongside the main application container (the legacy--A legacy application is an outdated software program-- app) in the same Kubernetes pod, they share resources like network and storage, allowing them to communicate via localhost and share files 1.Legacy applications often only support HTTP. A sidecar proxy (like Nginx) can handle incoming HTTPS requests and pass them to the legacy app over localhost. 2.Sidecars can act as an authentication agent (e.g., OAuth2 Proxy), validating tokens and handling user identity before passing requests to the app. 3.Instead of changing the legacy code to log to a central system, a sidecar (e.g., Fluentd) can tail log files on a shared volume and ship them to modern logging backends (Elasticsearch, Splunk). 4.Sidecars can act as adapters, polling metrics from the legacy application or reading logs and exporting them into formats compatible with Prometheus. 5.Instead of baking secrets into the image, a sidecar can pull secrets from a secure vault (like HashiCorp Vault) and populate them at runtime. For practical aws&Devops learning ping me 9154078579

Question:-68 Why should containers run as non-root? What is image immutability? 1.If a container running as root is compromised, the attacker may gain root-level access to the host machine, leading to a full system compromise. 2.Non-root containers restrict the actions an attacker can perform, such as installing malicious packages or modifying system files, reducing the overall risk. 3.Even if an application vulnerability is exploited, the limited privileges of a non-root user prevent the attacker from performing administrative tasks or accessing sensitive host resources. 4.orchestration platforms like OpenShift and Kubernetes strongly recommend or enforce running applications as non-root 5.Using a custom non-root user with a specific user ID (UID) helps avoid file permission conflicts when mounting volumes between the container and the host. Image immutability Image immutability is a software engineering principle where a container image, once built, remains completely unchangeable. If any part of the image needs an update—such as a security patch or a code change—the existing image is not modified; instead, a brand-new image is built to replace the old one For aws&Devops practical learning ping me 9154078579

Question:-67 What was the last production outage you handled? What was the root cause? I faced docker-related production outages 1.Mainly due to resource mismanagement, networking misconfigurations, or external infrastructure dependencies. 2.Docker experienced a widespread outage affecting Docker Hub, Scout, and Build Cloud due to a major failure in AWS's US-East-1 region.   3.Failed due to trigger by outages in AWS DynamoDB, EC2, and Network Load Balancers, leading to increased failure rates that ultimately took Docker services offline for over 24 hours. 4.Containers frequently hit Out-of-Memory (OOM) limits (Exit Code 137). This is often due to application memory leaks or setting limits that are too low for actual production workloads. 5.Unused images, stopped containers, and rapidly growing log files can fill the host's disk space, preventing the Docker daemon from writing state and leading to a full system hang 6.Overlay networks often fail due to MTU (Maximum Transmission Unit) mismatches between the container network and the physical host network, causing packet fragmentation and connection timeouts.   For practical aws&Devops learning ping me 9154078579

Question:-66 How can we compress the logs & rotate the logs? Without proper log rotation, log files can consume all available disk space and bring our system to its knees. Logrotate is the standard tool on Linux for automatically rotating, compressing, and removing old log files. 1.Logrotate configurations are typically found in /etc/logrotate.conf or in files within /etc/logrotate.d/ 2.we add the compress keyword to our logrotate configuration block to enable compression for rotated files. 3.we use delaycompress along with compress to postpone compression until the next rotation cycle. 4.we can use rotate to define how many compressed logs to keep before deleting them /var/log/myapp.log { daily rotate 7 compress delaycompress missingok notifempty create 0640 root adm } For practical learning aws&Devops ping me 9154078579

Question:-65 What is the role of Docker DNS and service discovery? Docker's embedded DNS and service discovery mechanism allows containers to communicate with each other using human-readable names instead of static, hardcoded IP addresses 1.Docker runs an embedded DNS server at the internal IP 127.0.0.11 for every container connected to a user-defined network 2.When a container requests a name (e.g., ping database), the embedded server checks its internal key-value store to find the corresponding container's IP on the same network 3.If the name is not found internally (like google.com), Docker's DNS forwards the request to the external DNS servers configured on the host machine. 4.By using names rather than IPs, configurations remain consistent across development, testing, and production environments. 5.Service discovery is the process by which containers automatically locate and access other network services 6. Every container is automatically registered as a DNS A record by its --name. 7.we can assign multiple containers the same network alias using the --network-alias flag 8.Services defined in a Docker Compose file can reach each other using their service names (e.g., web can connect to db) without additional setup. For practical aws&Devops learning ping me 9154078579

Hi I am going to start a new AWS&Devops session on Friday 8-05-2026 @9AM IST training main intention is to clear the interview on your own ping me who are interested to join the session 9154078579

Question:-64 How would you split a 200MB Terraform state file without breaking dependencies? 1.Splitting a large (200MB) Terraform state file requires moving resources incrementally to new state files using terraform state mv 2.Immediately we need to create a backup of the remote state file 3.we need to identify logical groupings (e.g., networking, database, services) to reduce the blast radius and team conflicts. 4.we need to create a new directory for the new state and create .tf files containing the resources we plan to move. 5.Need to use the terraform state mv command to move resources from the old state to the new state terraform state mv -state=old.tfstate -state-out=new.tfstate \ module.network.aws_vpc.main module.network.aws_vpc.main 6.we need to run terraform plan in both directories to ensure no resources are being destroyed or recreated. For practical aws&Devops learning ping me 9154078579

Question:-63 How does Docker Engine interact with container runtime? 1.The Docker Engine primarily using containerd as its high-level runtime and runc as its low-level executor. 2.A user issues a command (e.g., docker run) via the Docker CLI.The Docker Daemon (dockerd) receives this request through the Docker Engine API. It handles high-level concerns like image management and network setup. 3.The daemon sends an execution request to containerd via gRPC(Remote Procedure Call). containerd is responsible for the full container lifecycle, including image pulling and monitoring container health. 4.containerd starts a shim process for each container. This shim acts as an intermediary that keeps the container's standard input/output (STDIO) open and allows the runtime to exit after starting the container. 5.The shim invokes runc, an Open Container Initiative (OCI)-compliant runtime. runc interacts directly with the Linux kernel to create namespaces and cgroups that isolate the container. For aws&practical learning ping me 9154078579

Question:-62 How do you structure Terraform code for a team managing 100+ accounts across multiple environments like dev, staging, and production? Managing over 100 accounts across multiple environments (Dev, Staging, Production) requires a strictly structured, modular, and automated Terraform approach to avoid configuration drift and accidental destruction. 1.The best practice for this scale is a component-based module structure paired with Terragrunt, allowing you to keep configurations DRY (Don't Repeat Yourself) while keeping environment state strictly separated. 2.A popular approach for large-scale environments is a mono-repo that separates modules (reusable code) from live infrastructure (specific configuration) 3.we need to use Terragrunt to automatically configure a unique S3 backend key for every component in every region of every account Structure: bucket / env / account / region / component / terraform.tfstate 4.Instead of creating a single "application" module, break infrastructure into small, manageable units Networking Component: VPC, Subnets, Route Tables. Data Component: RDS, DynamoDB, Elasticache. App Component: EKS, ECS, Lambda. 5.When a PR is created, Atlantis runs terragrunt plan on the specific changed directory, providing visibility and requiring approval before apply. 6.we need to use Terragrunt to automatically generate the provider "aws" block for each account, assuming appropriate IAM roles For aws&Devops practical learning ping me 9154078579

Devops and aws interview preparation Hyderabad - Statistics & analytics of Telegram channel @hyderabadawsdevopsinterviewprepa