r/devops 13d ago

Career / learning Python dev (Django/FastAPI/Docker/K8s) trying to break into DevOps — what should I prioritize, and what are the real problems no one warns you about?

Hey everyone, long-time lurker, first time posting here. Looking for honest advice from people who've actually made this kind of transition.

My current stack:

Python · Django / FastAPI · Docker + Compose · Kubernetes (basics) · Redis / PostgreSQL · Celery / Async · Bash / Linux · RTSP / FFmpeg pipelines / LLMs · YOLO / OpenCV

I've been building backend systems and a full AI-powered camera security system from the ground up — ingestion pipelines, async workers, containerized deployments, the whole thing. So I'm not starting from scratch, but I know my infra/ops knowledge has real gaps.

Now I want to go deeper into the operations side — CI/CD pipelines, infrastructure-as-code, monitoring, cloud, reliability engineering. Basically bridge the gap between "I can Dockerize things" and "I own the entire deployment lifecycle."

What I want to learn next:

  • CI/CD pipelines end-to-end (GitHub Actions, GitLab CI, Jenkins?)
  • Terraform or Pulumi for infrastructure-as-code
  • Proper Kubernetes beyond just "kubectl apply" — RBAC, Helm, Ingress, autoscaling
  • Cloud fundamentals — AWS or GCP (which is better to start with?)
  • Observability stack — Prometheus, Grafana, ELK, alerting
  • GitOps workflows — ArgoCD, FluxCD

Real questions for this community:

  1. What order should I learn these in? I've seen conflicting roadmaps. Some say start with cloud, others say master Linux first, others say just go build something and learn as you go.
  2. What are the actual painful problems nobody tells you about? Not the beginner stuff — I mean the things that trip up even experienced engineers. The stuff that takes months to unlearn or figure out on your own.
  3. Career reality check — I'm coming from a Python/ML background. Will that help me in DevOps roles or will recruiters just not take me seriously because I don't have a traditional sysadmin / infra background?

The real problems I'm already anticipating (want your take on these):

  • Tool sprawl confusion — Terraform vs Pulumi vs CDK vs Ansible vs Chef — no one agrees and every job posting wants something different. How did you pick one and stick with it?
  • Cloud costs — I have zero experience budgeting cloud infra and I know this bites everyone at some point. Any war stories?
  • Debugging distributed failures — logs scattered across 10 services, no clear owner, alerts firing at midnight. How long did it take you to get good at this?
  • Kubernetes complexity cliff — goes from "simple" to genuinely hard very fast, and tutorials always skip the hard parts. What resource actually helped you get past that wall?
  • "DevOps is a culture, not a role" — some companies don't even have a DevOps team, it's just dumped on top of dev work with no extra support or title. How common is this really?
  • Imposter syndrome — coming in as a developer, not a sysadmin, means constantly feeling like you're missing some foundational Linux/networking knowledge everyone else just has. Did this get better?
14 Upvotes

36 comments sorted by

35

u/AskOk2424 13d ago

— 

6

u/HeligKo 13d ago

Python experience will serve you well. Just be honest and dive in. Sounds like you have a background that can convert. Of course you will have some weak spots, but we all do when in transition.

1

u/TodayFar9846 13d ago

Thanks man, this actually helps and Weak spots I know I have right now — never touched real cloud infra, no proper IaC experience, and my monitoring is basically just logs and hope lol. Those are next on my list.

7

u/Afraid_Prompt_2379 13d ago

Everything is comes later first you need to work on Linux and Networking then go ahead evrything look cool as you mentioned.

2

u/[deleted] 12d ago

[removed] — view removed comment

2

u/Afraid_Prompt_2379 12d ago

Absolutely correct thank for giving him more clarity with your thoughts 👍

6

u/SadServers_com 13d ago
  1. I'd say depends on how good your fundamentals are. If you have a decent base, build and learn as you go. Learning doesn't have to be linear or following a strict roadmap; you can try and do a project and then in the middle do a deep dive into some Linux or cloud topic you need or are just interesting in learning.

  2. Good question and not sure about the answer; perhaps long-term maintenance, things breaking in unexpected ways (usually when it comes in contact with reality/users)

  3. Absolutely it will help you. In theory (ie Google's definition) for ex SRE is supposed to be a developer doing ops in a developer way rather than a sysadmin way. In reality most DevOps/SREs come from sysadmin world and the ones coming form SWE world are more valued.

- Tool sprawl confusion : pick one tool of each option to begin with and stick with it; use the most popular or to one you hate less :-) for ex Terraform & Ansible rather than Pulumi/Puppet. Go deep or one project end-to-end and then if you want you can go wide and learn other tooling.

- Cloud costs: set up daily budget alerts and tear down everything or expensive stuff frequently (ideally daily, this is what IaC is for after all)

- Debugging distributed failures , How long did it take you to get good at this? nobody is good at this, all companies have a lot of monitoring/alert noise :)

- Kubernetes complexity cliff : I'd delay a bit getting into k8s, only after getting good Docker and general VM/networking skills. The book "Kubernetes in Action" is very good. Then it's practice, ideally with real workloads.

- "DevOps is a culture, not a role" sigh, last argument I had here was about this. This is my take https://docs.sadservers.com/blog/what-the-f-is-devops/ , some companies have "DevOps" roles, some don't (but have Platform/Cloud/Infra/Production/Ops/whatever titles or job descriptions). Some companies have devs doing DevOps, we don't know how many. Anecdotally I'd say not the majority. There's a whole book "Network Topologies" discussing different ways of doing things. There's no "ideal" one (person I argued with said this is "an anti-pattern")

- Imposter syndrome: yes you just learn, also I'd dare to say it's easier for a dev to pick up infra than for a sysadmin to pick up dev.

2

u/TodayFar9846 12d ago

Thank you man for this incredibly grounded and practical reality check! I love the reminder that learning doesn't have to be linear, and focusing on a developer-first approach to infra makes total sense. Delaying K8s until Docker/networking are solid, and keeping tool sprawl minimal, is exactly the roadmap I need to avoid burnout. Appreciate you cutting through the hype!

2

u/SadServers_com 12d ago

happy to help :-)

3

u/[deleted] 13d ago

[removed] — view removed comment

1

u/TodayFar9846 13d ago

this is honestly the most honest comment in this post. you're pointing at the stuff that actually matters.

hands on first, everything else later. thanks for this.

2

u/Cultural_Cry535 13d ago

Hey!
Can't give you advice, but I'm a python developer too and I'm trying to learn k8s by doing a personal project. Maybe we can sync up if you're interested

2

u/TodayFar9846 13d ago

Definitely. DM me, let's build the server and start grinding...

2

u/mqfr98j4 13d ago

Get comfortable with continuous upgrades, bc every time you upgrade {{ anything }}, it's due for another upgrade, along with all of the other services that interface with it.

No one really warns you about that, but depending on the project, upgrades can become your full-time job

2

u/[deleted] 13d ago

[removed] — view removed comment

1

u/TodayFar9846 13d ago

bro you are going raw on this whole field lol and honestly i needed to hear it.

same title, completely different job depending on the company. that's actually scary.

gonna read JDs way more carefully from now on. thanks for this one fr.

2

u/BobHabib 13d ago edited 13d ago

AWS, K8S, Docker, CICD, and even terraform all run on foundation of Linux and networking. 

So dont be like me back in 2022 where I was deploying a kubernetes cluster on local cloud server and there was some issue with SMB and I suddenly had to figure out wtf is an smb lol

IMO, devops and SRE are 75% ops and 25% dev related.

Also if you are going to SRE get really comfortable with logs, metrics, traces, specifically Prometheus and ELK.

2

u/TodayFar9846 13d ago

Thank alot it's giving me very clear vision, i really need this type of answer

2

u/actionerror DevSecOps/Platform/Site Reliability Engineer 13d ago

Not Jankins

1

u/TodayFar9846 12d ago

Thank you, man! I appreciate it. But could you give a brief explanation or the reason behind it so new learners can properly understand it ?

1

u/actionerror DevSecOps/Platform/Site Reliability Engineer 12d ago

There are way better tools out there today that don’t have such security flaws and use bastardized groovy that makes writing the scripts hard. A decade ago or so, sure, that’s pretty much what we had, but to use it now for a greenfield project is crazy. Most likely, you would join an org that’s been using it for a long time.

1

u/TodayFar9846 12d ago

There might be better tools available on the market right now. If you're familiar with any of them, I'd love to get your recommendations. In your opinion, what should I learn Also, what are you personally using in your real-time daily workflows instead of this?

2

u/masterofrants 10d ago

Just go to eBay and get a used Dell server for around $1,000 with 128 GB of RAM, maybe 1 TB of hard drive space, and a good enough processor. You have all the computing power you need to do anything you want to learn. That's how I am doing it.

Then you can throw in Proxmox on it and it will give you all the support you need for all this tooling and everything else.

I just spent the last 5 hours building a Discord bot and pushing it to Docker using Ansible and now it's on my GitHub as well. Of course the whole thing was vibe coded but it was fun to do a personal project so yeah, think about personal projects you want to do and do them using these tools.

1

u/Important-Hunt-61 13d ago

I am very similar to you. Long time SWE looking at DevOps roles. I have been using ArgoCD + GitHub Actions to deploy an application to my local cluster. From there I was experimenting with the different ArgoCD features like how could I auto deploy/destroy on PR submissions. I kinda figured that out and so lately I've been messing around with Pulumi to build my Infra. I did a Lucid chart diagramming what it would look like. I think as far as IaC choice goes Terraform seems to be most popular but having used it before I wanted to get exposure to Pulumi. I'd say just pick Terraform and go with it till you understand it. TBH terraform or Iac isn't really the hard part, it's knowing what underlying resources you need in AWS to get something going. Like an ECS deployment needs 6-7 things before you actually have a cluster. I think you have a good outline of stuff to learn. You really need to know how the cloud provider works. Especially VPC, route tables, subnets (private vs public), Internet Gateway, Nat Gateway, VPC Endpoints, VPC peering, VPNs, etc. I've attempted the AWS Solutions Architect classes a few times but never finished. However they do provide a really good dive into those topics.

Books

- GitOps cookbook (I worked through this)

  • ArgoCD Up and Running (bought but haven't touched)
  • Terraform Up and Running (worked through an old version of this. I believe it's since been updated)

1

u/[deleted] 5d ago

[removed] — view removed comment