Devops Engineer
About the Role: Grade Level (for internal use): 12 What's in it for you: As part of a team with a broad engineering and... more info
AI workloads are brutal —petabytes of data, distributed jobs, and real-time GPU orchestration. We’re building an AI-first DevOps infrastructure that makes compute reliable, scalable, and cost-effective . If you love infrastructure automation, cloud-native engineering, and AI performance tuning , you’ll love this role. What you’ll do Design and manage scalable, fault-tolerant AI compute infrastructure Automate GPU provisioning, multi-cloud scheduling, and scaling strategies Improve observability, logging, and monitoring for real-time AI workloads Optimize containerized deployments for Kubernetes, Nomad, or Slurm Enhance security, CI/CD, and cloud networking for high-performance distributed training Implement security best practices for DevOps pipelines, including secrets management, infrastructure security, and compliance automation Reduce infrastructure cost and maximize performance through automation and tuning What we’re looking for Deep knowledge of CI/CD pipelines and infrastructure as code Hands-on experience with monitoring and logging tools (Prometheus, Grafana, OpenTelemetry) Proficiency in shell scripting, Python, or Go for automation Experience with security best practices for cloud environments, including IAM, container security, and incident response Nice to haves: Experience managing large-scale clusters with Kubernetes or other approaches and cloud infrastructure Experience with Terraform, Ansible, Helm, or Pulumi Understanding of AI/ML compute environments (GPUs, CUDA, NCCL, Slurm, Horovod) Our culture We move fast. We ship weekly —new features, improvements, and fixes go live fast. We test big. Every month, we stress test with large groups of users face to face, get real-world feedback, and iterate rapidly. We build together. On site only, in SF or Sydney. We iterate relentlessly. Direct user feedback shapes our roadmap—we release, test, refine, and keep moving. We travel when needed. Engineers may travel between SF and Sydney to run events and meet with clients. Location: SF or Sydney (OG startup house vibe, great food, late nights, all the GPUs) Equipment & Benefits: Top spec Macbook + separate GPU cluster dev environments for each engineer. Weekly cash bonus when you work out 3+ times a week. Comprehensive health benefits, including a choice of Kaiser, Aetna OAMC, and HDHP (HSA-eligible) plans for our SF-based team members. Highest in the world 20 year exercise window for options Don’t have all the skills? Apply anyway! We’re looking for people who move fast, learn fast, and ship fast. If that’s you, let’s talk. Want to get to know us first? Attend one of our upcoming events . #J-18808-Ljbffr
About the Role: Grade Level (for internal use): 12 What's in it for you: As part of a team with a broad engineering and... more info
Be among the first 25 applicants Nuage Technology Group provided pay range This range is provided by Nuage Technology Group.... more info
Rheinmetall Defence Australia Pty Ltd in Brisbane This is a rare chance to join a collaborative and supportive team providing... more info