Project Engineer (Part time or Full time) - ADL
Project Engineer (Part time or Full time) - ADL Posted: 26/02/2025 Closing Date: 12/03/2025 Job Type: Part time or Full Time... more info
We’re building the operating system for AI compute —seamless workstation style access as a single entry point into global compute, with ultra fast data transit connecting everything. If you love high-performance computing, distributed systems, and AI infrastructure , and have experience managing large-scale GPU clusters and storage systems , you’ll fit right in. What you’ll work on: Scalable, distributed AI infrastructure across cloud, on-prem, and colocation environments GPU orchestration and fault-tolerant scheduling (Slurm, Kubernetes, Ray, and other orchestration frameworks) Supercomputing clusters and high-performance storage solutions for AI workloads Ultra-fast data pipelines for petabyte-scale AI training workloads Multi-cloud orchestration and on-premise AI data centers , making compute feel like a single, unified system DevOps & MLOps automation for streamlined model training and deployment Security and reliability for distributed computing across the public internet Scaling compute clusters 10-20x, from 128 to 1024+ GPUs , ensuring high uptime, reliability, and utilization Optimizing HPC clusters for AI training, including upgrade pathways and cost-efficiency strategies Your background would include some or all of the following: Strong systems engineering skills with experience in distributed computing and storage for AI workloads Proficiency in GPU cluster management , including NVIDIA GPUs, Slurm, and Kubernetes Deep understanding of distributed training frameworks and multi-cloud architectures (AWS, GCP, Azure, and emerging GPU clouds) Experience managing large-scale clusters , including team leadership, hiring, and scaling operations Expertise in high-performance storage (Ceph, S3, ZFS, Lustre, and others) for massive AI datasets Ability to optimize cluster utilization, uptime, and scheduling for cost-effective operations Understanding of colocation strategies , managing AI data centers, and running HPC workloads in mixed environments DevOps/MLOps experience , automating training pipelines for large-scale AI models Experience working with AI/ML researchers , optimizing infrastructure for deep learning training This role is perfect for senior engineers who have built and scaled large AI compute clusters and are passionate about pushing the boundaries of distributed computing and AI training infrastructure . Our culture We move fast. We ship weekly —new features, improvements, and fixes go live fast. We test big. Every month, we stress test with large groups of users face to face, get real-world feedback, and iterate rapidly. We build together. On site only, in SF or Sydney. We iterate relentlessly. Direct user feedback shapes our roadmap—we release, test, refine, and keep moving. ️ We travel when needed. Engineers may travel between SF and Sydney to run events and meet with clients. Location: SF or Sydney (OG startup house vibe, great food, late nights, all the GPUs) Equipment & Benefits: Top spec Macbook + separate GPU cluster dev environments for each engineer. Weekly cash bonus when you work out 3+ times a week. Comprehensive health benefits, including a choice of Kaiser, Aetna OAMC, and HDHP (HSA-eligible) plans for our SF-based team members. Highest in the world 20 year exercise window for options Don’t have all the skills? Apply anyway! We’re looking for people who move fast, learn fast, and ship fast. If that’s you, let’s talk. Want to get to know us first? Attend one of our upcoming events . #J-18808-Ljbffr
Project Engineer (Part time or Full time) - ADL Posted: 26/02/2025 Closing Date: 12/03/2025 Job Type: Part time or Full Time... more info
Senior Structural Engineer – Mackay Project Engineer | Full-Time | Mackay Fantastic client facing role Well respected engineering... more info
Do work that matters: The role of Platform Engineer is to design, build, run, and evolve tools, infrastructure, templates,... more info