Software Engineer, Infrastructure
San Franciscoonsitemid$180K – $250K
via Greenhouse
About this role
You are a hands-on engineer who builds the software and processes that keep a large fleet of GPU servers healthy and productive. You write systems and tooling for managing 1000s of servers including provisioning, health monitoring, error detection, and recovery — and when something breaks that automation can’t fix, you drive resolution with partners.
Key responsibilities
Build and maintain Python fleet tracking system that manages the full lifecycle of servers including contracting and procurement, target use, pricing, availability, health, RMAs, etc
Build server management tooling that automates provisioning, health checks, GPU diagnostics, recovery and alerting…
What we'd score you on
reqspace match rubricFive dimensions, recruiter-grade. Upload your resume and we'll generate a written explanation of where you fit and where the gaps are.
1
Skills match
For this role: python, terraform, ansible, teams, soc 2…
2
Level fit
This role is mid-level. We check your trajectory against it.
3
Domain experience
Your work in the role's domain matters more than your years total. We weight recent and direct experience.
4
Recency
A skill you used last quarter weighs more than one from five years ago. We grade on recency, not lifetime.
5
Location fit
This role is based in San Francisco. We weight your proximity and willingness to relocate.
Score yourself on this role.
Free · no card · written explanation included
Skills in this role
Pulled from the job description. These are the keywords we'll weight when scoring your fit.
pythonterraformansibleteamssoc 2iso 27001
More at Fal
- View →Sales Development Representative (Inbound)San Francisco
- View →Account Executive, CommercialSan Francisco
- View →Account Executive, EnterpriseSan Francisco
- View →Business Development Representative (BDR)San Francisco
- View →Commercial LegalSan Francisco
- View →Account Manager, Enterprise (Singapore)Singapore
