Abacus.AI is an AI research + SaaS company helping enterprises build, deploy, and operate real-time deep learning systems in production. We work with some of the most data-driven teams globally — and we’re scaling fast.
The Role:
We’re looking for a Cloud Infra Support Engineer to own production reliability, work closely with customers, and act as an
on-call operations support engineer
during
US (PST) night hours
.
What You’ll Do:
Own and support production cloud infrastructure (AWS / GCP / Azure)
Act as on-call ops support during PST night hours (rotational, not continuous)
Monitor systems, alerts, and dashboards; respond to incidents and outages
Troubleshoot customer escalations and production issues
Work hands-on with Kubernetes-based platforms and customer onboarding
Build and improve monitoring, alerting, and automation
Represent customer needs and influence platform and infra improvements
What We’re Looking For:
2+ years in Cloud Infra / DevOps / SRE / Ops roles
Strong experience managing production environments
Comfortable working night shifts aligned with US (PST) time
Familiarity with Spark, TensorFlow, GPUs, or MLOps (nice to have)
Strong troubleshooting and customer-facing communication skills
What We Offer
Competitive salary and equity package
Opportunity to work with cutting-edge AI technology
Collaborative and innovative work environment
Professional development and learning opportunities
Culture :
We believe in giving everyone autonomy and ownership and don’t believe in over-management. We have a hands-off work-from-home environment where each individual has personal responsibility.
If you're interested, feel free to share your updated resume with Harsh Saxena - harshsaxena@abacus.ai
Looking forward to hearing from you!
ATS Match is available
1) Upload your resume. 2) Open any job and click Check ATS Match to see your fit score.