Read everything carefully. The requirements and screening questions are critical and if not answered correctly and satisfactorily will result in auto-rejection and waste of your time.
- Work from Home.
- This is a full-time role. If you plan to do 2 or more jobs at the same time or want to do this part-time, that won't work for us. In that case please do not apply as it will get auto-rejected
- Note - this job requires working late night India time until 4AM to overlap with USA working times. Do not apply if this timing doesn't work
- Salary depends on experience and current verifiable (paychecks) compensation.
- Junior candidates with 2 years experience are suitable
Private Cloud AI Platform Engineer
About Qubrid AI
Qubrid AI is building a full-stack AI infrastructure platform that enables enterprises to deploy, manage, and scale AI workloads across cloud, on-premises, and hybrid environments. Our platform combines GPU infrastructure, AI model serving, inference APIs, RAG services, model management, and enterprise AI software into a unified solution.
We are seeking a hands-on Private Cloud AI Platform Engineer to help build and enhance our on-premises AI platform. This role is focused on developing enterprise-grade software that allows customers to deploy and operate AI infrastructure within their own data centers, similar to how platforms such as Nutanix, VMware, OpenShift, and other private cloud solutions are delivered and managed.
Role Overview
As a Private Cloud AI Platform Engineer, you will work on the software that powers Qubrid's on-prem AI platform. You will develop features that simplify deployment, management, monitoring, and operation of AI infrastructure, GPU clusters, and AI models in enterprise environments.
The ideal candidate enjoys building products that combine software engineering, cloud-native technologies, infrastructure automation, Linux systems, networking, and AI infrastructure.
This is a hands-on engineering role requiring strong coding skills along with practical understanding of enterprise infrastructure environments.
Responsibilities
Platform Development
- Develop and enhance Qubrid's on-prem AI platform and management software.
- Build enterprise-grade platform features for AI infrastructure management.
- Design and develop APIs, backend services, and platform integrations.
- Create software that simplifies deployment and management of AI workloads in customer environments.
- Build self-service workflows for infrastructure and model deployment.
Enterprise Platform Features
- Develop user management, role-based access control (RBAC), and multi-tenancy capabilities.
- Implement SSO, LDAP, Active Directory, and SAML integrations.
- Build audit logging, monitoring, alerting, and operational dashboards.
- Develop upgrade, patch management, and lifecycle management capabilities.
- Support enterprise security and compliance requirements.
Infrastructure & Automation
- Work with Kubernetes-based deployments and orchestration systems.
- Automate installation and configuration of AI infrastructure.
- Develop cluster provisioning and management workflows.
- Build software for monitoring GPU, compute, networking, and storage resources.
- Integrate with cloud and hybrid cloud environments.
AI Platform Integration
- Integrate AI inference services into the platform.
- Support model deployment, management, and lifecycle workflows.
- Develop APIs and services for AI applications and model serving.
- Enhance observability and operational management of AI workloads.
Required Qualifications
- Bachelor's degree in Computer Science, Engineering, or related field.
- 2+ years of software development experience.
- Strong Python development skills.
- Experience building backend systems and APIs.
- Experience with Linux administration and troubleshooting.
- Understanding of networking fundamentals including TCP/IP, DNS, routing, firewalls, VLANs, and load balancing.
- Experience with Docker and containerized applications.
- Familiarity with Kubernetes and cloud-native technologies.
- Strong problem-solving and debugging skills.
Preferred Qualifications
- Experience developing private cloud, virtualization, or enterprise infrastructure platforms.
- Experience with technologies such as VMware, Nutanix, OpenShift, Rancher, KubeVirt, or OpenStack.
- Experience with GPU infrastructure and AI workloads.
- Knowledge of enterprise authentication systems such as LDAP, Active Directory, SAML, or OAuth.
- Experience with infrastructure automation tools.
- Familiarity with AI model deployment and inference platforms.
- Experience with monitoring tools such as Prometheus and Grafana.
Technical Skills
Software Development
- Python
- REST APIs
- Microservices
- PostgreSQL
- Redis
Infrastructure
- Linux
- Kubernetes
- Docker
- Networking
- Storage Systems
- Virtualization Platforms
Enterprise Technologies
- Active Directory
- LDAP
- SAML
- OAuth
- RBAC
- Audit Logging
Monitoring & Operations
- Prometheus
- Grafana
- Logging and Observability
- System Monitoring
- Troubleshooting and Root Cause Analysis
What You'll Build
- Enterprise AI infrastructure management software
- Private AI cloud deployment platform
- Multi-tenant AI environments
- User and access management systems
- Infrastructure monitoring and reporting tools
- Hybrid cloud integration capabilities
- GPU and AI workload management features
- Enterprise-grade operational tooling
Why Join Qubrid AI
- Build the future of private AI infrastructure.
- Work on cutting-edge AI and GPU technologies.
- Gain exposure to cloud, infrastructure, AI, and enterprise software development.
- Help enterprises deploy and operate AI securely within their own environments.
- Work with a fast-growing team focused on democratizing AI infrastructure.