DevOps / Platform Engineer
EngineeringRemoteFull-time
About the Role
Join our engineering team as a DevOps / Platform Engineer responsible for the reliability, scalability, and security of our infrastructure. You’ll own our Oracle Cloud and Kubernetes environments, build CI/CD pipelines, and ensure our backend services, real-time telematics pipelines, and camera/vehicle integrations run smoothly in production.
Responsibilities
- Design, deploy, and maintain infrastructure on Oracle Cloud (compute, networking, storage, managed services)
- Provision, operate, and optimize Kubernetes clusters (OKE or equivalent), including node pools, namespaces, RBAC, and autoscaling
- Implement and maintain CI/CD pipelines for backend services, APIs, and data pipelines (build, test, deploy, rollback)
- Configure and manage networking: VCNs, subnets, load balancers, security lists, DNS, and traffic routing
- Work directly with TCP and UDP protocols to support real-time telematics, camera streams, and device integrations
- Set up and maintain observability stacks (logs, metrics, traces, alerts, dashboards) for services and infrastructure
- Automate infrastructure and operational workflows using Infrastructure as Code (e.g., Terraform, Helm) and scripting (Bash, Python, etc.)
- Improve system reliability and performance through capacity planning, tuning, and resilient deployment strategies (rolling, blue/green, canary)
- Harden security across the platform, including IAM, secrets management, TLS, network policies, and OS hardening
- Collaborate with backend engineers and system engineers on architecture, service design, and performance reviews
- Participate in on-call rotations, incident response, root-cause analysis, and post-incident process improvements
- Monitor and help optimize cloud infrastructure costs while maintaining performance and reliability targets
Required Qualifications
- Bachelor's Degree in Computer Science, Software Engineering, Systems Engineering, or a related technical field (or equivalent practical experience)
- 2–5 years of professional experience in DevOps, SRE, Platform Engineering, or similar roles
- Strong hands-on experience with Kubernetes (cluster operations, deployments, services, ingress, Helm, etc.)
- Experience managing cloud infrastructure (preferably Oracle Cloud; AWS/GCP/Azure acceptable with willingness to learn OCI)
- Solid understanding of networking fundamentals, including TCP/UDP, load balancing, routing, and DNS
- Strong Linux systems experience (servers, processes, storage, permissions, resource limits, troubleshooting)
- Experience building and maintaining CI/CD pipelines (GitHub Actions, GitLab CI, or similar)
- Proficiency with at least one scripting language (Bash, Python, or similar) for automation and tooling
- Good grasp of data structures and algorithms as they relate to performance, scalability, and distributed systems debugging
- Experience with monitoring and observability tools (e.g., Prometheus, Grafana, ELK/EFK, OpenTelemetry, or similar)
- Familiarity with security best practices and compliance-aware development (e.g., HIPAA, SOC-2, least-privilege access)
- Clear communication and cross-team collaboration abilities
- Proactive, self-driven, and committed to continuous learning
- Able to thrive in fast-paced, startup-like environments
Preferred Qualifications
- Direct experience with Oracle Cloud Infrastructure (OCI) services such as OKE, Load Balancing, Object Storage, IAM, and VCNs
- Experience with real-time data systems for telematics, IoT, or fleet-management platforms
- Familiarity with multi-tenant SaaS architectures and microservice-based systems
- Experience operating message brokers/queues in production (RabbitMQ, Kafka, etc.)
- Experience with secrets management tools (e.g., HashiCorp Vault, Doppler, or similar)
- Experience with infrastructure and documentation tools (Terraform, Helm, Jira, Confluence, Notion)
- Prior participation in structured on-call rotations and incident management processes