Senior DevOps Engineer
Data Science UA is a service company with strong data science and AI expertise. Our journey began in 2016 with the organization of the first Data Science UA conference, setting the foundation for our growth. Over the past 9 years, we have diligently fostered the largest Data Science Community in Eastern Europe, boasting a network of over 30,000 top AI engineers.
About the client:
Our client specializes in inventory optimization for industrial and manufacturing companies, helping them streamline inventory and optimize working capital. Their browser-based platform, hosted on Azure, is designed to manage and refine inventory processes, enabling companies to buy only what they need and proactively manage stock.
About role:
We’re looking for a seasoned DevOpsexpert who will take full ownership of infrastructure and reliability topics – designing, building, and maintaining a scalable and secure environment for a mission-critical healthcare platform.
Requirements:
– 8+ years in production operations / DevOps roles;
– Strong experience with AWS, Kubernetes, Docker, Terraform;
– Advanced Linux system administration and network troubleshooting;
– Experience with CI/CD (preferably GitLab CI/CD);
– Monitoring stack experience (Prometheus, Grafana, ELK, or similar);
– Nice to have: PHP ecosystem familiarity + advanced MySQL administration experience.
– Strong experience with AWS, Kubernetes, Docker, Terraform;
– Advanced Linux system administration and network troubleshooting;
– Experience with CI/CD (preferably GitLab CI/CD);
– Monitoring stack experience (Prometheus, Grafana, ELK, or similar);
– Nice to have: PHP ecosystem familiarity + advanced MySQL administration experience.
Key responsibilities:
– Own and manage DevOps and infrastructure topics end-to-end;
– Monitor, maintain, and improve uptime, reliability, performance, and user experience;
– Design and track SLOs/SLIs suitable for healthcare-grade reliability;
– Manage CI/CD pipelines (GitLab), deployments, and rollbacks;
– Implement and enforce IaC best practices (Terraform, Ansible or similar);
– Build monitoring, alerting, and observability solutions with proactive incident detection;
– Troubleshoot production issues across AWS, Linux, MySQL, networking, and web servers (Apache/Nginx);
– Ensure platform security and GDPR compliance;
– Plan capacity and scalability as the platform grows;
– Lead post-incident reviews and drive continuous reliability improvements;
– Collaborate closely with the CTO and engineering team on architecture decisions;
– Participate in on-call rotations during business hours (no nights/weekends).
– Monitor, maintain, and improve uptime, reliability, performance, and user experience;
– Design and track SLOs/SLIs suitable for healthcare-grade reliability;
– Manage CI/CD pipelines (GitLab), deployments, and rollbacks;
– Implement and enforce IaC best practices (Terraform, Ansible or similar);
– Build monitoring, alerting, and observability solutions with proactive incident detection;
– Troubleshoot production issues across AWS, Linux, MySQL, networking, and web servers (Apache/Nginx);
– Ensure platform security and GDPR compliance;
– Plan capacity and scalability as the platform grows;
– Lead post-incident reviews and drive continuous reliability improvements;
– Collaborate closely with the CTO and engineering team on architecture decisions;
– Participate in on-call rotations during business hours (no nights/weekends).
The company offers:
– Opportunity to work on a cutting-edge localization platform with AI-driven innovation;
– A collaborative, dynamic team environment with a culture of learning and growth;
– Competitive salary and flexible work arrangements
– A collaborative, dynamic team environment with a culture of learning and growth;
– Competitive salary and flexible work arrangements
About
Apply vacancy