Sr. DevOps (Site Reliability Engineer) Engineer - US Company
Excellent progression opportunity & company culture
Great overall salary and work life balance style
About Our Client
A world leader in international education SaaS systems and services, serving over 3 million+ students across 10,000 schools in over 130 countries through our 3 integraed systems.
- Reliably automate the server provisioning process to reduce the labour of our R&D team.
- Building scalable infrastructure to manage high-load, concurrent sessions to support ~50 mm monthly page views and 500k+ active users.
- Drive the company through "Disaster Recovery Tests", where we manually turn down pieces of infrastructure to test products overall resilience to failures.
- Implement the systems and processes that Product Developers use to deploy their software into production.
- Build an auto-remediation system to automatically resolve production incidents before escalating them to on-call Developers.
The Successful Applicant
- Minimum of 5+ years of system administration experience for a high-usage, web-based software service ideally built using open-source software components.
- Knowledge of Amazon AWS services and API's including EC2, S3, VPC, IAM.
- Knowledge and familiarity with alerts & monitoring tools, and system management tools for Linux environments (including DataDog, Nginx, NewRelic, CloudFlare, MySQL/PostgreSQL, Apache, IPTables, ELK stack).
What's on Offer
Excellent progression opportunity & company culture.
Great overall salary and work life balance style.