SRE Infrastructure Cloud Engineer
We are seeking a talented and experienced SRE Infrastructure Cloud Engineer to join our team. This role offers the opportunity to transform the infrastructure and work in a dynamic, fast-paced environment, developing and maintaining performant and scalable infrastructures. As a key contributor to the new Platform SRE team, you will have the opportunity to make a significant impact as we build and automate our infrastructure in the cloud
Responsibilities:
- Automate/IaC first mindset.
- Develop, design and maintain automation tools and orchestration.
- Build and improve tools for users to understand and analyze the health and operations of large-scale data-intensive systems.
- Monitor the performance of our infrastructure and develop automated solutions to address any issues.
- Provide self-service tools for other teams to troubleshoot and resolve performance issues. Collaborate with other teams to design and implement tools that will help automate end-to-end processes within the infrastructure.
- Analyze and understand the experience of all customers of the platform.
- Assess technical risks and develop mitigation strategies.
- Develop new specs, documentation, and participate in the development of technical procedures and user support guides.
- Collaborate effectively with internal development, SRE, operations, and networking teams to define detailed requirements and communicate clear expectations.
- Track infrastructure delivery and dependencies to implementation.
- Communicate implementation issues, delays, and mitigation plans.
- Innovate to improve future processes and deployments.
Qualifications:
- Bachelor's degree in Engineering, Computer Science, Information Technology, or related field is strongly preferred.
- Hands-on experience deploying and operating applications using IaaS and PaaS on major cloud providers such as Amazon AWS, Microsoft Azure, or Google Cloud Services.
- Proficient in Python, Bash, Go. Typescript and/or Rust are a plus.
- Experience with Infrastructure as Code (IaC) using tools such as Terraform, CloudFormation, and Chef.
- Knowledge of SRE and security best practices, with previous experience implementing them into workflows.
- Proven experience with automation, CICD, orchestration, and configuration management.
- Familiarity with logging and observability platforms such as OpenTelemetry, Prometheus, and AppD/NewRelic/Dynatrace.
- Strong understanding of security and compliance frameworks.
- Excellent written and verbal communication skills, with the ability to convey technical concepts to both technical and non-technical audiences.
- Strong problem-solving and troubleshooting abilities.
- Ability to work on multiple concurrent projects in an agile environment.
- Team-oriented individual with strong self-motivation and the ability to work with minimal supervision.
- Passionate about continuous improvement and staying up-to-date with the latest technologies and trends.
Applications processed via employer's online application form