The revolution won’t be televised. It will be streamed. Want to help us build it?
Roku is changing how the world watches TV
Roku is the #1 TV streaming platform in the US, and we’ve set our sights on powering every television in the world. Roku pioneered streaming to the TV. Our mission is to be the TV streaming platform that connects the entire TV ecosystem. We connect consumers to the content they love, enable content publishers to build and monetize large audiences, and provide advertisers unique capabilities to engage consumers.
From your first day at Roku, you’ll make a valuable — and valued — contribution. We’re a fast-growing public company where no one is a bystander. We offer you the opportunity to delight millions of TV streamers worldwide while gaining meaningful experience across various disciplines.
About The Team
Our team’s mission is to build cutting-edge advertising technology to support and help grow sustainable advertising business. The team owns server technologies, reporting infrastructure, data, cloud services, and test engineering for the advertising platforms within Roku. We also work closely with internal and external customers to help them achieve their advertising goals.
About The Role
We are looking for a skilled Engineer with exceptional DevOps skills to join our team. Responsibilities include automation and scaling of Big Data and Analytics tech stacks on Cloud infrastructure, building CI/CD pipelines, setting up monitoring and alerting for production infra, and keeping our tech stacks up to date.
What You’ll Be Doing
- Develop best practices around cloud infrastructure provisioning, disaster recovery and guide developers on adoption
- Collaborate on system architecture with developers for optimal scaling, resource utilization, fault tolerance, reliability, and availability
- Design, develop and maintain efficient ways of tracking, predicting, and optimizing infrastructure cost in a multi-cloud setting
- Conduct low level systems debugging, performance measurement & optimization on large production clusters and low latency services
- Create scripts and automation that can react quickly to infrastructure issues and take corrective actions
- Participate in architecture discussions, influence product roadmap, and take ownership and responsibility over new projects
- Collaborate and communicate with a geographically distributed team
We’re Excited If You Have
- Bachelor’s degree or equivalent
- 4+ years of experience in DevOps and/or Reliability Engineering
- Experience working with monitoring and alerting tools (such as Datadog and PagerDuty)and being part of call rotations
- Experience with system engineering around edge cases, failure modes, and disaster recovery (AWS background preferred)
- Strong background in Linux/Unix shell scripting (or equivalent programming skills in Python)
Nice To Have
- Experience scaling production systems running Big Data tools like Spark, Hadoop, Apache Druid, Looker
- Understanding of automation tools like Ansible, Terraform, AWS Opswork
- Apache Airflow