Site Reliability Engineer @ Zefr
What we do:Zefr is the leading global technology company enabling responsible marketing in walled garden social environments. Zefr’s solutions empower brands to manage their content adjacency on scaled platforms such as YouTube, Meta, TikTok, and Snap, in accordance with industry standard frameworks. Through its patented AI technology, Zefr offers brands and agencies more accurate and transparent solutions for social walled gardens. The company is headquartered in Los Angeles, California, with additional locations across the globe.What you’ll do:As a Site Reliability Engineer at Zefr, you’ll apply your expertise in cloud infrastructure, CI/CD, Observability, and core SRE concepts, to deliver high-quality, reliable, and scalable solutions. A significant aspect of this role involves working closely with Zefr's Engineering and Data Science teams ensuring the infrastructure required for our services is robust, efficient, and scalable.We’re looking for someone to combine their technical expertise with strong leadership and a passion for continuous improvement and innovation. By ensuring the continuous health and efficiency of our infrastructure, you will directly contribute to Zefr’s commitment to providing a consistently high-quality user experience. This is a role where we both expect to learn from you and have you learn from us!Support and build systems and tools that enable other engineers to generate, deploy, and manage product features.Deploy and support a multi-cloud, micro-service architecture deployed via Github Actions, ArgoCD & Kubernetes.Collaborate with other engineers to architect secure, resilient, scalable, and cost-efficient applications and systems/pipelines in AWS and GCP.Foster and push our DevOps culture and philosophy by encouraging continuous improvement across all engineering teams.Proactively maintain the health of production environments, including monitoring application performance and resource utilization.Participate in 24/7 on-call rotation, respond to system performance issues and outages.Debug code at the application and infrastructure level.Mature our CI/CD workflows and release process.Maintains a forward-thinking approach, actively researching and proposing new solutions.Propose and…
Apply To This Job