Site Reliability Engineer

Remote, USA Full-time Posted 2025-07-27

Capchase is the #1 platform for vendor financing in tech. We help software and hardware vendors offer flexible installment payments as part of the sales process, improving conversion rates and cashflow. We provide an awesome buyer experience.

Capchase was founded in 2020 and is headquartered in NYC. We’ve provided over $2.5B in funding to thousands of companies and operate in the US and Europe. We are backed by QED (Nubank, Klarna), 01 Advisors (Tipalti, MasterClass), Bling Capital (Airtable, GitLab, Lyft, Square), SciFi (Stripe, Brex), Caffeinated Capital (Opendoor, Airtable), Thomvest, Invesco and many other leading investors.

Some of our achievements:

➡ Supporting thousands of software companies and software buyers
80 Capchasers representing 15+ nationalities
Active in 8 markets
Top Decile Growth
Ranked #1 across B2B BNPL

In December 2024 we reached the top of the Installment Payment, BNPL category on G2, #1 in B2B.

Why work with us?

Help accelerate an industry!

At Capchase, we are transforming how software and tech-enabled hardware equipment gets financed, we move and innovate fast. We’re always looking for the brightest minds to join us. We’re a diverse team of 15+ nationalities with a shared passion for helping innovative companies thrive. Join the climb with us!

As a foundational member of our growing engineering team, you'll play a pivotal role in shaping the culture, processes, and infrastructure that will support our company as we scale 50x. This is an opportunity to work in a fast-paced environment, collaborate with talented engineers, and directly contribute to building a resilient, high-performance product.

You’ll be responsible for ensuring the availability, latency, performance, efficiency, scalability, and reliability of our systems—while helping define the long-term vision and roadmap for Site Reliability Engineering at our company.

What will you do?

Infrastructure & Scalability
- Design and evolve our systems architecture to scale 50x.
- Lead infrastructure and team scalability initiatives.
- Partner with Tech leadership to drive strategy around production-critical systems.
Reliability & Performance
- Own service level objectives (SLAs/SLOs/SLIs), helping teams define and uphold them.
- Conduct capacity planning and cost optimization.
- Standardize service levels and observability practices across the organization.
Monitoring, Observability & Alerting
- Define requirements and best practices for monitoring, alerting, and logging.
- Design and implement tools to gain insight into trends, detect anomalies, and compare system behavior.
- Build visualizations to surface system health and performance.
CI/CD & Developer Velocity
- Improve and maintain our CI/CD pipelines and development environments.
- Eliminate toil and increase automation across the engineering lifecycle.
- Accelerate development by enhancing testing, staging, and deployment processes.
Incident Management & Disaster Recovery
- Lead the on-call rotation and incident response efforts.
- Serve as the first responder during production incidents—owning detection, escalation, and resolution.
- Drive postmortems and ensure actionable insights are implemented.
Security & Compliance
- Collaborate on in-house practices or third-party partnerships to improve security posture.
- Ensure that security and compliance support our scalability goals and customer trust.
Team & Culture Building
- Help define the roadmap and future scope of the SRE function.
- Participate in hiring and mentoring to grow a world-class reliability team.
- Foster a culture of ownership, collaboration, and continuous improvement.

What are we looking for?

Bachelor’s degree in Computer Science, a related field, or equivalent practical experience.
Proficiency in one or more programming languages such as C++, Elixir, JavaScript, Python, or Go.
Solid understanding of algorithms and data structures.
Deep expertise in designing, analyzing, and troubleshooting distributed systems.
Hands-on experience with Kubernetes, Terraform, and Google Cloud Platform (GCP).
Strong debugging and code optimization skills, with a passion for automation.
Systematic problem-solving approach, effective communication skills, and a drive for operational excellence.

This role is ideal for engineers passionate about building high-performing systems and scaling infrastructure—while collaborating cross-functionally to shape the future of engineering reliability.

We are an equal opportunity employer and value diversity at our company. We do not discriminate on the basis of race, religion, colour, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.

Apply to this Job

Apply Now

Site Reliability Engineer

Similar Jobs

Senior Backend Engineer (North or South America)

Senior Product Manager, Financial Products

Software Engineer- Servicing Platform

Staff Product Designer

Senior Full Stack Developer

Total Rewards - Client Manager, Major Market

Future Openings at Elevate Labs

Senior Front End Developer

Account Coordinator - Construction

Clinical Psychologist - Contractor

Copywriter - Direct Response Advertising

Client Service Excellence Representative PA - Remote (Pipeline 2024)

[Remote-Position] Customer Service Specialist-Remote

Amazon Remote Customer Service Associate – No Experience Needed | Join Our Team

Travel Program Specialist

Online Order Filling Team Associate

Head of APAC Sales - Navigation Solutions

[Remote-Position] Data Collection Specialist

Wells Fargo Work From Home Jobs

Influencer Marketing & Strategy Intern, Summer 2025