Staff Site Reliability Engineer
Job Description
We’re looking for a Staff Site Reliability Engineer to join Procore’s Project Execution Group. In this role, you’ll lead, collaborate, partner and develop solutions to maintain the health of the core platform. The goal is to ensure the chosen design and architecture is highly available, performant and reliable as this team is directly impacting Procore's internal customers and the decisions will directly impact external customer experience.
As a Staff Software Engineer on our Reliability Engineering team, you’ll help champion solutions to systemic issues affecting every team at Procore. Leveraging your software and systems architecture expertise, you’ll conduct consultative engagements with our service authors that improve our software’s reliability. If you have a passion for solving complex problems unique to running large, highly scalable, resilient systems with modern technologies; we would love for you to join us!
This position reports into Engineering manager and will be based in our Austin Office. We’re looking for someone to join us immediately.
What you’ll do:
Lead projects within a small team of Reliability Engineers to continually improve the reliability of Procore’s services through engineering and process improvement
Collaborate with your peers to envision, design, and develop solutions in your respective area with a bias toward reusability, toil reduction, and resiliency
Surface opportunities across the broader organization for solving systemic issues
Use a collaborative approach to make technical decisions that align with Procore’s architectural vision
Partner with internal customers, peers, and leadership in planning, prioritization, and roadmap development
Develop teammates by conducting code reviews, providing mentorship, pairing, and training opportunities
Serve as a subject matter expert on tools, processes, and procedures and help guide others to create and maintain a healthy codebase
Facilitate an “open source” mindset and culture both across teams internally and outside of Procore through active participation in and contributions to the greater community
What we’re looking for:
Container orchestration (Kubernetes) K8s, preferably EKS.
ArgoCD
Terraform or similar IaC
o11y
Public cloud (AWS, GCP, Azure)
Cloud automation tooling (e.g., CloudFormation, Terraform, Ansible)
Linux Systems
Experience with the following is :Continuous Integration Tooling (e.g., Circle CI,Jenkins, Travis, etc.)
Continuous Deployment Tooling (e.g., ArgoCD, Spinnaker)
Service Mesh / Discovery Tooling (e.g., Consul, Envoy, Istio, Linkerd)
Contributions to open-source projects
Networking(WAF, Cloudflare)