Site Reliability Engineer II
OverviewAre you an individual who loves to work on large-scale projects at one of the most exciting and diverse divisions within Microsoft? Are you looking for big, creative challenges that show immediate results since your customers are the product engineers for Office and M365? Do you want to be at the core of it all, acting as a force multiplier enabling groups of engineers to do their best work? If so, we have the perfect job for you!The ES365 (Engineering Systems 365) team owns the tools that make up the end-to-end developer experience in Office and M365 (Substrate) from source control and check-in experience to build, validation, and deployment automation, and we’re making big, bold changes – for the better! We’re making it easy to build and ship apps across platforms and endpoints, and we’re moving away from proprietary, internal-only tools onto “one Microsoft” investments, open source, and industry standard tools. This is an exciting time as we seek to re-invent productivity leveraging the power of AI universally. We are looking for Site Reliability Engineer II (SRE) to join ES365’s Infrastructure teams. The charters of these teams include the following (and more): Azure management & governanceBusiness continuityInfrastructure as CodeNetwork engineeringProvisioning & service deploymentSecurity & vulnerability managementSystems State ManagementAs one of our new SREs you’ll get to deliver novel solutions using a modern DevOps approach leveraging the full stack of technologies Microsoft has to offer to enable our organization to respond more effectively to evolving customer needs and market demands, all while reducing costs, eliminating duplicated work, and driving efficiencies through automation.Microsoft’s mission is to empower every person and every organization on the planet to achieve more. As employees we come together with a growth mindset, innovate to empower others, and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyondQualificationsRequired Qualifications4+ years technical experience in software engineering, network engineering, or systems administrationOR Bachelor's Degree in Computer Science, Information Technology, or related field AND 1+ year(s) technical experience in software engineering, network engineering, or systems administrationOR Master's Degree in Computer Science, Information Technology, or related field.Other RequirementsAbility to meet Microsoft, customer and/or government security screening requirements are required for this role. These requirements include, but are not limited to the following specialized security screenings: Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud Background Check upon hire/transfer and every two years thereafter.Preferred Qualifications5+ years technical experience in software engineering, network engineering,OR systems administrationOR Bachelor's Degree in Computer Science, Information Technology,OR related field AND 2+ years technical experience in software engineering, network engineering,OR systems administrationFull-stack troubleshooting skills across network, application, hardware, management fabric, and distributed services layers .Experience documenting complex systems accurately to convey technical ideas across teams.Experience in implementing and managing Service Level Objectives (SLOs) and Service Level Indicators (SLIs) for production services.Experience with one or more automation tools or frameworks (e.g., Terraform, ARM, Chef, Bicep) and scripting languages (e.g., Python, Bash, Powershell) or similar.Site Reliability Engineering IC3 - The typical base pay range for this role across Canada is CAD $83,600 - CAD $159,600 per year.Find additional pay information here:Microsoft will accept applications and processes offers for these roles on an ongoing basis.ResponsibilitiesYou participate in onboarding, code/design reviews, and regular meetings with the engineering teams that develop and manage those products.You independently develop code or scripts that automate the performance of repetitive and easily scalable operations processes. You design, develop, and maintain telemetry pipelines and monitoring tools that detail operations metrics. You develop, test, troubleshoot, and implement changes to optimize code and improve products. You respond to incidents during regular on-call rotations.You will author technical documentation for your tools and services.You will participate in post incident reviews to drive service improvements.Embody ourand.Benefits/perks listed below may vary depending on the nature of your employment with Microsoft and the country where you work.Industry leading healthcareEducational resourcesDiscounts on products and servicesSavings and investmentsMaternity and paternity leaveGenerous time awayGiving programsOpportunities to network and connect