Nexxiot is a TradeTech leader with hardware-enabled data solutions and a Vision to Reduce Uncertainty in Cargo. Nexxiot operates the most significant digital global fleet of around 300’000 Rail cars and 800’000 Intermodal containers in 2023 and follows an ambitious growth plan to quadruple the number of digitized assets by 2027. Nexxiot empowers carriers, railroads, and shippers to monitor the location, status, and conditions of their assets and cargo in real-time, provides forensic analysis of what has happened in the past and allows predictive, actionable insights. Sophisticated big data and AI-based analytics deliver business intelligence at scale to drive efficiency, process automation and achieve sustainability targets.Headquartered in Zurich, Nexxiot employs people from +30 nations and has offices in Germany, Sweden, the UAE, and North America. Nexxiot provides specialized solutions to the Rail, Tank container, and Intermodal/ Ocean segments for assets travelling in +160 countries. For more information, visit . As a Site Reliability Engineer (SRE) working at Nexxiot you are part of an interdisciplinary agile SRE team, responsible for implementing highly available, scalable, compliant and secure cloud infrastructure according to the requirements and priorities provided by the principal site reliability engineer. Working closely with the rest of the team your goal is to design, implement and test cloud infrastructure solutions and to operate and maintain the resulting software and cloud infrastructure services according to our Site Reliability Engineering (SRE) practices. As SRE you are pragmatic, taking the right tools for the job never afraid of reading and learning a new programming language or tool to the degree needed to get the job done. As well as improving existing toolchains and processes.Your main areas of accountability Enable DevOps teams to deliver secure, compliant and resilient software services with short time to market.Collaborate with Product Owners (PO) and Software Architects (SA) from product teams.Deploy infrastructure components and services to different (development, testing and production) environments using continuous deployment practices according to the principles of Site Reliability Engineering.Infrastructure Strategy (evaluation of external services).Conduct regular architecture reviews, (re)evaluations and participate in surveillance and compliance activities.Provide consultancy services for macro and security architecture to DevOps teams.Identity and Access Management (IAM) for cloud infrastructure.On & off-boarding of staff members in joint-venture with the internal IT department.Act as onboarding buddy for new SRE team members.Participate in agile team activities (e.g. stand-ups, planning meetings, demos. retrospectives, …).Participate in team’s on-call rotation during office hours to provide 3rd level support and to ensure system and service availability.Actively participate in discussions, peer collaboration and solution reviews.Develop and operate the Kubernetes platform.Develop and operate the GitLab CI/CD platform.Operation of persistent database systems and storage layers for primary data stores.Define, develop, test and practice disaster recovery procedures.Develop, maintain and improve monitoring (metrics, logs, ...) for our infrastructure and platform services.Develop and operate VPN services to provide secure and reliable connection to our infrastructure.Security Information and Event Management (SIEM) for cloud infrastructure.Participate in internal and external security audits & review activities (i.e. pentests) Desired qualifications Experience in writing and operating infrastructure as code with focus on reliable day two operations.Experience with at least one of the following infrastructure management tools Pulumi, Terraform, AWS CDK, AWS CloudFormation or Ansible.Experience in writing software in at least one of the following programming languages TypeScript, Rust, Go or Python.Linux/Unix shell know-how is a great plus.Familiar with public cloud infrastructure concepts, experience with Amazon Web Services (AWS) is a big plus.Acquainted with the Git version management system (Gitlab) and CI/CD best practices (Gitlab pipelines).Experience in writing and operating highly available and scalable containerized software services on top of Kubernetes, AWS ECS, AWS Fargate or Docker is a great plus.Experience in operating Kubernetes clusters is a great plus.Experience with monitoring system like Grafana, Prometheus, DataDog or CloudWatch is a great plus.Fluent in English spoken and written. German is a plus, but not mandatory. What we offer: Flexibility: We offer 4 days (80%) working week or 5 days (100%).Remote: We offer unlimited working from home; however, please bear in mind that some roles require a certain degree of office.Work Life Balance: We offer 25 days’ paid leave.Sustenance: Coffee, tea, water, fruit and nuts, snacks and fridges usually stocked with delicious smoothies – all free of charge, in office locations. Additional informationWe invite you to apply for this position regardless of your age, gender, nationality, religion, orientation, ethnicity or whatever you might think hold you back as we believe in and celebrate diversity. Only by accepting everyone as they are, can we be at our best! Our Talent Acquisition team is looking forward to receiving your CV via our career portal or via the button below. We are very much looking forward to receiving your application via our career portal (in PDF max. 3 pages) so we can get to know you a bit better. Our Values Contribute ActivelyBe Transparent - do not BSPromote Mutual RespectKeep Cool and Have FunFail ForwardThink and Act as an Entrepreneur Sounds like a match? We look forward reviewing your application (CV with max. 3 pages in PDF)! Please only send CVs in English.We will not consider CVs from agencies.

Site Reliability Engineer (SRE)

Site Reliability Engineer (SRE)

Share this job now