Senior Site Reliability Engineer

Posted 5 days 6 hours ago by Elemica, Inc.

Permanent
Not Specified
Other
Not Specified, United Kingdom
Job Description

Current job opportunities are posted here as they become available.

Candidates, please note - this position is available only in the United Kingdom

Interested in a career that bridges the gap between Supply Chain and Technology?

Elemica, an award-winning, digital supply chain company in the SaaS community, is seeking an experienced Senior Site Reliability Engineer (SRE) to continue our drive toward modernizing the global supply chain. This is an opportunity to join a growing company of talented and committed individuals, unified in the common goal of exceeding our client's expectations.

Our Values

  • Curiosity - we delight in the discovery of new challenges and feel compelled to solve them
  • Integrity - We are relatable and trustworthy; steadfast in our commitment to our colleagues, clients, and partners
  • Accountability - We show up and deliver measurable, meaningful business value. Consistently.
  • Passion - We have a shared enthusiasm for transforming our clients' supply chain

What's In It For You?

  • Competitive Compensation Package
  • Hybrid Work Locations & Flexible Work Schedule
  • Global EAP Program
  • Company Discounts
  • Generous Employee Referral Program
  • Bike Leasing/By-a-Bike/Cycle-to-Work Offerings
  • Benefits-in-kind/Wellness Stipends
  • Rewards & Recognition incl. Years of Service Awards
  • Quarterly Employee Engagement Events

Responsibilities & Objectives

Elemica's SRE, reporting to the Engineering Manager, will build, run, maintain, and scale our fault-tolerant distributed systems deployed globally.

As an SRE, your focus will be on enabling and ensuring efficient, scalable, and reliable production systems for one or more of our solutions. You will help drive and maintain Service Level Objectives (SLOs) in close coordination with application and platform engineering.

We thrive on frequent, open, and transparent communication in a blameless operating environment. We take pride in managing complex systems and frequent deployments through constant collaboration, building trust, shared ownership, and providing cross-functional value across various teams that deliver our solutions.

What You Will Do

  • Maintain, operate, and improve the security, scalability, and reliability of Elemica systems
  • Provide technical mentorship and guidance to SRE team members
  • Cultivate an environment of sustainable engineering and operational practices
  • Establish, maintain, and monitor SLOs in coordination with Application and Platform engineering
  • Bring to life services through collaboration with Application and Platform Engineering
  • Build and maintain tooling to reign in operational overhead as systems grow and become more complex

What You'll Need

  • Interest in technical leadership and mentoring while as an individual contributor
  • Comfortable collaborating and solving problems in a fast-paced and complex environment
  • Partnering openly within the team even if the problem or solution is not well understood
  • Comfortable taking responsibility for delivering large and complex projects
  • Passionate about identifying and eliminating repetitive work through automation
  • Extensive knowledge of AWS EC2, ECS, Networking, SQS, Cloudformation, Lambda, RDS, Aurora, and IAM is preferred
  • Experience participating in on-call rotations for mission-critical systems
  • Experience designing, planning, and implementing operational projects for cloud-native systems
  • 5+ years experience in:
    • AWS services
    • Application configuration management tools such as Chef or Ansible
    • Infrastructure as Code tools such as Terraform
    • Operational scripting using programming languages such as Bash, JavaScript, Python, Go
    • Production operations for cloud-native distributed systems
    • Evaluate and analyze production issues
    • Unix operating systems

Education

Bachelor's in Computer Science, Information Systems, a related field, or equivalent practical work experience

Who We Are

  • A distributed and diverse team
  • Working closely with Application and Platform engineering to bring production context forward and help drive development of fault-tolerant and scalable systems
  • Committed to delivering quality and throughput via modern technology and incremental improvement
  • Battling entropy through collaboration, automation, and repeatable patterns

It is the responsibility of all Elemica employees to ensure the security, availability, processing integrity, confidentiality, and privacy of Elemica systems and data and the data of our customers. Using best practices in these areas, all Elemica employees will observe a 'security first' approach to their daily responsibilities. All employees are accountable for securing their work devices, work areas, and communications in the execution of their daily duties.