Leave us your email address and we'll send you all the new jobs according to your preferences.
Site Reliability Engineer - US
Posted 7 days 17 hours ago by Valarian Technologies Limited
Valarian Technologies is a dual-use technology company building critical tools to safeguard the future in an era of evolving global security challenges. We're rethinking security beyond traditional military domains, addressing asymmetric threats that impact our technological advantage, economic strength, and democratic institutions.
We build Acra - the platform foundation for everything we do as a dual-use technology company. The platform's name, rooted in the Greek word for citadel (or, fortress), reflects the design and purpose of our infrastructure-agnostic secure enclaves: protecting critical data. Some of the government and commercial workflows include: increased operational resiliency for mission-critical systems and functions; enabling organizations to more quickly and widely adopt emerging technologies while ensuring the integrity of their intellectual property; information flow during disaster response scenarios, and zero-trust / least-privilege environments for M&A, attorney-client privileged communications, etc. And we've only scratched the surface.
At our core, we're driven by a shared mission and a belief in making a tangible impact on our world. Whether you join our London HQ or the wider global organisation, you'll be a part of collaborative, high-performing teams, creating cutting-edge software, platforms, and infrastructure.
The Role
Join us as a Site Reliability Engineer and help us build the future of data sovereignty! We're seeking an SRE passionate about creating high-performance, scalable, and reliable services for our production infrastructure. You'll have a direct impact, improving existing systems and developing innovative solutions to complex challenges.
Our small, collaborative engineering teams own the full lifecycle of their services, from development to production operations. We champion automation and empower you to choose the best tools for the job. If you thrive in a fast-paced environment where you can make a real difference, we want to hear from you!
What You'll Do:
- Develop and implement a comprehensive observability strategy for self-hosted deployments, including infrastructure and tooling for monitoring, alerting, and troubleshooting. This will involve designing and implementing robust metrics and logging systems.
- Engineer the Acra platform for high availability and fault tolerance. This includes ensuring resilience against Cloud Availability Zone outages and the ability to gracefully handle node failures.
- Guarantee 99.9% uptime for the platform's control plane and deployment management. Design and implement a disaster recovery plan with active/passive deployments and seamless failover capabilities.
- Architect and implement a highly available deployment setup for applications within the Acra platform. This will involve designing and building the infrastructure and processes necessary for continuous operation.
- Create and maintain robust backup and recovery strategies for all Valarian products, ensuring data integrity and minimal downtime in the event of a failure.
- Integrate and manage an incident detection and paging solution to ensure rapid response to critical issues and minimize service disruptions.
- Scale the Acra platform and applications to support large concurrent user bases (25+ users) and sustained daily usage. This will involve performance tuning, capacity planning, and optimization of resource utilization.
- Collaborate closely with the product engineering team to influence the design and implementation of new products and features, ensuring they meet our reliability and scalability standards from the outset.
Preferred Qualifications
- Bachelor's degree (or foreign equivalent) in Computer Science or a related field is desired; relevant practical experience will also be considered
- Proficiency with programming languages like Go, Bash, Python
- 5+ years' experience with Linux system administration
- Experience with virtualization and orchestration technologies like Kubernetes and Docker Swarm
- Experience with system management tools like Terraform, Chef, Puppet or Ansible
- Experience with hybrid cloud environments spanning on-premise and multiple cloud providers is a plus
Salary & Benefits
- Competitive salary and equity grants
- Employer pension contributions;
- UK roles include enhanced employer pension contributions
- US roles include 401(k) retirement savings plan - traditional and Roth
- Platinum healthcare benefit;
- For US roles, we offer comprehensive medical, dental and vision plans at little to no cost to you
- For UK roles, Valarian will cover the full cost of the Private Medical Insurance (PMI) premium
- Basic Life / AD&D and long-term disability insurance 100% covered by Valarian
- Hybrid work arrangements are managed at team level
- Generous holiday calendar and PTO
- Relocation assistance (depending on role eligibility)
Valarian Technologies Limited is an equal opportunity employer and welcomes applications from individuals regardless of race, colour, religion, sex, sexual orientation, gender, identity or expression, national origin, age, disability, genetic information, marital status, veteran, amnesty, or any other legally protected characteristic.
We are committed to ensuring a fair and inclusive recruitment process and providing employment opportunities to all applicants. Decision recruitment, hiring, and employment are based solely on qualifications, skills, and experience relevant to the job requirements.
Valarian Technologies Limited
Related Jobs
It Product Owner (m/f/d)
- Hamburg, Germany
Process Expert For Sensors With Focus Wafer Bonding (f/m/div.)
- Sachsen, Dresden, Germany, 01067
Packaging Specialist / Pcr Expert (m/f/d)
- Bayern, München, Germany, 80331
Computer Scientist / It Specialist As Network Specialist (m/f/d)
- Nordrhein-Westfalen, Düsseldorf, Germany, 40210
Sales Support Specialist Dach - 12 Months Contract (m/w/d)
- Berlin, Germany