HPC Operations Engineer
crypto:infraengineeringIC4IT Infrastructure + WCW
Compensation
Not disclosed
Jump Trading is committed to world class research. We empower exceptional talents in Mathematics, Physics, and Computer Science to seek scientific boundaries, push through them, and apply cutting edge research to global financial markets. Our culture is unique. Constant innovation requires fearlessness, creativity, intellectual honesty, and a relentless competitive streak. We believe in winning together and unlocking unique individual talent by incenting collaboration and mutual respect. At Jump, research outcomes drive more than superior risk adjusted returns. We design, develop, and deploy technologies that change our world, fund start-ups across industries, and partner with leading global research organizations and universities to solve problems.
We are looking for an adaptable hands-on individual, passionate about the details and nuances of managing Linux HPC environments at scale, and eager to tackle complex and unpredictable operational work as their primary job function.
What You'll Do:
Provide front-line operational support for 24/7 Linux HPC compute, storage, and interconnects. Technologies involved include RDMA fabrics, parallel filesystems, HPC batch schedulers, FUSE filesystems, internal Jump software, multi-vendor hardware, cybersecurity requirements, a challenging and unpredictable client workload, and high user expectations
Solve problem reports and questions posed by members of Jump's research community, escalating as needed and managing the entire problem lifecycle.
Respond to alerts in a timely fashion
Participate in large, coordinated maintenance operations, including during evenings and weekends
Work on global projects across a wide range of infrastructure
Write code for diagnosing, resolving, and triaging difficult problems and automating frequently performed tasks
Collaborate with team members and across teams to write code and testing infrastructures spanning both new and existing codebases in multiple programming languages
Manage r