Chief HPC Engineer Job at EPAM Systems, Inc., Remote

WVlpQU1LbXlCNGVlcG1EbG52cklsUDhMaUE9PQ==
  • EPAM Systems, Inc.
  • Remote

Job Description

Chief HPC Engineer Description

We are currently seeking an experienced Chief HPC Engineer to manage the daily operations and engineering activities within our HPC environment.

The perfect candidate should be proficient in engineering with substantial expertise in setting up and enhancing HPC infrastructure. This role will involve collaboration with our L3 HPC infrastructure engineering team to facilitate the use of an HPC cluster by our Scientific research team. Priority will be given to candidates residing in India, though the position is available to candidates from any location.

#LI-DNI

Responsibilities

  • Maintenance and support of the HPC infrastructure
  • Implementation of infrastructure automation through IaC (Infrastructure as Code)
  • Participation in software and hardware upgrades while resolving incidents
  • Management of job scheduling and resource distribution with HPC job schedulers
  • Configuration and installation of Bright Cluster Manager
  • Optimization and maintenance of GPFS/Lustre file systems
  • Supervision of InfiniBand/OmniPath network interconnect configurations

Requirements

  • 10+ years as a general technical expert in HPC
  • Background in engineering or HPC system development
  • Experience in configuring and supporting HPC infrastructure
  • Proficiency in Linux (any rpm-based) including knowledge of kernel modules compilation and debugging tools such as strace, coredump, and tcpdump
  • Skills in managing HPC job schedulers including IBM LSF and Slurm
  • Competency in configuring and installing Bright Cluster Manager
  • Familiarity with GPFS and Lustre file systems
  • Understanding of InfiniBand and OmniPath network interconnect technologies

Nice to have

  • Understanding of hardware diagnostics, upgrades, and tuning including HCA InfiniBand and disk arrays from Lustre, Vast, IBM
  • Skills in infrastructure monitoring using Zabbix, Splunk, or Grafana
  • Familiarity with Easybuild
  • Experience in a GxP environment
  • Capability to use Jira and ServiceNow

Job Tags

Remote job,

Similar Jobs

KBR

Associate Aerospace Engineer Job at KBR

 ...Title: Associate Aerospace Engineer Program Summary KBR's Missile, Aviation, and Ground Systems (MAGS) division delivers mission engineering solutions for critical U.S. Army programs, specializing in aviation and ground systems, integrated air and missile defense... 

Sanford Health

LPN - PRN - LTC Job at Sanford Health

 ...Department Details Join our team as a PRN LPN! - $28+ per hour, depending on...  ...in establishing and maintaining effective working relationships with resident, health care...  ...time off package to maintain a healthy home-work balance. For more information about Total... 

Sherman Associates

Security Guard - Riverside Plaza Job at Sherman Associates

 ...Security Guard - Resident Services Officer (RSO Staff I) Twins Cities Metro Area - Riverside Plaza Starting Pay: $20/hr + full time benefits Schedule Availability: Full time - (40 hours a week) Shifts Available ~ Wednesday - Sunday, 4 pm - 12 am (midnight... 

Lensa

Home Office Rotational Development Program Job at Lensa

 ...the firm is committed to helping more people achieve financially what is most important to them. Team Overview The Rotational Development Program (RDP) provides a combination of rotational assignments, mentoring, and coaching to build a broad range of experience... 

Domino's Franchise

Part Time Delivery Driver Job at Domino's Franchise

 ...professional company, striving to be number one in carryout, delivery, customer service, sales... everything! While the entire...  ...doorsteps. For this role specifically, we are looking for part-time drivers who mostly work during our busiest hours, around dinner time...