Research Computing System Engineer

Updated: 3 months ago
Location: Notre Dame, INDIANA
Job Type: FullTime
Deadline: 08 Feb 2024

Position Information


Job Title Research Computing System Engineer
Job Description
The Center for Research Computing (CRC) at University of Notre Dame is an innovative and multidisciplinary research environment that supports collaboration to facilitate multidisciplinary discoveries through advanced computation, software engineering, artificial intelligence, and other digital research tools. The Center enhances the University’s innovative applications of cyberinfrastructure, provides support for interdisciplinary research and education, and conducts computational research.
This position focuses on supporting complex research computing environments under the direction of the Center for Research Computing (CRC). To support such environments, the Research Computing System Engineer designs, builds, and maintains large-scale computing and storage infrastructure requiring in-depth expertise with Linux, cluster-based networking, grid-enabled middle-ware, common scientific applications, multi-platform network and system management tools, and distributed/parallel file systems.
In particular, this position requires a comprehensive knowledge of storage principles and practices. The candidate will have experience or familiarity with traditional multi-user distributed POSIX based file systems, common storage network protocols (NFS, SMB), established backup solutions and so on. Proficiency with proprietary protocols or closed vendor solutions is welcomed. Knowledge of acceptable storage policies, especially with respect to security and shared usage, is assumed. The position must meet export control compliance requirements, we therefore require US citizenship.
This role will also provide additional opportunities for other system engineering responsibilities with both intra- and inter-departmental CRC projects in order to produce a world class research computing environment.
The CRC is a part of the Notre Dame Research (NDR) division. NDR is committed to creating a community that fosters equity of experience and opportunity and ensures that members of all backgrounds feel safe, welcome, and included. We strive to achieve a culture of openness, autonomy, and belonging; making Notre Dame an exceptional place for our team, partners, and collaborators to flourish.
Essential Duties and Responsibilities:
  • Write and maintain documentation on utilizing the research computing environment. Assist in the preparation and delivery in user short-courses on utilizing the storage environment. Train and guide junior level team members.
  • Accountable for the installation, maintenance, upgrading and troubleshooting hardware and software on platforms, which are supported in the research computing environment. This currently includes but is not limited to hardware from Lenovo, Dell, HP, VAST, NetApp, and Panasas. Includes the ability to confirm optimal operation of the products currently installed.
  • Research current and future trends of both hardware and software. Keep current with trade publications, vendor documentation and new releases of books. Attend research computing conferences and/or trade shows like Supercomputing.
  • Accountable for the system administration and integration of research computing environments. This includes building and maintaining storage clusters at the hardware, operating system and end-user levels. Also includes maintenance and monitoring of the high performance parallel file systems in coordination with the CRC and external IT Operations/Engineering groups. Also includes enforcing policies, especially security and usage policies, pertaining to the use of the environment.
  • Accountable for the installation, maintenance, and troubleshooting of grid-enabled middleware. This includes interfacing CRC’s assets with common authentication/authorization frameworks to grid-middleware like HTCondor, Open Science Grid, and Globus.
Minimum Qualifications
Experience: 5 to 6 years
Degree: Bachelor’s degree plus five years experience, or Master’s degree with two years of experience, or Ph.D. in a related field.
Skills:
Knowledge and experience of one or more of the following:
  • common software development languages and tools
  • distributed file systems
  • standalone NFS servers
  • storage arrays
  • NAS devices
  • manage storage through GUI and/or command line tools
  • software debugging
  • API scripting
  • storage optimization in an HPC environment
  • backup strategies
  • data share technologies
  • basic networking concepts.
  • be prepared and able to develop skills in other areas
Preferred Qualifications
  • Good understanding of HPC/HTC environments including job schedulers, optimizing storage access patterns for single and multi-node computing, knowledge of complex networked environments and how CRC systems might be used to interact with research computing environments at other universities and national supercomputing centers.
  • Experience with other well known parallel/distributed file systems like Ceph, Lustre, or GPFS would be a plus.
  • Excellent oral and written communication skills.
  • The ability to pickup and learn general concepts and technologies quickly and independently.
Special Instructions to Applicants
Department Center for Research Computing (29055)
Department Website https://crc.nd.edu/
Family / Sub-Family IT / HPC
Career Stream/Level EIC 2 Professional
Department Hiring Pay Range $56,368 - $96,636
Pay ID Semi-Monthly
FLSA Status S1 - FT Exempt
Job Category Information Technology
Job Type Full-time
Schedule: Days of Week & Hours
Monday – Friday (8:00 AM – 5:00 PM)
Schedule: Hours/Week 40
Schedule: # of months 12


Posting Detail Information


Job Posting Date (Campus) 01/18/2024
Job Posting Date (Public) 01/18/2024
Job Closing Date 02/08/2024
Posting Type Open To All Applicants
Posting Number S24569
Quick Link for Internal Postings https://jobs.nd.edu/postings/33187


Similar Positions