Key Responsibilities:
Research Computing Administration
- Manage and optimize ILRI’s High-Performance Computing (HPC) environment, including configuration, user access, job scheduling, performance tuning, and resource allocation.
- Administer and support Linux/UNIX systems that host scientific computing workloads and research applications.
- Monitor performance, capacity, availability, and security of research computing platforms across ILRI’s Kenya and Ethiopia campuses.
- Collaborate with ICT Infrastructure teams on server hardware, virtualisation, and network requirements supporting HPC operations.
Research Repository and Platform Administration
- Administer ILRI’s research repository platforms (e.g., DSpace), ensuring availability, performance, backups, indexing, and secure access.
- Implement software upgrades, patches, and performance enhancements for the repository and related platforms.
- Support secure integration and data exchange between ILRI repository systems and CGIAR platforms using APIs or established protocols.
Systems Integration, Security, and Documentation
- Ensure research computing and repository platforms comply with institutional performance, availability, and security standards.
- Coordinate with ICT Infrastructure on network, firewall, and connectivity requirements for cross-campus and external system access.
- Develop and maintain operational documentation, standard operating procedures (SOPs), and user guides.
Research Support and Capacity Building
- Provide technical support, training, and guidance to researchers and ICT staff on HPC usage, Linux environments, and best practices for scientific computing.
- Promote efficient and responsible use of research computing resources through user education and capacity-building initiatives.
Continuous Improvement and Innovation
- Evaluate and recommend improvements to enhance system performance, scalability, automation, and cost-effectiveness.
- Explore emerging technologies that strengthen research computing, virtualisation, or repository platform capabilities.
- Contribute to ICT operational standards, system monitoring tools, and disaster recovery procedures for research computing environments.
Perform any other related duties as may be required
Requirements
- Bachelor’s degree in Computer Science, Information Systems, Computer Engineering, or related field.
- Professional certification in Linux systems administration, virtualisation, or HPC environments (e.g., RHCE, CompTIA Linux+, or equivalent).
- About Seven (7) years of relevant experience in Linux/UNIX systems administration, HPC environments, or research computing support.
- Proven experience managing multi-user Linux environments, job schedulers (e.g., Slurm, PBS, Grid Engine), and data-intensive workloads.
- Hands-on experience with repository or web application administration, preferably: DSpace, Tomcat, PostgreSQL, MySQL or equivalent systems.
- Strong scripting skills (Bash, Python, or Perl) for automation and operational support.
- Solid understanding of networking, system security, backup, and recovery strategies.
- Strong analytical, troubleshooting, and communication skills.
- Familiarity with FAIRER or open science frameworks is an added advantage.
Salary: Discuss During Interview
Education: Diploma
Employment Type: Full Time