Posted:
By:Hiring Kenya
Key Responsibilities
Operations Management
Supervise a team of shift engineers manning NOC/TOC operations 24hours /7days of the week whilst ensuring continuous service monitoring and fault resolution
Develop, appropriately modify and enforce standard operating procedures (SOPs) for incident handling, escalation, and reporting
Ensure prompt detection, diagnosis, and resolution of all network power, and site alarms
Coordinate with Managed Service Provider (MSP) teams teams for timely fault rectification
Manage shift rosters, handovers, and ensure adequate coverage during all hours.
Work with operations team in coordinating and reviewing and tracking all site access requests.
Manage relationships with MSPs including enforcing SLAs with external partners
Network & System Monitoring
Oversee the monitoring of network, site power systems (grid, generator, solar, hybrid), and environmental parameters.
Ensure all monitoring tools, dashboards, and alerting systems are functional and updated, together with escalating to key stakeholders and providers.
Track key performance indicators such as uptime and SLA compliance.
Identify recurring faults and coordinate root cause analysis (RCA) and preventive actions collaborating with opeatipons and site MSP.
Incident & Problem Management
Classify and prioritize incidents based on severity and business impact.
Ensure proper escalation to Level 2/3 support or vendor teams as per SLA.
Review and approve incident and outage reports.
Lead or direct major incident bridge calls and coordinate technical response until restoration.
Drive post-incident reviews and continuous improvement initiatives.
Implement proactive monitoring strategies to identify potential issues before they impact services
Utilize automation tools for rapid incident detection and preliminary diagnosis
Coordinate major incident with structured communication and action protocols
Ensure compliance with disaster recovery and business continuity procedures
Team Leadership
Supervise and mentor NOC/TOC engineers and technicians.
Conduct performance appraisals and identify skill development needs.
Foster a culture of accountability, collaboration, and technical excellence.
Organize regular training on tools, troubleshooting, and communication.
Reporting & Analytics
Produce daily, weekly, and monthly operational reports (uptime, outages, MTTR, etc.).
Qualifications & Experience