NPU MI (Management Interface) Software Engineer
Furiosa Ai
Posted: December 12, 2025
Interested in this position?
Create a free account to apply with AI-powered matching
Required Skills
Job Description
Responsibilities
• Design and develop the NPU Management Interface (MI) firmware/software enabling communication between Host/BMC and NPU devices
• Implement and maintain MCTP, PLDM, and custom MI command handling for out-of-band NPU management, monitoring, and control
• Develop device-management features over SMBus/I²C, I3C, PCIe VDM, or custom sideband channels
• Integrate MI functionality into the NPU firmware, including:
• Health and error reporting
• Thermal and power telemetry
• Runtime status, utilization metrics, and debug information
• Ensure compliance with industry specifications by performing spec-driven design, implementation, and validation
• Support bring-up, interoperability testing, rack-scale platform integration, and system-level validation
• Develop test strategies and validation tools based on MCTP and PLDM specifications
• Perform protocol compliance testing, regression testing, and interoperability verification
Requirements
• Strong proficiency in embedded C or C++
• Experience with firmware development for NPU/accelerator, GPU, or SoC
• Understanding of management protocols including MCTP (over I²C/SMBus, I3C or PCIe VDM) and PLDM
• Experience with low-level interfaces: SMBus/I²C, I3C, SPI, PCIe
• Ability to interpret complex protocol specifications and convert them into robust implementations
• Familiarity with device telemetry, sensor frameworks, watchdog/reset flows, and health monitoring
• Experience with system-level debugging using logic/protocol analyzers and low-level debug tools
• Knowledge of embedded systems, bare-metal or RTOS environments, and firmware lifecycle flows
Preferred Qualifications
• Experience of BMC firmware stacks such as OpenBMC, Redfish, IPMI, and PLDM device-model implementations
• Background in spec creation, requirement definition, or standards compliance validation
• Experience defining FRU data, power/thermal management policies, and diagnostics frameworks
• Familiarity with secure provisioning, firmware update mechanisms, and lifecycle state management
• Experience with large-scale datacenter or HPC system integration (rack-level management, telemetry aggregation)
• Contributions to firmware for accelerator, MCTP/PLDM implementations, or open-source system firmware projects
Contact