Senior Linux Infrastructure and Automation Engineer
NVIDIA
Posted: April 12, 2026
Interested in this position?
Create a free account to apply with AI-powered matching
Quick Summary
We are seeking a highly skilled Senior Linux Systems Administrator to join our team and contribute to the development and maintenance of our internal cloud provisioning platform, driver verification environments, and automated build and test systems.
Required Skills
Job Description
NVIDIA is looking for an outstanding Linux Systems Administrator to join a leading network verification and infrastructure automation team. The team develops and maintains a wide range of infrastructure solutions including an internal cloud provisioning platform on both VMs and baremetal servers, driver verification environments, automated build and test systems, and more. You will have the opportunity to impact engineering teams by ensuring our NVIDIA/Mellanox networking hardware is provisioned, tested, and validated at scale. Your responsibilities will include using cutting-edge automation technologies and AI based solutions to developing infrastructure capabilities spanning our internal cloud grid, provisioning pipelines, kernel and driver verification environments, and high-performance networking setups.We are looking for a motivated teammate who isn't afraid of learning new technologies, tackle complicated debugs, work closely with internal R&D teams and develop modern tools to make constant improvements to our server fleet and automation infrastructure.
What you'll be doing:
• Join our infrastructure team and develop best-in-class automation solutions for bare-metal server provisioning and network driver verification.
• Build and maintain Ansible playbooks and roles for full server lifecycle management — from OS installation and kernel configuration to OFED driver setup and production readiness.
• Develop Python solutions for hardware introspection, REST API integration, inventory management, and resource allocation across the server fleet.
• Develop virtualization and system capabilities — KVM, QEMU, libvirt, Vagrant, Docker, and Kubernetes — across a variety of operating systems and hardware architectures (x86\64, aarch64, ppc64le).
• Build and maintain Jenkins CI/CD pipelines (Groovy/Jenkinsfile) that orchestrate the full provisioning workflow from BIOS configuration through Ansible provisioning to automated validation.
• Be a part of an experienced team with a great atmosphere.
• Collaborate with multiple cross-domain teams — verification engineers, hardware teams, and cloud engineers — to provide the best infrastructure solutions to our customers.
What we need to see:
• B.Sc. (or equivalent experience) in Computer Engineering, Computer Science, or a related technical field.
• 5+ years of experience in the field of Linux systems administration, infrastructure automation, or DevOps.
• Background in designing, implementing, and debugging automation software. Strong debugging and analytical skills.
• Experience in Python — scripting, REST API clients, subprocess management, and pip package management.
• Solid understanding of Linux — systemd, package management (dnf/yum, apt, zypper), kernel parameters, GRUB, sysctl tuning, NFS, and service management.
• Agility and multitasking.
• Strong collaboration and communication skills with peer and internal customers.
Ways to stand out from the crowd:
• Experience with Ansible (playbooks, roles, tags, idempotency) and infrastructure-as-code principles as well as background with Kubernetes, Vagrant (vagrant-libvirt), Docker, and KVM/QEMU/libvirt virtualization stacks.
• Familiarity with NVIDIA/Mellanox hardware — ConnectX NIC series, BlueField DPUs, MFT (Mellanox Firmware Tools), and RSHIM driver configuration.
• Hands-on experience with hardware management APIs such as Redfish (Dell iDRAC / HP iLO) and IPMI for automated BIOS and BMC configuration.
• Experience with performance tuning — hugepages, DPDK, NUMA, CPU pinning — for virtualization and high-performance networking workloads