MisuJob - AI Job Search Platform MisuJob

Jobs

Browse 250+ jobs updated daily

Latest Job Openings

Köln permanent
PythonSoftware EngineeringDesign PatternsAlgorithmsTestingBenchmarkingValidationCompilers

Who we are: Roofline is building a deployment platform to run any model on disruptive hardware at the edge. We are looking for talented and ambitious engineers that are passionate about technology to ...

April 2, 2026 View Details
Archived Not Specified, Belgium Freelance
Senior Project ManagerData CentreML InfrastructureData Centre InfrastructureLiquid CoolingProject DeliveryML Project LeadLive EnvironmentFibreElectrical UpgradesFibre Management

Lead the deployment of machine learning infrastructure across multiple locations, including liquid cooling, fibre, and electrical upgrades, in a live environment....

April 1, 2026 View Details
San Francisco, CA, United States permanent
ML InfrastructureDistributed TrainingLLMMultimodalCNN ArchitecturesMulti-provider AI ReliabilityLatency OptimizationCost EfficiencyEvaluation InfrastructureTechnical Direction

About Decagon Decagon is the leading conversational AI platform empowering every brand to deliver concierge customer experiences. Our technology enables industry-defining enterprises like Avis Budge...

March 26, 2026 View Details
Vancouver, British Columbia, Canada (Vancouver) Remote permanent
Machine LearningInfrastructureDockerAWSGCPMonitoringModel DeploymentModel LifecycleGrafana

Later is the world’s most intelligent influencer marketing company, built to give brands the confidence to create unforgettable campaigns. By combining real creator relationships, trusted intelligence...

March 9, 2026 View Details
Freiburg or Berlin permanent
CloudGPUDistributed TrainingCost OptimizationSlurmML Infrastructure

Who we are Foundation models have transformed text and images, but structured data - the largest and most consequential data modality in the world - has remained untouched. Tables power every clinica...

March 22, 2026 View Details
Boston, Massachusetts, United States (Boston, Pittsburgh, Remote U.S. Only) Remote permanent
KubernetesPythonGoDistributed SystemsMachine LearningData ProcessingModel TrainingHigh-Throughput SystemsSoftware EngineeringCloud Platforms

Mission Summary: Our team builds the foundational infrastructure that empowers Machine Learning Engineers to develop the next generation of self-driving technology. We design and operate the high-per...

March 17, 2026 View Details
Pittsburgh, Pennsylvania, United States (Boston, Pittsburgh, Remote U.S. Only) Remote permanent
Software EngineeringKubernetesPythonGoCloudDistributed SystemsMachine LearningHigh-Throughput SystemsEnd-to-End DevelopmentOwnership

Mission Summary: Our team builds the foundational infrastructure that empowers Machine Learning Engineers to develop the next generation of self-driving technology. We design and operate the high-per...

March 17, 2026 View Details
Remote, Ontario Remote permanent
Software EngineeringGoPythonData StructuresAlgorithmsSoftware DesignRelational DatabasesNoSQL Databases

Thumbtack helps millions of people confidently care for their homes. Thumbtack is the one app you need to take care of and improve your home — from personalized guidance to AI tools and a best-in-cla...

March 11, 2026 View Details
Remote, California, United States Remote permanent
Site Reliability EngineeringKubernetesAWSTerraformInfrastructure-as-CodeSlurmGPU WorkloadsProprietary Platforms

Company Overview Deepgram is the leading platform underpinning the emerging trillion-dollar Voice AI economy, providing real-time APIs for speech-to-text (STT), text-to-speech (TTS), and building pro...

March 9, 2026 View Details
Archived San Francisco, CA, United States permanent
Staff Software EngineerML InfrastructureDistributed TrainingMLPerf InferenceTraining MatrixIntelligent RoutingQuantizationBatch Process ManagementInfrastructure DomainSpeculative Decoding

About Decagon Decagon is the leading conversational AI platform empowering every brand to deliver concierge customer experiences. Our technology enables industry-defining enterprises like Avis Budge...

February 24, 2026 View Details
Archived Portugal Remote permanent
Research Focused System EngineerSoftware EngineeringData ProcessingMachine LearningGraphsCloud Native ServicesSupervised Machine LearningFault ToleranceScientific PublicationsPatents

Feedzai is the world’s first RiskOps platform for financial risk management, and the market leader in safeguarding global commerce with today’s most advanced cloud-based risk management platform, powe...

February 19, 2026 View Details
Archived Paris, Île-de-France, France Remote permanent
LinuxDistributed SystemsGPU ClustersContainer OrchestrationMonitoringLoggingAlertingTerraformKubernetes

About Pathway Pathway is shaking the foundations of artificial intelligence by introducing the world’s first post-transformer model that adapts and thinks just like humans. Pathway’s breakthrough ar...

December 19, 2025 View Details
Archived Palo Alto, CA Remote permanent
Computer Science FundamentalsC++PythonGoDistributed SystemsML InfrastructureHigh AvailabilityScalable InfrastructureScalable ML PipelinesTeam Collaboration

About AppLovin AppLovin makes technologies that help businesses of every size connect to their ideal customers. The company provides end-to-end software and AI solutions for businesses to reach, mone...

February 18, 2026 View Details
Archived Boston, MA (Boston) Remote permanent
AWSKubernetesAWS infrastructure managementMicroservicesProduction SolutionsSoftware ArchitectureEngineering Best PracticesTechnical GuidanceContainerized Deployments

About SimpliSafe SimpliSafe is a leading innovator in the home security industry, dedicated to making every home a safe home. With a mission to provide accessible and comprehensive security solutions...

February 17, 2026 View Details
Archived Remote, California, United States Remote permanent
Site Reliability EngineeringKubernetesTerraformAWSInfrastructure-as-CodeSlurm

Company Overview Deepgram is the leading platform underpinning the emerging trillion-dollar Voice AI economy, providing real-time APIs for speech-to-text (STT), text-to-speech (TTS), and building pro...

February 17, 2026 View Details

Senior ML Infrastructure Engineer

Ellison Institute of Technology

Archived Oxford, England, United Kingdom Hybrid permanent
CloudGPUDockerKubernetesTerraformHigh-performance ComputingStorage SystemsObservabilitySecurity

At the Ellison Institute of Technology (EIT), we’re on a mission to translate scientific discovery into real world impact. We bring together visionary scientists, technologists, policy makers, and ent...

December 12, 2025 View Details
Archived Athens, Attica, Greece Hybrid permanent
Machine LearningData EngineeringBig DataSparkPythonPySparkSQLLinuxDockerJenkinsAirflow

Optasia is a fully enabled B2B2X financial technology platform covering scoring, financial decisioning, disbursement and collection. We are committed to enabling financial inclusion for all. We are ch...

September 1, 2025 View Details
Archived Athens, Attica, Greece Hybrid permanent
Machine LearningSparkPySparkScalaJenkinsAirflowMicroservicesData AnalysisFeature Engineering

Optasia is a fully enabled B2B2X financial technology platform covering scoring, financial decisioning, disbursement and collection. We are committed to enabling financial inclusion for all. We are ch...

September 1, 2025 View Details
Archived Mountain View, California, United States permanent
Software EngineeringSystems FundamentalsDistributed SystemsData PipelinesPlatform TrainingGPU OptimizationInference OptimizationTooling SuitePerformance AnalysisOwnership Mindset

Join Us in Building the Future of Home Robotics At Sunday, we're developing personal robots to reclaim the hours lost to repetitive tasks. We're focused on an ambitious goal to make generalized robot...

February 11, 2026 View Details
Archived Munich, Germany (Munich (DE-MUC-ARP)) permanent
PythonC++GoMachine LearningKubernetesCloud servicesRoboticsState-of-the-art models

Intrinsic is Alphabet’s bet aiming to reimagine the potential of industrial robotics. Our team believes that advances in AI, perception and simulation will redefine what’s possible for industrial robo...

February 6, 2026 View Details
Archived Palo Alto, CA permanent
Network DevelopmentML InfrastructureHigh-Speed InterconnectsDesignValidationProductizationVendor Due DiligenceOnboardingBringing UpCharacterizationRigorous TestingLPO

About xAI xAI’s mission is to create AI systems that can accurately understand the universe and aid humanity in its pursuit of knowledge. Our team is small, highly motivated, and focused on engineeri...

February 5, 2026 View Details
Archived New York, NY Remote permanent
AWSPythonKafkaFlinkAWS GluedbtAthenaDynamoDBBigQueryMachine LearningData EngineeringML Infrastructure

The mission of The New York Times is to seek the truth and help people understand the world. That means independent journalism is at the heart of all we do as a company. It’s why we have a world-renow...

January 16, 2026 View Details
Archived United States (HQ) permanent
Full-Stack DevelopmentReactFastAPIREST APIsgRPCRAGLangChainMilvusPGVectorOllamaUI/UX DesignWebGL/Three.js

At Zone 5 Technologies, we're redefining what's possible in unmanned aircraft systems. Our team of engineers and innovators is developing cutting-edge autonomous solutions that push the boundaries of ...

January 29, 2026 View Details
Archived Santa Clara HQ permanent
LinuxKubernetesCephPythonBashInfrastructure-as-CodeGitOpsRDMAInfiniBandGPU

About The Role We're looking for a Senior Site Reliability Engineer to help us run one of the most exciting GPU clusters around—our Toronto datacenter packed with NVIDIA H100 and A100 GPUs, over 20PB...

January 26, 2026 View Details
Archived San Francisco, California, United States Remote permanent
CloudKubernetesDockerPythonRaySparkS3GCSCI/CDDevOps

Who Are We Industrial labor is incredibly dangerous work - almost 3 million people in the US per year are injured in the workplace for entirely preventable and at times, fatal or debilitating causes....

August 18, 2025 View Details
Archived Toronto permanent
Network EngineeringInfiniBandEthernetHigh-speed NetworkingRDMAGPU CommunicationNetwork SecurityFirewallsACLsVLANsHPC TopologiesInfiniBand Fabrics

About The Role We're seeking an experienced Network Engineer to design, build, and optimize the high-performance networking infrastructure powering our AI/ML operations in Toronto. You'll work at the...

January 26, 2026 View Details
Archived Bay Area, California, United States Remote permanent
Software EngineeringInfrastructureDistributed SystemsStream ProcessingScalable ArchitectureLow-Latency SystemsFault ToleranceSystems DesignPerformance TuningReliability

About LMArena LMArena is the open platform for evaluating how AI models perform in the real world. Created by researchers from UC Berkeley’s SkyLab, our mission is to measure and advance the frontier...

December 18, 2025 View Details
Archived United States Remote permanent
Production ML InfrastructureModel ServingOrchestrationCost OptimizationObservabilityData QualityML OperationsCross-functional CollaborationInnovation & Operational ExcellenceMentorship

About Playlab Playlab is a tech non-profit dedicated to helping educators and students become critical consumers and creators of AI. We believe that an open-source, community-driven approach is key ...

November 24, 2025 View Details
Archived San Mateo, CA permanent
ML InfrastructureDistributed TrainingInferencePytorchPytorch LightningPytorch GeometricRayGPU Performance EngineeringMLOpsCross-Cluster Deployment

About the Team We’re a tight-knit team of proven drug hunters, deep learning researchers, and software engineers united by a common mission — drive AI innovation in biochemistry, discovering and deve...

November 24, 2025 View Details
Archived San Francisco, California, United States Remote permanent
CloudDockerKubernetesTerraformPostgreSQLDistributed Systems

Company Overview Echo Neurotechnologies is an exciting new startup in the Brain-Computer Interface (BCI) space, driving innovation through advanced hardware engineering and AI solutions. Our mission ...

January 29, 2026 View Details
Archived San Mateo, California, United States permanent
GPU PerformanceServing StackParallelismQuantization/PEFTSystemsObservabilityAutoscalingA/B TestingCUDATensorRT

Introducing Moonlake, AI for creating real-time interactive content Mission: Improve Throughput, Latency, & Cost - deploying our models 2–10× faster & cheaper without quality regressions. Scope of W...

December 12, 2025 View Details
Archived San Mateo, CA internship
Distributed ComputingData SystemsAPI DevelopmentSystem OptimizationSystem PerformanceTesting & DebuggingProgramming Languages (Python, Java, Go)Distributed SystemsCommunication Skills

About the Job We are looking for a few interns to join us either part-time through the year or Full-time for the summer. The ideal candidate should have an interest and some experience in Distributed ...

November 27, 2024 View Details
Archived Remote, California, United States Remote permanent
KubernetesAWSTerraformSlurmAI/ML InfrastructureJob SchedulingInfrastructure-as-CodeScalabilitySelf-service EnvironmentGPU Orchestration

Company Overview Deepgram is the leading platform underpinning the emerging trillion-dollar Voice AI economy, providing real-time APIs for speech-to-text (STT), text-to-speech (TTS), and building pro...

December 23, 2025 View Details
Archived San Francisco, CA, United States permanent
PythonPostgresFastAPISQLAlchemyPydanticMachine LearningKubernetesProduction SystemsCode QualityMachine Learning Models

Who We Are At TwelveLabs, we are pioneering the development of frontier multimodal foundation models that can see, hear and understand the world as humans do. Our models have redefined the standards ...

August 26, 2025 View Details
Archived los angeles, california , United States Remote permanent
DevOpsMachine LearningML InfrastructureCloud PlatformsKubernetesDockerTerraformCI/CDAutomationSecurity

At Serve Robotics, we’re reimagining how things move in cities. Our personable sidewalk robot is our vision for the future. It’s designed to take deliveries away from congested streets, make deliverie...

December 19, 2025 View Details
Archived los angeles, california , United States Remote permanent
Machine LearningData EngineeringData Processing PipelinesData CurationData AnnotationSearch CapabilitiesNatural Language QueryingOrchestration and SchedulingData SchemasAnnotation Platforms

At Serve Robotics, we’re reimagining how things move in cities. Our personable sidewalk robot is our vision for the future. It’s designed to take deliveries away from congested streets, make deliverie...

October 29, 2025 View Details
Archived los angeles, california , United States Remote permanent
Full Stack DevelopmentNoSQLSQLCloud PlatformsReactCI/CDPythonUI/UXData Engineering

At Serve Robotics, we’re reimagining how things move in cities. Our personable sidewalk robot is our vision for the future. It’s designed to take deliveries away from congested streets, make deliverie...

January 13, 2026 View Details
Archived Stockholm Remote permanent
Machine LearningDiffusion ModelsAudio ProcessingSignal ProcessingML-Based Audio ProcessingModel Training PipelinesPerformance OptimizationModel IntegrationResearch TranslationCode Quality

We are seeking a Senior Research Engineer to join our Artist-First AI Music lab. Our team pioneers and advances state-of-the-art generative technologies for music that create breakthrough experiences ...

January 23, 2026 View Details
Archived Santa Clara HQ permanent
Linux Systems AdministrationKubernetesCeph StoragePython ScriptingBash ScriptingInfrastructure-as-CodeGitOpsRDMAInfiniBandGPUDirect

About The Role We're looking for a Senior Site Reliability Engineer to help us run one of the most exciting GPU clusters around—our Toronto datacenter packed with NVIDIA H100 and A100 GPUs, over 20PB...

January 12, 2026 View Details
Archived Toronto permanent
LinuxKubernetesCephPythonBashInfrastructure-as-codeTerraformGitOpsRDMAInfiniBand

About The Role We're looking for a Senior Site Reliability Engineer to help us run one of the most exciting GPU clusters around—our Toronto datacenter packed with NVIDIA H100 and A100 GPUs, over 20PB...

January 12, 2026 View Details
Archived Santa Clara HQ permanent
Network EngineeringInfiniBandEthernetHPCRDMAGPU-to-GPU communicationNetwork SecurityFirewallsACLsVLANs

About The Role We're seeking an experienced Network Engineer to design, build, and optimize the high-performance networking infrastructure powering our AI/ML operations in Toronto. You'll work at the...

January 12, 2026 View Details
Archived Toronto permanent
Network EngineeringInfiniBandEthernetHigh-Speed NetworkingRDMAGPU CommunicationNetwork SecurityFirewallsACLsHPC NetworkingCeph StorageNetwork Automation

About The Role We're seeking an experienced Network Engineer to design, build, and optimize the high-performance networking infrastructure powering our AI/ML operations in Toronto. You'll work at the...

January 12, 2026 View Details
Archived Seattle, Washington permanent
LeadershipBackend DevelopmentAI/ML InfrastructureData EngineeringData PipelinesDevOpsFinancial SystemsSQLGoCloud-Native ArchitectureCustomer Experience

About us Today’s financial system is built to favor those with money. Grid’s mission is to level that playing field by building financial products that help users better manage their financial future....

July 7, 2025 View Details
Archived San Francisco, CA Hybrid permanent
Production ML InfrastructurePythonAWSKubernetesFeature StoresModel RegistriesMLFlowCI/CD PipelinesReproducibilityAutomated Rollback

About Gridware Gridware is a San Francisco-based technology company dedicated to protecting and enhancing the electrical grid. We pioneered a groundbreaking new class of grid management called active ...

December 11, 2025 View Details
Archived Mountain View, USA (Mountain View) Remote permanent
Machine LearningInfrastructurePlatform DevelopmentServices DevelopmentTooling DevelopmentML EngineeringDeveloper ExperienceDeveloper EfficiencyWorkflow OptimizationAds Tech Infrastructure

Please complete the attached Internal Transfer Request Form and submit. Please make sure to apply with your Coupang e-mail address. Company Introduction We exist to wow our customers. We know we’re...

January 23, 2026 View Details
Archived London, UK (London (UK-LON-40BR)) Remote permanent
Machine LearningAI InfrastructureLarge-scale Model DevelopmentDistributed SystemsML AcceleratorsMachine learning model architectureData EngineeringSystem DesignCross-functional CollaborationMentoring

Waymo is an autonomous driving technology company with the mission to be the world's most trusted driver. Since its start as the Google Self-Driving Car Project in 2009, Waymo has focused on building ...

January 14, 2026 View Details
Archived Mountain View, California, United States, San Francisco, California, United States (Mountain View (US-MTV-EMF680), San Francisco (US-SFO-MKT555)) Remote permanent
Data AnalysisMachine LearningSQLC++Data Tooling

Waymo is an autonomous driving technology company with the mission to be the world's most trusted driver. Since its start as the Google Self-Driving Car Project in 2009, Waymo has focused on building ...

January 14, 2026 View Details
Archived Mountain View, CA, USA: San Francisco, CA, USA (Mountain View (US-MTV-EMF680), San Francisco (US-SFO-MKT555)) Remote permanent
Machine Learning InfrastructureLarge Foundation ModelsML AcceleratorsDistributed SystemsData EngineeringModel DevelopmentSimulationMulti-agent SystemsSystem ArchitecturePerformance Optimization

Waymo is an autonomous driving technology company with the mission to be the world's most trusted driver. Since its start as the Google Self-Driving Car Project in 2009, Waymo has focused on building ...

January 14, 2026 View Details
Archived Mountain View, CA, USA; San Francisco, CA, USA; New York City, NY, USA (Mountain View (US-MTV-EMF680), New York City (US-NYC-CHEL), San Francisco (US-SFO-MKT555)) Remote permanent
PythonMachine LearningC++DockerKubernetesCI/CDTensorFlowJAXMLOps

Waymo is an autonomous driving technology company with the mission to be the world's most trusted driver. Since its start as the Google Self-Driving Car Project in 2009, Waymo has focused on building ...

January 14, 2026 View Details