Les missions du poste


A propos d'Inria

Inria est l'institut national de recherche dédié aux sciences et technologies du numérique. Il emploie 2600 personnes. Ses 215 équipes-projets agiles, en général communes avec des partenaires académiques, impliquent plus de 3900 scientifiques pour relever les défis du numérique, souvent à l'interface d'autres disciplines. L'institut fait appel à de nombreux talents dans plus d'une quarantaine de métiers différents. 900 personnels d'appui à la recherche et à l'innovation contribuent à faire émerger et grandir des projets scientifiques ou entrepreneuriaux qui impactent le monde. Inria travaille avec de nombreuses entreprises et a accompagné la création de plus de 200 start-up. L'institut s'eorce ainsi de répondre aux enjeux de la transformation numérique de la science, de la société et de l'économie.
PhD Position F/M Physics-Grounded World Models for Scalable, Efficient, and Robust Robot Learning
Le descriptif de l'offre ci-dessous est en Anglais
Type de contrat : CDD

Niveau de diplôme exigé : Bac +5 ou équivalent

Fonction : Doctorant

Contexte et atouts du poste

The work will be conducted in the WILLOW team at Inria Paris research center. Renowned for its exceptional work in computer vision and robotics, the WILLOW team has consistently produced high-quality research, resulting in publications in major journals and conferences.

As part of the team, you will have access to a well-established laboratory featuring multiple robotic platforms and computing cluster.

Additionally, you can expect frequent visits and talks by esteemed researchers from top research laboratories around the world. Opportunities abound for collaboration with leading researchers both in Europe and globally.

Furthermore, you will join an international and welcoming team environment, where we regularly organize various events ranging from casual after-work gatherings to multi-day lab retreats.

Mission confiée

Physics-Grounded World Models for Scalable, Efficient and Robust Robot Learning

Thesis Information

Thesis director:, Researcher (CRCN), Inria, École Normale Supérieure - Willow team
Thesis co-director:, Researcher (CRCN), Inria, École Normale Supérieure - Willow team
Doctoral school: SMPC - ED 386 (Sciences Mathématiques de Paris Centre)
Location: Inria/Willow, Inria, and École Normale Supérieure

Context, motivations and scientific objectives

In recent years, Vision-Language-Action (VLA) models have demonstrated remarkable progress in enabling robot systems to follow complex instructions and generalize across tasks. However, these models currently face an extreme data-intensity bottleneck: they require vast amounts of real-world robot data - data that is prohibitively expensive to collect via human teleoperation.

While simulation offers a scalable alternative for robot data generation and reinforcement learning (RL), a fundamental reality gap remains. Classical physics simulators provide essential physical grounding but struggle to capture the full complexity of real-world interactions, particularly in contact-rich settings. Meanwhile, data-driven foundation models - such as recent video generation approaches - learn rich statistical variability but often produce physically inconsistent predictions.

This thesis aims to bridge this gap by developing physically grounded, multimodal world models that combine classical analytical methods with modern deep learning. The goal is to design efficient and physically consistent computational models for autonomous, complex robotic behaviors.

The project addresses three main scientific objectives:

- Objective 1: Hybrid World Models with Physical Awareness. Integrate model-based priors with data-driven architectures to create hybrid world models combining multimodal sensor streams (vision, tactile, force). The aim is to reduce data requirements, improve physical fidelity, and enhance interpretability and generalization.
- Objective 2: Learning Large-Scale Behavioral Priors. Use the hybrid models as synthetic data engines to train Large Behavior Models (LBMs), allowing imitation learning from simulated, physically consistent trajectories.
- Objective 3: Sample-Efficient RL in Hybrid Worlds. Develop reinforcement learning and planning algorithms that leverage the differentiable structure of the hybrid world models, solving complex tasks with higher sample efficiency than model-free approaches.

Expected Contributions and Scientific Approach

The research plan is structured around three main axes:

Axis 1: Developing Hybrid Architectures for Physics-Grounded World Modeling

Design physically grounded hybrid world models that combine differentiable physics engines with high-capacity generative architectures. The project will build upon , a differentiable physics simulator, and explore integrating 3D foundation and video generation models for high-fidelity visual prediction.

Axis 2: Scalable Robot Data Synthesis for LBMs Pretraining

Develop methods to learn large-scale behavioral priors via imitation and self-supervised learning. Utilize hybrid models to generate physically consistent synthetic data, enabling pretraining of Large Behavior Models (LBMs) with strong task generalization.

Axis 3: Model-based Reinforcement Learning and Planning

Leverage the differentiability of hybrid models for gradient-based RL and planning. Integrate trajectory optimization, latent planning, and policy learning for a unified, sample-efficient framework capable of real-world transfer.

Software Development, Experimental Validation, and Benchmarking

All theoretical developments will be validated experimentally on robotic platforms (Unitree G1 humanoid, GO2 quadruped, Shadow Dexterous Hand) within the at Inria Paris. Algorithms and simulation engines will be open-sourced and integrated into the Willow software stack: , , and .

This PhD will benefit from access to the Robotics Lab and national computational resources (Jean Zay cluster). In line with open science practices, the project commits to considering the publication of code and datasets under a permissive open-source license, promoting reproducibility, knowledge sharing, and collaboration.

Supervision and Research Ecosystem

The at Inria is recognized for its expertise in computer vision, robotics, optimization, and reinforcement learning. The two supervisors' complementary expertise - Shizhe Chen in multimodal foundation models and Justin Carpentier in differentiable physics and optimization - will provide an ideal environment for advanced research.

The project also benefits from the AI clusters and collaborations with MIAI (Pierre-Brice Wieber, Michael Arbel) and ANITI (Ludovic Righetti), as well as potential international partnerships with Marc Toussaint (TU Berlin), Russ Tedrake (MIT), and Heng Yang (Harvard).

Application Instructions

This PhD will occur within the framework.

Non-discrimination, openness, and transparency

Candidates will be evaluated through an Open, Transparent, and Merit-based (OTM) recruitment process. All PR[AI]RIE-PSAI partners are committed to promoting equality, diversity, and inclusion within their communities. We encourage applications from individuals with diverse backgrounds and ensure that selections are made through an open and transparent process.

Evaluation criteria

- Academic excellence in relevant fields (computer science, applied mathematics, machine learning, or computer vision)
- Relevance of the candidate's background to the PhD topic
- Motivation and clarity of research goals
- Research potential (publications, projects, technical achievements)
- Ability to work independently and collaboratively
- Proficiency in programming and analytical skills
- International experience or mobility (considered an asset)

Application procedure

Applications should be submitted via the webpage where this offer is published.

- Application contact:
Justin Carpentier (Head of Willow) -
Shizhe Chen (Researcher) -
- Application deadline:
The deadline is May 15, and results will be communicated in two phases between May30 and mid-June.
- Required documents:
- The candidate's CV
- A one-page motivation letter describing the applicant's ambitions and the relevance of their profile
- A copy of the latest degree(s)

Principales activités

Compétences

The candidate must have an excellent track record. The candidate must have the following qualifications:

- Strong background in robotics and AI

- Excellent programming skills in Python/C++

- Strong proficiency in both written and spoken English

- Ability to work independently as well as collaboratively

Avantages

- Subsidized meals
- Partial reimbursement of public transport costs
- Leave: 7 weeks of annual leave + 10 extra days off due to RTT (statutory reduction in working hours) + possibility of exceptional leave (sick children, moving home, etc.)
- Possibility of teleworking and flexible organization of working hours
- Professional equipment available (videoconferencing, loan of computer equipment, etc.)
- Social, cultural and sports events and activities
- Access to vocational training
- Social security coverage

Postuler sur le site du recruteur

Ces offres pourraient aussi vous correspondre.

L’emploi par métier dans le domaine Industrie à Paris