A curated list for vision-and-language navigation. ACL 2022 paper "Vision-and-Language Navigation: A Survey of Tasks, Methods, and Future Directions"
-
Updated
May 2, 2024
A curated list for vision-and-language navigation. ACL 2022 paper "Vision-and-Language Navigation: A Survey of Tasks, Methods, and Future Directions"
A curated list of research papers in Vision-Language Navigation (VLN)
Code of the CVPR 2021 Oral paper: A Recurrent Vision-and-Language BERT for Navigation
Code and Data of the CVPR 2022 paper: Bridging the Gap Between Learning in Discrete and Continuous Environments for Vision-and-Language Navigation
Pytorch code for ICRA'21 paper: "Hierarchical Cross-Modal Agent for Robotics Vision-and-Language Navigation"
Repository for Vision-and-Language Navigation via Causal Learning (Accepted by CVPR 2024)
Code of the NeurIPS 2021 paper: Language and Visual Entity Relationship Graph for Agent Navigation
Code and data of the Fine-Grained R2R Dataset proposed in the EMNLP 2021 paper Sub-Instruction Aware Vision-and-Language Navigation
Fast-Slow Test-time Adaptation for Online Vision-and-Language Navigation
Planning as In-Painting: A Diffusion-Based Embodied Task Planning Framework for Environments under Uncertainty
Code for ORAR Agent for Vision and Language Navigation on Touchdown and map2seq
[AAAI-25] FLAME: Learning to Navigate with Multimodal LLM in Urban Environments (arXiv:2408.11051)
Companion Repo for the Vision Language Modelling YouTube series - https://bit.ly/3PsbsC2 - by Prithivi Da. Open to PRs and collaborations
[ECCV 2022] Official pytorch implementation of the paper "FedVLN: Privacy-preserving Federated Vision-and-Language Navigation"
Code for 'Chasing Ghosts: Instruction Following as Bayesian State Tracking' published at NeurIPS 2019
LACMA: Language-Aligning Contrastive Learning with Meta-Actions for Embodied Instruction Following
Official repository of "Mind the Error! Detection and Localization of Instruction Errors in Vision-and-Language Navigation". We present the first dataset - R2R-IE-CE - to benchmark instructions errors in VLN. We then propose a method, IEDL.
A list of research papers on knowledge-enhanced multimodal learning
Hierarchical Spatial Proximity Reasoning for Vision-and-Language Navigation
Add a description, image, and links to the vision-and-language-navigation topic page so that developers can more easily learn about it.
To associate your repository with the vision-and-language-navigation topic, visit your repo's landing page and select "manage topics."