Leveraging Bird Eye View Video and Multimodal Large Language Models for Real-Time Intersection Control and Reasoning

Managing traffic flow through urban intersections is challenging. Conflicts involving a mix of different vehicles with blind spots makes it relatively vulnerable for crashes to happen. This paper presents a new framework based on a fine-tuned Multimodal Large Language Model (MLLM), GPT-4o, that can...

Full description

Saved in:

Bibliographic Details
Main Authors:	Sari Masri, Huthaifa I. Ashqar, Mohammed Elhenawy
Format:	Article
Language:	English
Published:	MDPI AG 2025-05-01
Series:	Safety
Subjects:	conflict detection fine-tuning Multimodal Large Language Models (MLLMs) prompt design unsignalized intersections urban traffic management
Online Access:	https://www.mdpi.com/2313-576X/11/2/40
Tags:	Add Tag No Tags, Be the first to tag this record!

Be the first to leave a comment!

Leveraging Bird Eye View Video and Multimodal Large Language Models for Real-Time Intersection Control and Reasoning

Similar Items