The emergence of vision language models (VLMs) offers a promising new approach. VLMs integrate computer vision (CV) and natural language processing (NLP), enabling AVs to interpret multimodal data by ...