A Multi-Modal Attentive Framework That Can Interpret Text (MMAT)

Deep learning algorithms have demonstrated exceptional performance on various computer vision and natural language processing tasks. However, for machines to learn information signals, they must understand and have enough reasoning power to respond to general questions based on the linguistic featur...

Full description

Saved in:

Bibliographic Details
Main Authors:	Vijay Kumari, Sarthak Gupta, Yashvardhan Sharma, Lavika Goel
Format:	Article
Language:	English
Published:	IEEE 2025-01-01
Series:	IEEE Access
Subjects:	Visual question answering system (VQA) text visual question answering system (Text-VQA) optical character recognition (OCR) attention mechanism natural language processing (NLP)
Online Access:	https://ieeexplore.ieee.org/document/11072709/
Tags:	Add Tag No Tags, Be the first to tag this record!

Internet

https://ieeexplore.ieee.org/document/11072709/

A Multi-Modal Attentive Framework That Can Interpret Text (MMAT)

Internet

Similar Items