Multi-Modal Validation and Domain Interaction Learning for Knowledge-Based Visual Question Answering
Abstract: Knowledge-based Visual Question Answering (KB-VQA) aims to answer the image-aware question via the external knowledge, which requires an agent to not only understand images but also ...
Abstract: Multi-modal image synthesis is crucial for obtaining complete modalities due to the imaging restrictions in reality. Current methods, primarily CNN-based models, find it challenging to ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results