Distilling Answer-Set Programming Rules from LLMs for Neurosymbolic Visual Question Answering

ArXi:2606.03269v1 Announce Type: new Visual Question Answering (VQA) is the task of answering questions about images, requiring the integration of multimodal input and reasoning. Modular approaches that incorporate logic-based representations into the reasoning component offer clear advantages over end-to-end trained systems, particularly in terms of interpretability. However, adapting or extending these representations when task requirements change can place a significant burden on developers.