AI RESEARCH
Prism: A Plug-in Reproducible Infrastructure for Scalable Multimodal Continual Instruction Tuning
arXiv CS.LG
•
ArXi:2605.26110v1 Announce Type: new Multimodal Large Language Models (MLLMs) achieve versatility by reformulating diverse tasks into a unified instruction-following framework via instruction tuning. However, real-world deployment requires continuous adaptation to emerging tasks, motivating Multimodal Continual Instruction Tuning (MCIT). Despite its growing importance, current MCIT research is hindered by severe engineering bottlenecks.