AI RESEARCH
MEMENTO: Leveraging Web as a Learning Signal for Low-Data Domains
arXiv CS.AI
•
ArXi:2605.29795v1 Announce Type: new Real-world tasks often lack large labeled datasets, motivating extensive work on learning in low-data regimes. However, existing approaches such as few-shot prompting, instruction tuning, and synthetic data generation, continue to treat labeled or pseudo-labeled data as the primary learning signal. In contrast, human practitioners acquire expertise through repeated, self-directed interaction with the open web, progressively refining both domain knowledge and search strategies.