PageLLM: A Multi-Grained Reward Framework for Whole-Page Optimization with Large Language Models

ArXi:2506.09084v2 Announce Type: replace-cross Whole-page optimization (WPO) decides how search and recommendation results are surfaced to users, and large language models (LLMs) open a new route to it by treating page generation as sequence generation. Adapting LLMs to web-scale WPO, however, remains bottlenecked by the need for costly human annotations and by the mismatched granularity between page-level coherence and item-level placement.