AI RESEARCH

Sample Complexity of Policy Gradient for Log-Growth Control

arXiv CS.LG

ArXi:2605.26640v1 Announce Type: cross We study the sample complexity of policy gradient for log-growth control -- the problem of learning, from observed state transitions, a feedback gain that optimally stabilizes a scalar linear system driven through a multiplicative-noise actuation channel. The objective $J(K) = \mathbb{E}[\log|1+BK|]$ is the top Lyapuno exponent of the closed loop. This problem carries a structural difficulty we call the cusp obstruction: the optimal gain $K^*$ always places the noise singularity $b_{\rm sing}(K) = -1/K$ in the interior of the.