I-SAFE: Wasserstein Coherence Metrics for Structural Auditing of Scientific AI Models

ArXi:2605.21731v1 Announce Type: new Deep learning models are increasingly used in scientific prediction tasks where strong benchmark performance is often interpreted as evidence of scientifically meaningful behavior. This interpretation is fragile, as models may exploit shortcut features, dataset-specific regularities, or distributional biases that are predictive on held-out data but not aligned with domain-relevant structure. To address this limitation, we