AI RESEARCH
l9gpu - open-source GPU observability with workload-level attribution [P]
r/MachineLearning
•
GPU monitoring tools like DCGM give you hardware-level metrics but no workload context. When a node is saturated, you can't tell which experiment, team, or job is responsible without digging through logs. We built l9gpu to close that gap.