Step 1: Analyze the image as direction, not caption
The backend is instructed to recover creative direction instead of writing a plain description. That is why the structured output includes subject anchor, scene envelope, style signals, shot logic, emotion, and remix handles.