We ve rolled out the Codex platform internally as an alpha, and we re excited to see teammates use Codex seamlessly and share skills across the company.
Codex models + Codex Harness have been the most consistent option for long-running tasks in my workflow. That consistency is why many top engineers I follow keep backing Codex.
I ve tested Claude and Gemini for long-running workflows too, but I still see reliability gaps when tasks run for a long time. I ve felt this way since the 4.0 generation, and it hasn t really improved for my use cases.
If/when AI becomes the default operator for long-running tasks across industries, I believe Codex will stand out as the most dependable choice.