Ready to replace all your employees with AI agents that can work all day and never complain?
You will want to fire them because they can’t do the job.
A new study put AI agents from all the major players to the test.
The agents were given freelance jobs from Upwork.
They completed less than 3% of the work. Earned a paltry $1,800 out of a possible $144,000.
The best performing agent from the Chinese AI company Manus completed just 2.7% of the work assigned. An agent from Google, Gemini Pro, finished just .8% of the work.
Agents from Anthropic, OpenAI and Grok all performed just as poorly, posting scores from 1.7 to 2.1 of the work completed successfully at a level that would be acceptable for a freelancer to be paid.
The model makers have been selling the idea of replacing humans with their super smart AI from the day ChatGPT was unleashed almost three years ago.
But with no memory storage and continual learning from experiences, the promise is just that, a disappointing promise.
Even your intern can do better.
And companies that are laying off humans and using AI as an explanation probably aren’t being honest with their investors.
This study also supports the actions of Klarna, the Swedish fintech company that got rid of thousands of employees, only to hire them back a few months later.
And the experts that say the technology works and that you aren’t using it correctly if you don’t see results are also not being honest. This is the second recent study that shows the current technology does not perform in the real world as promised.
All of the funds being spent on data centers should be focused on finding a technology that actually works.
What we see in real world applications are that humans need to be involved in any workflow that uses AI. The work must be checked, edited and corrected.
AI is not a set it and forget it technology, at least not yet. Maybe some of the money flowing to data centers can find a technology that is able to execute on the promise of independent work; the current product needs a real human touch.

Leave a Reply