But the problem is more “my do it all tool randomly fails at arbitrary tasks in an unpredictable fashion” making it hard to trust as a tool in any circumstances.
it would be like complaining that a water balloon isn’t useful because it isn’t accurate. LLMs are good at approximating language, numbers are too specific and have more objective answers.
I get that it’s usually just a dunk on AI, but it is also still a valid demonstration that AI has pretty severe and unpredictable gaps in functionality, in addition to failing to properly indicate confidence (or lack thereof).
People who understand that it’s a glorified autocomplete will know how to disregard or prompt around some of these gaps, but this remains a litmus test because it succinctly shows you cannot trust an LLM response even in many “easy” cases.
“My hammer is not well suited to cut vegetables” 🤷
There is so much to say about AI, can we move on from “it can’t count letters and do math” ?
But the problem is more “my do it all tool randomly fails at arbitrary tasks in an unpredictable fashion” making it hard to trust as a tool in any circumstances.
it would be like complaining that a water balloon isn’t useful because it isn’t accurate. LLMs are good at approximating language, numbers are too specific and have more objective answers.
deleted by creator
I get that it’s usually just a dunk on AI, but it is also still a valid demonstration that AI has pretty severe and unpredictable gaps in functionality, in addition to failing to properly indicate confidence (or lack thereof).
People who understand that it’s a glorified autocomplete will know how to disregard or prompt around some of these gaps, but this remains a litmus test because it succinctly shows you cannot trust an LLM response even in many “easy” cases.
deleted by creator