Why I am not impressed by A.I.

joel1974@lemmy.world · 1 day ago

Why I am not impressed by A.I.

Allero@lemmy.today · edit-2 15 hours ago

Here’s my guess, aside from highlighted token issues:

We all know LLMs train on human-generated data. And when we ask something like “how many R’s” or “how many L’s” is in a given word, we don’t mean to count them all - we normally mean something like “how many consecutive letters there are, so I could spell it right”.

Yes, the word “strawberry” has 3 R’s. But what most people are interested in is whether it is “strawberry” or “strawbery”, and their “how many R’s” refers to this exactly, not the entire word.

jj4211@lemmy.world · 1 hour ago

It doesn’t even see the word ‘strawberry’, it’s been tokenized in a way to no longer see the ‘text’ that was input.

It’s more like it sees a question like: How many 'r’s in 草莓?

And it spits out an answer not based on analysis of the input, but a model of what people might have said.

Opisek@lemmy.world · 15 hours ago

But to be fair, as people we would not ask “how many Rs does strawberry have”, but “with how many Rs do you spell strawberry” or “do you spell strawberry with 1 R or 2 Rs”