• RatBin@lemmy.world
    link
    fedilink
    English
    arrow-up
    1
    ·
    9 months ago

    Obviously nobody fully knows where so much training data come from. They used Web scraping tool like there’s no tomorrow before, with that amount if informations you can’t tell where all the training material come from. Which doesn’t mean that the tool is unreliable, but that we don’t truly why it’s that good, unless you can somehow access all the layers of the digital brains operating these machines; that isn’t doable in closed source model so we can only speculate. This is what is called a black box and we use this because we trust the output enough to do it. Knowing in details the process behind each query would thus be taxing. Anyway…I’m starting to see more and more ai generated content, YouTube is slowly but surely losing significance and importance as I don’t search informations there any longer, ai being one of the reasons for this.