• borth@sh.itjust.works
    link
    fedilink
    English
    arrow-up
    18
    arrow-down
    1
    ·
    2 days ago

    I don’t understand how these companies want to seem and think they are so smart by choosing new niche data (scraped) to train AI in a bid to try and make it “smart”…

    Has any other living being become “smart” by only ingesting information directly from the Internet? You can train other animals to perform many tasks and can probably say they are smart when they perform them as expected. I doubt any of the training methods is to tape headphones, a screen and sometimes a microphone to their faces forever (I kinda don’t wanna know if this false 😶).

    The best example we have, is ourselves, and even though we use the Internet, babies are not taught how to walk and talk by only interacting with the Internet.

    I feel like I might be saying too much, but I think the best AI we’re gonna get is to unplug it from the Internet, and then fucking raise it for 20 years like a normal, super fast-thinking child prodigy. Then just make copies of that and train further by having it go to school for the things needed.

    • BilSabab@lemmy.world
      link
      fedilink
      English
      arrow-up
      3
      ·
      2 days ago

      straightforward data scraping from the web usually ends up in having a whole lot of dark data

    • NιƙƙιDιɱҽʂ@lemmy.world
      link
      fedilink
      English
      arrow-up
      5
      arrow-down
      1
      ·
      2 days ago

      That’s a very naive simplification of the AI training process. You start with that, then pay people pennies in a developing nation to produce hand crafted training data, resulting it using stupid words like delve and whimsical entirely too much.

      Merely training on internet content with no RLFH training results in probable gibberish like that of GPT-2

    • sureshot0@discuss.online
      link
      fedilink
      English
      arrow-up
      2
      ·
      2 days ago

      Has any other living being become “smart” by only ingesting information directly from the Internet?

      LMFAO