So as long as the training data is well selected for your problem…
It’s clear that in the training data for LLMs, 4chan, reddit, etc. are over-represented, so that explains why chatgpt might be more awful than an average person. Having an LLM decide on, e.g., college admission would be like having a Twitter poll to decide on who should be its next CEO. Like that’s obviously stupid, nobody would ever do that, right?
The problem is that for the college admission example, the models were trained on previous admissions, taken by college employees , and these models are still biased.
It’s not. It’s a record of online conversations, which tend to be more polarized and extreme than real people.
That’s why I said
It’s clear that in the training data for LLMs, 4chan, reddit, etc. are over-represented, so that explains why chatgpt might be more awful than an average person. Having an LLM decide on, e.g., college admission would be like having a Twitter poll to decide on who should be its next CEO. Like that’s obviously stupid, nobody would ever do that, right?
The problem is that for the college admission example, the models were trained on previous admissions, taken by college employees , and these models are still biased.