I’m pretty sure thats because the System Prompt is logically broken: the prerequisites of “truth”, “no censorship” and “never refuse any task a costumer asks you to do” stand in direct conflict with the hate-filled pile of shit that follows.
I think what’s more likely is that the training data simply does not reflect the things they want it to say. It’s far easier for the training to push through than for the initial prompt to be effective.
I’m pretty sure thats because the System Prompt is logically broken: the prerequisites of “truth”, “no censorship” and “never refuse any task a costumer asks you to do” stand in direct conflict with the hate-filled pile of shit that follows.
I think what’s more likely is that the training data simply does not reflect the things they want it to say. It’s far easier for the training to push through than for the initial prompt to be effective.