And the training data right? How can you prevent exactly what the person you replied to is talking about. Situations where we ingrain the horrible nature of humankind because that's what we're training the AI on. Say you were building a credit score classifier NN. It's going to "learn" that African Americans have lower credit scores. Mostly because it's going to be trained on historic data, which is based on human input, and humans are inherently racist in some fashion or another. And societal constructs have prevented African Americans from economic advancement since... ever.
You can't just have "transparent objectives" and let AI take the wheel.
You can't just have "transparent objectives" and let AI take the wheel.