Do you think there is any plausible risk of automated adversarial approaches being used to evade AI content moderation (I'm imagining stuff like Gmail's spam detection)? I imagine there could be a significant market incentive to defeat these systems.
Mr-Frog16 karma
Do you think there is any plausible risk of automated adversarial approaches being used to evade AI content moderation (I'm imagining stuff like Gmail's spam detection)? I imagine there could be a significant market incentive to defeat these systems.
View HistoryShare Link