bugoid

http://www.reddit.com/user/bugoid

Highest Rated Comments

bugoid2 karma2023-05-16 21:02:18 UTC

Do you know which LLMs (e.g., ChatGPT, Bard, Llama) use C4 as their training data?

Do you have any insights into whether how some of these AI teams might be filtering out some of the more problematic C4 data prior to training?

Have you been able to confirm the degree to which problematic C4 data is actually represented in the models (e.g., prompting the models to summarize that data)?

View History Share Link

bugoid1 karma2023-05-16 21:32:00 UTC

Thank you! I can't wait to see your next report!

View History Share Link