Rethinking Data Use in Large Language Models