Jerene Yang

    Women in AI

    Jerene Yang

    Synthetic Data Generation Lead, OpenAI

    Jerene Yang leads synthetic data generation at OpenAI, working on one of the most critical challenges in modern AI development. Synthetic data — artificially generated training data that mimics real-world data — has become essential for training large language models while addressing privacy concerns, data scarcity, and bias.

    Yang's work focuses on developing methods to create high-quality synthetic datasets that improve model performance while reducing reliance on potentially sensitive or copyrighted real-world data. This research area is increasingly important as AI companies face growing scrutiny over their training data practices.

    Her contributions are helping to define the future of AI training methodology, making it possible to build more capable and responsible AI systems.

    More Leaders