Synthetic Data in Machine Learning: What, Why, How?

Vincent Granville
1 min readJul 25, 2022

--

In this episode, Nicolai Baldin (CEO) and Simon Swan (Machine Learning Lead) of Synthesized are welcoming the founder of Data Science Central and MLTechniques.com Vincent Granville to discuss synthetic data generation, share secrets about Machine Learning on synthetic data, key challenges with synthetic data, and using generative models to solve issues related to fairness and bias.

Contents

0:00 — Introductions

3:24 — How did you become interested in synthetic data?

5:36 — How does the corporate world interact with synthetic data?

8:31 — Problems that synthetic data can help solve

18:55 — Synthetic datasets used by corporations

27:55 — What is driving the interest to synthetic data?

31:21 — How would you define what synthetic data actually is?

38:43 — Creating and sharing high quality synthetic data

41:58 — What criteria should be used to measure synthetic data?

46:02 — Challenges in scaling from standalone tables to databases

49:38 — Data coverage concept and its applications

51:30 — Using synthetic data to help solve biases

57:13 — Fire round

1:00:53 — Conclusions

View podcast here.

--

--

Vincent Granville
Vincent Granville

Written by Vincent Granville

Founder, MLtechniques.com. Machine learning scientist. Co-founder of Data Science Central (acquired by Tech Target).

No responses yet