Synthetic Data
Introduction
In our mission to shorten the path from data to insight, we aim to improve access to data. However, this is often hindered by legal barriers and internal policies. One way to overcome this challenge is by utilizing Synthetic data.
What is Synthetic Data?
Synthetic data is data generated by a computer algorithm that mimics real data. It has the advantage of having all the valuable properties of real data, without breaching privacy concerns because the data is not about real people. The challenge, however, with synthetic data is that the more valuable/quality the synthetic data is, the higher the chances of “working back” to real identifiable individuals- breaching their privacy right. The Data Innovations Office has developed Synthetic Data generating algorithms that overcome this challenge. The synthetic data we generate is not only high quality but private.
Why is Synthetic Data valuable?
Synthetic data can be used in a variety of ways. It can be used for teaching and training where learners apply their knowledge to real-world problems with realistic data. With a critical need for diversified data to train Machine Learning algorithms and AI, synthetic data can be used to augment real data to improve its diversity. Synthetic data also has the potential to “jump-start” research and innovation by facilitating low-risk experimentation and exploration. We envision a time when synthetic data will work together with real data to drive innovation.
Join us!
AKU Data Office is always on the lookout for opportunities to leverage this technology and is open to collaboration.
Send us an email to cdio@aku.edu