The Topiary of the Digital World
Issue #24: Synthetic Data - Harnessing Order and Embracing the Wild for Creative Breakthroughs
Introduction
Synthetic data is a fascinating technological advancement, offering a cleaner, more controlled representation of human behaviors and problems. At its core, synthetic data is artificially generated data that mirrors the structure and properties of real-world data. It is meticulously crafted using algorithms and models that analyze vast amounts of information from the internet. These algorithms create personas that can mimic the complexities of human behavior, often including elements of chaos and unpredictability. However, much like the art of topiary, where wild bushes are sculpted into precise, decorative shapes, synthetic data tends to smooth out the wildness, the outliers, and the chaos inherent in real human experiences. This process results in a clean and polished representation that, while useful, misses the raw edges where true innovation often lies.
Welcome to Sandringham, 20,000 acres of land and one of the royal residences of Charles III. A large body of natural land indeed. But it is not the reality of wilderness. You will find nature but with few rough bushes, where’s the fun in nature that has been fully sterilized by a landscape architect?
Synthetic Data in Marketing and Problem Solving
In the world of marketing and problem-solving, synthetic data has undeniable value. For average work, where the goal is to present market research and solutions in a neat, digestible format, synthetic data performs admirably. It offers consistency and reliability, providing a solid foundation for strategies and decisions. This topiary-like approach is sufficient for many purposes, as it avoids the tangents and unpredictability that can complicate analyses.
However, to truly produce groundbreaking work, the wildness and chaos of real human behavior are essential. Innovation often thrives at the edges, in the messy spaces where different environments and perceptions intersect. Synthetic data, with its sanitized and averaged personas, cannot fully capture these nuances. It is at the fringes of these clean personas—where the unexpected and the unstructured exist—that magic happens.
Marketing, at its core, aims to uncover new unknowns and competitive advantages. When everyone relies on synthetic data, derived from the same datasets and methodologies, the resulting strategies tend to converge. They miss the unique insights that come from engaging with real, messy human experiences. True competitive advantage is often found in the chaos, in understanding and leveraging the outliers, the anomalies that synthetic data tends to smooth over.
The Value of the Wild and Chaotic
While synthetic data can closely approximate reality—sometimes within 95% accuracy, as noted by Mark Ritson in Marketing Week—it excels mainly in representing averages. This high accuracy in the averages, however, glosses over the richness of the outliers. These outliers, the wild and unpredictable elements of human behavior, are where groundbreaking solutions and innovations often emerge. Synthetic data, for all its benefits, cannot replicate the serendipity and creativity that arise from engaging with the full spectrum of human experiences. In understanding the rabbit holes other humans fail to explore, let alone the clean safehouse that statistical modeling provides—There in lies the chaotic competitive advantage that will likely only become more valuable in the future.
I enjoyed this report from The Alan Turing Institute that supports some of the thoughts I’ve been exploring, https://royalsociety.org/-/media/policy/projects/privacy-enhancing-technologies/synthetic_data_survey-24.pdf
The Danger of Synthetic Data as a Safety Crutch
There is a real danger that synthetic data could become a crutch, especially in industries like advertising where creativity and distinctiveness are paramount. Advertising agencies and their clients might lean into the cleanliness and sterility of synthetic data because it offers a sense of safety and understandability. It can also manage ‘big data’ with ease. This reliance on synthetic data could stifle creativity and lead to homogeneity in marketing strategies. AI and synthetic data should not become a safety net that distances us from the raw and real aspects of human behavior.
Keeping Some Human Data Wild
According to an analysis reported in the Proceedings of the National Academy of Sciences in 2021, only 19% of Earth's land remains wild. This stark statistic serves as a powerful analogy for the current state of human data. Just as we are becoming increasingly distanced from the wildness of nature, there is a risk that we might similarly distance ourselves from the raw and untamed aspects of human experiences by over-relying on synthetic data. The value in chaos and unfiltered data is immense. Not everything needs to be cleaned up to the point of sterility. There is still profound value in the raw, wild data that captures the true essence of human behavior.
In the advertising industry and beyond, it is crucial to maintain a connection to this wildness. As synthetic data becomes more prevalent, we must resist the urge to overly sanitize and simplify human experiences. Instead, we should strive to keep close to the unrefined and chaotic nature of real human behavior. This approach will ensure that our insights and solutions are rich, authentic, and truly innovative.
In Bali, Aboriginal people grow crops in the midst of a diverse tropical forest. There’s probably something for us to learn here. Manipulating land but doing so within a wild diverse tropical rainforest. Manipulated but not sterilized.
Conclusion
In conclusion, while synthetic data has its place and value, particularly in providing a stable foundation for average marketing work and problem-solving, it falls short when it comes to unlocking true innovation. The real breakthroughs occur in the wild, messy, and chaotic spaces between the clean personas that synthetic data creates. To harness the full potential of human creativity and problem-solving, we need to embrace the chaos, the outliers, and the edges of human experience.
Therefore, our approach should not be to abandon synthetic data but to use it as a tool within a broader strategy that also values and seeks out the wildness and unpredictability of real human behavior. By doing so, we can create a more holistic and dynamic understanding of the world, leading to more innovative and effective solutions. Let's celebrate the untamed, the chaotic, and the messy, for it is in these spaces that we find the true magic of human potential.
By maintaining a balance between synthetic data and real-world chaos, we can avoid the danger of becoming overly reliant on the safety of clean, sterile data. Instead, we can continue to explore the edges and intersections where creativity and innovation thrive, ensuring that our work remains vibrant, distinctive, and impactful. Just as we must strive to preserve the wildness of nature, we must also ensure that the richness of human experience is not lost to the sanitizing effect of synthetic data.





