Bruno Sánchez-Andrade Nuño

June 8, 2023

The Gutenberg Moment in AI and it's shadow, the End of Digital Presumption of Veracity.

image.png


In the field of genetic ancestry, services like 23andMe enable us to trace our roots back 100,000 years. However, the recent ancestry --the closer generations to us-- are paradoxically harder to trace. Why? This is in part due to advent of increasingly common, fast, and cheap travel:  horses, ships, trains, cars, planes, ... It basically mixes populations in much deeper, complex, dynamic ways. The technological revolution made genetic ancestry possible, but also complicates the assumption that one can easily relate location with DNA.  Similarly, today, a comparable shift is taking place in the digital world. As I explain below, we can no longer assume that any digital record is a factual representation of the present or the past. This was the case before. It is the end of presumption of digital veracity.

The Gutenberg Moment

Johannes Gutenberg's invention of the movable type printing press in Mainz, Germany, around 1440 was undeniably one of the most transformative events in human history. Prior to Gutenberg's innovation, producing a single book was a labor-intensive, manual process that could take a scribe over a year to complete. As a result, books were expensive and rare commodities, accessible only to the wealthy and the clergy; thereby maintaining an illiterate powerless populace and a powerful class that included clergy, royalty and wealthy people.

However, with Gutenberg's press, this all changed dramatically. The press was capable of producing up hundreds of pages per day, a stark contrast to the mere handful of pages a scribe could manage. By 1480 there were 110 printers active across Europe, making 20 million books. The following century, 200 million books.

Making books was the output, but it drove dramatic changes far and wide: The print press led to a surge in the skills to use this new tool: literacy rates. Western Europe went from anecdotally literate at 10% (1% women), to anecdotally literate at 90% in some cities in 2 generations. The cultural Renaissance and intellectual scientific revolution are sparks of the print press. New and existing news, ideas, calendars, laws, religions, languages all became standard classes that traveled far and wide. 

In essence, the invention of the movable type printing press was not just a technical innovation. It was a societal catalyst that democratized knowledge, reshaped power structures, and paved the way for the intellectual, cultural, and scientific advancements that define our modern world.

The Gutenberg Moment of AI

Today, generative AI — an AI capable of creating, understanding and modifying digital content — stands to redefine how digital content is created and consumed. It has the potential to generate text, images, music, and even code, reaching or even surpassing human capabilities of quality, and definitely of speed, cost and scale.

Many have predicted many truly exciting positive changes reshaping of societal structures. Content was technically available before, but now anyone can request an extremely tailored guidance in their language, learning the user's knowledge and explaining the missing bits. We can make anything personalized, or a game, or a song, or a video, or poem. On demand, in seconds:  News articles, educational materials, health advice, music,  art, software. It can translate us in real time, advice us, teach us, support us, ...

Moreover, I believe all of the above stems from paying attention to the "generative" part of this new wave of AI. We are still not leaning enough into what I believe is the linchin that unlocks many more innovations and impact: A technical thing called "Embeddings" that captures a mathematical representation of inter-related meaning. [I wrote about them here].

However, like the Gutenberg moment, the rise of generative AI brings its own challenges. It's unclear how dependable, accurate, or fair we can [or should] make AI. It can also supplant us with or without our consent. Truly convincing deepfakes, propaganda, populism or misinformation are as easy to create and hide as they are effective. This is why I believe it's the end of an era:

The end of digital presumption of veracity.

Generative AI is reshaping our digital landscape to such an extent that determining the authenticity of digital content is increasingly complex. Today, even tracing the origin or modifications of a digital item is challenging. Tools to do this explicitly or covertly are increasingly easy, cheap and available to anyone. It is also incredibly opaque, we even love to design and present AI generated content as if made by humans. For example writing emails, essays or blog post (this one?). Since generative AI is designed to appear human, and not designed to be truthful or reveal itself as AI-made, we are quickly thrown into trouble.  As an example, Google's top image result for the famous painter "Johannes Vermeer" is not an image created by him, but an AI-Generated version, unattributed.

Paradoxically it will also limit AI in the sense that many types of aI depend on large amounts of human content that they learn to mimic. From now on, any content might actually be AI generated, and therefore of much reduced quality to learn to mimic humans. It would be like trying to learn real Spanish reading bad automated machine translations.

The growing power of generative AI underlines the need for a tamper-proof system of veracity. It's not just about debunking fake news; it's about preserving history and ensuring that our decision-making, opinion-forming, and understanding of the world are based on reliable content. Addressing this is a collective effort. We need not only new technologies but also fresh norms, laws, and regulations that govern generative AI usage while upholding truth, transparency, and accountability.

As we embrace this new Gutenberg moment with the rise of generative AI, we also face a fundamental challenge: The complexity of determining the authenticity of digital content and the potential for its misuse. It underscores our need for a robust, tamper-proof system of veracity. We are not merely debunking fake news; we are preserving our digital history, informing our decisions, shaping our opinions, and striving to understand our world through reliable content. I am candidly not sure we can solve this. It is a generative adversarial problem where machines are just too good to win every round, until their own success poisons the training data they need to thrive.

Disclaimer of AI use:
* This blog post was drafted manually as bullet points, its arguments discussed with ChatGPT GPT4, and then edited back manually.
* Top image created with Bing Image creator with the prompt: "A renaissance realism style painting of the Gutenberg print used by robots."

About Bruno Sánchez-Andrade Nuño

Scientist. Impact Architect. Intellectually promiscuous. Stoic optimist… all that you need when working on tech innovation for climate change, socioeconomic development and biodiversity. By training PhD Astrophysics and rocket scientist. By way of #PlanetaryComputer 
Saepe cadendo. Dad to Sela, @emmyagsmith husband