The Dawn of Liquid Content

The Medium, the Message, and the Multimodal Age of AI

May 14, 2025

Marshall McLuhan, the great 20^th Century Canadian Philosopher of Media, wasn't just prescient—he was prophetic. His declaration that "the medium is the message" didn't merely define 20th-century media theory; it has become the skeleton key to understanding our AI-transformed future. In McLuhan's world, form overshadowed content. In our world, AI is obliterating the boundaries between forms altogether.

Welcome to the age of Liquid Content—where information flows seamlessly between mediums, transforming its shape while preserving its essence. What was once fixed—text, audio, video—now exists in a state of perpetual potential, ready to materialize in whatever form serves the moment. To understand where Liquid Content will take us, we need only to look at how it has already transformed how we comprehend information.

On TikTok, a video’s format – short, vertical, often overlaid with text – is key to its addictive appeal. A striking example is the prevalence of AI-enabled dynamic transcript overlays (auto-captions and text snippets that appear in sync with speech). By presenting spoken words as on-screen text, TikTok videos manage to grip viewers even with the sound off – something traditional digital media formats failed to do. The result? Higher engagement and retention. In fact, surveys have found that 80% of viewers are more likely to watch an entire video when captions are on, and 37% say captions actually encourage them to turn the sound on out of increased interest.

Learn How To Add Moving Captions in Tiktok Easily! — Cannot overemphasize how much this has altered cognition… (sorry, did not care to find a higher res image)

TikTok didn't merely add captions to videos—it fundamentally rewired how we consume information. Those dynamic transcript overlays transform passive viewing into active reading-watching, which changes when and how TikTok is consumed. When sound was needed to process the content, one needed silence or the ability to put in headphones without being called out to engage. Now students watch TikTok under their desk during lectures and bored employees do the same during meetings. By getting rid of the need for audio to transmit the totality of the content a new purely visual medium was created, one that can be engaged with in more real estate than hybrid forms. It’s way easier to watch a TikTok in public without drawing attention than it is to watch a YouTube video.

Another medium shift is happening in the world of spoken literature. Traditionally, Amazon’s Audible platform has been the go-to for audiobooks, where the dedicated medium (audiobook-only app, credit-based model) encourages listeners to approach books in a certain way – often choosing one heavyweight title per month, typically by known authors, to maximize their subscription credit. The Amazon thesis with Audible is the inverse of McLuhan’s, the belief that content is king. Amazon sees books, whether physical, digital, or audio, as fundamentally the same thing and so the discovery experience is consistent across these mediums. There likely was even a belief during the Audible acquisition that audiobooks would encourage more physical book consumption. The person physically reading a thriller they just couldn’t put down would switch to the audiobook when they get in the car to drive to pick up their kids. For better or for worse, this just isn’t how people approach content.

Contrast Amazon’s focus on content (in this case books) across mediums with Spotify’s focus on medium across content. Spotify’s addition of audiobooks to a music streaming app might seem like just adding content, but it’s actually a change in medium context with big effects on consumption. Spotify’s medium is different: it’s a general audio platform for music and podcasts, now offering audiobooks in a time-based model. This shift in medium and pricing model has started to change user behavior in notable ways.

Spotify launches audiobook business with 300,000 titles and à la carte ... — The typographic brain cannot comprehend treating books as background audio

According to Spotify’s own observations, their audiobooks are reaching a different audience and surfacing different titles than Audible. Spotify CEO Daniel Ek noted that many listeners on Spotify explore lesser-known works and younger or emerging authors – content they might skip on Audible – precisely because the medium’s structure (listening hours up to a cap across the library, integrated in a familiar app) encourages sampling and serendipity. On Audible, the credit system nudges users toward “safe bets” (bestsellers, famous authors) since you essentially purchase one audiobook at a time. But on Spotify, the audiobook lives side by side with songs and podcasts, and you can dip in without an extra fee per book, lowering the barrier to trying niche topics. The medium (an audio app that people use casually throughout the day) becomes the message: it frames audiobooks less as a dedicated reading experience and more as another form of on-demand audio entertainment.

The difference highlights McLuhan’s point: the platform is not neutral. Put audiobooks into the music streaming medium and you get different patterns of use than in the audiobook-specific medium. Listeners on Spotify consume books in more fragmented, exploratory ways – perhaps listening in shorter bursts between songs – whereas Audible users might carve out long blocks of focused listening. The same stories take on a new life when the medium changes. How we consume is shaped by where we consume. Context reshapes content.

These transformations in TikTok and Spotify exemplify McLuhan's thesis in action—the medium fundamentally reshaping how content is consumed and understood. Yet these examples represent only the initial stages of a more profound revolution. What we're witnessing in these platforms is medium adaptation, where content shifts between established forms. What lies ahead is something far more radical: the complete dissolution of medium boundaries altogether. The next evolutionary leap is more than about content moving between fixed channels but existing in a state of pure potential, ready to materialize in whatever form best serves the user and context.

Liquid Content

If TikTok and Spotify show medium shifts in action, AI-powered multimodality represents something far more radical: the complete liberation of content from form. We're not just talking about converting text to speech—we're entering an era where content exists as pure information potential, ready to materialize in whatever medium best serves the moment.

This is beyond flexible content; it's Liquid Content—a paradigm where information flows into the vessel most appropriate for context, user, and purpose.

At the heart of this trend are multimodal embeddings – AI models that learn a joint representation of different data types. Researchers at Meta, for instance, developed ImageBind, a model that binds together six modalities (images, text, audio, depth, thermal, and motion) into one shared embedding space. Such a model can understand correspondences between, say, a sound and an image – enabling transfer from any one medium to another. A good mental model of what embeddings enable is search on steroids. Every platform, from TikTok to Spotify, ensures that they give the user what they want by enabling effective search and recommendations over the content they have. But what AI enables is not just search over existing content, but over all possible content. Spotify can generate audiobooks or podcasts using text-to-voice of a larger library of content than that which was specifically designed for the platform specifically in response to a particular user’s recommended needs. In effect, AI like this provides an “anything-to-anything” machine: feed in a piece of content in one form, and get it out in another. The medium barrier breaks down.

McLuhan said new media alter our “sense ratios” – how we balance sight, sound, touch, etc. AI multimodality is poised to scramble those sense ratios completely. When an idea or experience can be materialized as text, spoken word, image, or even haptic feedback interchangeably, the medium truly becomes a flexible extension of ourselves, not a fixed channel. The message finds the most convenient medium for the context or user preference. This could mean a single piece of content (say, a report or a story) can have many lives – as a written article, an audio narration, a video with visuals, an interactive simulation – all generated from the same source. The implications are profound for accessibility and personalization: people could consume information in whatever form suits them best now. Time-poor executive on the go? Have that dense report read out to you as a podcast. Visual learner? See the data as an infographic or animation. The AI acts as a universal translator of mediums.

When the medium can transform at will, the message becomes pure potential, awaiting the perfect form for maximum impact.

The Strategic Power of Medium-Matching

The brilliance of AI-powered Liquid Content isn't just in the transformation itself—it's in medium-matching content to the ritual and context that maximizes its value.

Let’s take Google's Audio Overviews, which converts input text into podcasts with two people discussing the content back and forth. An ideal use case for these is to redesign executive engagement. A significant chunk of white-collar employees have had the experience of preparing a detailed presentation that was meant to be read but that nobody does more than a quick glance over. Executives entering a board room expected to have pre-digested information instead just end up shooting off the cuff. With Audio Overviews it’s not just about converting slides to audio; it’s transforming the executive information consumption ritual. By converting a 100-slide deck into an 8-minute podcast, it isn't merely saving time—it's changing where and how they consume information. Listen to the deck on your morning commute instead of carving out 15-min at the office that will never appear.

The ritual shift is profound: from desk-bound, focused reading to mobile, parallel processing during a commute. Same information, radically different engagement pattern. This isn't convenience; it's unlocking trapped value through medium transformation. The deck sitting unread in an inbox becomes actionable intelligence absorbed during otherwise lost time.

The strategic power of medium-matching extends into professional workflows. A compelling example comes from the sports entertainment industry, where Video Alchemy, developed through a collaboration between MLSE, AWS, and TwelveLabs, transformed how sports marketing teams relate to archived content.

Traditional video production followed a linear, compartmentalized process: producers brainstormed ideas, developed initial scripts, conducted searches on media asset management systems, and then prepared edits in video editing software. This segmented workflow created natural barriers between ideation and execution, limiting both speed and creative exploration. By leveraging multimodal embeddings that made video content discoverable through natural language descriptions, Video Alchemy enabled new forms of creative discovery.

The technology compressed the ideation-to-creation workflow by enabling producers to interact conversationally with an AI system that understood the video content library at a deep level. This transformed what had been distinct phases—creative ideation separate from technical implementation—into a unified, fluid process. Declarative edits through natural language commands allowed the system to eliminate the cognitive gap between imagining a concept and visualizing its execution.

By condensing ideation and verification into a continuous conversation, Video Alchemy created a new collaborative dynamic between human creativity and machine capability. Producers could explore possibilities more freely, receive immediate visual feedback, and iterate rapidly—establishing an entirely new creative rhythm that had been impossible in the traditional segmented workflow. The medium transformation created a new ritual of creation, demonstrating how AI can reshape not just what we consume, but how we create.

Beyond organizational workflows, perhaps the most profound medium transformation occurs in the realm of artistic expression. The translation of Shane Guffogg's paintings into music through AI represents a fundamental breakthrough in how we experience creative works. Just as TikTok altered the ritual of content consumption and Video Alchemy transformed the ritual of content creation, this cross-modal art translation reshapes the ritual of aesthetic engagement itself.

Traditional artistic mediums have established specific engagement patterns over centuries. A painting creates a ritual of contemplation—static, spatial, visual, encouraging viewers to stand before it, absorbing its composition and meaning through sight alone. Music, conversely, creates a ritual of temporal, auditory immersion, where meaning unfolds sequentially through time rather than space. These distinct ritual experiences have historically remained separate, with different art forms demanding different modes of attention and appreciation.

When AI bridges these sensory domains, it creates an entirely new ritual of multisensory artistic communion. The painting no longer exists solely as a visual artifact but expands into an integrated experience that engages multiple cognitive pathways simultaneously. This transformation parallels how Audio Overviews changed executive information processing and how Video Alchemy altered creative workflows but extends further by merging previously incompatible sensory experiences into a unified whole.

This evolution represents the culmination of the medium transformation journey—from altered consumption patterns to redesigned workflows to fundamentally new forms of sensory engagement. The progression moves from communication to professional processes to our most fundamental modes of perceiving and experiencing the world through art. In each case, medium adaptation enables content to transcend its original constraints, speaking directly to human cognition in its most receptive form rather than being limited by traditional modality boundaries.

Content as Pure Potential

In a world of Liquid Content, content itself becomes pure potential—a quantum state of meaning that can materialize in any medium that best serves the moment. This is a philosophical transformation in our understanding of media.

The TikTok engagement model could evolve beyond its current form, sensing when a viewer needs deeper engagement versus quick consumption. The same content might transform from a captioned short-form video into an interactive experience when the system detects sustained interest or condenses further when attention is fleeting—all while preserving the core message.

Similarly, Spotify’s signalling that audiobooks as an auditory medium have more akin to music than physical books tell us about how content gains new lives as it’s transformed. The same executive report can be distributed as a podcast for one person, a video for another, and read by a third. By tapping into the medium that resonates with the habits and rituals that allow content to most resonate with a given person, liquid content enables maximum engagement. The content itself would sense the ritual taking place for the person—focused desk work versus commuting—and flow into the appropriate vessel. This isn’t one piece of content following the same person into different mediums, but rather different people discovering the same content in their respective medium.

There could be even more transformative possibilities for educational contexts. Imagine individualized curricula that detect when a student struggles with a concept and adapt their modality to what is best suited to that specific environment, from text to simulation, visualization, or guided interactive experience. The content would dynamically reshape itself based on learning patterns and comprehension levels, creating personalized ritual pathways toward understanding.

This is the ultimate vision of McLuhan's thesis in the AI age: content freed from medium constraints, automatically flowing into the vessel that creates the optimal ritual for the moment. The medium isn't just the message anymore—the medium is adaptive, responsive, and context-aware.

This transformation demands a new philosophical framework. If content can exist in any form, what becomes of authorial intent? If the same information can create different rituals depending on its medium, how do we ensure the right medium for the right purpose?

The answer lies in understanding that medium selection isn’t about convenience—it's about ritual design. Each medium creates its own rituals of engagement, its own patterns of thought, its own cognitive frameworks. The strategic application of AI multimodality isn't just about transformation capability—it's about matching content to the rituals that maximize its impact.

This is where McLuhan's insights become more crucial than ever. As we gain the power to transform mediums at will, we must become more conscious of how each medium shapes perception and behavior. The question isn't "can we convert this?"—it's "should we convert this, and to what?"

The future belongs to those who master this new alchemy—who understand that the true power of AI-enabled multimodality isn't just in the transformation itself, but in matching content to the medium that creates the right ritual for the right moment. This isn't about making content accessible in different forms—it's about making content transformative by placing it in the medium that maximizes its impact.

In this new world, content becomes truly liquid—flowing into whatever vessel best serves its essence. The medium is still the message, but now we control both.