What's Actually Inside an AI-Generated File (Spoiler: More Than You Think)

Every export from Midjourney, Suno, Runway, ElevenLabs, ChatGPT, and Photoshop AI carries a metadata block describing how the file was made. Here's exactly what each tool embeds, where it lives in the file, and which platforms read it.

Your AI export comes with a receipt.

If you've generated an image with Midjourney, a track with Suno, a video clip with Runway, a voice clone with ElevenLabs, an image with ChatGPT, or anything with Photoshop's Generative Fill — there is a block inside the file you downloaded that documents exactly how it was made, by which tool, on which date, with which model. None of this is in the visible content. All of it gets read in milliseconds at upload time by the platforms you publish to.

This is a tour of where that metadata lives, what each tool puts in there, and which platforms are reading it. If you want the action item up front: it's at the bottom, and it takes 30 seconds.

Your AI Export Comes With a Receipt

A media file is a wrapper around content. The wrapper is structured — JPEG headers, PNG chunks, MP4 boxes, MP3 frames — and inside that structure are reserved areas where metadata lives. EXIF for images, ID3 for MP3 audio, UDTA boxes for MP4 video and M4A audio, VORBIS_COMMENT for FLAC. These were originally designed for benign purposes: camera makes and models, song titles, copyright notices, GPS coordinates so your photos app could put your vacation on a map.

Then C2PA happened. The Coalition for Content Provenance and Authenticity is a standard backed by Adobe, Microsoft, OpenAI, Google, the BBC, Sony, Leica, and Arm — and what it specifies is a way to embed cryptographically signed manifests inside that same metadata structure. A C2PA manifest can declare which AI model produced the content, what edits were applied, who the creator is, where it was captured, and a chain of custody.

In 2026, every commercial AI tool you've heard of is writing C2PA. Some of them have been writing it for over two years and most users still don't know it's there.

The Three Layers: EXIF, XMP, and C2PA (And Why C2PA Is the New Problem)

There are three distinct metadata systems you might find in an AI-generated file. They serve different purposes and live in different parts of the structure.

EXIF — The original. Camera-focused. Stores GPS coordinates, exposure settings, camera make and model, capture timestamp, lens info. AI image tools sometimes write a synthetic EXIF block with the AI tool's name in the Software field — for example, Midjourney sets Software=Midjourney v6.1. EXIF is dense, well-structured, and read by basically every photo platform on earth.

XMP — Adobe's extensible XML-based metadata format. More flexible than EXIF, often used to carry richer attribution, edit history, and creator credits. Photoshop, Lightroom, and Premiere all heavily write XMP. Generative Fill and other Adobe AI features extend XMP with assertions about what was AI-generated.

C2PA — The new one. Stored in a JUMBF box (JPEG Universal Metadata Box Format) inside the file's metadata region. C2PA manifests are cryptographically signed, meaning they can't be selectively edited without invalidating the signature. They contain assertions like this asset was generated by Suno using model v4.0 on 2026-04-23 or this image had a generative fill applied to coordinates X,Y by Photoshop 26.1. Removing C2PA isn't a matter of editing one field — you have to strip the whole manifest cleanly. (What C2PA actually is, in detail.)

A single AI-generated file often contains all three layers. EXIF for legacy compatibility. XMP for the application-specific story. C2PA for the cryptographic proof. Strip one and you've still got the others talking.

Audio editing software interface showing waveforms and metadata layers

What Midjourney Embeds (The Exact Fields)

A Midjourney v6.1 PNG export contains, at minimum:

EXIF Software: Midjourney v6.1
EXIF ImageDescription: the prompt used to generate the image, sometimes with --ar, --style, and other parameters appended
XMP xmp:CreatorTool: Midjourney
XMP dc:creator: in some versions, the Discord username of the requester
C2PA assertion: a signed manifest declaring AI generation, with the model version and a timestamp

The prompt being in the EXIF is the part most creators don't realize. If you publish a Midjourney render directly without stripping, the literal text of your prompt is readable by anyone who pulls the file into a metadata viewer. That's not a privacy concern in the GDPR sense — but it's an attribution problem if the prompt is part of your creative process you don't want shared, and it's a competitive concern if your prompt is the result of expensive iteration.

Smartphone displaying an AI chat interface in dark mode

What Suno and Udio Put in an Audio File

Suno exports an MP3 or M4A. Inside, depending on which export option you used:

ID3 TENC (Encoder): the Suno model version
ID3 COMM (Comment): in some exports, the prompt or song specification
iTunes-style atoms in the M4A udta (User Data) box: encoder string, software, sometimes the Suno track ID
C2PA assertion: model name, generation timestamp, signed by Suno's certificate
Album artwork: Suno-default artwork unless you replace it, and that artwork carries its own metadata

Udio's structure is similar. Both write an udta box in M4A exports — that's where C2PA assertions live in audio container formats, alongside or instead of standard ID3. The fact that audio C2PA exists at all is recent enough that most metadata-removal tutorials online haven't caught up to it.

ElevenLabs voice exports follow the same pattern — model name in the encoder field, C2PA assertion in the container, sometimes a watermark in the audio itself (the audio watermark is a different system, covered in the Spotify post).

What Runway and Pika Leave in Video

Runway, Pika, Sora, and similar AI video tools export MP4 or MOV files. The metadata sits in the udta box at the file level (or per-track inside the moov atom). What's in there:

ISO BMFF udta box: encoder name (Runway, Pika), software version, generation timestamp
moov.udta per video track: secondary metadata, sometimes including the prompt
C2PA assertion: signed manifest with model name and date — increasingly hardware-anchored as AI video tools partner with content authentication providers
GPS-style coordinates: in some workflows, a synthetic location based on the user's account billing region

Video metadata is structurally trickier than audio or image because MP4 and MOV both allow nested metadata at multiple levels — file-level, track-level, sample-level. A naive "strip metadata from MP4" tool that only touches the file-level udta box can miss assertions written deeper in the structure. Removing C2PA from a video reliably means walking the whole box tree.

Audio editing software with waveform display

How to Read Your Own File's Metadata (Hex Editor or Metadata Viewer)

Three ways, in order of difficulty.

Easiest: drag the file to a metadata viewer. metadata2go.com reads images, video, and audio and renders the metadata as plain text. It's the same tool the Metadata Cleaner support page recommends for verification.

Medium: install ExifTool — open-source, free, command-line. exiftool yourfile.png dumps everything ExifTool can find, including most C2PA assertions. The output is verbose; pipe it through less or grep for specific fields.

Hardest, but instructive: open the file in a hex editor (Hex Fiend on macOS, HxD on Windows). Scroll through the first few KB. EXIF blocks look like ASCII strings — you'll spot Midjourney, Suno, Adobe Photoshop in plaintext if they're there. C2PA manifests are inside JUMBF boxes; you'll see the magic string jumb in the bytes near the metadata region. This is overkill for anyone who isn't curious about the underlying file structure, but it makes the abstraction concrete.

If you've never looked at the metadata in your own AI exports, do it once. Most creators who do are surprised at the amount of identifying information sitting in plain sight.

Which Platforms Read Which Tags

Platform	Reads C2PA	Reads EXIF software	Reads XMP edit history	Applies AI label automatically
Spotify	Yes (audio)	Yes (encoder field)	n/a	Yes
TikTok	Yes (video, image)	Yes	Sometimes	Yes
Instagram	Yes (video, image)	Yes	Sometimes	Increasingly
YouTube	Yes (video)	Yes	Sometimes	Yes for Shorts
LinkedIn	Yes (image, video)	Yes	Yes	Yes
Apple Music	Yes (audio)	Yes	n/a	Yes
Tidal	Yes (audio)	Yes	n/a	Yes
Getty / Shutterstock	Yes — explicitly checks for C2PA	Yes	Yes	Rejection rather than label

The pattern: every major platform reads C2PA where it's present, every major platform reads the encoder field, and every major platform that has an AI labeling policy at all uses these signals to apply the label without asking the creator.

The platforms that don't currently apply AI labels still read the metadata. They retain it. They use it for internal classification. When their labeling policy changes, the metadata you uploaded last year is still there.

How to Strip All Three Layers in One Pass

Concrete workflow:

Open metadatacleaner.app in your browser.
Drop the file (image, video, or audio).
Click Clean. The app processes the file entirely on your device — EXIF, XMP, C2PA, ID3, UDTA, VORBIS_COMMENT, RIFF, all of it. Nothing is uploaded.
Click Download. The cleaned file has the same audio, the same pixels, the same video frames, just no metadata wrapper.

For images, the tool re-encodes through the canvas API, which means a fresh file with zero embedded metadata of any kind. For audio and video, it does in-place byte manipulation on the metadata containers — the actual sound and pixels are byte-identical to the original.

FAQ

Is metadata the same thing as a watermark?

No. Metadata is structured data attached to the file, separate from the content. A watermark is a modification to the content itself — visible (a logo) or invisible (audio frequencies, pixel-level patterns). Stripping metadata removes the first kind. Watermarks survive metadata stripping.

If I screenshot an AI image, does the metadata go with it?

No. A screenshot creates a new file from your screen's pixels — the metadata of the original isn't copied. But the screenshot itself carries metadata about your screenshot tool and operating system. Worth knowing if you're using a screenshot to defeat C2PA: you've also created a new metadata trail. (What's actually in a screenshot.)

Does compressing an image strip metadata?

Sometimes. JPEG re-compression in a browser typically drops EXIF and most other metadata. JPEG re-compression with ImageMagick or Photoshop preserves it by default — those tools were built to retain metadata. So "yes" if you go through a stripping pipeline, "no" if you go through a professional editing tool.

What about file conversion? Does converting MP3 to WAV remove the C2PA?

Often yes — the WAV format has a different metadata structure than MP3, and most converters don't carry assertions across formats. But this is unreliable and tool-dependent. Some converters preserve metadata where they can. Don't count on a convert-to-strip workflow.

Why do AI tools embed this metadata if it's hurting their users' reach?

Because the AI tools have their own incentive to comply with C2PA — pressure from regulators, pressure from platforms, pressure from the broader provenance-and-authenticity push that's been building since 2023. The tools embed the metadata. The creator is the one paying the cost of platform throttling.

Can platforms detect AI without metadata?

Yes, increasingly. Spectral classifiers for audio, pixel-pattern classifiers for images, motion-pattern classifiers for video. They're imperfect but improving. Stripping the metadata removes the easy answer; it doesn't remove the question entirely. (How TikTok specifically handles this for video.)

Strip what's strippable. Be honest with yourself about what isn't. Drop your file into metadatacleaner.app — it processes locally, your file never leaves your device.

Images: hero — Photo from Pexels · audio interface — Photo from Pexels · AI chat dark — Photo from Pexels · audio editing — Photo from Pexels