The fingerprint hidden in plain sight

EchoMark

May 19, 2026

The fingerprint hidden in plain sight

How EchoMark uses steganography techniques to invisibly watermark text

Text-based content, including emails and documents, sits at the center of most insider leak incidents, yet it is uniquely difficult to protect. Other security tools including DLP, IRM, and SIEM work at the level of files, devices, or network channels. They protect the container - the file, device, or channel holding the content - but not the words themselves. But the moment someone with legitimate access reads a document, the content itself is free. It can be copied into a new message, retyped from memory, dictated to someone over a call, or pasted into an AI tool and reformulated. Each of those methods bypasses the controls that organizations typically rely upon.

EchoMark's text watermarking addresses this by placing a forensic identifier inside the text itself using steganographic methods, not in the file or the channel carrying it. It works through two distinct techniques: imperceptible character-level adjustments unique to each copy, and a second approach that addresses the scenario where a document never surfaces at all.

Invisible adjustments, unique to every copy

When marking text, EchoMark creates individualized copies by making imperceptible adjustments to character spacing, line positioning, and Unicode character selection. These adjustments are invisible to readers but encode content with a unique identifier for each recipient. A CEO’s company-wide announcement to 40,000 employees produces 40,000 distinct copies, each uniquely tied to the person who received it. Margins stay the same. Word choices are untouched. Two copies distributed to different people are indistinguishable on screen and in print. What differs is the precise visual rendering of the characters, which EchoMark's AI-powered computer vision reliably detects. We continue to refine these techniques, making differences finer and increasingly undetectable without advanced computer vision.

Importantly, these text watermarks are not a visible stamp or file metadata. This distinction matters for survivability. Sophisticated users can remove or bypass visible stamps. Metadata can be stripped from a file when it's resaved or reformatted, however character-level spacing travels with the text itself. A photo taken on a personal device, a screenshot of a document, a printout scanned and uploaded elsewhere, or even an email that is copy and pasted as plain text will preserve the visual fingerprint, because the mark is in the rendered characters, not in an attribute that can be easily removed. The video below demonstrates these micro-adjustments that make every copy unique but remain undetected when an individual is reading their copy of the information.

When a source is quoted but no document surfaces

Character-level marking solves the case where a portion of a document appears publicly, for example a leaked email or PDF report that is published on the internet. But some leaks never include an original copy of the leaked document. A reporter writes that "a source familiar with the matter" confirmed details of an internal strategy. In this case, there is nothing visual to upload and analyze.

EchoMark's AI rephrasing protection addresses this scenario through a second protection approach. Rather than marking individual characters, AI rephrasing encodes alternative word patterns into phrases within the email. These alternate phrases are not known to readers who only receive their copy, but they are detectable by AI.

Consider a sentence like: "The board has not approved the Q3 budget yet". AI rephrasing might create alternative versions of this sentence such as: "The board has not yet approved the Q3 budget" or "The board hasn’t approved the Q3 budget yet". The words differ, but the underlying message - the subject, action, and qualifier - is semantically the same. There could be multiple versions of this specific sentence, and when rephrasing is applied to an entire email, the result can be millions, billions, or even trillions of unique permutations that can be sent to different users. When a source reconstructs information from memory or feeds a document through an AI tool to reword it, the resulting text typically preserves enough of the original pattern for the fingerprint to hold.

Starting a leak investigation

When a watermarked document surfaces where it should not, the investigation is immediate. You upload the leaked artifact to EchoMark: a screenshot, a forwarded copy, a photo of a printout, or a fragment of quoted text. EchoMark’s AI analyzes the artifact and compares it against every uniquely marked copy that was distributed. The system returns an investigation report in minutes outlining which recipient’s copy matches the leaked artifact, along with a confidence score and details about the original version and marked copies.

Subscribe to this blog to follow the series, or schedule a demo to see EchoMark’s text watermarking in your own environment.