Are We Reading the Same Brief?
Hidden text in legal documents has been a problem for years, from embarrassing metadata to failed redactions and accidental disclosures. I recently read about a “white text” tactic in academia though where hidden words appeared to game automated review and rankings. That made me wonder whether courts might face a similar issue in the near future. As judges and clerks begin to use GenAI tools to summarize filings and check compliance with court rules, will some litigants try to tuck instructions into the invisible layer that only machines read in order to tilt what those tools produce? I am not an AI security expert, so I welcome those who are to weigh in. This is a thought exercise. I hope it proves unnecessary.
Most judges trust what they can see on the page. A brief appears to meet the limit. A black bar appears to cover protected information. Then a search returns terms that do not appear in the visible text. Is that a glitch, an OCR artifact, or a hint that invisible content is shaping how software reads the filing? The practical question is whether our systems sometimes ingest one document while the judge reads another.
By “white text,” I mean real characters inside a PDF that a person does not see. As I understand it, this can happen when the font color matches the background, when text is marked as hidden or set in tiny type, or when words are nudged just off the page so they will not show in a normal view. In many scanned filings there is also an OCR-created text layer that makes the document searchable. That layer sits behind the page image. A human looks at the picture. The computer reads the text.
From what I’ve read, many search or analytics tools can read any text objects in the file, including characters that are not visible on the page. That helps explain past surprises like failed redactions or leftover boilerplate language. Products and settings differ, so I do not claim this is universal, but the general pattern seems consistent. People focus on what the page shows. Software often reads whatever text the file contains. A PDF can hold both the visual page and a separate text layer at the same time, and if text is present anywhere, many tools may pick it up even when a person cannot see it.
GenAI shifts the stakes. When a GenAI assistant summarizes a brief, it reads the full text layer rather than only what appears to the eye. And if large language models are built to follow instructions they find in text, unless the tool is constrained, it may not distinguish between directions in a standing order and directions buried in a filing by a bad actor. If that is right, invisible instructions could bias a summary, skew a compliance check, or nudge a triage system. The method is simple. The consequences are not. If one party can deliver guidance to the court’s tool that the judge never sees, the adversarial system is in trouble.
You can picture the outcomes. A first-pass summary leans hard into one argument and ignores another because hidden instructions told the tool to do so. An automated review blesses a defective filing because an invisible sentence told the court’s case management system to confirm compliance. A prioritization routine moves a matter to the front because the hidden layer says treat as urgent.
I hope that I am wrong and that this article is way off base. It may be harder than I think to influence an AI tool this way. Current court tools may already neutralize this risk and vendors may strip hidden layers by default. And if that is true, the worry fades.
The point is to surface the question before the first hard case. Courts run on trust that the record a judge reads is the record the system processes. If hidden text can break that link, we should know now and decide what to do about it together.
