An unknown writer creates a disposable Reddit account. A new username, a new email, no personal details. They carefully avoid mentioning their location, workplace, or anything that could reveal their identity. Connections are only made via public Wi-Fi networks. They believe they are anonymous. But there’s a problem they haven’t thought about: their writing style is as distinctive as a fingerprint, and it follows them everywhere they go online.
Every post, email, comment, or message leaves behind a pattern. Writers develop habits they don’t even realize, choices in punctuation, word preference, sentence flow, and phrasing. Maybe the writer uses semicolons more than most. Maybe their sentences tend to be long, or short and clipped. Perhaps they favor certain expressions or consistently arrange their thoughts in a specific way. These patterns are very consistent, measurable, and often traceable across platforms, even when the writer believes they’re hidden.
This is stylometry, the science of analyzing writing style. For investigators, it has become one of the most effective ways to connect anonymous or pseudonymous writing to real people.
Every anonymous post leaves a trail. Not in IP logs or metadata—but in the words themselves. To the trained eye, a writer’s style can expose the person behind the keyboard, no matter how many aliases they use.
What Is Stylometry?
Stylometry is the quantitative study of writing style. The name comes from the Greek words “stylos” (writing instrument) and “metron” (measure), and that’s exactly what it does: it measures writing. But stylometry doesn’t measure what you say. It measures how you say it.
Scholars have studied writing styles for centuries, aiming to identify the true authors of disputed historical texts or anonymous pamphlets. One well-known early example is The Federalist Papers, written in 1787 and 1788 to support the ratification of the U.S. Constitution. These essays were published anonymously under the pseudonym “Publius,” but it is now known that Alexander Hamilton, James Madison, and John Jay were the authors. However, for some essays, it was uncertain which founder wrote which part. In the 1960s, statisticians applied stylometric analysis to solve this mystery by analyzing patterns in the authors’ word choices, revealing that Madison authored most of the essays.
What’s changed since then is computing power. Analyzing writing style used to involve manually counting words and calculating frequencies by hand, which limited the amount of text you could analyze and how detailed your analysis could be. Now, computers can process millions of words in seconds, examining thousands of different features at once. This shift has transformed stylometry from an academic curiosity into a practical investigative tool. What once took months of painstaking work can now be done in hours, and the amount of detail we can extract from text has greatly increased.
How Writing Creates Identifiable Patterns
Here’s the key insight behind stylometry: most of your writing style is unconscious. When you write, you’re focused on what to say, not how to say it. The “how” is automatic, shaped by habits you’ve built over the years. These habits are so ingrained that they are difficult to notice or change.
Let’s consider something simple: sentence length. Some people naturally write long, flowing sentences with multiple clauses, commas, and dependent phrases, creating complex thoughts. Others prefer short, direct sentences that convey their message quickly. Most of us fall somewhere in between, but each person has a characteristic pattern. If you average the sentence lengths of everything you’ve written over the past year, you’ll find a consistent number. It may shift slightly depending on the context—more formal emails versus casual texts—but within similar situations, your sentence length pattern remains stable.
This idea extends to hundreds of other features. Word length, for example, follows similar patterns. Some individuals have large vocabularies and tend to use longer, more complex words. Others stick to simpler, more common language. Neither is better; they are simply different and measurable. You can determine someone’s average word length, vocabulary diversity, and how often they use rare or common words, all contributing to their unique profile.
Function words are small grammatical words that link sentences, such as “the,” “and,” “of,” “to,” “in,” “for,” “but,” “or,” and “with.” While they seem insignificant on their own, the way you use these words is highly distinctive and consistently stable. For example, consider the phrases “the reason for this,” “the reason that this,” and “why this.” Although they mean similar things, they each use different function words, and you likely have a preferred choice that you use automatically. Numerous subtle choices are made with these small words in every paragraph. Do you like “different from” or “different than”? Do you use contractions like “don’t” and “can’t,” or spell out “do not” and “cannot”? When starting a sentence, how often do you choose “however,” “but,” or “though”?
These patterns extend to punctuation. Some people love using commas and do so liberally to divide their sentences into manageable parts. Others use them sparingly, preferring to let their sentences flow without interruption. Some people frequently use semicolons; others avoid them entirely. Exclamation points, question marks, parentheses, and em dashes—each of these has a usage pattern that varies from person to person.
Your syntactic preferences are also important. Syntax refers to the structure of your sentences, or how you arrange words and phrases to convey meaning. Do you prefer active voice (”The investigator examined the evidence”) or passive voice (”The evidence was examined by the investigator”)? Do you often use subordinate clauses, or do you lean toward straightforward subject-verb-object sentences? Do you lead with important information at the start of sentences, or do you build up to it?
All these elements, lexical choices, function words, punctuation habits, and syntactic patterns combine to form a distinctive style that’s uniquely yours. The key point is: this style remains consistent. It appears whether you’re composing a formal report or a casual social media post. It endures across different topics and contexts. It’s present whether you write under your real name or a pseudonym.
This consistency is why stylometry is effective for attribution. When investigators analyze anonymous texts such as threatening messages, leaked documents, or harassing emails, they can identify these stylistic patterns and compare them to known samples from suspects. If the patterns align closely enough, it’s evidence that they have the same author. The anonymous writer believed they were concealed, but their writing style revealed their identity.
“The subconscious ways in which authors use seemingly insignificant words is an extremely effective marker for authorship attribution.” - Patrick Joula, Stanford University
Real-World Applications in Investigation
So, where is this actually relevant? When do investigators utilize stylometry to link anonymous writings to specific individuals?
The most straightforward use is in criminal investigations. Anonymous threats are a frequent issue, whether they’re violent threats directed at schools, workplaces, or public figures, or threats related to extortion, stalking, or harassment. In these cases, the sender usually tries to conceal their identity, perhaps via burner email accounts, public computers, or anonymizing services. However, they still have to craft the message, and that writing can be examined.
If investigators have a suspect, they can analyze known writing samples, such as social media posts, emails from their usual account, or documents they’ve authored, and compare these to the threatening messages. The key question isn’t if the suspect might have written something threatening, but whether their linguistic patterns match. Did the same individual produce both texts?
Cybercrime investigations are increasingly employing stylometry. Communications from ransomware gangs can be analyzed, as can posts on underground forums. Criminals often use multiple pseudonyms to hide their activities, but stylometric analysis can uncover links if the same person operates different accounts.
One particularly important use is identifying the operators behind sock puppet accounts. A sock puppet is a fake online identity created to deceive others, often used to sway public opinion, harass someone while pretending to be multiple people, or give the false impression of support for a certain viewpoint. Someone might create several different accounts on Reddit or X, each with its own personality and posting history, but stylometry can reveal if they are all written by the same person. The writing patterns don’t deceive, even if the identities are convincing.
In corporate investigations, stylometry helps identify who leaked confidential information. When a sensitive document ends up with journalists or competitors, and that document could have come from any of dozens of employees with access, stylometric analysis can help narrow down the suspects. By comparing the writing style in the leaked material to known samples from employees’ emails, reports, and internal communications, investigators can build a case for who was most likely responsible.
Law enforcement has employed stylometric analysis in high-profile cases, with the Unabomber investigation serving as a notable example. In 1995, Ted Kaczynski’s manifesto, published in newspapers, was scrutinized by linguists who compared it to other writings. When Kaczynski’s brother recognized similarities between the manifesto and letters Ted had sent to his family, investigators established a connection. Although this predates sophisticated computational stylometry, it demonstrated that writing style can identify an author even in anonymous texts.
More recently, stylometry helped identify the author of a book published under a pseudonym. In 2013, “The Cuckoo’s Calling’ appeared under the name Robert Galbraith and initially attracted little attention. Rumors suggested it was written by a well-known author. Stylometric analysis compared the book’s style to J.K. Rowling’s works and found a match. Rowling later confirmed she wrote the book under a pseudonym. While not a criminal case, this example highlights how powerful modern stylometric techniques have become.
The Cross-Platform Challenge
The internet has simplified anonymous communication, presenting both opportunities and challenges for investigators. Anyone can create accounts on platforms like Reddit, X, Facebook, Instagram, forums, messaging apps, and many others. Each account can have a unique username, biographical info, and persona. For example, someone may appear professional on LinkedIn but act as a troll on Reddit or a conspiracy theorist on X. Without a method to link these accounts, each identity remains isolated. Here, stylometry becomes especially useful. Even if someone carefully manages different personas, preventing overlap in details, interests, or opinions, their writing style usually stays consistent. The way they form sentences doesn’t change just because they use different usernames. Their unconscious preferences for certain words, punctuation habits, and sentence lengths tend to persist across platforms.
This opens up investigative possibilities. If you’re trying to identify someone who posts threatening content anonymously, you might not know their real name, but you could link their anonymous account to other accounts where they’ve been less careful about hiding their identity. For example, connecting ThrowAway123 on Reddit to JohnSmith47 on X, and if JohnSmith47’s profile reveals identifying details, that’s a breakthrough.
The same approach applies to uncovering sock puppet networks. Someone managing multiple fake accounts to sway discussions or engage in harassment has to craft content for each account. They may attempt to give each persona a unique voice, but consistently maintaining distinct writing styles across many posts is highly challenging. These patterns tend to surface, and stylometric analysis can cluster these accounts together, indicating they are likely operated by the same individual.
Platform-specific differences complicate matters. X’s character limit influences how people write compared to longer Reddit posts or blogs. Auto-correction on mobile devices can obscure natural spelling patterns. Different communities also have varying norms, and writing styles in professional forums differ from those in casual chat rooms. While these contextual factors must be considered in analysis, they don’t eliminate the underlying stylometric signature.
Why This Matters for Investigators
If you’re involved in investigations, whether in law enforcement, corporate security, intelligence, or the private sector, knowing stylometry provides you with a valuable tool. While methods like IP tracing and metadata analysis are useful, they have their limits, as individuals often use VPNs, public computers, or anonymizing tools. They also tend to be cautious about sharing personal details, yet writing inevitably leaves identifiable traces.
Stylometry is effective because it examines unconscious writing habits that are hard for authors to hide. You can change your username, IP address, location, or identity, but genuinely altering your writing style in a consistent way across all your writings is incredibly challenging. Even when individuals attempt to mask their style, traces of their natural patterns often remain visible.
Though useful, stylometry is not flawless and has limitations we will discuss later. Having enough sample text is essential, as more material leads to more accurate pattern detection. Context matters since comparing a formal report to informal messages isn’t ideal. Writing styles can evolve over time, and while difficult, a skilled person can still intentionally obscure their style.
Oh, and AI. How does AI affect all of this?
The main point is that each piece of human-created writing forms a unique pattern. This pattern is quantifiable, recognizable, and enduring. In a world of anonymous accounts and digital pseudonyms, the skill to uncover identity through writing style analysis is essential for investigators. While the words may differ, the way they are arranged remains consistent. This consistency is the foundation of stylometry.
Next Friday - Part 2
feedback: matt(at)threatswithoutborders.com


Wow, the part about writing style being as distinctive as a fingerprint really stood out to me, such a brilliant observation. It builds perfectly on your previous article about digital footprints, and now I'm convinced my predilection for long, winding sentences will be my undoing online.