Hot or Not? What makes a document sexy—and why a machine can’t figure it out for you.
When conducting document review for case preparation, the ultimate goal is to find the smoking gun documents that help you make your case. You know them when you see them. But how do you find them when they’re not in plain sight, and how do you maximize your chances of finding all of them?
By definition, a “hot document” is unusual. If it’s an email or text message among or about key players—and that’s what a hot document frequently will be—it is likely to contain veiled, novel, or otherwise unpredictable language. It might be very short, and it is likely to require inference based on a broader context not provided within the document itself. What’s more, “hotness” is not constant across different types of matters or even among matters within a single industry. The same document could be a smoking gun in one case and an irrelevant snoozer in another.
There’s not an app for that.
A pure machine approach can’t identify what’s intriguing or surprising for a particular case. If someone tries to sell you a generic “hot document filter,” back away slowly with your hand firmly on your wallet. Machines can be very good at finding information you already know exists (try entering “capital of Liechtenstein” or “Lady Gaga’s real name” in a search engine), but the problem with hot docs is that you don’t start out knowing what’s there. If you already knew that, you’d be done. Machines are sadly lacking when it comes to, say, figuring out whether certain people at a company accepted cash bribes, or knowingly invested in ineligible securities, or blithely ignored red flags about safety issues, or engaged in any other flavor of shenanigans.
The case team (or even better, a team of strategic search experts)—people who understand the facts at issue and the relationships among the players—will recognize critical information that can either support or undermine their story, or contradict the story they know their opponents hope to tell. They can speculate about what sorts of things people might have said, and what kinds of activities might have occurred, but they can’t know a priori exactly what people said or exactly what the evidence of those activities looks like. And they can’t just read 2 million documents to find those things out. The good news is that they don’t have to. There are ways to systematically funnel down your full data set into a body of documents that is manageable enough to allow principled, strategic human searches for the information you need most.
Location, location, location.
If you want to find hot documents, you need to go where they live. Your search platform can be a great tool to help you objectively narrow down your search space to the documents most likely to contain the information you’re hoping to find. If you’re looking for the kind of information that people might blurt out to a confidante or co-conspirator, don’t bother looking in spreadsheets or formal documents, but instead stick to emails and text messages. Is there a time period within which you know certain problematic activities must have occurred? Start looking within that range and then branch out later if you need to as you learn more. Focus on emails containing key players’ names. Focus on emails that don’t have large distribution lists (people don’t generally brag to the world about their misdeeds). Use any objective criteria you can think of that will shrink your sandbox without likely knocking out the good stuff.
The hallmarks of hotness
While there’s no such thing as a generic filter that can identify specific hot documents for a specific matter, there are certain generic hallmarks that are both searchable and frequently correlated with hotness. Profanity and other emotionally heightened language is one example. Once you’ve established a reduced document set as your search area, you can—guided by what you know about the case and the personalities of the participants—start searching on the types of words and phrases that you wouldn’t say in front of your mother. You can also look for suggestions of worry, fear, stress, or anger (“worried about,” “trouble,” “problems,” “pissed off”); verbs of obligation (“should,” “supposed to”); and suggestive phrases like “I’m not a lawyer, but” or “just between us.” Don’t forget to factor in common typos and abbreviations/internet-speak (SRSLY? Yes.).
As you look, keep an eye out for additional hooks that you hadn’t thought of or known about going in. Do some people tend to use their personal email accounts when discussing relevant topics? Look more closely at emails sent to or from that address. Do certain executive assistants tend to chatter with each other about their bosses’ foibles? Focus on emails between the assistants when no one else is CC’d. Let your machine do what it’s good at—“show me all emails sent by Joe Smith to Josephine Doe between April 2008 and January 2009”—“OK, now show me the subset of those emails containing any of the following words…”—and then use the app between your ears to determine whether the results are important.
In summary, search tools can be an indispensible asset in your (or your expert search consultant’s) hands when trying to locate the smoking guns. But while machines can do a lot of work for you, they can’t do your work for you. It’s the marriage of the machine’s capabilities with the know-how and intuition that only a human can bring, that work in tandem to find you the documents that will make your case.