TREC 2011 Results - Why you should care
TREC 2011 has wrapped up and provided their final report on the comparison of search results for a defined discovery effort. There are some key takeaways that are worth thinking about.
1. Finding relevant information with any accuracy remains a very challenging task.
The systems evaluated were subject to a recall-precision tradeoff: you can have high recall (finding relevant documents) OR you can have high precision (finding only relevant documents). In this year’s test, no one method achieved both high recall and high precision at the same time. In fact, no entry was reported to have achieved at least 60% recall while maintaining at least 50% precision. Said otherwise, that’s alot of overcapture of non-relevant documents to find a little more than half of all the relevant documents.
>> Why you should care: If you choose to use a technology-assisted review software or other document review methodology subject to this recall/precision tradeoff, you must essentially choose, if you can, where to risk a deficiency; in missing relevant documents or in over-capture. With systems subject to this tradeoff, if you favor higher precision, you are effectively trading away more of the relevant documents. Favor comprehensive capture of relevant documents, and you are trading away precision. Unfortunately, TREC 2011 shows that for the systems tested, you can’t really have your cake and eat it too. So if you can’t have simultaneously high precision and recall, and you have to favor one or the other, in a responsive review context its safest to favor comprehensiveness (recall). In that case, systems like those evaluated in TREC 2011 may be best suited for applications where reasonably high recall with lower precision is acceptable. For example, they may be helpful for pre-review culling but likely will need to be complemented by downstream responsiveness review to weed out non-relevant documents prior to production.
2. Quality assurance (estimating levels of recall) remains a significant challenge.
Participants’ estimates of recall were almost always significantly lower or higher than the recall actually achieved, suggesting that quality assurance (measurement of accuracy) remains a key challenge for document review.
>> Why you should care: If you can’t accurately estimate recall and precision during the process, (even if the ranking itself is quite accurate), you can’t make an informed decision as to when reasonable accuracy was achieved and, hence, stop the review. Earlier this year Judge Peck famously stated that he “may be less interested in the science behind the ‘black box’ of the vendor’s software than in whether it produced responsive documents with reasonably high recall and high precision.” In doing so, Judge Peck gave the clearest guidance yet of how to substantiate your process – you need know your precision and recall. Absent such reliable measurements, the defensibility of the process comes into question. In practice, TREC 2011 suggests to courts and litigants that close attention needs to be paid to the validity of the design and execution of the quality assurance protocol—that is of the determination of whether or not “reasonably high recall and high precision” were achieved.
Learn more about the TREC Legal Track and H5’s past participation in and support of this important initiative.