I trust my readers to take me to task and keep my blogs accurate with immediate feedback. In response to yesterday’s blog on acceptable confidence levels, I received several complementary emails and one that bluntly called the entire survey question and post “wrong-headed”. My expert friend and critic was right on the money when he pointed out that a confidence level or interval is a sampling measurement rather than a measurement of how accurate or stable your PC-TAR relevance model has become through the sampling/training process. I don’t want to dive back into precision, recall, F1 and other statistical jargon. The PC-TAR promoters have spent 2-4 years trying to educate attorneys and lit support practitioners on the wonders of machine learning with limited success. The smart marketers have dumbed down the ‘when to stop’ message and mistakenly promoted simplistic concepts of ‘98% confident’ because they resonate with consumers. I fell prey to this dumb-speak and promise to do better in future surveys and blogs. The question sought to understand whether practitioners had arrived independently at a standard measurement for completeness when using PC-TAR for typical civil litigation responsiveness retrieval. These are the kinds of questions that our consulting clients ask all the time, “How are others setting the defaults on Relativity Assisted Review training?” In fact, sampling confidence level is one of the first decisions that eDiscovery counsel have to answer. Essentially, how big do our sample training sets need to be for an acceptable level of confidence in the resulting recall, precision and stability? That is easy to understand, much easier than F1 stability, overturn rates and the other measurements that actually support the final ‘when to stop’ decision. No one likes the quenticential consultant/counsel answer to every question, “It depends.” But that is what every interview respondent said when pressed. There is no standard measurement of completeness for PC-TAR yet. Having one would speed adoption, but be wrong in too many instances. Thank you all for the feedback, good and bad. It helped further the discussion and clarified an important point. Keep it up!
Technology Assisted Review