Sorting Out the Real Cost and Value of E-Discovery Technology

By Jeremy Pickens


There has been a bit of talk lately in the e-discovery echo chamber about fixed-price models for processing, hosting, review, and productions. The purported goal of this discussion was to create a stir and drum up business. Yet conspicuously absent from this entire discussion was talk of total cost, aka value. I am the research scientist at Catalyst, so typically I do not get involved in discussions like this.  However, as there still seems to be a great deal of confusion over value, I felt the need to help sort all this out.

First, a bit of my background. I have spent the last 18 years of my professional life developing and applying algorithms to the task of finding relevant information. Currently, I am the senior applied research scientist at Catalyst.  I obtained my Ph.D. in computer science with a focus on information retrieval (search engines) from the Center for Intelligent Information Retrieval (CIIR) at UMass Amherst in 2004. I did a postdoc at King’s College University of London and then spent five years at the Fuji Xerox research lab in Palo Alto (FXPAL) before joining Catalyst in 2010. 

My focus has been on not just finding relevant information, but on finding as much relevant information as possible, with the least amount of user effort as possible. I have continued this approach while at Catalyst by applying my craft to the problem of document review.

So when I see the industry discussing cost, I am acutely aware that total cost is not just how much you pay for your software or even your hardware to run that software (assuming that you’re not using a hosted/cloud product). Your total cost includes, to no insignificant degree, how much you are paying for review. And reducing that cost – not just for the legal domain but for any place where information is being sought using the assistance of computer algorithms – has been the focus of my scientific inquiries for almost two decades now.

It’s not like we don’t already know this. In 2012 RAND published a study, “Where the Money Goes, Understanding Litigant Expenditures for Producing Electronic Discovery,” that estimated the cost of review at around 73 percent of a litigant’s total e-discovery spend. So what does that mean for the other 27 percent? Let’s do a fun little thought experiment.

Let’s suppose you’ve got a matter for which your total spend is going to be $100. As per the RAND study, you’re probably spending about $27 on software and $73 on review. (To keep the discussion simple, I am using “software” as a catchall for pretty much everything but review – the technology and services related to collection, processing, hosting, and production.) Let’s pretend for a moment that, in some magical world, your software is completely free. Instead of spending $100 on e-discovery, your total cost would drop to $73: ($0 software) + ($73 review) = $73.

On the other hand, let’s pretend for a moment that, in another magical world, the software is not free but has the technological capability of defensibly slicing out half the documents from your review. We’ll call this magical software “technology-assisted review.” In that case, your total cost would be $63.50. Why? ($27 software) + ($73/2 = $36.50 review) = $63.50. Actually, slicing your review in half is probably a conservative estimate, especially on low richness collections. It’s not unheard of for TAR to be able to slice out 75 percent of your review, maybe even 90 percent or more. I’ll leave the math as an exercise for the reader, but in these two cases total cost would drop to $45.25 and $34.30, respectively.

In this thought experiment I have described two magical worlds: One in which your software is completely free, the other in which technology orders documents by relevance, so that you can find what you need without having to examine the entire corpus. And of these two worlds, which one is actually magical, and which one is real? As far as I can tell, no one is giving away their software for free, but the software to vastly reduce your review costs in an intelligently algorithmic manner does exist. But even if the magical “free” world were real, which world would save you the most? The world in which your total cost is $73 (free software) or the world in which your total cost is $34.30 (effective software)?

The astute reader will of course want to start arguing about nuances. And rightly so – it is of utmost importance to be aware of all factors that might affect your bottom line. But that’s exactly the point: Are you aware of all the factors that affect your bottom line? And are you aware of the relative value of each factor (i.e., that review alone is almost three times as expensive as everything else combined)? Or are you only focused on part of the story? Are you so excited that someone is giving away razors that you forget to check the price of the blades?

Your total cost includes all factors, and any sales pitch that does not focus on that total cost is distracting you from the overall value (or lack thereof) of what you might be getting.

