Legal Data Culling: What Does It Mean to Cull Documents?

Posted by Staff Writer | Fri, Sep 02, 2022

In the legal world, culling documents means removing certain documents from a larger set. This can be done for a variety of reasons, such as reducing the size of a data set or removing irrelevant data and documents from a review process.

As our use of technology creates more data than ever before, law firms are tasked with managing and culling information. How much data is necessary? What file types should be included?

Legal teams don't all have endless resources to dedicate to manual review and document culling. While it takes an average reader more than 55 hours to read a million words, technology can offer a much quicker legal data culling method.

That's why the use of predictive coding and AI software has revolutionized the process of electronic discovery and managing electronically stored information (ESI).

What is legal data culling?

Legal data culling is the process of removing certain documents from a larger set. This can be done for a variety of reasons, such as reducing the size of a data set or removing irrelevant documents from a review.

Key terms

To better understand the legal jargon of data culling, here are some helpful key terms:


DeNisting is the process of removing documents from a review because they hold no evidentiary value.


Deduplication (or "De-dupe") is the process of identifying and removing or suppressing duplicate documents from a review. This can be done much more efficiently with the help of document review software.


In eDiscovery, data custodians ad are "persons having administrative control of a document or electronic file." For example, the custodian of an email message could be the owner of the digital mailbox that contains the relevant message, but it could also be the person who sent the message and has it in their outbox.

Identifying these key custodians is an important first step in and eDiscovery, litigation, or regulatory matter.

No De-dupe

No de-dupe is a term used to describe a data set that contains no duplicate documents.

Search terms

Search terms are the words or phrases that you use to search for documents in a review. You can set search terms for specific data sets or a specific date range to narrow the scope.

Two-filter method

A method used in large document review projects to maximize efficiency. In the first phase, deNisting and de-duping are employed as technical filters. In the second phase, the ESI has been purged of unwanted custodians, date ranges, spam, and other obvious irrelevant files and file types.

This second filter is the one that utilizes predictive coding and reliance on artificial intelligence.

Email threading

in eDiscovery, email threading involves identifying the relationships within emails by parsing through threads, people, and attachments. This is done to view the emails as a chain of unique, individual messages while still maintaining the greater context of the overall conversation.

Maintaining critical data and key documents

In legal reviews, it is important to maintain critical data and key documents. Understanding the key legal issues to identify and process the most critical data should be the top priority of the attorneys managing the cull. Securing access to this information is also vital.

For example, certain files may fall under privilege and should be logged and removed from data sets that would be turned over to opposing or outside counsel.

Legal teams can help clients better identify and secure relevant information for their cases and keep costs lower with the use of smart technologies like eDiscovery software.

The importance of efficient and effective document review

With larger volumes of writings accessible in today’s information age, "Big Data" presents as much an opportunity as a challenge for lawyers. On the one hand, there are electronic writings everywhere which can be hard to destroy.

That means organizations may unknowingly be saving data they're not legally required to keep, but that could be damaging if discovered. Thus, understanding information governance has become a key principle of business management.

However, the benefit of all this information is that with the large amount of ESI in cyberspace, it means that the truth is almost always out there. The challenge is simply for lawyers to find it, identify it, and determine how to best use it to their client's advantage.

Thus, the legal field has become more reliant on technology to help manage this increased workload resulting from "Big Data." Technology-assisted culling can help reduce the amount of data that needs to be reviewed, making the process more efficient and effective.

Improving document review efficiency

With so much electronically stored data out there, finding relevant documents can seem like searching for a needle in a haystack. That's why more attorneys are turning to software to aid in legal search and document review.

In fact, in most cases, newer cloud-based eDiscovery software empowers legal departments to bring routine document review in-house. This can offer huge cost savings and security benefits to clients. Instead of relying on an expensive third-party vendor, firms can utilize internal resources to better control discovery processes.

How document review software can streamline data culling

Document review software can help streamline the process of legal data culling. Whether a firm conducts review in-house or leverages outside counsel for higher complexity or higher risk matters, the overall collected data volume will need to be culled down to a usable size. 

When a review team can quickly and efficiently narrow down large data sets of the most relevant information, they can better control project scope, time, and costs. Thus, review software can streamline data collection, organization, and review for better results and lower costs.

The advantages of in-house data processing and culling

In-house data processing and culling have many advantages. First, it allows you to better control the quality of your data. Second, it gives you the ability to customize your data processing workflow.

Finally, it allows you to keep your data confidential. Each of these alone would be beneficial to a review team. Maximizing resources benefits all parties.

Digital WarRoom's culling tools

Digital WarRoom offers a variety of eDiscovery tools to help you streamline your legal data culling process. While there are myriad approaches and options for culling and searching, Digital WarRoom has industry-leading service and easy-to-use software that is transparent, simple, and affordable. Schedule a demo today to learn more!