eDiscovery Key Terms

Posted by Jeremy Greer | Tue, May 28, 2019

This eDiscovery key terms list is the first step in understanding our tool. Our key terms are sequenced to highlight the timing in which each process will arise in your workflow.


Creating and Managing a Matter

Matter / Database – In eDiscovery, every matter is associated with a database containing all documents relating to that matter. In this way, the words matter and database are interchangeable.
Archive – A copy of your data meant to be stored for possible access in the future. File organization is crucial in understanding and revisiting your inactive data after a long period of time.
Backup – An exact copy your data. Digital WarRoom automatically keeps backups of all your files in case of data loss.
Views – Represent stages of your eDiscovery workflow in Digital WarRoom, in DWR we define them as Processing, Policy, Review, Draft Productions, Productions and Reports. You can toggle between views to access tools designed for the different stages of eDiscovery.



Field/ Metadata/ column – Data that has several parts can be divided into fields. For example: create date, last modified date, author. In a database, each column holds one field. The information in these fields are known as metadata. Metadata describe all possible recorded information about the files.
Process / Catalog documents – The act of adding documents to Digital WarRoom. The tool will extract all metadata from each document which can be used later for filtering
Indexing (Automatic) – The act of finding words in documents so you can look them up quickly
Collections vs Imports:

  • A collection can contain one or more imports (think "waves of data"). For instance, you might wish to create a collection  for the data you've acquired today. But that data might come in several different forms, i.e. a hard drive, a CD, and an email container from a public server. You may wish to process those three pieces of data separately but still have them appear in today's collection . To do that, you'd create your collection first, then add each import into it one at a time. The collections tree will show a single collection but when you select it, you'll see each import  listed along with some statistics about the import .

Job – A processing task specific to a computer service. There are 5 common eDiscovery computer services

  • Adding documents and productions (Catalog)
  • Generating keyword indexes (Indexing)
  • Generating OCR, extracted text and language analysis (OCR)
  • Converting Native documents to Image (Conversion)
  • Endorsing production images (Endorsement)

Exceptions – a variety of errors which prevent documents from processing correctly
The following are the major reasons a submission fails:

  • Corrupted files
  • Password-protected files
  • Encrypted files
  • Container files that cannot be opened
  • Unrecognized file formats.


Pith- Identifies near duplicates of emails, such as the copies that were sent and received. The reviewable content for both will be identical (while differing in aspects such as header and formatting), so deduping by pith will save you time.
Forensic fingerprint - Identifies exact duplicates. If the difference between the sent and received copies of the same email may be critical enough that they need to be reviewed separately, for example, use this method.
Dedupe – Remove duplicates of documents. By default, DWR dedupes by pith, meaning and email sent from bill and received by Jeremy in a different time zone will return only one document.
Custodian – The original owner or user of a collection of documents
Noise word/stop word – The most common words in a language are filtered out of indexing to make your searches more efficient. These words are excluded because they will not help you narrow your search to find your intended documents. Examples: the, a…



Marks (tags) – Will the document be produced? Each document can have 1 mark. Examples: Privileged, Produce, Not responsive
Issue codes – What legal or factual issue does the document relate to? Each document can have unlimited issue codes. Example: Duty, Breach, Damages
POD (Protective Order Designation) – Explains the level of confidentiality of the document
Propagate – When your attorney work product (marks, issue codes, PODS)  automatically copies across duplicate documents
Filter tree – Contains all attributes of your documents of which you can select, filter and narrow your corpus to a particular subset of documents   
Attribute – An item in the filter tree for which you can filter documents. Example: binders, custodians, collections, extensions
Families – Documents that are grouped together. Example: an email with an attachment
Threads – A thread of emails containing all emails connected to the original email
Redaction – Censoring a document, by blacking out or making sensitive text unreadable
Reindexing – Indexing your documents again due to changes in the indexable content
In most cases, you won't need to use this function since documents are automatically indexed upon processing. However, there are a few situations where re-indexing makes sense:

  • After generating OCR text
  • After replacing native documents or OCR text for specific files
  • After software updates


Draft Production

Native file – A document that has maintained the same file type as when it was originally collected
Conversion/imaging/printing – Converting your native documents to images for the purpose of redacting, endorsing and bates stamping
Endorse – Stamp your document with some information prior to production, most commonly a bates number or POD.


Production and Export

Sequencing - All documents in a production are assigned a sequence number. The sequence number determines the order in which documents will be processed and assigned Bates numbers.
When is a good time to resequence my production?

  • If new documents have been added to the production
  • It is possible that removing a document from a production can cause the production to have a Bates gap. To ensure that the production does not contain any gaps in the Bates range, you will need to re-sequence and re-endorse the production

Placeholder/ Slipsheet – A placeholder for a document that has been withheld due to privilege, clawback, etc to avoid a bates gap. The term can also be used to describe a cover sheet which contains meta data and other information about a document.


If you found this article interesting, be sure to subscribe you and your team to our monthly blog distribution email. This email list is solely for blog distribution purposes and we promise to only send one email per month. To subscribe, simply scroll down and fill out the "Subscribe" form below the comment box.


Topics: Best Practices