Rule 26 Blog
Hi ho, Hi ho, Off to Mine for Metadata We Go…
Written by Charles T. Tsuji   
Tuesday, 03 November 2009 00:00

Not long ago, metadata was the new electronic discovery buzz word.

It was discussed at length at CLEs, and the legal industry quickly learned that metadata could be another source of potentially relevant information that must be properly preserved.

Upon its introduction, this “data about data” concept was sometimes difficult to grasp. Why do electronic files contain additional information that’s easily altered or updated? Why would an electronic document contain information that’s not readily viewable by the author? Why can’t electronic documents be treated like paper documents?  

Despite this confusion, metadata’s potential impact on a particular case was easily understood.

Most attorneys have heard stories of a party un-clicking the track changes option in a Microsoft Word document to reveal the original text. They also know all too well the serious impact that has on a case. This surveying of a document’s metadata or, mining for metadata, has been the topic of various bar association ethics opinions. However, these opinions are split on whether metadata mining violates an attorney’s ethical obligations.

The interesting thing about metadata is that it rarely provides a case-turning smoking gun. Metadata doesn’t tell you every person who accessed or viewed a document. It doesn’t tell you who read a document or how it got to its current location. In most cases, the most relevant information is still in the text of the document itself.

Despite these nuances, metadata continues to play a large role in today’s litigation practice, especially during the production of documents in discovery.

Originally, the production of electronic documents typically included TIFFs, an image load file and a delimited text file with only the production Begdoc and Enddoc values.  However, parties are beginning to include the document’s metadata in the delimited text file or are simply requesting the production of the corresponding native documents.

Providing metadata in a delimited text file generally does not have negative consequences because the parties can agree on which metadata fields are produced and in what particular format, e.g. .CSV or .DAT format.

However, the production of native files is risky. As seen above, one main concern of a native production is that the opposing party could be mining for metadata. In large productions, review teams typically don’t scan each document for hidden data and reviewers may only focus on the text of each document, not on the dozens of different metadata fields.

In addition, the production of native files may cause problems if a particular document needs to be redacted. It’s easy to redact a TIFF or a hard copy, but nearly impossible to produce a completely redacted native file.

Given the evolving e-discovery landscape, it’s likely that metadata and native file requests will become more and more common. Attorneys need to be aware of the implications of producing metadata.

Attorneys that face a native-file production request should consider the following:

• Explain to your client the implications of producing documents in native file format (e.g., the potential for metadata mining).

• Determine why the opposing party is asking for native documents, and see if a delimited text file containing the document’s metadata would suffice.

• If your client agrees to a native file production, prepare a confidentiality agreement, protective order or claw-back agreement to protect against the inadvertent production of evidence.

Photograph of Charles Tsuji

Charles T. Tsuji is an attorney and electronically stored information consultant at Blank Law + Technology PS. He provides advice to clients to ensure that sources of electronic documents are properly identified, preserved and collected. Mr. Tsuji has more than five years of e-discovery experience and is a graduate of the Seattle University School of Law.