Recent Posts
Keyword Cloud
years
stored
rule
parties
it’s
emails
court
search
cases
relevant
evidence
case
attorney
civil
clients
client
document
data
files
technology
information
process
litigation
company
blank
rules
federal
time
electronic
attorneys
employee
documents
software
discovery
forensics
practice
email
legal
party
need
| Non-Latin Text Poses E-Discovery Problem |
| Wednesday, 11 November 2009 12:08 |
|
By Eric P. Blank & Jonathan Yeh ICANN announced its long expected decision to allow the use of non-Latin characters in Internet addressing on Oct. 30. ICANN, the Internet Corporation for Assigned Names and Numbers, manages rules pertaining to Internet Protocol addresses and domain names. Until now, domain names have been limited to Latin characters, with some additional restrictions thrown in. For example, the Internet uses extensions such as .gov, .com, .ca and so forth.
These top level domain names (TLDs) are used for Web site and email addressing. ICANN’s decision is a response to the growing use of the Internet in countries where the primary language does not use Latin characters (example: China). In ICANN’s official statement, ICANN chairman Peter Thrush said that “the coming introduction of non-Latin characters represents the biggest technical change to the Internet since it was created four decades ago. Right now, Internet address endings are limited to Latin characters – A to Z. But the Fast Track Process is the first step in bringing the 100,000 characters of the languages of the world online for domain names.” Initially, the Internationalized Domain Names (IDNs) will be limited to country-code TLDs (e.g. the China country-code TLD “.cn” might become “.中國” or “.中国”). The longer-term intent is to extend the use of IDNs to all areas of the Internet. On the one hand, IDN use is positive in so much as it makes the Internet truly global. On the other hand, the addition of non-Latin character sets could be problematic for the e-discovery industry if not immediately addressed by both the legal and technical communities. Why? If the launch of IDNs follows historical Internet trends, we can expect spammers and scammers to exploit the inevitable initial weaknesses in the Internet-wide engineering changes to launch a series of new attacks, some of which will lead to litigation. We can also expect that free mail, social and texting sites will appear with the new IDNs, and that some of their users will believe that the new IDNs are somehow immune from laws relating to trademark, copyright, theft of trade secrets and defamation. It is no protection that IDNs will initially be limited to country domains. Experience teaches that many countries give up control of their country domains to third parties with questionable business ethics, practices and models. The technical e-discovery response is simple to describe, yet complex to implement. Forensic software, search utilities, hosting platforms and other software must be upgraded to address the thousands of new character sets and to take into account the many ways in which their use might affect search methods. It won’t work to simply translate the new IDNs. Inn many cases, it might be a mistake to do so. Literally, the Chinese characters for China might translate as “middle kingdom.” If the goal is to find all email from “johnsmith@starbucks.中国,” will “johnsmith@starbucks.middle kingdom” need to be searched as well as “jsmith@starbucks.china”? Do contextual searches of translated documents need to be expanded to include concepts relating to “middle kingdom” and China? And this will only get harder as IDN use expands. Take, for example, the Chinese characters that might be used for “Starbucks.cn”: 星巴克.中國. Literally translated, the first character would translate as the word “star,” but the last two characters are a rough phonetic representation of the sound of the word “buck” and, unless programmed to recognize this, translating software might translate the characters as gibberish involving conquering. Will searches need to account for finding translated documents containing or relating to the concept “Star conquer dot middle kingdom”? Sound difficult? It is. And attorneys who assume that all e-discovery software and search engines are created equal are in for yet another rude awakening. Here’s the good news: Technology is up to the task. I have no doubt that innovative software engineers will find ways to address all of the myriad ways in which the introduction of non-Latin domains will complicate e-discovery. The bad news? The new IDNs will feature in litigation soon after their introduction, but e-discovery software in the hands of many attorneys will be outdated for years. Rather than waiting for technology to catch up or for e-discovery budgets to make room for upgrades, attorneys can proactively help to address IDN concerns through better communication and cooperation with respect to search terms. Compare an analogous situation today: The problem of searching for “all of Mr. Smith’s email.” Mr. Smith may have many email accounts. Some are work-assigned, some personal and some shared with family. These accounts may not contain Mr. Smith’s name. Mr. Smith may receive emails addressed to both smith@sample.com and info@sample.com. He may have a personal account like smith@abc.com as well as hugehoopsfan@abc.com. In the e-discovery industry, it’s vital in these kinds of situations that all parties understand what is meant, in scope and depth, by “search Smith’s email” and “we searched Smith’s email.” Effective e-discovery can only be achieved by properly accounting for all the various data sources. As the Internet moves toward true globalization, parties will have an even greater need to understand precisely what is meant, in scope and depth, by “search Smith’s email.” Will searches include domain names that U.S.-based desktop computers can’t even type without special software? Will non-Latin text be translated and then searched, or searched using non-Latin keywords? Attorneys can avoid costly mistakes by keeping up to date on the translation and language capabilities of both their own and opposing parties’ electronic discovery and document management systems, as well as carefully communicating with opposing counsel on search scope and depth. Eric P. Blank is the founder and managing attorney of Blank Law + Technology PS. His practice focuses on electronic discovery counseling, e-security response planning and implementation, investigations and computer forensics. Mr. Blank has conducted more than 300 investigations into computer and software-related torts and employee misconduct since 2001 and has frequently been a court-appointed special master or neutral in e-discovery matters. Jonathan Yeh is an attorney and principal at Blank Law + Technology PS. Mr. Yeh’s practice includes general commercial transactions and litigation, computer forensics, electronic evidence, electronic data and technology risk management and intellectual property. Mr. Yeh received his J.D. degree, cum laude, from the Seattle University School of Law and his undergraduate degree from the University of Georgia. |
