Best Practices: NearDup - Lexbe – e-Discovery Fast · 2014-07-17 · additional cost in Lexbe...

24
Best Practices: NearDup Gene Albert Principal, Lexbe LC Using Near Duplicate ID to Detect Key Docs, Protect Privilege & Speed Reviews July 17, 2014

Transcript of Best Practices: NearDup - Lexbe – e-Discovery Fast · 2014-07-17 · additional cost in Lexbe...

Page 1: Best Practices: NearDup - Lexbe – e-Discovery Fast · 2014-07-17 · additional cost in Lexbe eDiscovery Platform. Best Practices: ‘NearDup’ Identification | eDiscovery Webinar

Best Practices: NearDup

Gene AlbertPrincipal, Lexbe LC

Using Near Duplicate ID to Detect Key Docs, Protect Privilege & Speed Reviews

July 17, 2014

Page 2: Best Practices: NearDup - Lexbe – e-Discovery Fast · 2014-07-17 · additional cost in Lexbe eDiscovery Platform. Best Practices: ‘NearDup’ Identification | eDiscovery Webinar

eDiscovery Webinar Series

○ Takes Place Monthly

○ Cover a Variety of Relevant eDiscovery Topics

○ Presentations Available for Download by Registrants.

Best Practices: ‘NearDup’ Identification | eDiscovery Webinar Series | July 17, 2014

Info

Page 3: Best Practices: NearDup - Lexbe – e-Discovery Fast · 2014-07-17 · additional cost in Lexbe eDiscovery Platform. Best Practices: ‘NearDup’ Identification | eDiscovery Webinar

eDiscovery Webinar Series

Lexbe is an Austin, TX based eDiscovery software and services provider.

○ Lexbe eDiscovery PlatformLexbe eDiscovery Platform is a hosted eDiscovery processing and review tool. Users can load a variety of file types, process for review, OCR for search, and conduct document reviews, productions, prepare for depos & analyze transcripts, conduct case analytics, prepare for dispositive motions, and provide litigation support during trial.

○ Lexbe eDiscovery Services Lexbe performs large volume document culling, processing from native to PDF or TIFF, load file creation, high-volume OCR of image files, Rule 26 and project management consulting, and related eDiscovery Services.

About Lexbe

Lexbe Sales [email protected]

(800) 401-7809 x22

Best Practices: ‘NearDup’ Identification | eDiscovery Webinar Series | July 17, 2014

Page 4: Best Practices: NearDup - Lexbe – e-Discovery Fast · 2014-07-17 · additional cost in Lexbe eDiscovery Platform. Best Practices: ‘NearDup’ Identification | eDiscovery Webinar

If you have any questions or technical issues, please e-mail them to:

[email protected]

Questions will be forwarded to Gene and answered during the webinar or via e-mail if we run out of time.

eDiscovery Webinar SeriesQuestions & Technical Issues

Best Practices: ‘NearDup’ Identification | eDiscovery Webinar Series | July 17, 2014

Page 5: Best Practices: NearDup - Lexbe – e-Discovery Fast · 2014-07-17 · additional cost in Lexbe eDiscovery Platform. Best Practices: ‘NearDup’ Identification | eDiscovery Webinar

○ Principal of Lexbe LC, a provider of cloud-based litigation review and document management software & eDiscovery services.

○ Prior business experience in software, medical services and internet-based businesses. Prior legal experience as in-house counsel and in private practice.

○ Frequent speaker and author on eDiscovery and legal technology issues.

○ EducationMBA, University of Texas (2005)JD, Southern Methodist University (1983)BA, University of Texas (1979)

○ Contact Gene [email protected]

Best Practices: ‘NearDup’ Identification | eDiscovery Webinar Series | July 17, 2014

eDiscovery Webinar SeriesGene Albert Bio

Page 6: Best Practices: NearDup - Lexbe – e-Discovery Fast · 2014-07-17 · additional cost in Lexbe eDiscovery Platform. Best Practices: ‘NearDup’ Identification | eDiscovery Webinar

Near Duplicate Detection

○ What is Near Duplicate Identification?

○ When is ‘NearDup’ Needed?

○ Inadvertent Privilege Release Example

○ Using ‘NearDup’ to:■ Group Similar Documents■ Find More Key Documents■ Enable Email Threading■ Prevent the Inadvertent Release of Privileged Information

○ NearDup Groupings+ service options from Lexbe

Best Practices: ‘NearDup’ Identification | eDiscovery Webinar Series | July 17, 2014

Agenda

Page 7: Best Practices: NearDup - Lexbe – e-Discovery Fast · 2014-07-17 · additional cost in Lexbe eDiscovery Platform. Best Practices: ‘NearDup’ Identification | eDiscovery Webinar

What Is It?

○ NearDup technology automatically recognizes similar documents within an e-discovery document collection

○ Algorithm analyzes, evaluates and compares the actual text content of the documents to each other

Best Practices: ‘NearDup’ Identification | eDiscovery Webinar Series | July 17, 2014

Near Duplicate Detection

Unstructured Documents NearDup Groupings

Page 8: Best Practices: NearDup - Lexbe – e-Discovery Fast · 2014-07-17 · additional cost in Lexbe eDiscovery Platform. Best Practices: ‘NearDup’ Identification | eDiscovery Webinar

What Does It Do?

Best Practices: ‘NearDup’ Identification | eDiscovery Webinar Series | July 17, 2014

Near Duplicate Detection

NearDup technology will group similar documents, even though not exactly the same. Examples include:

○ Separately scanned documents.

○ Multiple versions of a Word document that are slightly different due to minor edits, reformatting, etc.

○ An original document and one with handwritten notes on it.

○ Emails and responses that continue a conversational ‘chain’ or ‘thread’.

Page 9: Best Practices: NearDup - Lexbe – e-Discovery Fast · 2014-07-17 · additional cost in Lexbe eDiscovery Platform. Best Practices: ‘NearDup’ Identification | eDiscovery Webinar

Data Types and Volume Keep Growing

Digital Information Created, Captured, Replicated Worldwide4

3

2

1

2005 2010 2015Source: IDC Digital Universe Study (2012)* 1 Zettabyte = 1 Trillion Gigabytes

Zettabytes*

2.8 zettabytes of information were created and replicated during 2012, a 56% increase from 2011 (IDC)

VoipEmail

iPhones Peer-to-Peer

Online StorageDigital Cameras

Facebook | LinkedIn DropBox | Backup Devices

Elastic Storage | SaaS | Google StreetsPersonal Blogs | Skype | World Satellite Images

Personal Scanners | Customer Service Recordings Public Webcams | Google Goggles | Netbooks | Cloud Instance Servers | PaaS

Need for Near Duplicate Detection

Best Practices: ‘NearDup’ Identification | eDiscovery Webinar Series | July 17, 2014

Page 10: Best Practices: NearDup - Lexbe – e-Discovery Fast · 2014-07-17 · additional cost in Lexbe eDiscovery Platform. Best Practices: ‘NearDup’ Identification | eDiscovery Webinar

Main Applications of NearDup

There are 4 main applications of NearDup analysis:

1) Grouping similar documents:○ Bunch highly similar documents together for more efficient coding

and review

2) Finding hidden ‘key’ or ‘hot’ docs:○ Retrieve and mark unseen documents that have content highly

related to existing ‘hot’ or ‘key’ documents

3) Preventing the inadvertent release of privileged information○ Be automatically alerted to files containing similar content to

documents that have already been coded as privileged

4) Enable email threading:○ Maintain relationships between email conversations

Do I Need Near Duplicate Detection?

Best Practices: ‘NearDup’ Identification | eDiscovery Webinar Series | July 17, 2014

Page 11: Best Practices: NearDup - Lexbe – e-Discovery Fast · 2014-07-17 · additional cost in Lexbe eDiscovery Platform. Best Practices: ‘NearDup’ Identification | eDiscovery Webinar

Applying Near Duplicate DetectionLarge Groupings Accelerate Review

Feature DescriptionReport identifies Near Dup Groups in a case based on extracted or OCRed text

Benefits⃝ Accelerate document review by batch coding (using multidoc edit) larger groups

⃝ Increase coding consistency of batched documents

⃝ Reduce privilege errors

Best Practices: ‘NearDup’ Identification | eDiscovery Webinar Series | July 17, 2014

Page 12: Best Practices: NearDup - Lexbe – e-Discovery Fast · 2014-07-17 · additional cost in Lexbe eDiscovery Platform. Best Practices: ‘NearDup’ Identification | eDiscovery Webinar

Applying Near Duplicate DetectionFind Similar Versions of Key Documents

ExampleSimilar versions of a Key Document are shown in the Document Viewer

Benefits⃝ Follow the trail from one key document to others.

⃝ Find key documents that would otherwise be missed

Best Practices: ‘NearDup’ Identification | eDiscovery Webinar Series | July 17, 2014

Page 13: Best Practices: NearDup - Lexbe – e-Discovery Fast · 2014-07-17 · additional cost in Lexbe eDiscovery Platform. Best Practices: ‘NearDup’ Identification | eDiscovery Webinar

Prevent Inadvertent Privilege Release

Setup & Planning Collection Culling &

Analysis Processing Depos & Motions

Review & Production

Beware of Inadvertent Privilege Release

○ Larger cases have put a strain on accurate privilege review.

○ Finding 9 versions of a privileged document doesn’t help if you release version 10.

○ Nothing is more costly than compromising or losing a case because of privilege disclosure.

○ Claw-back agreements a good idea, but no panacea. “You can’t unring a bell.”

Applying Near Duplicate Analysis

Best Practices: ‘NearDup’ Identification | eDiscovery Webinar Series | July 17, 2014

Page 14: Best Practices: NearDup - Lexbe – e-Discovery Fast · 2014-07-17 · additional cost in Lexbe eDiscovery Platform. Best Practices: ‘NearDup’ Identification | eDiscovery Webinar

Prevent Inadvertent Privilege ReleaseApplying Near Duplicate Analysis

Example Case: Thorncreek Apartments III, LLC v. Village of Park Forest (N.D. Ill. 2011)

○ At issue were six documents produced by Defendants to Plaintiffs, but attorney-client privilege was claimed

○ Court determined that the Defendants were negligent by failing to check the production database created by a third-party e-discovery vendor before it became available to opposing counsel

○ Court found waiver, relying in part on long period of time after production before attempting to clawback documents and failure to timely prepare a privilege log.

○ Even if the court allowed clawback, the sensitive information would have already been disseminated.

Best Practices: ‘NearDup’ Identification | eDiscovery Webinar Series | July 17, 2014

Page 15: Best Practices: NearDup - Lexbe – e-Discovery Fast · 2014-07-17 · additional cost in Lexbe eDiscovery Platform. Best Practices: ‘NearDup’ Identification | eDiscovery Webinar

Prevent Inadvertent Privilege Release

Setup & Planning Collection Culling &

Analysis Processing Depos & Motions

Review & Production

Minimizing Risk of Privilege Release

○ Understand the Privilege Review process undertaken in detail.

○ Build dictionary of privileged sources and issues early in doc review.

○ Check for: untrained or sloppy review; unsearchable documents; incomplete search indices; poor redaction procedures; search not done in metadata and full-text; privilege text retained in natives, text files, load files, text-based PDFs.

○ Use specialized computerized privilege checks for container (email family) consistency, exact-dup and near-dup identification.

Applying Near Duplicate Analysis

Best Practices: ‘NearDup’ Identification | eDiscovery Webinar Series | July 17, 2014

Page 16: Best Practices: NearDup - Lexbe – e-Discovery Fast · 2014-07-17 · additional cost in Lexbe eDiscovery Platform. Best Practices: ‘NearDup’ Identification | eDiscovery Webinar

Prevent Inadvertent Privilege Release

Example

⃝ Privileged documents found 9 out 10 times, but one missed

Benefit⃝ Find privileged documents with text similarity that can be easily missed otherwise

Applying Near Duplicate Detection

Best Practices: ‘NearDup’ Identification | eDiscovery Webinar Series | July 17, 2014

Page 17: Best Practices: NearDup - Lexbe – e-Discovery Fast · 2014-07-17 · additional cost in Lexbe eDiscovery Platform. Best Practices: ‘NearDup’ Identification | eDiscovery Webinar

Applying Near Duplicate DetectionCatch Privilege Inconsistencies

Feature DescriptionReport identifies inconsistently coded privilege and work product codings

Benefits⃝ Reduce privilege errors

⃝ Avoid sole reliance on human coding consistency

⃝ Establish safeguards to help maintain privilege

Best Practices: ‘NearDup’ Identification | eDiscovery Webinar Series | July 17, 2014

Page 18: Best Practices: NearDup - Lexbe – e-Discovery Fast · 2014-07-17 · additional cost in Lexbe eDiscovery Platform. Best Practices: ‘NearDup’ Identification | eDiscovery Webinar

Applying Near Duplicate DetectionEmail Threading

Feature DescriptionGroup email messages that have similar text representing a conversation thread

Benefits

⃝ View email chains with similar text in date & time order

⃝ Avoid confusion of emails only tangentially related (<50% text overlap)

⃝ Consistently code email chains for responsiveness, privilege, attorney-eyes only, etc.

Best Practices: ‘NearDup’ Identification | eDiscovery Webinar Series | July 17, 2014

Page 19: Best Practices: NearDup - Lexbe – e-Discovery Fast · 2014-07-17 · additional cost in Lexbe eDiscovery Platform. Best Practices: ‘NearDup’ Identification | eDiscovery Webinar

Included with Lexbe eDiscovery PlatformApplying Near Duplicate Analysis

○ Near Duplicate Identification is included at no additional cost in Lexbe eDiscovery Platform.

Best Practices: ‘NearDup’ Identification | eDiscovery Webinar Series | July 17, 2014

○ You can automatically apply ‘NearDup’ to documents you self-upload into the platform to group similar documents and review for privilege coding consistency.

Page 20: Best Practices: NearDup - Lexbe – e-Discovery Fast · 2014-07-17 · additional cost in Lexbe eDiscovery Platform. Best Practices: ‘NearDup’ Identification | eDiscovery Webinar

Applying ‘NearDup’ in The CloudLexbe eDiscovery Platform

● Self-administration● Native (Office, etc.) processing● Automatic OCR● Early case analysis● Dual-index search● Exact & near-dup ID● Doc Review & issue tagging● Blended productions● Transcript management● Timelining, depo prep● Dispositive motions● Trial document management

Cloud-based litigation document management software

FEATURES

Best Practices: ‘NearDup’ Identification | eDiscovery Webinar Series | July 17, 2014

Page 21: Best Practices: NearDup - Lexbe – e-Discovery Fast · 2014-07-17 · additional cost in Lexbe eDiscovery Platform. Best Practices: ‘NearDup’ Identification | eDiscovery Webinar

Included in Processing ServicesApplying Near Duplicate Analysis

Best Practices: ‘NearDup’ Identification | eDiscovery Webinar Series | July 17, 2014

We apply NearDup Groupings+ to the following processing services at no additional charge:

○ Native Processing+ (TIFF) Convert Outlook, Microsoft Office, and other native file types for review in in-house TIFF-based systems

○ Native Processing+ (PDF)Convert Outlook, Microsoft Office, and other native file types into searchable PDFs for review

○ Native Extraction+ Prepare case data for native or near native review

Page 22: Best Practices: NearDup - Lexbe – e-Discovery Fast · 2014-07-17 · additional cost in Lexbe eDiscovery Platform. Best Practices: ‘NearDup’ Identification | eDiscovery Webinar

Security & Data Ownership

What to look for in litigation cloud service offerings:

○ EncryptionData encrypted (256-bit or above) in-place and in-transit.

○ Data Center CertificationsData centers should be certified, follow industry best standards, etc.

○ Clear Ownership RightsService agreements should clearly acknowledge client data ownership.

○ Redundant Back-Ups; RecoveryService provider should have robust and redundant backup & recovery protocols.

Applying ‘NearDup’ in The Cloud

Best Practices: ‘NearDup’ Identification | eDiscovery Webinar Series | July 17, 2014

Page 23: Best Practices: NearDup - Lexbe – e-Discovery Fast · 2014-07-17 · additional cost in Lexbe eDiscovery Platform. Best Practices: ‘NearDup’ Identification | eDiscovery Webinar

Summary

Use ‘NearDup’ to Improve Doc Reviews

Best Practices: ‘NearDup’ Identification | eDiscovery Webinar Series | July 17, 2014

○ Faster ReviewGroup Incoming Documents by Similarity for faster, more efficient coding.

○ Find Hot DocsFind hidden ‘hot’ documents with similar content to files you’ve already marked as being particularly important to a case.

○ Prevent Privilege ReleaseIdentify documents containing privileged information that haven’t been consistently tagged before producing them to opposing counsel

○ Better Email ReviewEasily and coherently review through email conversations threads with different custodian sources.

Page 24: Best Practices: NearDup - Lexbe – e-Discovery Fast · 2014-07-17 · additional cost in Lexbe eDiscovery Platform. Best Practices: ‘NearDup’ Identification | eDiscovery Webinar

Thank YouContact Info

Gene Albert:Lexbe Principal

[email protected](512) 686-3382

Stu Van Dusen:Marketing Manager

[email protected](512) 843-7672

Lexbe Sales: [email protected](800) 401-7809 x22

Webinar Questions: [email protected]

Best Practices: ‘NearDup’ Identification | eDiscovery Webinar Series | July 17, 2014