Lexbe eDiscovery Webinar- Best Practices: NearDup
-
Upload
lexbewebinars -
Category
Law
-
view
162 -
download
0
Transcript of Lexbe eDiscovery Webinar- Best Practices: NearDup
Best Practices: NearDup
Gene AlbertPrincipal, Lexbe LC
Using Near Duplicate ID to Detect Key Docs, Protect Privilege & Speed Reviews
July 17, 2014
eDiscovery Webinar Series
○ Takes Place Monthly
○ Cover a Variety of Relevant eDiscovery Topics
○ Presentations Available for Download by Registrants.
Best Practices: ‘NearDup’ Identification | eDiscovery Webinar Series | July 17, 2014
Info
eDiscovery Webinar Series
Lexbe is an Austin, TX based eDiscovery software and services provider.
○ Lexbe eDiscovery PlatformLexbe eDiscovery Platform is a hosted eDiscovery processing and review tool. Users can load a variety of file types, process for review, OCR for search, and conduct document reviews, productions, prepare for depos & analyze transcripts, conduct case analytics, prepare for dispositive motions, and provide litigation support during trial.
○ Lexbe eDiscovery Services Lexbe performs large volume document culling, processing from native to PDF or TIFF, load file creation, high-volume OCR of image files, Rule 26 and project management consulting, and related eDiscovery Services.
About Lexbe
Lexbe Sales [email protected]
(800) 401-7809 x22
Best Practices: ‘NearDup’ Identification | eDiscovery Webinar Series | July 17, 2014
If you have any questions or technical issues, please e-mail them to:
Questions will be forwarded to Gene and answered during the webinar or via e-mail if we run out of time.
eDiscovery Webinar SeriesQuestions & Technical Issues
Best Practices: ‘NearDup’ Identification | eDiscovery Webinar Series | July 17, 2014
○ Principal of Lexbe LC, a provider of cloud-based litigation review and document management software & eDiscovery services.
○ Prior business experience in software, medical services and internet-based businesses. Prior legal experience as in-house counsel and in private practice.
○ Frequent speaker and author on eDiscovery and legal technology issues.
○ EducationMBA, University of Texas (2005)JD, Southern Methodist University (1983)BA, University of Texas (1979)
○ Contact Gene [email protected]
Best Practices: ‘NearDup’ Identification | eDiscovery Webinar Series | July 17, 2014
eDiscovery Webinar SeriesGene Albert Bio
Near Duplicate Detection
○ What is Near Duplicate Identification?
○ When is ‘NearDup’ Needed?
○ Inadvertent Privilege Release Example
○ Using ‘NearDup’ to:■ Group Similar Documents■ Find More Key Documents■ Enable Email Threading■ Prevent the Inadvertent Release of Privileged Information
○ NearDup Groupings+ service options from Lexbe
Best Practices: ‘NearDup’ Identification | eDiscovery Webinar Series | July 17, 2014
Agenda
What Is It?
○ NearDup technology automatically recognizes similar documents within an e-discovery document collection
○ Algorithm analyzes, evaluates and compares the actual text content of the documents to each other
Best Practices: ‘NearDup’ Identification | eDiscovery Webinar Series | July 17, 2014
Near Duplicate Detection
Unstructured Documents NearDup Groupings
What Does It Do?
Best Practices: ‘NearDup’ Identification | eDiscovery Webinar Series | July 17, 2014
Near Duplicate Detection
NearDup technology will group similar documents, even though not exactly the same. Examples include:
○ Separately scanned documents.
○ Multiple versions of a Word document that are slightly different due to minor edits, reformatting, etc.
○ An original document and one with handwritten notes on it.
○ Emails and responses that continue a conversational ‘chain’ or ‘thread’.
Data Types and Volume Keep Growing
Digital Information Created, Captured, Replicated Worldwide4
3
2
1
2005 2010 2015Source: IDC Digital Universe Study (2012)* 1 Zettabyte = 1 Trillion Gigabytes
Zettabytes*
2.8 zettabytes of information were created and replicated during 2012, a 56% increase from 2011 (IDC)
VoipEmail
iPhones Peer-to-Peer
Online StorageDigital Cameras
Facebook | LinkedIn DropBox | Backup Devices
Elastic Storage | SaaS | Google StreetsPersonal Blogs | Skype | World Satellite Images
Personal Scanners | Customer Service Recordings Public Webcams | Google Goggles | Netbooks | Cloud Instance Servers | PaaS
Need for Near Duplicate Detection
Best Practices: ‘NearDup’ Identification | eDiscovery Webinar Series | July 17, 2014
Main Applications of NearDup
There are 4 main applications of NearDup analysis:
1) Grouping similar documents:○ Bunch highly similar documents together for more efficient coding
and review
2) Finding hidden ‘key’ or ‘hot’ docs:○ Retrieve and mark unseen documents that have content highly
related to existing ‘hot’ or ‘key’ documents
3) Preventing the inadvertent release of privileged information○ Be automatically alerted to files containing similar content to
documents that have already been coded as privileged
4) Enable email threading:○ Maintain relationships between email conversations
Do I Need Near Duplicate Detection?
Best Practices: ‘NearDup’ Identification | eDiscovery Webinar Series | July 17, 2014
Applying Near Duplicate DetectionLarge Groupings Accelerate Review
Feature DescriptionReport identifies Near Dup Groups in a case based on extracted or OCRed text
Benefits⃝ Accelerate document review by batch coding (using multidoc edit) larger groups
⃝ Increase coding consistency of batched documents
⃝ Reduce privilege errors
Best Practices: ‘NearDup’ Identification | eDiscovery Webinar Series | July 17, 2014
Applying Near Duplicate DetectionFind Similar Versions of Key Documents
ExampleSimilar versions of a Key Document are shown in the Document Viewer
Benefits⃝ Follow the trail from one key document to others.
⃝ Find key documents that would otherwise be missed
Best Practices: ‘NearDup’ Identification | eDiscovery Webinar Series | July 17, 2014
Prevent Inadvertent Privilege Release
Setup & Planning Collection Culling &
Analysis Processing Depos & Motions
Review & Production
Beware of Inadvertent Privilege Release
○ Larger cases have put a strain on accurate privilege review.
○ Finding 9 versions of a privileged document doesn’t help if you release version 10.
○ Nothing is more costly than compromising or losing a case because of privilege disclosure.
○ Claw-back agreements a good idea, but no panacea. “You can’t unring a bell.”
Applying Near Duplicate Analysis
Best Practices: ‘NearDup’ Identification | eDiscovery Webinar Series | July 17, 2014
Prevent Inadvertent Privilege ReleaseApplying Near Duplicate Analysis
Example Case: Thorncreek Apartments III, LLC v. Village of Park Forest (N.D. Ill. 2011)
○ At issue were six documents produced by Defendants to Plaintiffs, but attorney-client privilege was claimed
○ Court determined that the Defendants were negligent by failing to check the production database created by a third-party e-discovery vendor before it became available to opposing counsel
○ Court found waiver, relying in part on long period of time after production before attempting to clawback documents and failure to timely prepare a privilege log.
○ Even if the court allowed clawback, the sensitive information would have already been disseminated.
Best Practices: ‘NearDup’ Identification | eDiscovery Webinar Series | July 17, 2014
Prevent Inadvertent Privilege Release
Setup & Planning Collection Culling &
Analysis Processing Depos & Motions
Review & Production
Minimizing Risk of Privilege Release
○ Understand the Privilege Review process undertaken in detail.
○ Build dictionary of privileged sources and issues early in doc review.
○ Check for: untrained or sloppy review; unsearchable documents; incomplete search indices; poor redaction procedures; search not done in metadata and full-text; privilege text retained in natives, text files, load files, text-based PDFs.
○ Use specialized computerized privilege checks for container (email family) consistency, exact-dup and near-dup identification.
Applying Near Duplicate Analysis
Best Practices: ‘NearDup’ Identification | eDiscovery Webinar Series | July 17, 2014
Prevent Inadvertent Privilege Release
Example
⃝ Privileged documents found 9 out 10 times, but one missed
Benefit⃝ Find privileged documents with text similarity that can be easily missed otherwise
Applying Near Duplicate Detection
Best Practices: ‘NearDup’ Identification | eDiscovery Webinar Series | July 17, 2014
Applying Near Duplicate DetectionCatch Privilege Inconsistencies
Feature DescriptionReport identifies inconsistently coded privilege and work product codings
Benefits⃝ Reduce privilege errors
⃝ Avoid sole reliance on human coding consistency
⃝ Establish safeguards to help maintain privilege
Best Practices: ‘NearDup’ Identification | eDiscovery Webinar Series | July 17, 2014
Applying Near Duplicate DetectionEmail Threading
Feature DescriptionGroup email messages that have similar text representing a conversation thread
Benefits
⃝ View email chains with similar text in date & time order
⃝ Avoid confusion of emails only tangentially related (<50% text overlap)
⃝ Consistently code email chains for responsiveness, privilege, attorney-eyes only, etc.
Best Practices: ‘NearDup’ Identification | eDiscovery Webinar Series | July 17, 2014
Included with Lexbe eDiscovery PlatformApplying Near Duplicate Analysis
○ Near Duplicate Identification is included at no additional cost in Lexbe eDiscovery Platform.
Best Practices: ‘NearDup’ Identification | eDiscovery Webinar Series | July 17, 2014
○ You can automatically apply ‘NearDup’ to documents you self-upload into the platform to group similar documents and review for privilege coding consistency.
Applying ‘NearDup’ in The CloudLexbe eDiscovery Platform
● Self-administration● Native (Office, etc.) processing● Automatic OCR● Early case analysis● Dual-index search● Exact & near-dup ID● Doc Review & issue tagging● Blended productions● Transcript management● Timelining, depo prep● Dispositive motions● Trial document management
Cloud-based litigation document management software
FEATURES
Best Practices: ‘NearDup’ Identification | eDiscovery Webinar Series | July 17, 2014
Included in Processing ServicesApplying Near Duplicate Analysis
Best Practices: ‘NearDup’ Identification | eDiscovery Webinar Series | July 17, 2014
We apply NearDup Groupings+ to the following processing services at no additional charge:
○ Native Processing+ (TIFF) Convert Outlook, Microsoft Office, and other native file types for review in in-house TIFF-based systems
○ Native Processing+ (PDF)Convert Outlook, Microsoft Office, and other native file types into searchable PDFs for review
○ Native Extraction+ Prepare case data for native or near native review
Security & Data Ownership
What to look for in litigation cloud service offerings:
○ EncryptionData encrypted (256-bit or above) in-place and in-transit.
○ Data Center CertificationsData centers should be certified, follow industry best standards, etc.
○ Clear Ownership RightsService agreements should clearly acknowledge client data ownership.
○ Redundant Back-Ups; RecoveryService provider should have robust and redundant backup & recovery protocols.
Applying ‘NearDup’ in The Cloud
Best Practices: ‘NearDup’ Identification | eDiscovery Webinar Series | July 17, 2014
Summary
Use ‘NearDup’ to Improve Doc Reviews
Best Practices: ‘NearDup’ Identification | eDiscovery Webinar Series | July 17, 2014
○ Faster ReviewGroup Incoming Documents by Similarity for faster, more efficient coding.
○ Find Hot DocsFind hidden ‘hot’ documents with similar content to files you’ve already marked as being particularly important to a case.
○ Prevent Privilege ReleaseIdentify documents containing privileged information that haven’t been consistently tagged before producing them to opposing counsel
○ Better Email ReviewEasily and coherently review through email conversations threads with different custodian sources.
Thank YouContact Info
Gene Albert:Lexbe Principal
[email protected](512) 686-3382
Stu Van Dusen:Marketing Manager
[email protected](512) 843-7672
Lexbe Sales: [email protected](800) 401-7809 x22
Webinar Questions: [email protected]
Best Practices: ‘NearDup’ Identification | eDiscovery Webinar Series | July 17, 2014