Pcori2013 (23)

5

Click here to load reader

Transcript of Pcori2013 (23)

Page 1: Pcori2013 (23)

Platform for Patient Centric Collaborative

Research

Dadong Wan, Sophia Cao, Karthik Gomadam,Accenture Technology Labs, San Jose, CA.

{ dadong.wan, sophia.cao, karthik.gomadam }@accenture.com

1 Abstract

The Affordable Care Act is perhaps the most significant “face-lift” in the U.S.healthcare system since the introduction of Medicare and Medicaid. Key focusareas of ACA include evidence based care and pay for performance. Patientengagement is at the heart of both of these focus areas. However, finding relevantpatients to engage with medical providers is an important challenge. In thispaper, we describe our solution to alleviate this problem that leverages patientdata avaialble in online health communities and seeks to match the patients inthese communities for relevant projects. Our solution can be applied to datafrom any patient community and patients can engage with researchers fromwithin the communities they are already a part of. We believe that this approachwill help researchers find highly relevant patients and will enable patient centric,dynamic and responsive research.

2 Introduction

The Affordable Care Act is perhaps the most significant “face-lift” in the U.S.healthcare system since the introduction of Medicare and Medicaid. Key focusareas of ACA include evidence based care and pay for performance. Patient en-gagement is at the heart of both of these focus areas. For example, researcherswho want to study the effectiveness of levetiracetam, lamotrigine, or oxcar-bazepine on pediatric epilepsy patients should engage with the patients andtheir caregivers. Measuring this would allow them to validate their care planprocess for helping patients manage their conditions as well as that of their treat-ment plans. Having these validations will help providers analyze and optimizetheir performance in the pay for performance age. However, finding relevantpatients to engage with medical providers is a non-trivial problem. In the aboveexample, providers will need to recruit patients who are children, have epilepsy,and are prescribed levetiracetam, lamotrigine, or oxcarbazepine.

In this paper, we propose a solution to address this problem using the datafrom online health communities. Our experience in the past when we developed

1

Page 2: Pcori2013 (23)

applications to match patients and clinical trial investigators had proved to usthat patients will not flock to recruitment platforms and any meaningful solutionshould /emphfish where the fishes are. We realized that patient communitiessuch as PatientsLikeMe and Medhelp have millions of patients who are sharinginformation about their medical conditions, medications, and their experiencein managing their conditions. We developed a solution that takes advantage ofthis patient data, allowing researchers to find patients from these communities.We apply semantic and text mining algorithms to analyze patient conversationsin these communities to build rich patient profies that captures their medicalconditions, medications, and demographic information. We build similar profilesfor research projects (listed at PCORI.org). We then match and rank the projectand the patient profiles to find the most relevant patients.

One challenge in matching projects with patients based on patient conver-sations is the difference in the ways in which different participants (researchers,patients, caregivers) describe the same thing. For example, a researcher will usediabetes mellitus while a patient might say type 2. Using semantic Web tech-nologies (UMLS ontologies, OpenCalais entity extractor, semantic type match-ing) allows us to overcome this problem.

We have prototyped our approach (available at: http://bit.ly/pccr_acn)that demonstrates the effectiveness of our approach in finding patients. Dueto privacy concerns, we were not able to integrate with existing online commu-nities. We have developed a sample online community, MeMed (available at:http://bit.ly/me_med_), and created posts similar to those found in existingcommunities. Our prototype allows users to add PCORI projects and findingmatching patients in MeMed.

3 Overview of the PCCR Platform

In this section we briefly describe the PCCR platform. We begin by describingthe main models in the system. These are illustated in figure 1.

1. Investigators: captures the information about the investigators who areseeking participants for their projects. We model the institution and theareas of interest for an investigator. The areas of interest of an investigatorare automatically created by analyzing their projects.

2. Projects: Each investigator can have multiple projects. Each project hasa title, description, goals, project type that captures the nature of theproject, the medical conditions and medications of interest described inthe project, and the expected outcomes. Our matching algorithm matchesparticipants across these different dimensions and calculates a match score.The patients are ranked based on this match score.

3. Patients: We extract patient profiles based on their conversations / partic-ipation in existing online health communities. We identify and use theirdemographic, socio-economic, and medical information in creating theirprofile.

2

Page 3: Pcori2013 (23)

Project(defini-on(

Inves-gators(

Name(Organiza-on(

Areas(of(interest(Project(History(

Project(1( Project(2( Project(k(

Statement(

UC(

PR(

UC(

Type(Medical(

Condi-ons( Demographics(

PR(

Goals(

UC(

PR(

UC(

PR(

Preven-ve(Diagnos-c(Therapeu-c(Pallia-ve(Health(Delivery(

Age(Gender(Economic(Region(Race(

Medical(condi-on(Condi-on(stage(Medica-on(

*PR(–(Pa-ent(response,(*UCN(User(comment(

Outcome(

Trial(Tests(Studies(Surveys(

Pa-ents(

Name(Age(

Gender(Loca-on(

Economic(status(Race(

Areas(of(interest(Medical(Condi-ons(/(stage(

Ac-vity(

Figure 1: PCCR Matching Platform - Data Definitions

Multidimensional Semantic Match Engine

Matched Participants Across Online Patient Communities

Rich Project Profile

Big Data & Multidimensional Semantic Analysis

Project Title & Description Patient Communities

Big Data & Multidimensional Semantic Analysis

Rich Patient Profile

Figure 2: PCCR Matching Platform - Data Flow

The researcher and patients profiles are used by our matching engine toidentify relevant patients for a project. At the heart of the PCCR platform isour matching engine. Figure 2 illustrates the data flow of our matching engine.The two main components of the matching engine are the researcher profilegenerator and the patient profile generator.

The researcher profile generator takes as input the textual description of aresearch project. For the purposes of this challenge, we use the descriptions offunded PCORI projects. This profile is passed through a semantic analyzer.The semantic analyzer is built using concepts in RXNorm and SNOMED and

3

Page 4: Pcori2013 (23)

Figure 3: Example output of semantic analysis

identifies medical terminologies and concepts in the description, along with theirsemantic types. In addition to the semantic analyzer, the description is also sentto OpenCalais Web API for entity identification. A final list of entites and typesis created by combining the output of the semantic analyzer and OpenCalais.The demographic analyzer module extracts demographic information (such asage group of target population, gender, and location information). We usetextual cues to identify expected outcomes. Figure 3 illustrates the entitiesidentified from the description of a PCORI project on Epilepsy.

The patient profile generator uses the semantic analyzer and the demo-graphic analyzer. However, given the volume of patient data, we needed toadopt a more scalable approach as semantic analysis can be expensive. We usea Map-Reduce based solution, where we have a series of map and reduce jobs.The first map job takes user profiles as input and uses the semantic analyzerto identify entities and types. In parallel, we have another map job that usesextracts entities and types using OpenCalais. The respective reduce jobs com-bine all the identified entities for a patient. We merge these lists to create asemantic signature of the patient consisting of a collection of entities and theirtypes. Similarly, the demographic and socio-economic information is identified.Combining all of the above information yields a rich patient profile. We storethe profile as a structured object in Mongo.

The matching algorithm takes as input a rich project profile. For each ofthe facets in medical condition, medication, and demographics, the match-ing algorithm first finds the relevant patients using set containment opera-tors. We also use Mongo’s geo querying to filter users by location, if theproject description mentions such as a restriction. Further, we apply a seman-tic similarity (based on Ted Pedersons UMLS Similarity project available athttp://umls-similarity.sourceforge.net/), to compute the semantic sim-ilarity of a patient profile to that of a project. All of these are then combinedto create a match score that is used in selected and ranking patients.

4 Related Work

The techniques we have used in this paper are built upon prior research in theareas of semantic Web, hierarchical object matching, and entity extraction. Inthe context of patient matching for healthcare, the TrialX system [4]is veryrelevant to work. We also use our prior work in the area of faceted matchingand searching of unstructured documents [3] for factet extraction.We model our

4

Page 5: Pcori2013 (23)

similarity measurement technique based on the their approach. We also appliedthe principles of hierarchical object matching discussed by Ganesan et. al in [2]and Doan et. al in [1]. We also use OpenCalais Web service [5] to semanticallyenrich patient conversations and project descriptions and to extract relevantentities.

5 Conclusions

In this paper, we describe our solution to the PCORI Healthcare 2.0 chal-lenge. Our solution leverages existing patient data available in online healthcommunities and creates a rich semantic profile of the patients. We have alsodeveloped techniques for creating multi-dimensional project profiles from theirtextual descriptions. We have developed a semantic matching algorithm thatfinds matching patients for research projects. The PCCR platform we have de-veloped works for any patient community. Due to privacy concerns, we havenot used any online community data in our development or demonstration. In-stead, we use data from a patient community that we prototyped and seededwith posts. We evaluated our system and found that our approach has over90% accuracy in finding patients who have same or similar medical conditions.The match rate when using demographics goes down to about 80%. We arecurrently improving our demographic profiling and extraction technique. Ourapproach builds on current ways patients share and interact on the Web todayand we believe that it can help researchers find very relevant patients leadingto more meaningful and productive engagements and outcomes.

References

[1] Anhai Doan, Pedro Domingos, and Alon Halevy. Learning to match theschemas of data sources: A multistrategy approach. Machine Learning,50(3):279–301, 2003.

[2] Prasanna Ganesan, Hector Garcia-Molina, and Jennifer Widom. Exploitinghierarchical domain structure to compute similarity. ACM Transactions onInformation Systems (TOIS), 21(1):64–93, 2003.

[3] Karthik Gomadam, Ajith Ranabahu, Meenakshi Nagarajan, Amit P Sheth,and Kunal Verma. A faceted classification based approach to search and rankweb apis. In Web Services, 2008. ICWS’08. IEEE International Conferenceon, pages 177–184. IEEE, 2008.

[4] Chintan Patel, Sharib Khan, and Karthik Gomadam. Trialx: Using semantictechnologies to match patients to relevant clinical trials based on their per-sonal health records. Proc. of the International Semantic Web Conference(ISWC), 2009.

[5] T Reuters. Opencalais, 2009.

5