Imran a. Khan – Online Presence, IEEE Papers, Research Papers, Data Analysis Writing Samples &...

7
7/21/2019 Imran a. Khan – Online Presence, IEEE Papers, Research Papers, Data Analysis Writing Samples & Posters http://slidepdf.com/reader/full/imran-a-khan-online-presence-ieee-papers-research-papers-data-analysis 1/7

description

Imran A. Khan – Online Presence, IEEE Papers, Research Papers, Data Analysis Writing Samples & Posters

Transcript of Imran a. Khan – Online Presence, IEEE Papers, Research Papers, Data Analysis Writing Samples &...

Page 1: Imran a. Khan – Online Presence, IEEE Papers, Research Papers, Data Analysis Writing Samples & Posters

7/21/2019 Imran a. Khan – Online Presence, IEEE Papers, Research Papers, Data Analysis Writing Samples & Posters

http://slidepdf.com/reader/full/imran-a-khan-online-presence-ieee-papers-research-papers-data-analysis 1/7

Page 2: Imran a. Khan – Online Presence, IEEE Papers, Research Papers, Data Analysis Writing Samples & Posters

7/21/2019 Imran a. Khan – Online Presence, IEEE Papers, Research Papers, Data Analysis Writing Samples & Posters

http://slidepdf.com/reader/full/imran-a-khan-online-presence-ieee-papers-research-papers-data-analysis 2/7

IEEE Papers:

Personalized Electronic Health Record System for Monitoring Patients with Chronic Disease

2013 IEEE Systems and Information Engineering Design Symposium (SIEDS 2013) -- April 2013

The Personalized Electronic Health Record System for Monitoring Patients with Chronic Disease (PEHRS-

MPCD) is designed to permit tracking and monitoring of the symptoms of patients with chronic disease

and provide healthcare professionals with data on patients' lifestyle changes, medication (drug)

changes, diet changes and symptom changes. The current method of assessing the symptoms of

patients with chronic disease uses an episodic approach that includes phone calls to the patient, paper

surveys of health status and on-site examinations. A preventative approach that can actively involve the

patient, monitor multiple conditions and provide real time information about a patient's health

condition is proven to be effective in chronic disease care. PEHRS-MPCD is designed to continually

monitor patients. The goal of the application is to continuously gain and provide patients' information to

themselves and healthcare professionals in-order to improve the efficiency of the diagnosis and timely

intervention which would yield better quality of care and quality of life for the patient. This personalized

electronic health record system (PEHRS-MPCD) will be objective in providing feedback about patientlifestyle changes and choices, and in channeling this information to healthcare providers. PEHRS-MPCD

would (a) allow for relevant data to be entered by the patient, (b) make relevant data available to

patient's care provider, at real-time and at doctor's visit, (c) generate reports and graphs for the data

and (d) provide secure storage of the data. PEHRS-MPCD is a work in progress as a lot of its

functionalities and user interface design are still being amended. This paper describes the purpose, need

and design of the application.

Available at: http://tinyurl.com/IEEE-Monitoring-Chronic

Smartphone Application for Transmission of ECG Images in PreHospital STEMI Treatment

2012 IEEE Systems and Information Engineering Design Symposium (SIEDS 2012) -- April 2012

An S-T segment elevation myocardial infarction (STEMI) is a severe heart attack that kills heart muscle

every minute it is left untreated. Therefore, early diagnosis and treatment are crucial for patient

survival. Currently, Charlottesville ambulances that service the University of Virginia hospital are

equipped with proprietary systems that send electrocardiogram (ECG) images while the ambulance is en

route to the hospital. From an ECG, a doctor can diagnose a STEMI prior to patient arrival and prepare

for surgery, thereby reducing the time delay. However, these ambulance systems are costly and provide

no feedback regarding the success of an ECG transmission. This paper describes the development of an

inexpensive iPhone application that transmits ECG images over the AT&T data network to the hospitalprior to a patient's arrival. The application is designed to dovetail with the existing STEMI-care protocol

used by Charlottesville-area Emergency Medical Technicians (EMTs), and it provides a novel red/green

light indicator predicting successful receipt of the image within two minutes. The goal of the application

is to improve process efficiency and information flow allowing the patient to receive early, appropriate

care and the best chance for a successful recovery. Test results, including usability tests, show that the

application fulfills all key requirements. A prototype of the application will be evaluated by

Page 3: Imran a. Khan – Online Presence, IEEE Papers, Research Papers, Data Analysis Writing Samples & Posters

7/21/2019 Imran a. Khan – Online Presence, IEEE Papers, Research Papers, Data Analysis Writing Samples & Posters

http://slidepdf.com/reader/full/imran-a-khan-online-presence-ieee-papers-research-papers-data-analysis 3/7

Charlottesville area Emergency Medical Technicians prior to implementation in emergency response

protocols and long term deployment elsewhere.

Received best paper award at IEEE Symposium in “System Design and Integration Track”

Available at: http://tinyurl.com/IEEE-Medical-App

Poster at: www.tinyurl.com/STEMI-Poster

Research Reports:

30-Day Readmission Trends and Variables of UVA Hospital Dementia Patients

One area of major concern for hospitals is high readmission rates. According to the Centers for Medicare

(CMS), 20% of Medicare patients who are discharged from hospitals are readmitted within 30 days.

Currently, patient readmissions cost the U.S government over $17 billion per year and are projected to

increase [1, 11]. Although factors such as how a patient is diagnosed, the severity of the illness, and

patient’s behavior may affect the 30-day readmission rate, the Medicare Payment Advisory Commission

(MedPAC) claims that 75% of all readmissions within 30 days can potentially be prevented if the

hospitals properly plan patient treatments [2]. Following MedPAC’s findings and recommendations, the

U.S. government wants to emphasize the readmission problem within the new Affordable Care Act by

severely penalizing any hospital with excess readmission rate within a 30-day period [3].

In this research, de-identified electronic health record system (EHRs) data on 24,954 patients with

dementia is used to predict patients who are at high-risk for 30-day readmissions. Furthermore, random

forest models are used to test with different attributes selection in-order to identify attributes necessary

for significant predictive models for 30-day readmission of Dementia patients. Our model correctly

identifies 98% of the patients that are at high risk for 30-day readmissions, as well as identifies and

investigates the importance of variables necessary to build a significant model.

Available at: http://tinyurl.com/Dementia-Patients

Building a Quantitative Case for the Medical and Economic Potential of Symptom Tracking Tools

Since 2007, the Centers for Medicare and Medicaid Services (CMS) have devised outcome measures that

focus on high quality patient care. A hospital readmission rate over a 30-day period is one such measure

that allows medical professionals and patients to critically appraise health care providers and provides aframework for hospitals to meet quality control standards. Hospital readmission rates, especially for

elderly patients, are a significant concern in U.S. Healthcare since 1 in 5 patients on Medicare &

Medicaid is re-admitted to hospitals within 30 days of their treatment. This is currently costing the U.S.

government over $17 billion per year and is projected to increase. On October 2012, the U. S.

government started implementing penalties, which could reach as high as 40 million, to hospitals with

high rates of readmission over a 30-day period. The objective of this paper is to analyze the efficacy of

Page 4: Imran a. Khan – Online Presence, IEEE Papers, Research Papers, Data Analysis Writing Samples & Posters

7/21/2019 Imran a. Khan – Online Presence, IEEE Papers, Research Papers, Data Analysis Writing Samples & Posters

http://slidepdf.com/reader/full/imran-a-khan-online-presence-ieee-papers-research-papers-data-analysis 4/7

the use of symptom tracking tools and to build a quantitative case for the medical and economic

potential of symptom tracking tools using the 30-day readmission rate metric.

Available at: http://tinyurl.com/Medical-Economic-Case

Data Analysis Writing Samples:

Cardiac Rhythm Classification

The task is to design and evaluate models to use in cardiac rhythm classification and to recommend the

best approach with smaller test cases and cross validation error. In this research, I use different machine

learning approaches such as tree learning, rule learning, and instance-based learners, and ensemble

method. The goal is to distinguish atrial fibrillation from normal sinus rhythm and a normal sinus rhythm

with ectopy. After doing different experiments with different prediction classifiers, this research shows

that Random Forest, with 500 trees and 4 attributes (HRV,LDs, COSEn, and DFA), gives thehighest prediction accuracy 93.43%.

Full 19 page report available at: http://tinyurl.com/Cardiac-Rhythm

Design Improvements for the University of Virginia Transplant Center

This study considers the number of kidney and liver transplants at UVA and comes up with an evaluation

for these organic transplants with the MCV and Duke center overall and in different ethnic group

especially for minorities. UVA has the smallest trend on the number of kidney transplants overall and in

non-white group as compared to the two centers over the period 1988 – 2012. The t-test shows that

there is a difference between the number of transplants at UVA and the other centers at 5% level. The95% bootstrap confidence interval of the mean difference also indicates that I can reject the null

hypothesis of mean difference is zero. Time series linear model is constructed to predict the mean

difference between two centers in 2013.The results from Bootstrap and Monte-Carlo simulation reveals

that the 95% prediction confidence interval does not contain zero, meaning that there is a difference

between the prediction numbers of kidney transplants. The negative confidence interval tells that the

predicted number of kidney transplants at UVA overall and for non-whites in 2013 is less than the

predicted number of kidney transplants at MCV and Duke. This suggests UVA to do better at recruiting

people overall and at recruiting people from other ethnicities. For liver transplants, it is hard to conclude

that building the new Roanoke center in 2005 has increased the number of liver transplants at UVA.

Linear model and Poisson model to model the number of liver transplants show contradict results.

Based on linear model with time series, the p-value of Roanoke variable is 0.014 and is less than 0.05.

With this model, I can reject the null hypothesis at 5% and conclude that building the Roanoke center

has increased the number of liver transplants. Meanwhile based on Poisson model with time series,

Roanoke variable does not affect the number of liver transplants and is not significant at 5% level. This

suggests UVA to do more research on liver transplants and it may be interesting to collect data at UVA

C-ville and UVA Roanoke center.

Full 34 page report available at: http://tinyurl.com/UVA-Transplant

Page 5: Imran a. Khan – Online Presence, IEEE Papers, Research Papers, Data Analysis Writing Samples & Posters

7/21/2019 Imran a. Khan – Online Presence, IEEE Papers, Research Papers, Data Analysis Writing Samples & Posters

http://slidepdf.com/reader/full/imran-a-khan-online-presence-ieee-papers-research-papers-data-analysis 5/7

Spam Filtering

In this study, I use logistic regression model to build static filter design, i.e. to classify e-mails as spam

and ham. I find that there are 3 important variables that need to be considered in filtering out spam, i.e.

frequency of some words/characters in the message, longest run-length of capital letter, and total run-

length of capital letters. The variables are highly significant in the logistic regression model at 5% level. It

is important to transform the predictors into a log-scale as this will increase the accuracy of the model.

The final model selected for spam filtering has the highest accuracy with smaller total errors (13.4%) and

false positives (7.9%) made. It also fits better based on BIC criteria. For spam filtering, I also build time

series filter design to predict the daily amount of spam e-mails. I found that there is a relationship

between the amount of spam e-mails received and time of arrivals. Time is highly significant in the linear

regression model at 5% level. For spam data, the residuals can be modeled by ARMA model with 2

autoregressive (AR) terms and 1 moving average (MA) term and this model gives the best forecast.

Meanwhile, for ham data, the residuals can be modeled by ARIMA(1,1,1) and this models gives the best

forecast with MSE 2.0. ARIMA(1,1,1) model also has the lowest AIC and BIC values. Both models shows

adequacy from the Ljung-Box Q-statistic plot since all the points are insignificant. The static and time

series filter design can be integrated to produce an overall filter design by using Bayes rule. It meansthat for any email that comes into my classifier, the probability of getting a spam e-mail is determined

by the probability of my e-mail is spam based on the static filter and the probability of my e-mail is spam

based on the time series filter.

Full 36 page report available at: http://tinyurl.com/Spam-Filtering

Analysis of Train Accidents in the U.S. During 2001 – 2012

There are many factors that can cause severity of rail accidents. Based on my findings, season is an

important factor that causes more death. I find that more fatalities occur during summer season. The

rate of change of fatalities during summer season is estimated to be about 0.22 with 95% confident

interval between 0.04 and 0.4. So it is important for the FRA to put an extra safety when the train is

running under summer season. Type of accident and cause of accident significantly affect the cost

damage at 5% level. A train accident at RR grade crossing is more likely to cause cost damage. Putting a

greater safety at RR Grade Crossing can reduced the severity of cost damage. Also, the FRA should train

well their people about safety in order to minimize human error.

Full 23 page report available at: http://tinyurl.com/Train-Accidents-Report

Air Traffic Control, Reliability Analysis and Cargo Operations

This report focusses on some of the challenges in the air transportation and the aircraft industry,

provides a detailed analysis of these challenges, and proposes solutions and recommendations.

Section 1 talks about air traffic control, focussing more on the landing queue system. Owing to the

safety measures of maintaining a certain separation distance in the queue, there are challenges that the

aircraft industry faces in terms of avoiding flight delays and better management of air traffic. A discrete

event simulation (DES) model is used and it is found that the average queue length is between 2.48 and

Page 6: Imran a. Khan – Online Presence, IEEE Papers, Research Papers, Data Analysis Writing Samples & Posters

7/21/2019 Imran a. Khan – Online Presence, IEEE Papers, Research Papers, Data Analysis Writing Samples & Posters

http://slidepdf.com/reader/full/imran-a-khan-online-presence-ieee-papers-research-papers-data-analysis 6/7

3.13. This provides room for improvement since it is desirable to have shorter queue lengths. A more

detailed analysis found that about 17.4% to 23% of the time, the queue is clogged, which is defined as

more than 5 planes in the queue. This is far from ideal because a clogged queue means flight delays and

bad customer ratings. Moreover, the total number of planes in a system for a given system requirement

is approximately 14 planes, which is again is far from ideal. The average number of planes in recircles are

also high. Thus, there is significant challenge in terms of reducing queue length, reducing clogged queuetime, number of planes in the system, number of planes in recircles, and a variety of additional issues.

Following a static analysis, the report discusses the impact of a decrease in mean and spread of landing

times by 10%, which may be because of relaxation of stringent safety rules. It is found that all the

statistics improve to a great extent because of such a small change. Hence, it becomes only advisable to

research further on whether this 10% can be incorporated without compromising on safety standards.

Another static analysis was conducted to assess the impact of a decrease in recircle distance by 10%.

This analysis does not show a significant improvement in the overall air traffic control system and hence

can be reduced in priority. The third static analysis was conducted to determine the impact of the

change in plane separation with a decrease of 10%. This results in a dramatic improvement to the

amount of time the queue is clogged. Hence, this change should definitely be considered by themanagement with a high priority. We believe that the first and the third change are highly feasible and

should be implemented following final safety tests. These recommendations can help to ensure

effective air traffic control.

Section 2 focuses on the reliability analysis in the same setup of aircraft industry. Emotionally driven

customers give a lot of importance to safety. It is critical to understand the risks and how those risks

interact with one another and affect the system as a whole. The overall likelihood of an accident is

extremely small; however with increasing air traffic, this probability grows in likelihood and becomes

even more important to the industry. Giving pilots a new dynamic control system, which will limit their

response time in the event of an inflight separation violation, has the potential to reduce this overall

risk. Thus, more errors can be absorbed by the system. Section 1 discussed that the inflight separationdistance can be reduced for effective air traffic control. However, if inflight separation distance is

reduced, then the new dynamic control system has a demerit because it actually results in more

accidents. Pilots think that there is more room for error with this system, when analysis proves that it is

not the case. The benefits from the new dynamic control system are more than offset by the negatives

of altering the inflight separation distance. It is recommended that these two options be considered in

disjunction in order to maintain high safety standards and from the reliability point of view.

Section 3 deviates from air traffic control and focusses on the cargo operations that take place in an

airport network. We built an optimization model to ensure smooth and costeffective management of

cargo operations. There are often carrier capacities at each given airport, and there is cost associated

with transporting cargoes from one airport to the other. Having an optimization model which minimizescost for the aircraft company is always desirable because it would mean more profit and insights into

improving management. The analysis done shows the complexity of such a problem, which can be seen

from the fact the Excel fails to give a feasible solution. With regards to the current system, the

conclusion is that the current carrier capacity is insufficient to achieve global optimum in a week. This is

because of sudden peak influx of cargoes at the airports which can be handled for only a short period of

time. Not having enough carriers increases the cost by 17%. In addition to purchase more carrier. The

recommendation is that weekly demand distribution be smoothened by keeping some extra cargos on

Page 7: Imran a. Khan – Online Presence, IEEE Papers, Research Papers, Data Analysis Writing Samples & Posters

7/21/2019 Imran a. Khan – Online Presence, IEEE Papers, Research Papers, Data Analysis Writing Samples & Posters

http://slidepdf.com/reader/full/imran-a-khan-online-presence-ieee-papers-research-papers-data-analysis 7/7