Training Manual

download Training Manual

of 20

description

Matlab Training Manual

Transcript of Training Manual

  • Introduction to MATLAB for Data Analysis and Visualization

  • Table of Contents

    1. Introduction ..................................................................................................................................... 1

    2. Overview of the MATLAB Desktop ................................................................................................... 3

    3. Data Import ...................................................................................................................................... 5

    4. Visualization ..................................................................................................................................... 8

    5. Analysis .......................................................................................................................................... 12

    6. Script Creation ............................................................................................................................... 16

    7. Report Generation ......................................................................................................................... 18

  • Introduction to MATLAB for Data Analysis and Visualization

    1

    Introduction

    This workshop will provide a basic introduction to performing data analysis and visualization in

    MATLAB. Using a simple example, you will learn how to perform routine analysis tasks such as data

    import, manipulation, plotting and fitting in MATLAB.

    The data in this example is from a drug interaction study that examined the effect of combination

    therapy on a physiological response metric. The study measured the pain threshold at various dosing

    concentrations of an opioid and a sedative, to understand if the 2 drugs interact in an antagonistic or

    synergistic manner. Opioids and sedatives are the 2 components in anesthesiology. On the X-axis, we

    have the dosing concentration of the opioid, i.e. the pain-relieving component. On the Y-axis, we have

    the dosing concentration of the sedative, i.e. the sleep-inducing component. On the Z-axis, we have a

    pain response metric, i.e. the maximum pain a subject could tolerate at the given combination of the

    opioid and the sedative concentration. Observe that at higher drug concentration, the tolerable pain

    metric is higher.

    Our objective is to identify whether the interaction between these two drugs is synergistic or

    antagonistic. We will do that by fitting the following model1 to this 3-dimensional dataset and estimating

    the value of . < 1 indicates an antagonistic interaction and > 1 indicates a synergistic interaction.

    (

    )

    (

    )

  • Introduction to MATLAB for Data Analysis and Visualization

    2

    where, CO and CS are the dosing concentrations of the Opioid and the Sedative, respectively; and

    are the concentrations of the Opioid and Sedative that produce 50% of the maximal effect, when

    each drug is administered individually; is the interaction parameter, and n is a measure of the slope

    of the response surface.

    References:

    [1] Kern SE, Xie G, White JL, Egan TD. A response surface analysis of propofol-remifentanil pharmacodynamic interaction in volunteers. Anesthesiology 2004; 100:137381

  • Introduction to MATLAB for Data Analysis and Visualization

    3

    Overview of the MATLAB Desktop

    To open MATLAB, double-click on the MATLAB icon on the desktop. [This launches the MATLAB desktop environment]

    The following components appear at the top of the MATLAB desktop:

    Toolstrip contains the Home, Plots and Apps tabs; each tab groups functionality associated with

    common tasks. Additional tabs, such as Editor, Publish and Variable, appear in the toolstrip as

    needed to support your workflow.

    Quick access toolbar displays frequently used options such as cut, copy, paste and help. You can

    customize the options available in the quick access toolbar to suit your typical workflow.

    Current folder toolbar enables you to specify the current working directory.

    Search Documentation box allows you to search the documentation.

    Watch this video for more information on the MATLAB Desktop: http://www.mathworks.com/videos/new-matlab-desktop-70403.html

    The rest of the MATLAB desktop is composed of a collection of independent windows. The layout of these windows within the desktop is customizable. Each window can be undocked, docked back into the desktop, resized, moved, minimized, maximized, restored or closed. You can find these controls under

    the button on the title bar of the window.

    In its default layout, the MATLAB desktop displays the following windows:

    Current Folder enables you to access the contents of the current working directory. Use the options in the current folder toolbar to change the current working folder.

    Command Window lets you execute MATLAB commands at the command line, indicated by the prompt (>>)

    Workspace allows you to explore data that was created in MATLAB or imported from files.

    Command History records every command that was executed from Command Window. You can view or rerun previously executed commands from the Command History.

    Watch this video for more information on working in the MATLAB development environment: http://www.mathworks.com/videos/getting-started-with-matlab-68985.html

  • Introduction to MATLAB for Data Analysis and Visualization

    4

    To reset the MATLAB Desktop to the default layout, choose Layout Default in the Home tab.

  • Introduction to MATLAB for Data Analysis and Visualization

    5

    ** You should have received an email from your instructor with a link to the example files. Please go to the ftp site

    and download the MATLAB_Training.zip folder to the Desktop on your machine.

    Data Import

    In this example, the first step is importing the dose-response data into MATLAB. The data stored in the file, Data.csv, in the MATLAB Training folder. To access this file, change the MATLAB current working directory to the MATLAB Training folder. Note: If you do not have these files, please refer to the footnote on this page.

    To change the current folder:

    Click on the icon in the current folder toolbar. [Opens folder navigation window]. Navigate to your local Desktop and click Select Folder

    In the Current Folder window, double-click on the MATLAB Training.zip file to unzip its content

    Double-click on the MATLAB Training folder in the Current Folder window to change the current folder to the MATLAB Training folder

    [The Current Folder window displays 2 files combinedEffect.m and Data.csv and a folder, called Solution]

    To import data from Data.csv:

    Click on Import Data in the Variable section on the Home tab. Alternatively, double-click on the Data.csv file in the Current Folder or right-click on the file and select Import Data

    Navigate to the MATLAB Training folder, select Data.csv and click Open [Opens the Import Tool]

    Import Tool recognizes that Data.csv is a comma-separated file, and automatically specifies comma as

    the delimiter in the Delimiters section. The Import tab also lets you specify the range, the variable row

  • Introduction to MATLAB for Data Analysis and Visualization

    6

    containing the column names, as well as, the data type (column vector, matrix, cell array, etc.) of the

    imported variables.

    In this example, we will use the auto-selected options for range, variable names row and type of

    imported data. Click the Import Selection icon . [The data is imported as 3 separate variables Opioid, Sedative and Response. These variables appear in the MATLAB Workspace]

    The Import Tool enabled us to interactively import the data into MATLAB. Next, we will generate a function to automate this task; this will enable us to apply a similar import operation to multiple files without going through the Import Tool.

    To generate a function:

    Click on Import Selection and then select Generate Function. [Import Tool generates a new function file, importfile.m, and opens it in the Editor]

    To save the function, click on the Save icon in the Editor tab. Save the file as importfile.m in the MATLAB Training folder.

    Close the Editor window.

  • Introduction to MATLAB for Data Analysis and Visualization

    7

    You can now use this function to directly import data from other (similarly formatted) files. At the command prompt, type:

    >>[Opioid,Sedative,Response] = importfile('Data.csv') ;

    [The imported variables will overwrite any workspace variables that have the same name]

    Watch this video for more information on interacting with the Import Tool: http://www.mathworks.com/videos/importing-data-from-text-files-interactively-71076.html

  • Introduction to MATLAB for Data Analysis and Visualization

    8

    Visualization

    Data analysis often begins with visualization. Plots enable you to quickly visualize trends, spot outliers or anomalies, and identify missing information. MATLAB offers a variety of plot types that can be interactively accessed from the MATLAB desktop.

    In this example, we want to understand how Response varies with the concentration of the Opioid and Sedative. We will start exploring this relationship by creating a 3D scatter plot of the 3 variables Opioid, Sedative and Response.

    To create this plot:

    Click on the Plots tab in the Toolstrip

    In the Workspace folder, select Opioid, Sedative and Response. (Hold down the Ctrl key to select multiple variables) [The selected workspace variables are shown in the Selection section on the Plots tab the sequence of variables in the Selection section reflects the order in which they were selected in the Workspace]

    Select the scatter3 plot from the Plots gallery. [This creates a 3D scatter plot with Opioid, Sedative and Response on the x, y and z axis, respectively. Notice that MATLAB automatically inserts the command required to generate this plot at the command prompt >> in the Command Window]

  • Introduction to MATLAB for Data Analysis and Visualization

    9

    The plots gallery in the Plots tab displays a list of all possible plots (in MATLAB and available toolboxes) that can be created with the selected variables. To see additional plotting options, switch to the All plots option at the bottom of the expanded plots catalog.

    The scatter plot is created with default appearance properties. Next, we will customize the plot using

    Plot Tools, an interactive plot editing environment. Plot Tools lets you interactively modify the

    properties of the figure, as well as the properties of its subcomponents, such as the line, markers, text,

    axes, etc. In this example, we will customize the plot to include axes labels and title. We will also set the

    figure background color to white and marker face color to blue.

    To edit the properties of a figure using Plot Tools:

    Click on Show Plot Tools icon ( ) in the figure window to open Plot Tools. [Opens the figure in the Plot Tools environment] Maximize the Plot Tools window

    Click on Axes (no title) in the Plot Browser panel. Alternatively, click on the axis in the figure

    window. [The Property Editor at the bottom displays the properties of the selected axis]

    Configure the following axes properties in the Property Editor:

    o Title: Drug Interaction Plot o X-axis label: Opioid o Y-axis label: Sedative o Z-axis label: Response

    Click on the marker in the Plot Browser. [The Property Editor at the bottom displays the properties of the markers] Change the face color to blue in the Property Editor.

    Click on the figure background. [The Property Editor at the bottom displays the properties of the figure window]. Change the background color to white.

    Click on Hide Plot Tools icon ( ) in the figure window to exit Plot Tools

  • Introduction to MATLAB for Data Analysis and Visualization

    10

    Plot Tools enabled us to interactively customize the appearance of the drug-interaction plot. Next, we will generate a MATLAB function that will let us apply the same set of modifications to a different dataset. Once again, we are automating a subset of workflow.

    To generate a MATLAB function:

    Select File Generate Code in the figure window [Creates a new file createfigure.m and opens it in the Editor]

    Click on the Save icon in the Editor tab to save the generated file as createfigure.m in the current directory [The Current Folder displays this newly added file]

    Close the Editor window

  • Introduction to MATLAB for Data Analysis and Visualization

    11

    You can now use this function to recreate the figure. At command prompt >>, type:

    >> createfigure(Opioid, Sedative, Response, [] , 'b')

    In the above command, [] indicate the default marker size and b specifies edge color of the marker

    (blue).

  • Introduction to MATLAB for Data Analysis and Visualization

    12

    Analysis

    The data analysis in this example is composed of two parts:

    Data pre-processing: Extract data that has non- negative Response values

    Data fitting: Fit the interaction model (Eq. 1) to the pre-processed data

    Pre-processing

    The 3D scatter plot of the data highlights some irregularities in the data; specifically, we see that some of the Response values are negative (see circled points in the figure below). We will exclude these points before proceeding to the fitting step.

    In this example, we will use logical indexing to identify and extract non-negative Response values from the data. First, we will create a logical array (idx) that identifies the positions of the non-negative elements in the Response variable. Next, we will use this logical array (idx) to extract only the non-negative elements from the Opioid, Sedative and Response variables.

    >> idx = Response >= 0 ; >> newOpioid = Opioid(idx) ; >> newSedative = Sedative(idx) ; >> newResponse = Response(idx) ;

    [In the Workspace, notice that there are only 387 elements in newOpioid, newSedative and newResponse variables. 10 values were removed from the each of the original variables]

  • Introduction to MATLAB for Data Analysis and Visualization

    13

    Recreate the plot with the pre-processed data:

    >> createfigure(newOpioid, newSedative, newResponse, [] , 'b')

    [Notice that the negative Response values are no longer present in the figure]

    Data Fitting

    Using logical indexing, we have extracted the data subset associated with non-negative Response values. Next, we will fit the drug interaction model to the pre-processed data. We will use the Curve Fitting tool in the Curve Fitting Toolbox to perform the fitting.

    The Curve Fitting tool is an interactive tool to fit curves and surfaces to 2D and 3D data, respectively. The Curve Fitting tool is available in the Apps gallery (if the Curve Fitting Toolbox is available)

    To fit the interaction model (Eq. 1) to the dose-response data:

    Open the Curve Fitting tool; go to the Apps tab in the toolstrip, and select Curve Fitting in the Apps gallery. Alternatively, type cftool at the command prompt. [Opens the Curve Fitting tool]

    Select newOpioid, newSedative and newResponse as the X, Y and Z data, respectively [A surface is automatically fit to the data using the default fitting options]

    Change the fitting method to Custom Equation, and type the following expression in the equation edit box: combinedEffect(x, y, IC50A, IC50B, alpha, n)

    Set the lower bounds on the parameter estimates; click on Fit Options and set the lower bounds on all four parameters to 0.

    Click Fit

    [The figure in the Curve Fitting tool updates to display the fitted surface. The results box on the left displays the fitting results, such as the estimated parameter values and the goodness of fit metrics. The table of fits at the bottom displays a summary of fit]

  • Introduction to MATLAB for Data Analysis and Visualization

    14

    The estimated value of alpha is 8.84, which is greater than 1; this indicates that the interaction between

    the selected opioid and sedative is synergistic.

    As we did with Import Tool and Plot Tool, we will generate a MATLAB function that will enable us to apply the same fitting method and options to a different dataset.

    To generate a MATLAB function:

    Select File Generate Code under the menu options [Creates a new function file, createFit.m and opens it in the Editor)

    Click on the Save icon in the Editor tab to save the generated file as createFit.m in the MATLAB Training folder

  • Introduction to MATLAB for Data Analysis and Visualization

    15

    To call the function from the command prompt >>, type:

    >> createFit(newOpioid, newSedative, newResponse)

  • Introduction to MATLAB for Data Analysis and Visualization

    16

    Script Creation

    The analysis performed so far was all done interactively, either by using interactive tools such as the Import tool and Curve Fitting tool, or by calling functions at the MATLAB command prompt. Next, we will collect all the tasks performed until this point, and save them in a MATLAB script. Simply speaking, a MATLAB script is a sequence of MATLAB statements. You can evaluate the script by either typing the

    name of the file at the command prompt, or by clicking on the Run button in the Editor. Evaluating the script allows us to execute the entire analysis in a single step, instead of interactively executing each statement at the command prompt.

    In this example, we will create the script by selecting the relevant MATLAB statements from the Command History.

    To create a script from the selected commands in the Command History:

    In the Command History window, select the commands that you want to include in the script. (Hold down Ctrl or Shift key to select multiple lines)

    Right-Click on the selection and select Create Script. [Creates a new MATLAB script file containing the selected commands, and opens the file in the Editor].

    To save and evaluate the script:

    Click on the Save icon in the Editor tab, and save the file as myScript.m in the MATLAB Training folder

    Click on the Run button in the Editor tab to execute the entire analysis in a single step. Alternatively, type the filename at the command prompt (>> myScript)

    The generated script does contain a description of the steps or their purpose in the analysis. It is generally considered a good practice to comment your code because it helps others better understand your work. Next, we will add comments to the script to describe the logic and flow of the analysis.

  • Introduction to MATLAB for Data Analysis and Visualization

    17

    Use a % to add comments to a MATLAB script. By default, comments appear as green text in a MATLAB script, open in the MATLAB editor.

    In addition to including comments in the script, you can further organize your scripts by dividing it into smaller sections. Use %% to divide the scripts into code sections (Include a space after the %%)

    Besides serving as an organization tool, the code section feature in MATLAB also helps with prototyping your analysis. Dividing your script into smaller sections gives you the option to focus on and execute only a subset of the analysis, without evaluating the rest.

    Use the Run and Advance ( ) or Run Section ( ) buttons in the Editor tab to evaluate the script one code section at a time.

  • Introduction to MATLAB for Data Analysis and Visualization

    18

    Report Generation

    The final step in our example is creating a report that documents the analysis, along with any generated results and figures. In the example, we will use the Publish feature to generate the report.

    To create a report from myScript:

    Open the myScript.m file

    Click on the Publish tab in the toolstrip. Note: The Publish tab is only visible when the Editor is open.

    Click on the Publish button on the Publish tab [Generates an html report and opens it in the MATLAB web browser]

    The generated document captures the code, results, as well as, any figures that were generated in the

    analysis. HTML is the default format of reports generated by the Publish feature. You can also publish to

    other common formats such as PDF, Microsoft Word (.doc/.docx), Microsoft PowerPoint (.ppt) and

    LaTeX.