Introduction files/RIVPACS-RICT... · Web viewMachine Learning Build Guide - All Indices GB March...

6
River Invertebrate Classification Tool (RICT) Machine Learning Build Guide - All Indices GB March 2020 V1

Transcript of Introduction files/RIVPACS-RICT... · Web viewMachine Learning Build Guide - All Indices GB March...

Page 1: Introduction files/RIVPACS-RICT... · Web viewMachine Learning Build Guide - All Indices GB March 2020 V1 Contents: 1.Introduction3 2.Purpose of this document3 3.Experiment 1: GB

River Invertebrate Classification Tool (RICT)

Machine Learning Build Guide - All Indices GB

March 2020

V1

Page 2: Introduction files/RIVPACS-RICT... · Web viewMachine Learning Build Guide - All Indices GB March 2020 V1 Contents: 1.Introduction3 2.Purpose of this document3 3.Experiment 1: GB

Contents:1. Introduction 3

2. Purpose of this document 3

3. Experiment 1: GB All indices prediction 4

Version History

1 First Version 08/03/2020 Nick Irvine

Page 3: Introduction files/RIVPACS-RICT... · Web viewMachine Learning Build Guide - All Indices GB March 2020 V1 Contents: 1.Introduction3 2.Purpose of this document3 3.Experiment 1: GB

1. IntroductionRiver Invertebrate Classification Tool (RICT) is a web application that implements the RIVPACS IV predictive model. This tool is maintained by the UK’s environment agencies; Scottish Environment Protection Agency (SEPA), Environment Agency (EA), Natural Resources Wales (NRW) and Northern Ireland Environment Agency (NIEA).

2. Purpose of this documentThe purpose of this document is to outline how a user can build RICT on Microsoft Machine Learning Studio. This document has no dependencies, is a stand alone guide, as such is the only document required to build the GB all indices prediction experiment on machine learning.

This document should be used as a companion to the RICT technical specification and user guide.

This document outlines how the GB all indices prediction experiment can be created. Additional documents have been produced to outline how the other RICT experiments can be built.

Page 4: Introduction files/RIVPACS-RICT... · Web viewMachine Learning Build Guide - All Indices GB March 2020 V1 Contents: 1.Introduction3 2.Purpose of this document3 3.Experiment 1: GB

3. Experiment 1: GB All indices predictionThe GB all indices should be linked as outlined below. A description of each box is explained below.

1. Dataset inputThis should be removed by the user when they start and experiment and then read the input file they wish to use for the experiment.

For the standard experiment published to the gallery the test data set is added here.

To add this data set, upload a dataset by pressing NEW bottom left of the screen then upload the CSV input file.

Select the uploaded file and drag onto the experiment.

This step is explained in the user guide in section 8.

2. Enter Data ManuallyCreated by using options on left – Data Input and Output > Enter Data Manually

When added, left click on the box and the options on the right panel should be:

Has header should be ticked.

Data Format: CSV

Data box should read:

12 3

4

5

6

Page 5: Introduction files/RIVPACS-RICT... · Web viewMachine Learning Build Guide - All Indices GB March 2020 V1 Contents: 1.Introduction3 2.Purpose of this document3 3.Experiment 1: GB

Row 1: headerRow 2: parameterRow 3: <leave blank>

3. Prediction Support FilesThese are the support files that are needed to process the R code prediction script block.

This is a zip file which includes all the support files included.

This zip file embedded below

Further details about the function of each support file within the zip is outlined in technical specification.

The zip file is added by the same steps as option 2 above. The zip file is uploaded and then dragged on to the experiment.

4. Execute R Code ScriptThis is the first block of code to execute the prediction R code script.

This block is created by selecting R Language Modules > Execute R script and dragging on to the workspace.

Once the block is created, click the box and on the right hand panel in the R Script box, delete all the content and paste in the code from the file below

Random Seed box should be left empty

R Version should be selected as Microsoft R Open 3.4.4

Boxes 1,2,3 should then be linked to this box as per the diagram at the start of this section.

5. Select Columns in DatasetThis box enables the user to select a different output if desired.

This can be found on the left hand side by selecting Data Transformation > Manipulation and then Select Columns in Dataset.

Page 6: Introduction files/RIVPACS-RICT... · Web viewMachine Learning Build Guide - All Indices GB March 2020 V1 Contents: 1.Introduction3 2.Purpose of this document3 3.Experiment 1: GB

Once created you need to select all the columns that need to be selected. Click on the box and then the click launch column selector. Then select Begin With > No Columns

For this experiment you need to select all columns.

This should be linked to the left side output of box 4.

6. Convert to CSVThis box allows the user to download the prediction output of the experiment.

This is created by selecting Data Format Conversions > Convert to CSV

This block should then be connected to box 5.