Introduction to Orange - d37djvu3ytnwxt.cloudfront.net · Introduction to Orange ... delimited text...

Post on 05-Jun-2018

226 views 0 download

Transcript of Introduction to Orange - d37djvu3ytnwxt.cloudfront.net · Introduction to Orange ... delimited text...

IntroductiontoOrange

IntroductiontoOrange

• Orangeisadataminingtoolkit,soyoudon’tneedtobeanexpertinanyofthosesubjects• WewilluseOrangeto:• load,manipulate,andsavelargedatasets• visualizetherelationshipsbetweenvariables• discoverandquantifypatternsindata• createrulestopredictoutcomesbasedonobserveddata

Orange:GraphicalProgramming

UsingtheOrangeinterface

• Toaddawidget,dragitontothecanvasfromthewidgetpanel,orjustclickonitinthewidgetpanel• Toaddasignal,clickonthesignalattachmentpointonawidgetanddragfromittothesignalattachmentpointonanotherwidget• Inputsignalscomeinfromtheleft,outputsignalsgoouttotheright

UsingtheOrangeinterface

• Somewidgetshavemultiplepossibleinputandoutputports• Orangetriestoguesswhichoneyoumean• Ifitguesseswrong,doubleclickonthesignaltoselectwhichinputsandoutputsyouareusing

• Youcanalsotemporarilydisconnectordeletesignalsbyright-clickingonthem

FileWidget• Loadsdatafromafile• Manydifferentfiletypesaresupported• Recommended:tab-delimitedtext

• iris.tab isanexampledatasetthatcomeswithOrange,andcontains150irisflowersfromthreespecies

DataTableWidget• Listsrowsinadataset,sortbyclickingonthecolumnheading• Eachvaluehasabarshowinghowbigitis• Firstcolumnisassumedtobeacategory(inthiscase,species)

CCBY-SA3.0

Foreachofthe150flowersinthedataset,thereisavaluefor:

• PetalLength• PetalWidth• SepalLength• SepalWidth

SelectRowsWidget• Filtersdataaccordingtosimplerules• Forexample:excludealliriseswithshortpetals• Selectanattributeandaconditionandpress“Add”toaddittothefilter

DataSelectionResults• The“petallength”columnnowonlycontainsvalueslongerthan3cm• Thebluecategory,iris-setosa,isnowcompletelyabsent.• Apparentlyalliris-setosa flowershavepetalsshorterthan3cm.

SelectColumnsWidget(1)• Choosewhichcolumnsgointhedataset• “Attributes”aredatavaluestobeincludedinoutput• “Class”isthecategoryoftherow• “MetaAttributes”aredescriptiveattributesthatareexcludedfromtheanalysis(suchasarowID)• “AvailableAttributes”areattributesavailabletobeloaded,butignored

SelectColumnsWidget(1)• Dragormovevariablesbetweencategorieswiththe“>”and“<“buttons• Eachvariableismarked“C”forcontinuous(numericalvalues)or“D”fordiscrete(categoricalvalues)• Youmayneedtoclick“Apply”beforeanychangesyoumaketakeeffect

SelectColumnsinaction• Supposewewereonlyinterestedinsepals,notpetals.

FeatureConstructorWidget• Definesnewattributes(i.e.columns)basedonthevaluesofexistingattributes• Typeaformulaandclick“Add”toaddanewfeature

• Selectfieldsusing“(allattributes)”and“(allfunctions)”

• Widgetoutputsthesamedatasetwithnewattributesadded

• Thisparticularcalculationisassumingpetalsaretriangular

FeatureConstructionResults• Newattributeisaddedafterexistingattributesbutbeforeclass

SaveWidget

• Saveamodifiedfile• Saveswhateverisgoingtoitsinput• Ifyoumadechangeselsewhereinthescheme,theywillnotbesaved

• Becarefulnottoaccidentallyoverwriteyourinputfile

Exercise:FirstScheme• Loadandinspecttheimports-85.tab datafile(oncoursewebsite),whichcontainsinformationaboutvariousimportedcars• Adda“volume”attribute(i.e.lengthxwidthxheight)• Removetheoriginallength,width,andheightattributes• Savethedatasetusingadifferentfilename

Solution

Solution,continued

Remembertoclick“Apply”afteryoumakechanges!