Apache NiFi 1.0 in Nutshell

47
Apache NiFi 1.0 in Nutshell Koji Kawamura – Software Engineer Arti Wadhwani – Technical Support Engineer 2016 October 27

Transcript of Apache NiFi 1.0 in Nutshell

Page 1: Apache NiFi 1.0 in Nutshell

ApacheNiFi 1.0inNutshellKojiKawamura– SoftwareEngineerArti Wadhwani – TechnicalSupportEngineer2016October27

Page 2: Apache NiFi 1.0 in Nutshell

2 ©HortonworksInc.2011–2016.AllRightsReserved

AgendaWhat’sNiFi

NiFi 1.0Enhancements

NiFi ontheedge

Commonissues

What’sNext?

Page 3: Apache NiFi 1.0 in Nutshell

3 ©HortonworksInc.2011–2016.AllRightsReserved

AgendaWhat’sNiFi

NiFi 1.0Enhancements

NiFi ontheedge

Commonissues

What’sNext?

Page 4: Apache NiFi 1.0 in Nutshell

4 ©HortonworksInc.2011–2016.AllRightsReserved

November 2014NiFi is donated to the Apache Software Foundation (ASF) through NSA’s Technology Transfer Program and enters ASF’s incubator.

2006NiagaraFiles (NiFi) was first incepted at the National Security Agency (NSA)

ABriefHistory

July 2015NiFi reaches ASF top-level project status

Page 5: Apache NiFi 1.0 in Nutshell

5 ©HortonworksInc.2011–2016.AllRightsReserved

” NiFi islikedigging irrigationditchesasthewaterflows,ratherthanbuildingoutasprinklersysteminadvance."

“NiFiは事前にスプリンクラーを配備するというより、水が流れるのに合わせて用水路を整備するようなもんさ”

https://mail-archives.apache.org/mod_mbox/nifi-users/201604.mbox/%[email protected]%3E

What’sApacheNiFi?

Page 6: Apache NiFi 1.0 in Nutshell

6 ©HortonworksInc.2011–2016.AllRightsReserved

NiFi isatoolfor

DataFlowManagement

Page 7: Apache NiFi 1.0 in Nutshell

7 ©HortonworksInc.2011–2016.AllRightsReserved

StoreData

ProcessandAnalyzeData

AcquireData

SimplisticViewofDataFlows:Easy,Definitive

Dataflow

Page 8: Apache NiFi 1.0 in Nutshell

8 ©HortonworksInc.2011–2016.AllRightsReserved

RealisticViewofDataflows:Complex,Convoluted

StoreData

ProcessandAnalyzeData

AcquireData

StoreDataStoreData

StoreData

StoreData

AcquireData

AcquireData

AcquireData

Dataflow

Page 9: Apache NiFi 1.0 in Nutshell

9 ©HortonworksInc.2011–2016.AllRightsReserved

NiFi 1.0has170+Processors,30%IncreasefromNiFi 0.7

Hash

Extract

Merge

Duplicate

Scan

GeoEnrich

Replace

ConvertSplit

Translate

RouteContent

RouteContext

RouteText

ControlRate

DistributeLoad

GenerateTableFetch

JoltTransformJSON

PrioritizedDelivery

Encrypt

Tail

Evaluate

Execute

HL7

FTP

UDP

XML

SFTP

HTTP

Syslog

Email

HTML

Image

AMQP

MQTT

AllApacheproject logosaretrademarksoftheASFandtherespectiveprojects.

Fetch

Page 10: Apache NiFi 1.0 in Nutshell

10 ©HortonworksInc.2011–2016.AllRightsReserved

DeeperEcosystemIntegration– NewProcessors

Processor Description

Publish/ConsumeKafka TwoNARs, withkafka 0.9/0.10clientlibraries, respectively

JoltTransformJson Manipulate JSONdataonthefly,withapreviewfunctionality

GenerateTableFetch Incremental fetch+parallelfetchagainstsourcetablepartitions

PutHiveQL IngesttoHivetables

SelectHiveQL Select fromHivetables

PutHiveStreaming ingeststreamingdatatoHive,leverageHivestreamingAPI

CovertAvroToORC Formatconversation,AvrotoORC

Publish/ConsumeMQTT MQTTisapopularprotocol inIoTworld

Page 11: Apache NiFi 1.0 in Nutshell

11 ©HortonworksInc.2011–2016.AllRightsReserved

SOURCES REGIONALINFRASTRUCTURE

COREINFRASTRUCTURE

DataMovementManagement

ConstrainedHigh-Latency

Localized Context

Hybrid – Cloud/On-PremiseLow-Latency

Global Context

Page 12: Apache NiFi 1.0 in Nutshell

12 ©HortonworksInc.2011–2016.AllRightsReserved

HortonworksDataFlow(HDF)

§ Constrained§ High-latency§ Localizedcontext

§ Hybrid– cloud/on-premises§ Low-latency§ Globalcontext

SOURCES REGIONALINFRASTRUCTURE

COREINFRASTRUCTURE

Page 13: Apache NiFi 1.0 in Nutshell

13 ©HortonworksInc.2011–2016.AllRightsReserved

FlowManagement

DetailedBreakDownofRequirements

à Req 1:Acquire datafromvariousWearableDevice’sCloudInstances

à Req 2:Move DatafromCustomerCloudInstancestoon-premiseinstance

à Req 3:Performintelligent Routing &Filtering ofdata.Theroutingandfilteringruleswillbeoftenchangedatrun-time.

à Req 4:Deliver thedatadatatovariousdownstreamsystems.Newdownstreamappsshouldwillalwaysappearandthedatashouldbefedtoitwhenitcomesonline.

à Req 5:Parse thedevicedatatostandardizedformatthatdownstreamsysemcanunderstand

à Req 6:Enrich thedatawithcontextualinformationincludingpatient/customerinfo(age,gender,etc..)

à Req 7:Recognizethepatternwhentherestingheartrateexceedsacertainthreshold(theinsight),andthencreateanalert/notification.

à Req 8: RunaOutlierdetectionmodelonstreamingheartratethatcomesin.Ifthescoreisabovecertainthreshold,alertontheheartrate.

StreamProcessing&Analytics

Page 14: Apache NiFi 1.0 in Nutshell

14 ©HortonworksInc.2011–2016.AllRightsReserved

AgendaWhat’sNiFi

NiFi 1.0Enhancements

NiFi ontheedge

Commonissues

What’sNext?

Page 15: Apache NiFi 1.0 in Nutshell

15 ©HortonworksInc.2011–2016.AllRightsReserved

NiFi 1.0:ModernizedUI

Page 16: Apache NiFi 1.0 in Nutshell

16 ©HortonworksInc.2011–2016.AllRightsReserved

ModernizedUI– CompleteInterfaceRedesign

Page 17: Apache NiFi 1.0 in Nutshell

17 ©HortonworksInc.2011–2016.AllRightsReserved

ConnectComponentstodesignyourdataflow

Component What for?Processor Purposebuiltprocessingunite.g. GetXXX,PutXXXInputPort Receivingdata endpointbtwProcessGroups(local/remote)OutputPort ExposingdataendpointbtwProcessGroups(local/remote)ProcessGroup Musthave,todesignwellstructureddataflowRemoteProcessGroup EnabledatatransferbtwNiFi deploymentsviaSite-to-SiteFunnel Bundlemultiplerelationships intooneTemplate SharepartofdataflowLabel Usefultovisuallygroupprocessors,anddescription

Fromlefttoright

Page 18: Apache NiFi 1.0 in Nutshell

18 ©HortonworksInc.2011–2016.AllRightsReserved

DataProvenance

Page 19: Apache NiFi 1.0 in Nutshell

19 ©HortonworksInc.2011–2016.AllRightsReserved

NiFi 1.0:MultitenantAuthorization

Page 20: Apache NiFi 1.0 in Nutshell

20 ©HortonworksInc.2011–2016.AllRightsReserved

NiFi 0.x- AuthorizationModel

à Previouslyhadrolebasedauthorization– DataflowManager(DFM)– Monitor– Provenance– Admin– Proxy– NiFi

à Limitation- Allornothingmodel– DFMcanchangeeverything,Monitorcanchangenothing– Can’tgiveauserabilitytomodify/viewonlycertaincomponents– WouldrequirestandingupmultipleNiFiinstances

Page 21: Apache NiFi 1.0 in Nutshell

21 ©HortonworksInc.2011–2016.AllRightsReserved

NiFi 1.0- AuthorizationModel

à NiFi 1.0introducesanewdelegatedauthorizationmodel

à Authorizeeachrequestbasedonuseridentity,action,andresource– Exampleforuser1modifyingpropertiesonprocessor1:

• UserIdentity:user1• Action:WRITE• Resource:processor1 (uuid)

à Ifauthorizersaysresourcenotfound,parentischecked…ifparentisn’tfound,parent’sparentischecked,andsoon…

Page 22: Apache NiFi 1.0 in Nutshell

22 ©HortonworksInc.2011–2016.AllRightsReserved

NiFi 1.0– NiFiManagedAuthorizervs.ExternalAuthorizer

à ManagedAuthorizer– Filebasedpersistence

• Couldbebeextendedtootherpersistencemechanisms– NiFiUItomanagepolicies– NiFicontrolsauthorizationlogic

à ExternalAuthorizer– Rangerintegration– RangerUItomanagepolicies– Rangercontrolsauthorizationlogic

Page 23: Apache NiFi 1.0 in Nutshell

23 ©HortonworksInc.2011–2016.AllRightsReserved

NiFi 1.0– ManagingUsers

à ClickingthenewusericonallowstheadmintocreateUsersandGroups– IndividualUserscanbegrouped– Groupscanbeassigned

members

à ClickingtheeditusericonallowstheadmintoupdateaspecificUser/Group

Page 24: Apache NiFi 1.0 in Nutshell

24 ©HortonworksInc.2011–2016.AllRightsReserved

NiFi 1.0– UIOverviewUsersIconinGlobalMenuusedtoaccess

Users/Groups

LockIconinGlobalMenuusedto

accessGlobalpolicies

Page 25: Apache NiFi 1.0 in Nutshell

25 ©HortonworksInc.2011–2016.AllRightsReserved

NiFi 1.0– UIOverview

LockIconinpaletteusedtoaccess

policiesforcurrentlyselectedcomponent

SelectionContext

Page 26: Apache NiFi 1.0 in Nutshell

26 ©HortonworksInc.2011–2016.AllRightsReserved

NiFi 1.0– OverridingComponentPolicies

à ComponentinheritpoliciesfromtheclosestancestorProcessGroupwithpoliciesdefined

à View/Modifypolicieshandledindependently

à ClickOverridetodefineanewpolicy,thenaddUsersandGroups

à NewUsersandGroupsoverridetheinheritedpolicies(whitelisting)

Page 27: Apache NiFi 1.0 in Nutshell

27 ©HortonworksInc.2011–2016.AllRightsReserved

NiFi 1.0- Multi-TenancyExample

à CreateaGroupforTeam1andaGroupforTeam2

à GiveTeam1view&modifyforProcessGroup1

à GiveTeam2view&modifyforProcessGroup2

à AuserfromTeam1wouldsee:

Can’tseethenameofthegroupandcan’tright-clicktoconfigure thegroup, butcanenterthegroup

Page 28: Apache NiFi 1.0 in Nutshell

28 ©HortonworksInc.2011–2016.AllRightsReserved

NiFi 1.0– Revisions

à Revisionpercomponent

à Supportsconcurrenteditingofdifferentcomponentswithoutneedforrefreshing

Page 29: Apache NiFi 1.0 in Nutshell

29 ©HortonworksInc.2011–2016.AllRightsReserved

NiFi 1.0:ZeroMasterClustering

Page 30: Apache NiFi 1.0 in Nutshell

30 ©HortonworksInc.2011–2016.AllRightsReserved

NiFi 0.x:NCM(NiFi ClusterManager)

NCM

Node1

Node2

ExternalDataSource

Chunk

Chunk

Chunk

Distributionmechanismdependsondatasource

WebUI

OtherNiFi

InteractwithNCM

Site-to-Site:GettopologyfromNCMThentransferdatap2p

Primary

Page 31: Apache NiFi 1.0 in Nutshell

31 ©HortonworksInc.2011–2016.AllRightsReserved

NiFi 1.0:ZMC(ZeroMasterClustering)

Node1

Node2

Node3

ExternalDataSource

Chunk

Chunk

Chunk

Distributionmechanismdependsondatasource

WebUI

OtherNiFi

Interactwithanynode

Site-to-Site:Gettopologyfromoneofnodes

Thentransferdatap2pZookeeper

Primary

Coordinator

ZookeeperelectsClusterCoordinatorandPrimarynode

Anynodecanfail

Page 32: Apache NiFi 1.0 in Nutshell

32 ©HortonworksInc.2011–2016.AllRightsReserved

NiFi 1.0:AndMore!

Page 33: Apache NiFi 1.0 in Nutshell

33 ©HortonworksInc.2011–2016.AllRightsReserved

FoundationalWorkforSDLCÃ Deterministictemplateexport

– Deterministicordering,templatexmlfile

– Versioncontrolofthetemplate

– CollaborativeSDLCeffort

à Variableregistry

– Phaseoneimplementation

– In-memoryvariableregistry

– Thesamekeyreferencedinatemplate,mappedtodifferentenvironmental

specificvalues

Page 34: Apache NiFi 1.0 in Nutshell

34 ©HortonworksInc.2011–2016.AllRightsReserved©HortonworksInc.2011–2016.AllRightsReserved�X

EntertheTLSToolkit

⬢ Command-linetooltoautomatecertificategenerationandconfiguration

⬢ Self-containedcertificateauthority(CA)forcertificatesigning

⬢ Keystore&truststoregeneration

⬢ Clientcertificategeneration

⬢ Automaticallyupdatesnifi.properties⬢ UnderpinsAmbariTLSintegration

Page 35: Apache NiFi 1.0 in Nutshell

35 ©HortonworksInc.2011–2016.AllRightsReserved

JVM

RESTAPI

NiFi

Framework

Proc CS ReportTask

ExtensionAPI

S2SAPI

JVM

S2SClientLibraries

Site-to-SiteRefactoring– S2SHTTP(S)ProtocolthroughProxyServer

Socketprotocol:TCP

HDF2.0:HTTP(s)protocol

HTTPproxy

Page 36: Apache NiFi 1.0 in Nutshell

36 ©HortonworksInc.2011–2016.AllRightsReserved

AgendaWhat’sNiFi

NiFi 1.0Enhancements

NiFi ontheedge

Commonissues

What’sNext?

Page 37: Apache NiFi 1.0 in Nutshell

37 ©HortonworksInc.2011–2016.AllRightsReserved

EdgeIntelligencewithApacheMiNiFi

à Guaranteeddeliveryà Databuffering

‒ Backpressure‒ Pressurerelease

à Prioritizedqueuingà FlowspecificQoS

‒ Latencyvs.throughput‒ Losstolerance

à Dataprovenance

à Recovery/recordingarollinglogoffine-grainedhistory

à Designedforextension

DifferentfromApacheNiFià DesignandDeployà Warmre-deploys

KeyFeatures

Page 38: Apache NiFi 1.0 in Nutshell

38 ©HortonworksInc.2011–2016.AllRightsReserved

NiFivs.MiNiFiJavaProcessor,SmallerFootprint~40MB

NiFiFramework

Components

MiNiFi

NiFiFramework

UserInterface

Components

NiFi

Page 39: Apache NiFi 1.0 in Nutshell

39 ©HortonworksInc.2011–2016.AllRightsReserved

AgendaWhat’sNiFi

NiFi 1.0Enhancements

NiFi ontheedge

Commonissues

What’sNext?

Page 40: Apache NiFi 1.0 in Nutshell

40 ©HortonworksInc.2011–2016.AllRightsReserved

Commonissues

à NiFi Repoconfigurationissuesà NiFi SSLconfigurationorcertificateissuesà ExecuteStreamCommand Processorgettingstuckà OutOfMemory IssueswithNCMorprocessors.

Page 41: Apache NiFi 1.0 in Nutshell

41 ©HortonworksInc.2011–2016.AllRightsReserved

BestPractices

à DebugLoggingincaseofProcessorissuesà CorePropertiesandJVMtuning:

https://community.hortonworks.com/articles/7882/hdfnifi-best-practices-for-setting-up-a-high-perfo.html

Page 42: Apache NiFi 1.0 in Nutshell

42 ©HortonworksInc.2011–2016.AllRightsReserved

UnderstandinghealthviaNiFi UI

StatusBar

ProcessorDetails

Page 43: Apache NiFi 1.0 in Nutshell

43 ©HortonworksInc.2011–2016.AllRightsReserved

NiFi SummaryPage

Page 44: Apache NiFi 1.0 in Nutshell

44 ©HortonworksInc.2011–2016.AllRightsReserved

SystemInformation

Page 45: Apache NiFi 1.0 in Nutshell

45 ©HortonworksInc.2011–2016.AllRightsReserved

AgendaWhat’sNiFi

NiFi 1.0Enhancements

NiFi ontheedge

Commonissues

What’sNext?

Page 46: Apache NiFi 1.0 in Nutshell

46 ©HortonworksInc.2011–2016.AllRightsReserved

What’sNext

à Frameworkextension– Distributeddatadurability(HA

data)– Configurationmanagementflows

(SDLC)Ã EnhancedUserExperience

– Template/ExtensionRegistry– VariableRegistry

à Deeperecosystemintegration

à CentralCommandandControlà NativeAgent(GA)

NiFi MiNiFi

https://cwiki.apache.org/confluence/display/NIFI/Product+requirements

Nifi productrequirements Search!

Page 47: Apache NiFi 1.0 in Nutshell

47 ©HortonworksInc.2011–2016.AllRightsReserved

ThankYou