How*and*When*to*Use* Dynamic*Lookups*...Stac*Lookups*Review * 17!...
Transcript of How*and*When*to*Use* Dynamic*Lookups*...Stac*Lookups*Review * 17!...
Copyright © 2013 Splunk Inc.
Nimish Doshi Principal Systems Engineer, Splunk #splunkconf
How and When to Use Dynamic Lookups
Legal NoIces During the course of this presentaIon, we may make forward-‐looking statements regarding future events or the expected performance of the company. We cauIon you that such statements reflect our current expectaIons and esImates based on factors currently known to us and that actual events or results could differ materially. For important factors that may cause actual results to differ from those contained in our forward-‐looking statements, please review our filings with the SEC. The forward-‐looking statements made in this presentaIon are being made as of the Ime and date of its live presentaIon. If reviewed aSer its live presentaIon, this presentaIon may not contain current or accurate informaIon. We do not assume any obligaIon to update any forward-‐looking statements we may make. In addiIon, any informaIon about our roadmap outlines our general product direcIon and is subject to change at any Ime without noIce. It is for informaIonal purposes only and shall not, be incorporated into any contract or other commitment. Splunk undertakes no obligaIon either to develop the features or funcIonality described or to include any such feature or funcIonality in a future release.
Splunk, Splunk>, Splunk Storm, Listen to Your Data, SPL and The Engine for Machine Data are trademarks and registered trademarks of Splunk Inc. in the United States and other countries. All other brand names, product names, or trademarks belong to their respecCve
owners.
©2013 Splunk Inc. All rights reserved.
2
About Me: Nimish Doshi
! Principal Systems Engineer at Splunk – Cover the USA East Coast
! Have been at Splunk since 2008 ! Splunk Blogger ! AcIve App Template and Add-‐on developer at apps.splunk.com
3
Agenda
! Lookups in General ! StaIc Lookups ! Dynamic Lookups – Retrieve fields from a web site – Retrieve fields from a database – Retrieve fields from a persistent cache
4
Lookups in General
Enrichment
6
Enrich your events with fields from external sources.
Windows § Registry § Event logs § File system § sysinternals
Logfiles Configs Messages Traps Alerts
Metrics Scripts Tickets Changes
Capturing All This Data Occurs at INDEX Time
7
VirtualizaIon § Hypervisor § Guest OS § Guest Apps
Linux/Unix § ConfiguraIons § syslog § File system § ps, iostat, top
ApplicaIons § Web logs § Log4J, JMS,JMX § .NET events § Code, scripts
Databases § ConfiguraIons § Audit/query logs § Tables § Schemas
Networking § ConfiguraIons § syslog § SNMP § neglow
Splunk Architecture, BIG DATA Plagorm
8
Splunk Web Interface Splunk CLI Interface Other Interfaces, SDKs
Search
Index
Monitor Files Listen to Network Ports Run Scripts
Detect File Changes
Data RouIng, Cloning and Load Balancing
Distributed Search
REST API
Users & Access Controls
Scheduling/AlerIng ReporIng Knowledge
Splunk > Engine
Distributed Search
Deployment Server
(WMI, Registry, OPSEC LEA, DBI, JMS, VMWare API, other APIs)
Lookups
UlImate Knowledge Base
9
Lookups enrich your ability to act upon your data
hnp://sangrea.net/free-‐cartoons/comp_real-‐life-‐search-‐engine.jpg, Royalty Free Cartoons
*
Before Lookups
10
ASer Lookups
11
Top Status DescripIon
Integrate External Data
12
Extend search with lookups to external data sources LDAP, AD Watch
Lists
CRM/ERP CMDB
Correlate IP addresses with locaIons, accounts with regions
ü User’s Mailing Address (AD)
ü Error Code DescripIons
ü Product Names
ü Stock Symbol (from CUSIP)
InteresIng Things to Lookup
13
ü External Host Address
ü Database Query
ü Web Service Call for Status
ü Geo LocaIon
Other Reasons for Lookup
14
! Bypass staIc developer or vendor that does not enrich logs ! ImaginaIve correlaIons
– i.e. Web site URL with like or dislike count stored in external source
! Make your data more interesIng – Bener to see textual descripIons than arcane codes
StaIc Lookups
StaIc vs. Dynamic Lookup
16
StaIc
Dynamic
External Data comes from a CSV file
External Data comes from output of an external script. Output
resembles a CSV file
StaIc Lookups Review
17
! Pick what input fields will be used to get output fields ! Create or locate a CSV file that has all fields in proper order ! Tell Splunk via the Manager about your CSV file and your lookup ̶ You can also define lookups manually via props.conf and transforms.conf ̶ If you use automaIc lookups, they will run every Ime the source,
sourcetype, or associated host stanza is used in a search ̶ Non-‐automaIc lookups run only when the lookup command is invoked in
the search
Example StaIc Lookup Conf files
18
props.conf!![access_combined] lookup_http = http_status status OUTPUT status_description, status_type
transforms.conf
[http_status] filename = http_status.csv
Example AutomaIc StaIc Lookup
19
Permissions
20
local.meta![lookups/http_status.csv] access = read : [ * ], write : [ * ] export = system [transforms/http_status] access = read : [ * ], write : [ * ] export = system
Lookup Topics Not Covered in This Session
21
! Field extracIons are performed before lookups ! Lookups run on the indexer
– To ensure lookups do not run on remote peers, use local=true in lookup command
! You can also use outputlookup to populate a CSV file ! You can also use Ime based lookups to find fields that match your event’s Imestamp with an interval in the lookup CSV
Dynamic Lookups
Dynamic Lookups
23
! Write the script to simulate access to external source ! Test script with one set of inputs ! Create the Splunk Version of the lookup script ! Register the Script with Splunk via Manager or Conf files ! Test the script explicitly before using automaIc lookups
Lookups vs. Custom Command
24
! Use dynamic lookups when returning fields given input fields – Standard for users who already know how to use lookup
! Use a custom command when doing more than just lookup – Not all use cases involve just returning fields
ê Decrypt event data ê Translate event data from one format to another with new fields (e.g. FIX Orders)
Write/Test External Field Gathering Script
25
External Data in Cloud Your Python Script
Send Input Fields
Return Output Fields
Scripts
Example Script to Test External Lookup
26
# Given a host, find the ip !def mylookup(host):! try:!! ipaddrlist = socket.gethostbyname_ex(host)!
return ipaddrlist! except:! return []
Write/Test External Field Gathering Script
27
External Data in Cloud Your Python Script
Send Input Fields
Return Output Fields
Scripts
Test External Field Gathering Script with Splunk
28
Output Fields External Data
in Cloud Your Python Script
Scripts
Script for Splunk Simulates Reading Input CSV
29
hostname, ip!
a.b.c.com!
Zorrosty.com!
seemanny.com!
Output of Script Returns Logically Complete CSV
30
hostname, ip!
a.b.c.com, 1.2.3.4!
Zorrosty.com, 192.168.1.10!
seemanny.com, 10.10.2.10!
transforms.conf for Dynamic Lookup
31
[NameofLookup]!external_cmd = <name>.py field1…fieldN!external_type = python!fields_list = field1, …, fieldN!
Example Dynamic Lookup Conf files
32
!transforms.conf!
!
# Note this is an explicit lookup![whoisLookup]!external_cmd = whois_lookup.py ip whois!external_type = python!fields_list = ip, whois!
Dynamic Look Up Python Flow
33
def lookup(input):!
Perform external lookup based on input. Return result main()!
Check standard input for CSV headers.!
Write headers to standard output.!
For each line in standard input (input fields):!
!Gather input fields into a dictionary (key-value structure)!
!ret = lookup(input fields)!
!If ret:!
! !Send to standard output input values and return values from lookup!
Whois Lookup
34
def main():!
if len(sys.argv) != 3:!
print "Usage: python whois_lookup.py [ip field] [whois field]"!
sys.exit(0)!
ipf = sys.argv[1]!
whoisf = sys.argv[2]!
r = csv.reader(sys.stdin)!
w = None!
header = []!
first = True…!
Whois Lookup (cont.) to Read CSV Header
35
# First read the “CSV header” and output the fields names. Continue!
for line in r:!
if first:!
header = line!
if whoisf not in header or ipf not in header:!
print "IP and whois fields must exist in CSV data"!
sys.exit(0)!
csv.writer(sys.stdout).writerow(header)!
w = csv.DictWriter(sys.stdout, header)!
first = False!
continue…!
Whois Lookup (cont.) to Populate Input Fields
36
# Read the result and populate the values for the input fields (ip address in our case) !
result = {}!
i = 0!
while i < len(header):!
if i < len(line):!
result[header[i]] = line[i]!
else:!
result[header[i]] = ''!
i += 1!
Whois Lookup (cont.) to Populate Output Fields
37
# Perform the whois lookup if necessary !
if len(result[ipf]) and len(result[whoisf]):!
w.writerow(result)!
# Else call external website to get whois field from the ip address as the key!
elif len(result[ipf]):!
result[whoisf] = lookup(result[ipf])!
if len(result[whoisf]):!
w.writerow(result)!
Whois Lookup FuncIon
38
LOCATION_URL=http://some.url.com?query=!
# Given an ip, return the whois response !
def lookup(ip):!
try:!
whois_ret = urllib.urlopen(LOCATION_URL + ip)!
lines = whois_ret.readlines()!
return lines!
except:!
return ''!
Database Lookups in General
39
! Use DB Connect from apps.splunk.com if possible – Splunk supported – No code to be wrinen to do lookups to for
popular RDBMS
! Use your own DB lookup when – Your Database is not supported by DB Connect – You want to perform the lookup with custom
code to meet a requirement
Database Lookup vs. Database Sent to Index
40
! Depends… ! Use a lookup when
– Using needle in the haystack searches with a few users – Using form searches returning few results
! Index the database table or view when ê Having lots of users and ad hoc reporIng is needed ê It is ok to have “stale” data (N minutes) old for a dynamic database
Database Lookup
41
! Acquire proper modules to connect to the database ! Connect and authenIcate to database
– Use a connecIon pool, if possible ! Have lookup funcIon query the database
– Return a list ( [ ] ) of results
Example Database Lookup Using MySQL
42
# See http://splunk-base.splunk.com/apps/36664/splunk-mysql-connector!
# for a general example using MySQL!
# First connect to DB outside of the for loop!conn = MySQLdb.connect(host = "localhost",! user = “name of user",! passwd = ”password",! db = ”Name of DB")!cursor = conn.cursor()!
Example Database Lookup (cont.) Ssing MySQL
43
import MySQLdb….!
# Given a city, find its country !
def lookup(city, cur):!
try:!
selString = "SELECT country FROM city_country where city=”!
! cur.execute (selString + "\"" + city + "\"")!
row = cur.fetchone()!
return row[0]!
!except:!
return []!
Example Mongdb Lookup
44
import pymongo!
from pymongo import Connection!
# Given a star name, find its magnitude !
def lookup(collection, key):!
!try:!
!star = collection.find_one({'name’: key})!
! !return star['magnitude'] !
!except:!
return None!
...!
connection = Connection()!
db = connection.test_database!
star_collection = db.stars ...!
Web Services Lookup
45
! Acquire proper modules to connect to web service ! Connect and authenIcate to web service, if necessary ! Have lookup funcIon call your web service method
– Return a String of results or an empty String, if no match occurs
Example Web Services Lookup Using Suds
46
From suds.client import Client…!
# Given a name, height, and weight, return percent body fat !
def lookup(client, name, height, weight):!
try:!
result=client.service.getPercentBodyFat(name, height, weight)!
if result!=None and result!=‘’:!
!return result!
! else:!
! !return “”!
!except:!
return “”!
… client=Client(<URL to some Web Service WSDL>)!
Lookup Using Key Value Persistent Cache
47
! Download and install Redis ! Download and install Redis Python module ! Import Redis module in Python and populate key value DB ! Import Redis module in lookup funcIon given to Splunk to look up a value given a key
Redis Lookup
48
##### CHANGE PATH TO your distribution FIRST ############
sys.path.append("/Library/Python/2.6/site-packages/redis-2.4.5-py2.6.egg")
import redis
…
def main():
….
# Connect to redis CHANGE for your DISTRIBTUION
pool = redis.ConnectionPool(host='localhost', port=6379, db=0)
redp = redis.Redis(connection_pool=pool)
Redis Lookup (cont.)
49
# Note that this returns key value pairs. Redis can also return keys mapped to mulIple values
def lookup(redp, mykey):
try:
return redp.get(mykey)
except:
return “”
Combine Persistent Cache with External Lookup
50
! For data that is “relaIvely staIc” – First see if the data is in the persistent cache – If not, look it up in the external source such as a database or web service – If results come back, add results to the persistent cache and return results
! For data that changes oSen, you will need to create your own cache retenIon policies
Combining Redis with Whois Lookup
51
def lookup(redp, ip):!
try:!
ret = redp.get(ip)!
if ret!=None and ret!='':!
return ret!
else:!
whois_ret = urllib.urlopen(LOCATION_URL + ip)!
lines = whois_ret.readlines()!
if lines!='':!
redp.set(ip, lines)!
return lines…!
except:!
! return “”
Add-‐On Download LocaIon Release
Whois hnp://splunk-‐base.splunk.com/apps/22381/whois-‐add-‐on 4.x
DBLookup hnp://splunk-‐base.splunk.com/apps/22394/example-‐lookup-‐using-‐a-‐database
4.x
Redis Lookup hnp://splunk-‐base.splunk.com/apps/27106/redis-‐lookup 4.x
Geo IP Lookup (not in these
slides)
hnp://splunk-‐base.splunk.com/apps/22282/geo-‐locaIon-‐lookup-‐script-‐powered-‐by-‐maxmind 4.x
Where to Get Add-‐ons Discussed Here
52
Splunkbase.
53
Conclusion: Lookups are a powerful way to enhance your search experience
beyond indexing
So, What?
Enrich BIG DATA with external sources
THANK YOU