Splunk Overview Training
Duration: 3 days Skill Level: Introductory and beyond Hands-On Format: This hands-on class is approximately 50% hands-on lab to 50% lecture ratio, combining engaging lecture, demos, group activities and discussions with machine-based practical student labs and project work.
Course Overview Are you in charge of creating Splunk knowledge objects for your organization? Then you will benefit from this course that walks you through the various knowledge objects and how to create them. Working with Splunk is a comprehensive hands-on course that teaches students how to search, navigate, tag, build alerts, create simple reports and dashboards in Splunk, and how to Splunk's Pivot interface. Working in a hands-on learning environment, students will learn how to use Splunk Analytics to provide an efficient way to search large volumes of data. Students will learn how to run Basic Searches, Save and Share Search Results, Create Tags and Event Types, Create Reports, Create Different Charts, Perform Calculations and Format Search Data, and Enrich Data with Lookups. Examples will center around financial institution examples.
What You’ll Learn: Course Objectives After completion of this Splunk course, you will be able to: Get insight into Splunk Search App Learn to save and share Search Results Understand the use of fields in searching Learn Search Fundamentals using Splunk Explore the available visualizations on the software Create Reports and different Chart Types Perform Data Analysis, Calculation and Formatting Understand and execute various techniques of enriching data lookups
1
Recommended Audience & Pre-Requisites This is a technical class for technical people, geared for Users, Administrators, Architects, Developers & Support Engineers who are new to Splunk. This course is ideal for anyone in your organization who need to examine and use IT data. Ideal attendees would include: Beginners in Splunk who want to enhance their knowledge about this Software usage System Administrators and Software Developers Professionals who are eager to learn to search and analyze machine-generated data using a faster and agile software
Course Topics & Agenda Course Modules 1-4 Day 1 - Morning Module 1 - Basic Understanding of Architecture (Overview) What are the components? Discussion on Forwarders- UF/HF Common ports for the set up License Master/Slave relationship Understanding of Deployment Server and Indexer Module 2 - Introduction to Splunk's User Interface Understand the uses of Splunk Define Splunk Apps Learn basic navigation in Splunk Hands on Lab covering: Basic Navigation End of Module Hands-on Quiz Module 3 - Searching Run basic searches Set the time range of a search Hands on Lab covering: Run basic searches, Set the time range of a search Identify the contents of search results Refine searches
2
Hands on Lab covering: Identify the contents of search results, Refine searches Use the timeline Work with events Hands on Lab covering: Use the timeline, Work with events Control a search job Save search results Hands on Lab covering: Control a search job, Save search results End of Module Hands-on Quiz
Module 4 - Using Fields in Searches Understand fields Use fields in searches Use the fields sidebar Hands on Lab covering: Understand Fields, Use fields in searches, Use the fields sidebar End of Module Hands-on Quiz
Course Modules 5-7 Day 1 - Afternoon Module 5- Creating Reports and Visualizations Save a search as a report Edit reports Create reports that include visualizations such as charts and tables Hands on Lab covering: Save a search as a report, Edit Reports, Create reports that include visualizations such as charts and tables. Add reports to a dashboard Create an instant pivot from a search Hands on Lab covering: Add reports to a dashboard, Create an instant pivot from a search. End of Module Hands on Quiz Module 6 - Working with Dashboards Create a dashboard Add a report to a dashboard Hands on Lab covering: Create a dashboard, Add a report to a dashboard Add a pivot report to a dashboard
3
Edit a dashboard Hands on Lab covering: Add a pivot report to a dashboard, Edit a dashboard. End of Module Hands on Quiz
Module 7 - Search Fundamentals Review basic search commands and general search practices Examine the anatomy of a search Use the following commands to perform searches: Fields Table Rename Rex Multikv Hands on Lab covering: Review basic search commands and general search practices, Examine the anatomy of a search, Use the following commands to perform searches: Fields, Table, Rename, Rex, Multikv. End of Module Hands on Quiz. Course Modules 8-10 Day 2 – Morning (Deep Dive Topics) Module 8 - Reporting Commands, Part 1 Use the following commands and their functions: Top Rare Hands on Lab covering: Top, Rare Stats Add coltotals Hands on Lab covering: Stats, Add Coltotals End of Module Hands on Quiz Module 9 - Reporting Commands, Part 2 Explore the available visualizations Create a basic chart Split values into multiple series Hands on Lab covering: Explore the available visualizations, Create a basic chart, Split values into multiple series Omit null and other values from charts
4
Create a time chart Chart multiple values on the same timeline Hands on Lab covering: Omit null and other values from charts, Create a time chart, Chart multiple values on the same timeline Format charts Explain when to use each type of reporting command Hands on Lab covering: Format Charts, Explain when to use each type of reporting command. End of Module hands on Quiz
Module 10 - Analyzing, Calculating, and Formatting Results Using the eval command Perform calculations Convert values Hands on Lab covering: Using the eval command, Perform calculations, Convert values. Round values Format values Hands on Lab covering: Round values, Format values Use conditional statements Further filter calculated results Hands on Lab covering: Use conditional statements, Further filter calculated results End of Module Hands on Quiz Course Modules 11-12 Day 2 – Afternoon (Deep Dive Topics) Module 11 - Creating Field Aliases and Calculated Fields Define naming conventions Create and use field aliases Create and use calculated fields Hands on Lab covering: Define naming conventions, Create and use field aliases, Create and use calculated fields. End of Module Hands on Quiz Module 12 - Creating Field Extractions Perform field extractions using Field Extractor Hands on Lab covering: Perform field extractions using Field Extractor End of Module Hands on Quiz
5
Course Modules 13-15 Day 3 - Morning Module 13 - Creating Tags and Event Types Create and use tags Describe event types and their uses Create an event type Hands on Lab covering: Create and use tags, Describe event types and their uses, create and event type. End of Module Hands on Quiz Module 14 - Creating Workflow Actions Describe the function of a workflow action Create a GET workflow action Hands on Lab covering: Describe the function of a workflow action, Create a GET workflow action Create a POST workflow action Create a Search workflow action Hands on Lab covering: Create a POST workflow action, Create a SEARCH workflow action End of Module Hands on Quiz Module 15 - Creating and Managing Alerts Describe alerts Create alerts View fired alerts Hands on Lab covering: Describe alerts, Create alerts, View fired alerts End of Module Hands on Quiz Course Modules 16-17 Day 3 - Afternoon Module 16 - Creating and Using Macros Describe macros Manage macros Create and use a basic macro Hands on Lab covering: Describe macros, Manage macros, Create and use a basic macro. Define arguments and variables for a macro Add and use arguments with a macro Hands on Lab covering: Define arguments and variable for a macro, Add and use arguments with a macro.
6
End of Module Hands on Quiz
Module 17 - Using Pivot Describe Pivot Understand the relationship between data models and pivot Select a data model object Hands on Lab covering: Describe Pivot, Understand the relationship between data models and pivot, Select a data model object. Create a pivot report Save pivot report as a dashboard Hands on Lab covering: Create a pivot report, Save pivot report as a dashboard. End of Module Hands on Quiz. Post Course Final Quiz At the end of class, each attendee will take a Post Course Quiz that will gauge the student’s retention of the skills and topics covered throughout the course. The quiz will be distributed either on paper or online at the end of class and graded promptly.
7
Module 1 - Basic Understanding of Architecture (Overview)
What are the components? Discussion on Forwarders- UF/HF Common ports for the set up License Master/Slave relationship Understanding of Deployment Server and Indexer
8
Section 1-What are the components? Splunk Enterprise performs three key functions as it moves data through the data pipeline. First, it consumes data from files, the network, or elsewhere. Then it indexes the data. (Actually, it first parses and then indexes the data, but for purposes of this discussion, we consider parsing to be part of the indexing process.) Finally, it runs interactive or scheduled searches on the indexed data.
You can split this functionality across multiple specialized instances of Splunk Enterprise, ranging in number from just a few to thousands, depending on the quantity of data you're dealing with and other variables in your environment. You might, for example, create a deployment with many instances that only consume data, several other instances that index the data, and one or more instances that handle search requests. These specialized instances are known collectively as components. There are several types of components. For a typical mid-size deployment, for example, you can deploy lightweight versions of Splunk Enterprise, called forwarders, on the machines where the data originates. The forwarders consume data locally and then forward the data across the network to another Splunk Enterprise component, called the indexer. The indexer does the heavy lifting; it indexes the data and runs searches. It should reside on a machine by itself. The forwarders, on the other hand, can easily co-exist on the machines generating the data, because the data-consuming function has minimal impact on machine performance.
9
This diagram shows several forwarders sending data to a single indexer:
As you scale up, you can add more forwarders and indexers. For a larger deployment, you might have hundreds of forwarders sending data to a number of indexers. You can use load balancing on the forwarders, so that they distribute their data across some or all of the indexers. Not only does load balancing help with scaling, but it also provides a fail-over capability if one of the indexers goes down. The forwarders automatically switch to sending their data to any indexers that remain alive.
10
In this diagram, each forwarder load-balances its data across two indexers:
11
These are the fundamental components and features of a Splunk Enterprise distributed environment:
Indexers. Forwarders. Search heads. Deployment server.
Indexer A Splunk Enterprise instance that indexes data, transforming raw data into events and placing the results into an index. It also searches the indexed data in response to search requests. The indexer also frequently performs the other fundamental Splunk Enterprise functions: data input and search management. In larger deployments, forwarders handle data input and forward the data to the indexer for indexing. Similarly, although indexers always perform searches across their own data, in larger deployments, a specialized Splunk Enterprise instance, called a search head, handles search management and coordinates searches across multiple indexers.
Forwarder A Splunk Enterprise instance that forwards data to another Splunk Enterprise instance, such as an indexer or another forwarder, or to a third-party system. There are three types of forwarders: 12
A universal forwarder is a dedicated, streamlined version of Splunk Enterprise that contains only the essential components needed to send data.
A heavy forwarder is a full Splunk Enterprise instance, with some features disabled to achieve a smaller footprint.
A light forwarder is a full Splunk Enterprise instance, with most features disabled to achieve a small footprint. The universal forwarder supersedes the light forwarder for nearly all purposes. The light forwarder has been deprecated as of Splunk Enterprise version 6.0.0.
The universal forwarder is the best tool for forwarding data to indexers. Its main limitation is that it forwards only unparsed data. To send event-based data to indexers, you must use a heavy forwarder. Search Heads In a distributed search environment, a Splunk Enterprise instance that handles search management functions, directing search requests to a set of search peers and then merging the results back to the user. A Splunk Enterprise instance can function as both a search head and a search peer. A search head that performs only searching, and not any indexing, is referred to as a dedicated search head. Search head clusters are groups of search heads that coordinate their activities.
Deployment Server A Splunk Enterprise instance that acts as a centralized configuration manager, grouping together and collectively managing any number of Splunk Enterprise instances. Instances that are remotely configured by deployment servers are called deployment clients. The deployment server downloads updated content, such as configuration files and apps, to deployment clients. Units of such content are known as deployment apps. 13
Section 2-Discussion on Forwarders- UF/HF The universal forwarder The universal forwarder is Splunk's new lightweight forwarder. You use it to gather data from a variety of inputs and forward the data to a Splunk Enterprise server for indexing and searching. You can also forward data to another forwarder, as an intermediate step before sending the data onwards to an indexer. The universal forwarder's sole purpose is to forward data. Unlike a full Splunk Enterprise instance, you cannot use the universal forwarder to index or search data. To achieve higher performance and a lighter footprint, it has several limitations:
The universal forwarder has no searching, indexing, or alerting capability. The universal forwarder does not parse data.
Heavy and light forwarders While the universal forwarder is generally the preferred way to forward data, you might have reason (legacy-based or otherwise) to use heavy forwarders as well. Unlike the universal forwarder, which is an entirely separate, streamlined executable, both heavy and light forwarders are actually full Splunk Enterprise instances with certain features disabled. A heavy forwarder (sometimes referred to as a "regular forwarder") has a smaller footprint than a Splunk Enterprise indexer but retains most of the capability, except that it lacks the ability to perform distributed searches. Much of its default functionality, such as Splunk Web, can be disabled, if necessary, to reduce the size of its footprint. A heavy forwarder parses data before forwarding it and can route data based on criteria such as source or type of event.
14
This table summarizes the similarities and differences among the three types of forwarders: Features and capabilities
Universal forwarder
Heavy forwarder
Type of Splunk Enterprise instance
Dedicated executable
Full Splunk Enterprise, with some features disabled
Footprint (memory, CPU load)
Smallest
Medium-to-large (depending on enabled features)
Bundles Python?
No
Yes
Handles data inputs?
All types (but scripted inputs might require Python installation)
All types
Forwards to Splunk Enterprise?
Yes
Yes
Forwards to 3rd party systems?
Yes
Yes
Serves as intermediate forwarder?
Yes
Yes
Indexer acknowledgment (guaranteed delivery)?
Optional
Optional (version 4.2+)
Load balancing?
Yes
Yes
Data cloning?
Yes
Yes
Per-event filtering?
No
Yes
Event routing?
No
Yes
Event parsing?
No
Yes
Local indexing?
No
Optional, by setting indexAndForward attribute in outputs.conf
Searching/alerting?
No
Optional
Splunk Web?
No
Optional
15
Section 3- Common ports for the set up
Splunk configures two ports at installation time:
The HTTP/HTTPS port. This port provides the socket for Splunk Web. It defaults to 8000. The management port. This port is used to communicate with the splunkd daemon. Splunk Web talks to splunkd on this port, as does the command line interface and any distributed connections from other servers. This port defaults to 8089.
Let's login to our lab environment Please go to: http://www.uxcreate.com/guacamole User name: admin Password: admin
Your instructor will give you your machine number. Please remember your machine number throughout the training session.
Then please go to Start > All Programs > Splunk Enterprise > Splunk Enterprise The Splunk web interface should come up. The login details : username: admin password: admin
16
Section 4-License Master/Slave relationship
Splunk Enterprise takes in data from sources you designate and processes it so that you can analyze it. We call this process indexing. Splunk Enterprise licenses specify how much data you can index per calendar day (from midnight to midnight by the clock on the license master). Any host in your Splunk Enterprise infrastructure that performs indexing must be licensed to do so. You can either run a standalone indexer with a license installed locally, or you can configure one of your Splunk Enterprise instances as a license master and set up a license pool from which other indexers, configured as license slaves, can draw. When a license master instance is configured, and license slaves are added to it, the license slaves communicate their usage to the license master every minute. If the license master is unreachable for any reason, the license slave starts a 72 hour timer. If the license slave cannot reach the license master for 72 hours, search is blocked on the license slave (although indexing continues). Users cannot search data in the indexes on the license slave until that slave can reach the license master again.
17
Section 5-Understanding of Deployment Server and Indexer
The indexer is the Splunk Enterprise component that creates and manages indexes. The primary functions of an indexer are:
Indexing incoming data. Searching the indexed data.
In single-machine deployments consisting of just one Splunk Enterprise instance, the indexer also handles the data input and search management functions. For larger-scale needs, indexing is split out from the data input function and sometimes from the search management function as well. In these larger, distributed deployments, the indexer might reside on its own machine and handle only indexing, along with searching of its indexed data. In those cases, other Splunk Enterprise components take over the non-indexing roles. For instance, you might have a set of Windows and Linux machines generating events, which need to go to a central indexer for consolidation. Usually the best way to do this is to install a lightweight instance of Splunk Enterprise, known as a forwarder, on each of the event-generating machines. These forwarders handle data input and send the data across the network to the indexer residing on its own machine. Similarly, in cases where you have a large amount of indexed data and numerous concurrent users searching on it, it can make sense to split off the search management function from indexing. In this type of scenario, known as distributed search, one or more search heads distribute search requests across multiple indexers. The indexers still perform the actual searching of their own indexes, but the search heads manage the overall search process across all the indexers and present the consolidated search results to the user.
18
Here's an example of a scaled-out deployment:
A deployment server uses server classes to determine what content to deploy to groups of deployment clients. The forwarder management interface offers an easy way to create, edit, and manage server classes.
19
Module 2 - Introduction to Splunk's User Interface
Understand the uses of Splunk Define Splunk Apps Learn basic navigation in Splunk Hands on Lab covering: Basic Navigation End of Module Hands-on Quiz
20
Section 1-Understand the uses of Splunk Splunk Enterprise makes it simple to collect, analyze and act upon the untapped value of the big data generated by your technology infrastructure, security systems and business applications—giving you the insights to drive operational performance and business results. By monitoring and analyzing everything from customer clickstreams and transactions to security events and network activity, Splunk Enterprise helps you gain valuable Operational Intelligence from your machine-generated data. And with a full range of powerful search, visualization and pre-packaged content for use-cases, any user can quickly discover and share insights. Just point your raw data at Splunk Enterprise and start analyzing your world.
Collects and indexes log and machine data from any source Powerful search, analysis and visualization capabilities empower users of all types Apps provide solutions for security, IT ops, business analysis and more Enables visibility across on premise, cloud and hybrid environments Delivers the scale, security and availability to suit any organization Available as a software or SaaS ( Software as a Solution) solution
21
Section 2-Define Splunk Apps
A Splunk App is a prebuilt collection of dashboards, panels and UI elements powered by saved searches and packaged for a specific technology or use case to make Splunk immediately useful and relevant to different roles. As an alternative to using Splunk for searching and exploring, you can use Splunk Apps to gain the specific insights you need from your machine data. You can also apply user/role based permissions and access controls to Splunk Apps, thus providing a level of control when you are deploying and sharing Apps across your organization. Apps can be opened from the Splunk Enterprise Home Page, from the App menu, or from the Apps section of Settings.
22
Section 3-Learn basic navigation in Splunk About SplunkHome Splunk Home is your interactive portal to the data and apps accessible from this Splunk instance. The main parts of Home include the Splunk Enterprise navigation bar, the Apps menu, the Explore Splunk Enterprise panel, and a custom default dashboard (not shown here).
Apps The Apps panel lists the apps that are installed on your Splunk instance that you have permission to view. Select the app from the list to open it. For an out-of-the-box Splunk Enterprise installation, you see one App in the workspace: Search & Reporting. When you have more than one app, you can drag and drop the apps within the workspace to rearrange them.
You can do two actions on this panel: Click the gear icon to view and manage the apps that are installed in your Splunk instance. 23
Click the plus icon to browse for more apps to install.
Explore Splunk Enterprise The options in the Explore Splunk Enterprise panel help you to get started using Splunk Enterprise. Click on the icons to open the Add Data view, browse for new apps, open the Splunk Enterprise Documentation, or open Splunk Answers. About the Splunk bar Use the Splunk bar to navigate your Splunk instance. It appears on every page in Splunk Enterprise. You can use it to switch between apps, manage and edit your Splunk configuration, view system-level messages, and monitor the progress of search jobs.
The following screenshot shows the Splunk bar in Splunk Home.
The Splunk bar in another view, such as the Search & Reporting app's Search view, also includes an App menu next to the Splunk logo. 24
Return to Splunk Home Click the Splunk logo on the navigation bar to return to Splunk Home from any other view in Splunk Web. Settings menu
The Settings menu lists the configuration pages for Knowledge objects, Distributed environment settings, System and licensing, Data, and Authentication settings. If you do not see some of these options, you do not have the permissions to view or edit them.
25
User menu The User menu here is called "Administrator" because that is the default user name for a new installation. You can change this display name by selecting Edit account and changing the Full name. You can also edit the time zone settings, select a default app for this account, and change the account's password. The User menu is also where you Logout of this Splunk installation.
Messages menu All system-level error messages are listed here. When there is a new message to review, a notification displays as a count next to the Messages menu. Click the X to remove the message.
Activity menu The Activity menu lists shortcuts to the Jobs, Triggered alerts, and System Activity views.
Click Jobs to open the search jobs manager window, where you can view and manage currently running searches. 26
Click Triggered Alerts to view scheduled alerts that are triggered. This tutorial does not discuss saving and scheduling alerts. See "About alerts" in the Alerting Manual. Click System Activity to see Dashboards about user activity and status of the system. Help Click Help to see links to Video Tutorials, Splunk Answers, the Splunk Support Portal, and online Documentation.
Find Use Find to search for objects within your Splunk Enterprise instance. Find performs non-case sensitive matches on the ID, labels, and descriptions in saved objects. For example, if you type in "error", it returns the saved objects that contain the term "error".
These saved objects include Reports, Dashboards, Alerts, and Data models. The results appear in the list separated by the categories where they exist. 27
You can also run a search for error in the Search & Reporting app by clicking Open error in search.
Hands on Lab covering: Basic Navigation Take your time exploring the Splunk Web interface
28
End of Module Hands-on Quiz Please refer to your virtual machine for test
29
Module 3 - Searching
Run basic searches Set the time range of a search Hands on Lab covering: Run basic searches, Set the time range of a search Identify the contents of search results Refine searches Hands on Lab covering: Identify the contents of search results, Refine searches Use the timeline Work with events Hands on Lab covering: Use the timeline, Work with events Control a search job Save search results Hands on Lab covering: Control a search job, Save search results End of Module Hands-on Quiz
30
Run basic searches Types of searches Before delving into the language and syntax of search, you should ask what you are trying to accomplish. Generally, after getting data into Splunk, you want to:
Investigate to learn more about the data you just indexed or to find the root cause of an issue. Summarize your search results into a report, whether tabular or other visualization format.
Because of this, you might hear us refer to two types of searches: Raw event searches and Report-generating searches. Raw event searches Raw event searches are searches that just retrieve events from an index or indexes and are typically done when you want to analyze a problem. Some examples of these searches include: checking error codes, correlating events, investigating security issues, and analyzing failures. These searches do not usually include search commands (except search, itself), and the results are typically a list of raw events. Transforming searches Transforming searches are searches that perform some type of statistical calculation against a set of results. These are searches where you first retrieve events from an index and then pass them into one or more search commands. These searches will always require fields and at least one of a set of statistical commands. Some examples include: getting a daily count of error events, counting the number of times a specific user has logged in, or calculating the 95th percentile of field values.
31
Information density Whether you're retrieving raw events or building a report, you should also consider whether you are running a search for sparse or dense information:
Sparse searches are searches that look for single event or an event that occurs infrequently within a large set of data. You've probably heard these referred to as 'needle in a haystack' or "rare term" searches. Some examples of these searches include: searching for a specific and unique IP address or error code. Dense searches are searches that scan through and report on many events. Some examples of these searches include: counting the number of errors that occurred or finding all events from a specific host.
Search and knowledge As you search, you may begin to recognize patterns and identify more information that could be useful as searchable fields. You can configure Splunk to recognize these new fields as you index new data or you can create new fields as you search. Whatever you learn, you can use, add, and edit this knowledge about fields, events, and transactions to your event data. This capturing of knowledge helps you to construct more efficient searches and build more detailed reports.
32
The anatomy of a search To better understand how search commands act on your data, it helps to visualize all your indexed data as a table. Each search command redefines the shape of your table. For example, let's take a look at the following search. sourcetype=syslog ERROR | top user | fields - percent
The Disk represents all of your indexed data and it's a table of a certain size with columns represent fields and rows representing events. The first intermediate results table shows fewer rows--representing the subset of events retrieved from the index that matched the search terms "sourcetype=syslog ERROR". The second intermediate results table shows fewer columns, representing the results of the top command, "top user", which summarizes the events into a list of the top 10 users and displays the user, count, and percentage. Then, "fields - percent" removes the column that shows the percentage, so you are left with a smaller final results table.
33
Quotes and escaping characters Generally, you need quotes around phrases and field values that include white spaces, commas, pipes, quotes, and/or brackets. Quotes must be balanced, an opening quote must be followed by an unescaped closing quote. For example:
A search such as error | stats count will find the number of events containing the string error. A search such as ... | search "error | stats count" would return the raw events containing error, a pipe, stats, and count, in that order.
Additionally, you want to use quotes around keywords and phrases if you don't want to search for their default meaning, such as Boolean operators and field/value pairs. For example:
A search for the keyword AND without meaning the Boolean operator: error "AND" A search for this field/value phrase: error "startswith=foo"
The backslash character (\) is used to escape quotes, pipes, and itself. Backslash escape sequences are still expanded inside quotes. For example:
The sequence \| as part of a search will send a pipe character to the command, instead of having the pipe split between commands. The sequence \" will send a literal quote to the command, for example for searching for a literal quotation mark or inserting a literal quotation mark into a field using rex. The \\ sequence will be available as a literal backslash in the command.
If Splunk does not recognize a backslash sequence, it will not alter it.
For example \s in a search string will be available as \s to the command, because \s is not a known escape sequence. However, in the search string \\s will be available as \s to the command, because \\ is a known escape sequence that is converted to \. 34
Asterisks, *, can not be searched for using a backslash to escape the character. Splunk treats the asterisk character as a major breaker. Because of this, it will never be in the index. If you want to search for the asterisk character, you will need to run a postfiltering regex search on your data: index=_internal | regex ".*\*.*"
Examples Example 1: myfield is created with the value of 6. ... | eval myfield="6"
Example 2: myfield is created with the value of ". ... | eval myfield="\""
Example 3: myfield is created with the value of \. ... | eval myfield="\\"
Example 4: This would produce an error because of unbalanced quotes. ... | eval myfield="\"
35
Set the time range of a search
Time is crucial for determining what went wrong. You often know when something happened, if not exactly what happened. Looking at events that happened around the same time can help correlate results and find the root cause. Searches run with overly-broad time range wastes system resources and produces more results than you can handle. Select time ranges to apply to your search Use the time range picker to set time boundaries on your searches. You can restrict a search with preset time ranges, create custom time ranges, specify time ranges based on date or date and time, or work with advanced features in the time range picker. These options are described in the following sections. Note: If you are located in a different timezone, time-based searches use the timestamp of the event from the instance that indexed the data.
36
Select from a list of Preset time ranges
37
Define custom Relative time ranges Use custom Relative time range options to specify a time range for your search that is relative to Now. You can select from the list of time range units, "Seconds ago", "Minutes ago", and so on.
38
The labels for Earliest and Latest update to match your selection.
The preview boxes below the fields update to the time range as you set it.
39
Define custom Real-time time ranges The custom Real-time option enables you to specify the start time for your real-time time range window.
40
Define custom Date ranges Use the custom Date Range option to specify calendar dates in your search. You can choose among options to return events: Between a beginning and end date, Before a date, and Since a date.
For these fields, you can type the date into the text box or select the date from a calendar:
41
42
Define custom Date & Time ranges Use the custom Date & Time Range option to specify calendar dates and times for the beginning and ending of your search.
You can type the date into the text box or select the date from a calendar.
43
Use Advanced time range options Use the Advanced option to specify the earliest and latest search times. You can write the times in Unix (epoch) time or relative time notation. The epoch time value you enter is converted to local time. This timestamp is displayed under the text field so that you can verify your entry.
44
Hands on Lab Part 1 - Basic Concepts
There are a few concepts in the Splunk world that will be helpful for you to understand. I’ll cover them in a few sentences, so try to pay attention. If you want more details, see the “Concepts” section near the end of this document.
Processing at the time the data is processed: Splunk reads data from a source, such as a file or port, on a host (e.g. "my machine"), classifies that source into a sourcetype (e.g., "syslog", "access_combined", "apache_error", ...), then extracts timestamps, breaks up the source into individual events (e.g., log events, alerts, …), which can be a single-line or multiple lines, and writes each event into an index on disk, for later retrieval with a search.
Processing at the time the data is searched: When a search starts, matching indexed events are retrieved from disk, fields (e.g., code=404, user=david,...) are extracted from the event's text, and the event is classified by matched against eventtype definitions (e.g., 'error', 'login', ...). The events returned from a search can then be powerfully transformed using Splunk's search language to generate reports that live on dashboards.
45
Part 2 - Adding Data
Splunk can eat data from just about any source, including files, directories, ports, and scripts, keeping track of changes to them as they happen. We're going to start simple and just tell Splunk to index a particular file and not monitor it for updates:
1. 2. 3. 4. 5. 6. 7. 8.
Go to the Splunk Web interface (e.g. http://localhost:8000) and log in, if you haven’t already. Click Settings in the upper right-hand corner of Splunk Web. Under Settings, click Add Data. Click Upload Data to upload file. Click Select File. Browse and find "websample.log" on your Desktop that we previously saved. Accept all the default values and just click Submit. Click Start Searching
Assuming all goes well, websample.log is now indexed, and all the events are timestamped and searchable.
46
Part 3 -Basic Searching
Splunk comes with several Apps, but the only relevant one now is the 'Search' app, which is the interface for generic searching. (More apps can be downloaded and advanced users can built them themselves.) After logging into Splunk, select the Search app and let's get started in searching. We'll start out simple and work our way up. To begin your Splunk search, type in terms you might expect to find in your data. For example, if you want to find events that might be HTTP 404 errors (i.e., webpage not found), type in the keywords:
http 404
You'll get back all the events that have both HTTP and 404 in their text.
47
Notice that search terms are implicitly AND'd together. The search was the same as "http AND 404". Let's make the search narrower: http 404 "like gecko" 48
Using quotes tells Splunk to search for a literal phrase “like gecko”, which returns more specific results than just searching for “like” and “gecko” because they must be adjacent as a phrase. Splunk supports the Boolean operators AND, OR, and NOT (must be capitalized), as well as parentheses to enforce grouping. To get all HTTP error events (i.e., not 200 error code), not including 403 or 404, use this:
http NOT (200 OR 403 OR 404)
Again, the AND operator is implied; the previous search is the same as
http AND NOT (200 OR 403 OR 404)
Splunk supports the asterisk (*) wildcard for searching. For example, to retrieve events that has 40x and 50xx classes of HTTP status codes, you could try:
http (40* OR 50*)
49
When you index data, Splunk automatically adds fields (i.e., attributes) to each of your events. It does this based on some text patterns commonly found in IT data, and intermediate users can add their own extraction rules for pulling out additional fields. To narrow results with a search, just add attribute=value to your search:
sourcetype=access_combined status=404
This search shows a much more precise version of our first search (i.e., "http 404") because it will only return events that come from access_combined sources (i.e., webserver events) and that have a status code of 404, which is different than just having a 404 somewhere in the text. The “404” has to be found where a status code is expected on the event and not just anywhere. In addition to
=, you can also do != (not equals), and <, >, >=, and <= for numeric fields.
Part 4- Now here’s your turn on your own:
1. Upload the file LoanStats3a.csv located on your desktop 2. Search for entries that contain the word " divorced" 3. Search for entries that are divorced and renting
50
Part 5 - Search App Now click on Search on the Main toolbar
You will get the following screen:
51
Click on the Data Summary button, you will get:
Click on the Sources tab, you will get:
52
Now you can choose websample.log, you will get:
53
Part 6 - Let’s upload another sample file: 1. 2. 3. 4.
Please upload sampledata.zip, whichh is located on the Desktop Notice there is no preview. Please take the defaults and start Searching On the Sourcetypes panel, click access_combined_wcookie
54
You are a member of the Customer Support team for the online Flower & Gift shop. This is your first day on the job. You want to learn some more about the shop. Some questions you want answered are:
What does the store sell? How much does each item cost? How many people visited the site? How many bought something today? What is the most popular item that is purchased each day?
It's your first day of work with the Customer Support team for the online Flower & Gift shop. You're just starting to dig into the Web access logs for the shop, when you receive a call from a customer who complains about trouble buying a gift for his girlfriend--he keeps hitting a server error when he tries to complete a purchase. He gives you his IP address, 10.2.1.44. 55
1. Type the customer's IP address into the search bar: sourcetype="access_combined_wcookie" 10.2.1.44
As you type into the search bar, Splunk's search assistant opens.
Search assistant shows you typeahead, or contextual matches and completions for each keyword as you type it into the search bar. These contextual matches are based on what's in your data. The entries under matching terms update as you continue to type because the possible completions for your term change as well.
56
Part 7 - Time Ranges Try different time ranges like the previous week within the search toolbar
57
Identify the contents of search results and refine searches Splunk supports the Boolean operators: AND, OR, and NOT. When you include Boolean expressions in your search, the operators have to be capitalized. Also you can mouse over results to refine searches
58
Hands on Lab 1. Please choose the Data Source LoanStats3a.csv. Remember click on Search on the Toolbar and then click on the Data Summary Button. 2.
Search for the word : Status
3.
Then click on the word Paid and add to the search
4.
Click on the word : RENT and exclude from search
BONUS LAB: 1.
Without the use of fields, find the status of Not Paid and Not Mortgage 59
Use the timeline
The timeline is a visual representation of the number of events returned by a search over a selected time range. The timeline is a type of histogram, where the range is broken up into smaller time intervals (such as seconds, minutes, hours, or days), and the count of events for each interval appears in column form. When you use the timeline to display the results of real-time searches, the timeline represents the sliding time range window covered by the real-time search.
Mouseover a bar to see the count of events. Click on a bar to drill-down to that time range. Drilling down in this way does not run a new search, it just filters the results from the previous search. You can use the timeline to highlight patterns or clusters of events or investigate peaks (spikes in activity) and lows (possible server downtime) in event activity. Change the timeline format The timeline is located in the Events tab above the events listing. It shows the count of events over the time range that the search was run. Here, the timeline shows web access events over the Previous business week.
60
Format options are located in the Format Timeline menu:
You can hide the timeline (Hidden) and display a Compact or Full view of it. You can also toggle the timeline scale between linear (Linear Scale) or logarithmic (Log Scale). 61
When Full is selected, the timeline is taller and displays the count on the y-axis and time on the x-axis. Zoom in and zoom out to investigate events Zoom and selection options are located above the timeline. At first, only the Zoom Out option is available.
The timeline legend is on the top right corner of the timeline. This indicates the scale of the timeline. For example, 1 minute per column indicates that each column represents a count of events during that minute. Zooming in and out changes the time scale. For example, if you click Zoom Out the legend will indicate that each column now represents an hour instead of a minute. When you mouse over and select bars in the timeline, the Zoom to Selection or Deselect options become available.
62
Mouse over and click on the tallest bar or drag your mouse over a cluster of bars in the timeline. The events list updates to display only the events that occurred in that selected time range. The time range picker also updates to the selected time range. You can cancel this selection by clicking Deselect. When you Zoom to Selection, you filter the results of your previous search for your selected time period. The timeline and events list update to show the results of the new search.
63
You cannot Deselect after you zoomed into a selected time range. But, you can Zoom Out again.
64
Work with events An event is a single piece of data in Splunk software, similar to a record in a log file or other data input. When data is indexed, it is divided into individual events. Each event is given a timestamp, host, source, and source type. Often, a single event corresponds to a single line in your inputs, but some inputs (for example, XML logs) have multiline events, and some inputs have multiple events on a single line. When you run a successful search, you get back events.
65
Hands on Lab
Back at the Flower & Gift shop, let's continue with the customer (10.2.1.44) you were assisting. He reported an error while purchasing a gift for his girlfriend. You confirmed his error, and now you want to find the cause of it. Continue with the last search, which showed you the customer's failed purchase attempts. 1. Type purchase into the search bar and run the search: sourcetype="access_combined_wcookie" 10.2.1.44 purchase
When you search for keywords, your search is not case-sensitive and Splunk retrieves the events that contain those keywords anywhere in the raw text of the event's data Use Boolean operators If you're familiar with Apache server logs, in this case the access_combined format, you'll notice that most of these events have an HTTP status of 200, or Successful. These events are not interesting for you right now, because the customer is reporting a problem.
Splunk supports the Boolean operators: AND, OR, and NOT. When you include Boolean expressions in your search, the operators have to be capitalized. 2. Use the Boolean NOT operator to quickly remove all of these Successful page requests. Type in: 66
sourcetype="access_combined_wcookie" 10.2.1.44 purchase NOT 200
The AND operator is always implied between search terms. So the search in Step 5 is the same as: sourcetype="access_combined_wcookie" AND 10.2.1.44 AND purchase NOT 200
You notice that the customer is getting HTTP server (503) and client (404) errors. But, he specifically mentioned a server error, so let's quickly remove events that are irrelevant. Another way to add Boolean clauses quickly and interactively to your search is to use your search results. Splunk lets you highlight and select any segment from
67
Timeline Usage Continue with the last search, which showed you the customer's failed purchase attempts. 1. Search for: In the last topic, you really just focused on the search results listed in the events viewer area of this dashboard. Now, let's take a look at the timeline. sourcetype="access_combined_wcookie" 10.2.1.44 purchase NOT 200 NOT 404
The location of each bar on the timeline corresponds to an instance when the events that match your search occurred. If there are no bars at a time period, no events were found then. 2. Mouse over one of the bars. A tooltip pops up and displays the number of events that Splunk found during the time span of that bar (1 bar = 1 hour).
68
The taller the bar, the more events occurred at that time. Often seeing spikes in the number of events or no events is a good indication that something has happened. 3. Click one of the bars, for example the tallest bar. This updates your search results to show you only the events at the time span. Splunk does not run the search when you click on the bar. Instead, it gives you a preview of the results zoomed-in at the time range. You can still select other bars at this point.
One hour is still a wide time period to search, so let's narrow the search down more. 4. Double-click on the same bar. Splunk runs the search again and retrieves only events during that one hour span you selected.
69
You should see the same search results in the Event viewer, but, notice that the search overrides the time range picker and it now shows "Custom time". (You'll see more of the time range picker later.) Also, each bar now represents one minute of time (1 bar = 1 min). 5. Double-click another bar. Once again, this updates your search to now retrieve events during that one minute span of time. Each bar represents the number of events for one second of time.
Now, you want to expand your search to see everything else, if anything, that happened during this second. 6. Without changing the time range, replace your previous search in the search bar with: *
Splunk supports using the asterisk (*) wildcard to search for "all" or to retrieve events based on parts 70
of a keyword. Up to now, you've just searched for Web access logs. This search tells Splunk that you want to see everything that occurred at this time range:
71
Control search job progress After you launch a search, you can access and manage information about the search's job without leaving the Search page. Once your search is running, paused, or finalized, click Job and choose from the available options there.
You can:
Edit the job settings. Select this to open the Job Settings dialog, where you can change the job read permissions, extend the job lifetime, and get a URL for the job that you can use to share the job with others or put a link to the job in your browser's bookmark bar. Send the job to the background. Select this if the search job is slow to complete and you would like to run the job in the background while you work on other Splunk activities (including running a new search job). Inspect the job. Opens a separate window and display information and metrics for the search job using the Search Job Inspector. You can select this action while the search is running or after it completes. Delete the job. Use this to delete a job that is currently running, is paused, or which has finalized. After you have deleted the job you can still save the search as a report.
72
Change the search mode The Search mode controls the search experience. You can set it to speed up searches by cutting down on the event data it returns (Fast mode), or you can set it to return as much event information as possible (Verbose mode). In Smart mode (the default setting) it automatically toggles search behavior based on the type of search you're running.
73
Save the results The Save as menu lists options for saving the results of a search as a Report, Dashboard Panel, Alert, and Event type.
Report: If you would like to make the search available for later use, you can save it as a report. You can run the report again on an ad hoc basis by finding the report on the Reports listing page and clicking its name. Dashboard Panel...: Click this if you'd like to generate a dashboard panel based on your search and add it to a new or existing dashboard. Alert Click to define an alert based on your search. Alerts run saved searches in the background (either on a schedule or in real time). When the search returns results that meet a condition you have set in the alert definition, the alert is triggered. Event Type Event types let you classify events that have common characteristics. If the search doesn't include a pipe operator or a subsearch , you can use this to save it as an event type. Other search actions
Between the job progress controls and search mode selector are three buttons which enable you to Share, Export, and Print the results of a search. 74
Click Share to share the job. When you select this, the job's lifetime is extended to 7 days and read permissions are set to Everyone. Click Export to export the results. You can select to output to CSV, raw events, XML, or JSON and specify the number of results to export. Click Print to send the results to a printer that has been configured.
75
Hands on Lab 1. 2.
Using your file LoanStats3a.csv, save your last search as an event type Go to Settings, and click on event types to view your saved event type
76
End of Module Hands-on Quiz Please refer to your virtual machine for test
77
Module 4 - Using Fields in Searches
Understand fields Use fields in searches Use the fields sidebar Hands on Lab covering: Understand Fields, Use fields in searches, Use the fields sidebar End of Module Hands-on Quiz
78
Understand fields
Fields exist in machine data in many forms. Often, a field is a value (with a fixed, delimited position on the line) or a name and value pair, where there is a single value to each field name. A field can be multivalued, that is, it can appear more than once in an event and has a different value for each appearance. Some examples of fields are clientip for IP addresses accessing your Web server, _time for the timestamp of an event, and host for domain name of a server. One of the more common examples of multivalue fields is email address fields. While the From field will contain only a single email address, the To and Cc fields have one or more email addresses associated with them. In Splunk Enterprise, fields are searchable name and value pairings that distinguish one event from another because not all events will have the same fields and field values. Fields let you write more tailored searches to retrieve the specific events that you want.
79
Use fields in searches
Use the following syntax to search for a field: fieldname="fieldvalue" . Field names are case sensitive, but field values are not. 1. Go to the Search dashboard and type the following into the search bar: sourcetype="access_*"
This indicates that you want to retrieve only events from your web access logs and nothing else. sourcetype
is a field name and access_* is a wildcarded field value used to match any Apache web access event. Apache web access logs are formatted as access_common, access_combined, or access_combined_wcookie. 2. In the Events tab, scroll through the list of events. If you are familiar with the access_combined format of Apache logs, you recognize some of the information in each event, such as:
IP addresses for the users accessing the website. URIs and URLs for the pages requested and referring pages. HTTP status codes for each page request. GET or POST page request methods.
80
Use the fields sidebar
To the left of the events list is the Fields sidebar. As Splunk Enterprise retrieves the events that match your search, the Fields sidebar updates with Selected fields and Interesting fields. These are the fields that Splunk Enterprise extracted from your data. Selected Fields are the fields that appear in your search results. The default fields host, source, and sourcetype are selected.
You can hide and show the fields sidebar by clicking Hide Fields and Show Fields, respectively. 3. Click All Fields. The Select Fields dialog box opens, where you can edit the fields to show in the events list. You see the default fields that Splunk defined. Some of these fields are based on each event's timestamp (everything beginning with date_*), punctuation (punct), and location (index). 81
Other field names apply to the web access logs. For example, there are clientip, method, and status. These are not default fields. They are extracted at search time. This opens the field summary for the action field. In this set of search results, Splunk Enterprise found five values for action, and that the action field appears in 49.9% of your search results.
82
Hands on Lab 1. Go back to the Search dashboard and search for web access activity. Select Other > Yesterday from the time range picker: sourcetype="access_*"
You were actually using fields all along! Each time you searched for sourcetype=access_*, you told Splunk to only retrieve events from your web access logs and nothing else. To search for a particular field, specify the field name and value: fieldname="fieldvalue"
is a field name and access_combined_wcookie is a field value. Here, the wildcarded value is used to match all field values beginning with access_ (which would include access_common, access_combined, and access_combined_wcookie) . sourcetype
Note: Field names are case sensitive, but field values are not! 2. Scroll through the search results. If you're familiar with the access_combined format of Apache logs, you will recognize some of the information in each event, such as: • • • •
IP addresses for the users accessing the website. URIs and URLs for the page request and referring page. HTTP status codes for each page request. Page request methods.
83
As Splunk retrieves these events, the Fields sidebar updates with selected fields and interesting fields. These are the fields that Splunk extracted from your data. Notice that default fields host, source, and sourcetype are selected fields and are displayed in your search results:
3. Scroll through interesting fields to see what else Splunk extracted. You should recognize the field names that apply to the Web access logs. For example, there's clientip, method, and status. These are not default fields; they have (most likely) been extracted at search time. 4. Click the Edit link in the fields sidebar. The Fields dialogue opens and displays all the fields that Splunk extracted. • Available Fields are the fields that Splunk identified from the events in your current search (some of these fields were listed under interesting fields). • Selected Fields are the fields you picked (from the available fields) to show in 84
your search results (by default, host, source, and sourcetype are selected).
5. Scroll through the list of Available Fields. You're already familiar with the fields that Splunk extracted from the Web access logs based on your search. You should also see other default fields that Splunk defined--some of these fields are based on each event's timestamp (everything beginning with date_*), punctuation (punct), and location (index). But, you should also notice other extracted fields that are related to the online store. For example, there are action, category_id, and product_id. From conversations with your coworker, you may know that these fields are:
Field name action
Description what a user does at the online shop.
85
category_id
the type of product a user is viewing or buying.
product_id
the catalog number of the product the user is viewing or buying.
6. From the Available fields list, select action, category_id, and product_id.
7. Click Save. When you return to the Search view, the fields you selected will be included in your search results if they exist in that particular event. Different events will have different fields.
86
The fields sidebar doesn't just show you what fields Splunk has captured from your data. It also displays how many values exist for each of these fields. For the fields you just selected, there are 2 for action, 5 for category_id, and 9 for product_id. This doesn't mean that these are all the values that exist for each of the fields--these are just the values that Splunk knows about from the results of your search. What are some of these values? 8. Under selected fields, click action for the action field. This opens the field summary for the action field.
This window tells you that, in this set of search results, Splunk found two values for action and they are purchase and update. Also, it tells you that the action field appears in 71% of your search results. This 87
means that three-quarters of the Web access events are related to the purchase of an item or an update (of the item quantity in the cart, perhaps). 9. Close this window and look at the other two fields you selected, category_id (what types of products the shop sells) and product_id (specific catalog names for products). Now you know a little bit more about the information in your data relating to the online Flower and Gift shop. The online shop sells a selection of flowers, gifts, plants, candy, and balloons. Let's use these fields, category_id and product_id, to see what people are buying. Use fields to run more targeted searches These next two examples compares the results when searching with and without fields. Example 1 Return to the search you ran to check for errors in your data. Select Other > Yesterday from the time range picker: error OR failed OR severe OR (sourcetype=access_* (404 OR 500 OR 503))
88
Run this search again, but this time, use fields in your search. The HTTP error codes are values of the status field. Now your search looks like this: error OR failed OR severe OR (sourcetype=access_* (status=404 OR status=500 OR status=503))
Notice the difference in the count of events between the two searches--because it's a more targeted search, the second search returns fewer events. When you run simple searches based on arbitrary keywords, Splunk matches the raw text of your data. When you add fields to your search, Splunk looks for events that have those specific field/value pairs.
Example 2 Before you learned about the fields in your data, you might have run this search to see how many times flowers were purchased from the online shop: sourcetype=access_* purchase flower*
As you typed in "flower", search assistant shows you both "flower" and "flowers' in the typeahead. Since 89
you don't know which is the one you want, you use the wildcard to match both.
If you scroll through the (many) search results, you'll see that some of the events have action=update and category_id that have a value other than flowers. These are not events that you wanted! Run this search instead. Select Other > Yesterday from the time range picker: sourcetype=access_* action=purchase category_id=flower*
For the second search, even though you still used the wildcarded word "flower*", there is only one value of category_id that it matches (FLOWERS). 90
Notice the difference in the number of events that Splunk retrieved for each search; the second search returns significantly fewer events. Searches with fields are more targeted and retrieves more exact matches against your data.
Now on your own: 1. Bring up the Loan data file 2. Using fields find entries that annual salary is less than 20,000 and they live in the state of CA. Use addr_state for state 3. Refine the search for the field emp_title where it equals Walmart
91
End of Module Quiz Please refer to your virtual machine for test
92
Module 5- Creating Reports and Visualizations
Save a search as a report Edit reports Create reports that include visualizations such as charts and tables Hands on Lab covering: Save a search as a report, Edit Reports, Create reports that include visualizations such as charts and tables. Add reports to a dashboard Create an instant pivot from a search Hands on Lab covering: Add reports to a dashboard, Create an instant pivot from a search. End of Module Hands on Quiz
93
Save a search as a report
To save your search as a report, click on the Report link. This opens the Save As Report dialog:
From here, you need to do the following: 1. Enter a Title (or name) for your report. 2. Enter an optional Description to remind users what your report does. 3. Indicate if you'd like to include the Splunk Time Range Picker as a part of your report. 94
Once you click Save, Splunk prompts you to either review Additional Settings for your newly created report (Permissions, Schedule, Acceleration, and Embed), Add (the report) to Dashboard, View the report, or Continue Editing the search:
95
The additional settings that can be made to the report are given as follows:
Permissions: Allows you to set how the saved report is displayed: by owner, by app, or for all apps. In addition, you can make the report read only or writeable (can be edited). Schedule: Allows you to schedule the report (for Splunk to run/refresh it based upon your schedule). For example, an interval like every week, on Monday at 6 AM, and for a particular time range. Acceleration: Not all saved reports qualify for acceleration and not all users (not even admins) have the ability to accelerate reports. Generally speaking, Splunk Enterprise will build a report acceleration summary for the report if it determines that the report would benefit from summarization (acceleration). Embed: Report embedding lets you bring the results of your reports to large numbers of report stakeholders. With report embedding, you can embed scheduled reports in external (non-Splunk) websites, dashboards, and portals. Embedded reports can display results in the form of event views, tables, charts, maps, single values, or any other visualization type. They use the same formatting as the originating report. When you embed a saved report, you do this by copying a Splunk generated URL into an HTML-based web page.
96
Edit reports You can easily edit an existing report. You can edit a report's definition (its search string, pivot setup, or result formatting). You can also edit its description, permissions, schedule, and acceleration settings. To edit a report's definition If you want to edit a report's definition, there are two ways to start, depending on whether you're on the Reports listing page or looking at the report itself.
If you're on the Reports listing page, locate the report you want to edit, go to the Actions column, and click Open in Search or Open in Pivot (you'll see one or the other depending on which tool you used to create the report). If you've entered the report to review its results, click Edit and select Open in Search or Open in Pivot (you'll see one or the other depending on which tool you used to create the report).
Edit the definition of a report opened in Search After you open a report in search, you can change the search string, time range, or report formatting. After you rerun the report, a Save button will be enabled towards the upper right of the report. Click this to save the report. You also have the option of saving your edited search as a new report.
97
Create reports that include visualizations such as charts and tables
A visualization is a representation of data returned from a search. Most visualizations are graphical representations, however, a visualization can also be non-graphical. In dashboards, a panel contains one or more visualizations. Visualizations available for simple XML dashboards include:
chart event listing map table single value
A chart visualization has several types:
area bar bubble column filler gauge line marker gauge pie radial gauge scatter
98
Hands on Lab covering 1. Click Search on the Toolbar, then click the Data Summary button:
2. Choose the SourceType Tab, and click on access_combined_wcookie: 99
3. Select under Interesting Fields, category_id . Then click under Reports, top values:
100
101
4. It should yield a report:
102
5. Now click on Statistics, notice the table of values:
6. Go back to the Visualization tab, under Format , then investigate all the different options 7. Under the Bar Chart drop, investigate all the different chart types as well
Bonus Lab: Using the LoanStats3a.csv file, create a report from the data that top values across all the states
103
Add reports to a dashboard Once you have created your reports, you can easily add to the dashboard, by clicking Add to Dashboard button
104
Create an instant pivot from a search From any search, simply select the Statistics tab and click on the Pivot Icon Let's take a walkthrough: 1. Make sure to pick make interesting fields to be selected fields
105
2. Click the Statistics tab after you have the search you want:
3. Then click the Pivot Icon
106
4. Then you can choose the fields you have selected to Pivot, and click OK :
107
5. Then you can choose a field like annual_inc with a default of Sum to be part of your Pivot column values:
108
6. And then pick a field like addr_state to the row column
109
7. Finally pick a bar chart on the left side
110
Hands on Lab 1. Create a report out of LoanStats3a.csv source that looks into the annual income < 70000 and the addr_state of CA ,FL, NY 2. Create an instant pivot out of the search from #1 above.
111
End of Module Hands on Quiz
Please refer to your virtual machine for test
112
Module 6 - Working with Dashboards
Create a dashboard Add a report to a dashboard Hands on Lab covering: Create a dashboard, Add a report to a dashboard Add a pivot report to a dashboard Edit a dashboard Hands on Lab covering: Add a pivot report to a dashboard, Edit a dashboard. End of Module Hands on Quiz
113
Create a dashboard You can create a dashboard from the search OR you can click on the Dashboard option on the Toolbar
OR
114
Add a report to a dashboard Click on Add to Dashboard from your report
115
Hands on Lab: Let's use the flower shop transactions to create a dashboard and add a report to it
Before you learned about the fields in your data, you might have run this search to see how many times flowers were purchased from the online shop: sourcetype=access_* purchase flower*| top limit=20 category_id 1. 2. 3. 4.
Let's save the report of this search as Flowers Category Click on the view button to view the report Click Add to DashBoard to add report to Dashboard Name the Dashboard, Flowers Dashboard
Bonus Lab: The report out of LoanStats3a.csv source that looks into the annual income < 70000 and the addr_state of CA ,FL, NY from the last module and create a dashboard
116
Add a pivot report to a dashboard From your pivot , you can save as a dashboard panel
117
Edit a dashboard From your dashboard, you can edit your dashboard from the menu
And then you could, for example edit Panels
118
Hands on Lab: 1. Create an instant pivot, like the one from the previous module out of LoanStats3a.csv source that looks into the annual income < 70000 and the addr_state of CA ,FL, NY 2. Then add that pivot report to the dashboard 3. Create another report that looks at ALL the annual incomes in the states of CA,FL, NY 4. Add that report to the dashboard created in exercise #1 5. Edit the dashboard panels and add titles to your panels.
Bonus Lab: 1. Create another instant pivot or report and add to the existing dashboard
119
End of Module Hands on Quiz
Please refer to your virtual machine for test
120
Module 7 - Search Fundamentals Review basic search commands and general search practices Examine the anatomy of a search Use the following commands to perform searches: Fields Table Rename Rex Multikv
121
Review basic search commands and general search practices
To successfully use Splunk, it is vital that you write effective searches. Using the index efficiently will make your initial discoveries faster, and the reports you create will run faster for you and for others. In this chapter, we will cover the following topics:
How to write effective searches How to search using fields Understanding time Saving and sharing searches
Using search terms effectively The key to creating an effective search is to take advantage of the index. The Splunk index is effectively a huge word index, sliced by time. The single most important factor for the performance of your searches is how many events are pulled from the disk. The following few key points should be committed to memory:
Search terms are case insensitive: Searches for error, Error, ERROR, and ErRoR are all the same thing. Search terms are additive: Given the search item, mary error, only events that contain both words will be found. There are Boolean and grouping operators to change this behavior; we will discuss in this chapter under Boolean and grouping operators. Only the time frame specified is queried: This may seem obvious, but it's very different from a database, which would always have a single index across all events in a table. Since each index is sliced into new buckets over time, only the buckets that contain events for the time frame in question need to be queried. Search terms are words, including parts of words: A search for foo will also match foobar.
122
With just these concepts, you can write fairly effective searches. Let's dig a little deeper, though:
A word is anything surrounded by whitespace or punctuation: For instance, given the log line 2012-0207T01:03:31.104-0600 INFO AuthClass Hello world. [user=Bobby, ip=1.2.3.3], the "words" indexed are 2012,02, 07T01, 03, 31, 104, 0600, INFO, AuthClass, Hello, world, user, Bobby, ip, 1, 2, 3, and 3. This
may seem strange, and possibly a bit wasteful, but this is what Splunk's index is really, really good at—dealing with huge numbers of words across a huge number of events. Splunk is not grep with an interface: One of the most common questions is whether Splunk uses regular expressions for your searches. Technically, the answer is no. Splunk does use regex internally to extract fields, including the auto generated fields, but most of what you would do with regular expressions is available in other ways. Using the index as it is designed is the best way to build fast searches. Regular expressions can then be used to further filter results or extract fields. Numbers are not numbers until after they have been parsed at search time: This means that searching for foo>5 will not use the index, as the value of foo is not known until it has been parsed out of the event at search time. There are different ways to deal with this behavior, depending on the question you're trying to answer. Field names are case sensitive: When searching for host=myhost, host must be lowercase. Likewise, any extracted or configured fields have case sensitive field names, but the values are case insensitive. Host=myhost will not work host=myhost will work host=MyHost will work Fields do not have to be defined before indexing data: An indexed field is a field that is added to the metadata of an event at index time. There are legitimate reasons to define indexed fields, but in the vast majority of cases it is unnecessary and is actually wasteful.
123
Examine the anatomy of a search
Boolean and grouping operators There are a few operators that you can use to refine your searches (note that these operators must be in uppercase to not be considered search terms):
AND is implied between terms. For instance, error mary (two words separated by a space) is the same as error AND mary. OR allows you to specify multiple values. For instance, error OR mary means find any event that contains either word. NOT applies to the next term or group. For example, error NOT mary would find events that contain error but do not contain mary. The quote marks ("") identify a phrase. For example, "Out of this world" will find this exact sequence of words. Out of this world would find any event that contains all of these words, but not necessarily in that order. Parentheses ( ( ) ) is used for grouping terms. Parentheses can help avoid confusion in logic. For instance, these two statements are equivalent:
bob error OR warn NOT debug bob AND (error OR warn)) AND NOT debug
The equal sign (=) is reserved for specifying fields. Searching for an equal sign can be accomplished by wrapping it in quotes. You can also escape characters to search for them. \= is the same as "=". Brackets ( [ ] ) are used to perform a subsearch.
124
You can use these operators in fairly complicated ways if you want to be very specific, or even to find multiple sets of events in a single query. The following are a few examples:
error mary NOT jacky error NOT (mary warn) NOT (jacky error) index=myapplicationindex ( sourcetype=sourcetype1 AND ( (bob NOT error) OR (mary AND warn) ) ) OR ( sourcetype=sourcetype2 (jacky info) )
This can also be written with some whitespace for clarity: index=myapplicationindex ( sourcetype=security AND ( (bob NOT error) OR (mary AND warn) ) ) OR ( sourcetype=application (jacky info) )
125
Clicking to modify your search Though you can probably figure it out by just clicking around, it is worth discussing the behavior of the GUI when moving your mouse around and clicking.
Clicking on any word or field value will give you the option to Add to search or Exclude from search (the existing search) or (create a) New search:
Clicking on a word or a field value that is already in the query will give you the option to remove it (from the existing query) or, as above, (create a) new (search):
126
Event segmentation In previous versions of Splunk, event segmentation was configurable through a setting in the Options dialog. In version 6.2, the options dialog is not present – although segmentation (discussed later in this chapter under field widgets section) is still an important concept, it is not accessible through the web interface/options dialog in this version. Field widgets Clicking on values in the Select Fields dialog (the field picker), or in the field value widgets underneath an event, will again give us an option to append (add to) or exclude (remove from) our search or, as before, to start a new search. For instance, if source="C:\Test Data\TM1ProcessError_20140623213757_temp.log" appears under your event, clicking on that value and selecting Add to search will append source="C:\\Test Data\\TM1ProcessError_20140623213757_temp.log" to your search:
127
To use the field picker, you can click on the link All Fields (see the following image):
Expand the results window by clicking on > in the far-left column. Clicking on a result will append that item to the current search:
128
If a field value looks like key=value in the text of an event, you will want to use one of the field widgets instead of clicking on the raw text of the event. Depending on your event segmentation setting, clicking on the word will either add the value or key=value. The former will not take advantage of the field definition; instead, it will simply search for the word. The latter will work for events that contain the exact quoted text, but not for other events that actually contain the same field value extracted in a different way.
129
Time Clicking on the time next to an event will open the _time dialog (shown in the following image) allowing you to change the search to select Events Before or After a particular time period, and will also have the following choices:
Before this time After this time At this time
In addition, you can select Nearby Events within plus, minus, or plus or minus, a number of seconds (the default), milliseconds, minutes, hours, days, or weeks:
130
One search trick is to click on the time of an event, select At this time, and then use the Zoom out (above the timeline) until the appropriate time frame is reached.
131
Fields command Description Keeps (+) or removes (-) fields from search results based on the field list criteria. If + is specified, only the fields that match one of the fields in the list are kept. If - is specified, only the fields that match one of the fields in the list are removed. If neither is specified, defaults to +. Important: The leading underscore is reserved for all internal Splunk Enterprise field names, such as _raw and _time. By default, internal fields _raw and _time are included in output. The fields command does not remove internal fields unless explicitly specified with: ... | fields - _*
or more explicitly, with: ... | fields - _raw,_time
Note: Be cautious removing the _time field. Statistical commands, such as timechart and chart, cannot display date or time information without the _time field. Syntax fields [+|-] Required arguments Syntax: , , ... Description: Comma-delimited list of fields to keep (+) or remove (-). You can use wild card characters in the field names. 132
Examples Example 1: Remove the "host" and "ip" fields. ... | fields - host, ip
Example 2: Keep only the host and ip fields. Remove all of the internal fields. The internal fields begin with an underscore character, for example _time. ... | fields host, ip | fields - _*
Example 3: Keep only the fields 'source', 'sourcetype', 'host', and all fields beginning with 'error'. ... | fields source, sourcetype, host, error*
133
Table command Description The table command is similar to the fields command in that it lets you specify the fields you want to keep in your results. Use table command when you want to retain data in tabular format. The table command can be used to build a scatter plot to show trends in the relationships between discrete values of your data. Otherwise, you should not use it for charts (such as chart or timechart) because the UI requires the internal fields (which are the fields beginning with an underscore, _*) to render the charts, and the table command strips these fields out of the results by default. Instead, you should use the fields command because it always retains all the internal fields. Syntax table Arguments Syntax: ... Description: A list of field names. You can use wild card characters in the field names. Usage The table command returns a table formed by only the fields specified in the arguments. Columns are displayed in the same order that fields are specified. Column headers are the field names. Rows are the field values. Each row represents an event. Command type: The table command is a non-streaming command. If you are looking for a streaming command similar to the table command, use the fields command. Field renaming: The table command doesn't let you rename fields, only specify the fields that you want to show in your tabulated results. If you're going to rename a field, do it before piping the results to table. 134
Rename command Description Use the rename command to rename a specified field or multiple fields. This command is useful for giving fields more meaningful names, such as "Product ID" instead of "pid". If you want to rename multiple fields, you can use wildcards. Use quotes to rename a field to a phrase: ... | rename SESSIONID AS sessionID
Use wildcards to rename multiple fields: ... | rename *ip AS *IPaddress
If both the source and destination fields are wildcard expressions with the same number of wildcards, the renaming will carry over the wildcarded portions to the destination expression. See Example 2, below. Note: You cannot rename one field with multiple names. For example if you had a field A, you cannot do "A as B, A as C" in one string. ... | stats first(host) AS site, first(host) AS report
Note: You cannot use this command to merge multiple fields into one field because null, or non-present, fields are brought along with the values. For example, if you had events with either product_id or pid fields, ... | rename pid AS product_id would not merge the pid values into the product_id field. It overwrites product_id with Null values where pid does not exist for the event.
135
Syntax rename AS ... Required arguments wc-field Syntax: Description: The name of a field and the name to replace it. You can use wild card characters in the field names. Names with spaces must be enclosed in quotation marks.
136
Rex command Description Use this command to either extract fields using regular expression named groups, or replace or substitute characters in a field using sed expressions. The rex command matches the value of the specified field against the unanchored regular expression and extracts the named groups into fields of the corresponding names. If a field is not specified, the regular expression is applied to the _raw field. Note: Running rex against the _raw field might have a performance impact. When mode=sed, the given sed expression used to replace or substitute characters is applied to the value of the chosen field. If a field is not specified, the sed expression is applied to _raw. This sed-syntax is also used to mask sensitive data at index-time. Use the rex command for search-time field extraction or string replacement and character substitution. Syntax rex [field=] ( [max_match=] [offset_field=] ) | (mode=sed ) Required arguments regex-expression Syntax: "" Description: The PCRE regular expression that defines the information to match and extract from the specified field. Quotation marks are required. mode Syntax: mode=sed Description: Specify to indicate that you are using a sed (UNIX stream editor) expression. sed-expression Syntax: "" Description: When mode=sed, specify whether to replace strings (s) or substitute characters (y) in the 137
matching regular expression. No other sed commands are implemented. Quotation marks are required. Sed mode supports the following flags: global (g) and Nth occurrence (N), where N is a number that is the character location in the string. Optional arguments field Syntax: field= Description: The field that you want to extract information from. Default: _raw max_match Syntax: max_match= Description: Controls the number of times the regex is matched. If greater than 1, the resulting fields are multivalued fields. Default: 1, use 0 to mean unlimited. offset_field Syntax: offset_field= Description: If provided, a field is created with the name specified by . This value of the field has the endpoints of the match in terms of zero-offset characters into the matched field. For example, if the rex expression is "(?.{10})", this matches the first ten characters of the field, and the offset_field contents is "0-9". Default: unset Sed expression When using the rex command in sed mode, you have two options: replace (s) or character substitution (y). The syntax for using sed to replace (s) text in your data is: "s///"
is a PCRE regular expression, which can include capturing groups. is a string to replace the regex match. Use \n for backreferences, where "n" is a single digit. can be either: g to replace all matches, or a number to replace a specified match. 138
The syntax for using sed to substitute characters is: "y///"
This substitutes the characters that match with the characters in .
Usage Splunk Enterprise uses perl-compatible regular expressions (PCRE). When you use regular expressions in searches, you need to be aware of how characters such as pipe ( | ) and backslash ( \ ) are handled.
Examples Example 1: Extract "from" and "to" fields using regular expressions. If a raw event contains "From: Susan To: Bob", then from=Susan and to=Bob. ... | rex field=_raw "From: (?.*) To: (?.*)"
Example 2: Extract "user", "app" and "SavedSearchName" from a field called "savedsearch_id" in scheduler.log events. If savedsearch_id=bob;search;my_saved_search then user=bob , app=search and SavedSearchName=my_saved_search ... | rex field=savedsearch_id "(?\w+);(?\w+);(?\w+)"
139
Example 3: Use sed syntax to match the regex to a series of numbers and replace them with an anonymized string. ... | rex field=ccnumber mode=sed "s/(\d{4}-){3}/XXXX-XXXX-XXXX-/g"
Example 4: Display IP address and ports of potential attackers. sourcetype=linux_secure port "failed password" | rex "\s+(?port \d+)" | top src_ip ports showperc=0
This search used rex to extract the port field and values. Then, it displays a table of the top source IP addresses (src_ip) and ports the returned with the search for potential attackers.
140
Multikv command Description Extracts field-values from table-formatted events, such as the results of top, netstat, ps, and so on. The multikv command creates a new event for each table row and assigns field names from the title row of the table. An example of the type of data multikv is designed to handle: Name Josh Francine Samantha
Age 42 35 22
Occupation SoftwareEngineer CEO ProjectManager
The key properties here are:
Each line of text represents a conceptual record. The columns are aligned. The first line of text provides the names for the data in the columns.
multikv can transform this table from one event into three events with the relevant fields. It works more easily with the fixedalignment though can sometimes handle merely ordered fields. The general strategy is to identify a header, offsets, and field counts, and then determine which components of subsequent lines should be included into those field names. Multiple tables in a single event can be handled (if multitable=true), but may require ensuring that the secondary tables have capitalized or ALLCAPS names in a header row. Auto-detection of header rows favors rows that are text, and are ALLCAPS or Capitalized.
141
Syntax multikv [conf=] [...] Optional arguments conf Syntax: conf= Description: If you have a field extraction defined in multikv.conf, use this argument to reference the stanza in your search. Syntax: copyattrs= | fields | filter | forceheader= | multitable= | noheader= | rmorig= Description: Options for extracting fields from tabular events. Descriptions for multikv options copyattrs Syntax: copyattrs= Description: When true, multikv copies all fields from the original event to the events generated from that event. When false, no fields are copied from the original event. This means that the events will have no _time field and the UI will not know how to display them. Default: true fields Syntax: fields Description: Limit the fields set by the multikv extraction to this list. Ignores any fields in the table which are not on this list. filter Syntax: filter Description: If specified, multikv skips over table rows that do not contain at least one of the strings in the filter list. Quoted expressions are permitted, such as "multiple words" or "trailing_space ". forceheader 142
Syntax: forceheader= Description: Forces the use of the given line number (1 based) as the table's header. Does not include empty lines in the count. Default: The multikv command attempts to determine the header line automatically. multitable Syntax: multitable= Description: Controls whether or not there can be multiple tables in a single _raw in the original events. Default: true noheader Syntax: noheader= Description: Handle a table without header row identification. The size of the table will be inferred from the first row, and fields will be named Column_1, Column_2, ... noheader=true implies multitable=false. Default: false rmorig Syntax: rmorig= Description: When true, the original events will not be included in the output results. When false, the original events are retained in the output results, with each original emitted after the batch of generated results from that original. Default: true Examples Example 1: Extract the "COMMAND" field when it occurs in rows that contain "splunkd". ... | multikv fields COMMAND filter splunkd
Example 2: Extract the "pid" and "command" fields. ... | multikv fields pid command
143
Hands-on Lab 1. Use the source LoanStats3a.csv and only take a look at some fields out of the data 2. Use the source LoanStats3a.csv and the table command on the same fields in #1 3. Use the source LoanStats3a.csv and use the rename command to rename fields in #1 4. Use the source LoanStats3a.csv and use the rex command for: a. source="LoanStats3a.csv" annual_inc=60000 | rex "Does not meet the credit policy.(?.*)" b. and then click on all_util field to demonstrate the rex results
144
End of Module Quiz
Please refer to your virtual machine for test
145
Module 8 - Reporting Commands, Part 1 Use the following commands and their functions: Top Rare Hands on Lab covering: Top, Rare Stats Add coltotals Hands on Lab covering: Stats, Add Coltotals End of Module Hands on Quiz
146
Top command Description Displays the most common values of a field. Finds the most frequent tuple of values of all fields in the field list, along with a count and percentage. If the optional by-clause is included, the command finds the most frequent values for each distinct tuple of values of the group-by fields. Syntax top [] [...] [] Required arguments Syntax: , , ... Description: Comma-delimited list of field names. Optional arguments Syntax: Description: The number of results to return. Syntax: countfield= | limit= | otherstr= | percentfield= | showcount= | showperc= | useother= Description: Options for the top command. See Top options. 147
Syntax: BY Description: The name of one or more fields to group by. Top options countfield Syntax: countfield= Description: The name of a new field that the value of count is written to. Default: "count" limit Syntax: limit= Description: Specifies how many tuples to return, "0" returns all values. Default: "10" otherstr Syntax: otherstr= Description: If useother is true, specify the value that is written into the row representing all other values. Default: "OTHER" percentfield Syntax: percentfield= Description: Name of a new field to write the value of percentage. Default: "percent" showcount Syntax: showcount= Description: Specify whether to create a field called "count" (see "countfield" option) with the count of that tuple. Default: true showperc Syntax: showperc= Description: Specify whether to create a field called "percent" (see "percentfield" option) with the relative prevalence of that tuple. Default: true 148
useother Syntax: useother= Description: Specify whether or not to add a row that represents all values not included due to the limit cutoff. Default: false Examples Example 1: Return the 20 most common values of the "referer" field. sourcetype=access_* | top limit=20 referer
Example 2: Return top "action" values for each "referer_domain".
149
sourcetype=access_* | top action by referer_domain
Because a limit is not specified, this returns all the combinations of values for "action" and "referer_domain" as well as the counts and percentages
150
Example 3: Return the top product purchased for each category. Do not show the percent field. Rename the count field to "total". sourcetype=access_* status=200 action=purchase | top 1 productName by categoryId showperc=f countfield=total
151
Rare command Description Displays the least common values of a field. Finds the least frequent tuple of values of all fields in the field list. If the is specified, this command returns rare tuples of values for each distinct tuple of values of the group-by fields. This command operates identically to the top command, except that the rare command finds the least frequent instead of the most frequent. Syntax rare [...] [] Required arguments Syntax: ,... Description: Comma-delimited list of field names. Optional arguments Syntax: countfield= | limit= | percentfield= | showcount= | showperc= Description: Options that specify the type and number of values to display. These are the same used by the top command. Syntax: BY Description: The name of one or more fields to group by.
152
Top options countfield Syntax: countfield= Description: The name of a new field to write the value of count into. Default: "count" limit Syntax: limit= Description: Specifies how many tuples to return. If you specify >code>limit=0, all values up to maxresultrows are returned. See Limits section. Specifying a value larger than maxresultrows produces an error. Default: 10 percentfield Syntax: percentfield= Description: Name of a new field to write the value of percentage. Default: "percent" showcount Syntax: showcount= Description: Specify whether to create a field called "count" (see "countfield" option) with the count of that tuple. Default: true showperc Syntax: showperc= Description: Specify whether to create a field called "percent" (see "percentfield" option) with the relative prevalence of that tuple. Default: true
153
Limits There is a limit on the number of results which rare returns. By default this limit is 10, but other values can be selected with the limit option up to a further constraint expressed in limits.conf, in the [rare] stanza, maxresultrows. This ceiling is 50,000 by default, and effectively keeps a ceiling on the memory that rare will use. Examples Example 1: Return the least common values of the "url" field. ... | rare url
Example 2: Find the least common "user" value for a "host". ... | rare user by host
154
Hands on Lab covering: Top, Rare
1. Run
source="C:\\LoanStats3a.csv"| top limit=20 addr_state
Now , show the rare addr_state 2. Run another search on your own demonstrating your use of the top and rare functions
155
Stats command Description Calculates aggregate statistics over the results set, such as average, count, and sum. This is similar to SQL aggregation. If stats is used without a by clause only one row is returned, which is the aggregation over the entire incoming result set. If you use a by clause one row is returned for each distinct value specified in the by clause. Syntax Simple: stats (stats-function(field) [AS field])... [BY field-list] Complete: stats [partitions=] [allnum=] [delim=] ( ... | ... ) [] Required arguments stats-agg-term Syntax: ( | ) [AS ] Description: A statistical aggregation function. The function can be applied to an eval expression, or to a field or set of fields. Use the AS clause to place the result into a new field with a name that you specify. You can use wild card characters in field names. For more information on eval expressions, see Types of eval expressions in the Search Manual. sparkline-agg-term Syntax: [AS ] Description: A sparkline aggregation function. Use the AS clause to place the result into a new field with a name that you specify. You can use wild card characters in the field name. Optional arguments allnum syntax: allnum= Description: If true, computes numerical statistics on each field if and only if all of the values of that field 156
are numerical. Default: false delim Syntax: delim= Description: Specifies how the values in the list() or values() aggregation are delimited. Default: a single space by-clause Syntax: BY Description: The name of one or more fields to group by. You cannot use a wildcard character to specify multiple fields with similar names. You must specify each field separately. partitions Syntax: partitions= Description: If specified, partitions the input data based on the split-by fields for multithreaded reduce. Default: 1 Stats function options stats-function Syntax: avg() | c() | count() | dc() | distinct_count() | earliest() | estdc() | estdc_error() | exactperc() | first() | last() | latest() | list() | max() | median() | min() | mode() | p() | perc() | range() | stdev() | stdevp() | sum() | sumsq() | upperperc() | values() | var() | varp() Description: Functions used with the stats command. Each time you invoke the stats command, you can use more than one function. However, you can only use one by clause.
Usage The stats command does not support wildcard characters in field values in BY clauses. For example, you cannot specify | stats count BY source*.
157
Basic Examples 1. Return the average transfer rate for each host sourcetype=access* | stats avg(kbps) by host
2. Search the access logs, and return the total number of hits from the top 100 values of "referer_domain" Search the access logs, and return the total number of hits from the top 100 values of "referer_domain". The "top" command returns a count and percent value for each "referer_domain". sourcetype=access_combined | top limit=100 referer_domain | stats sum(count) AS total
3. Calculate the average time for each hour for similar fields using wildcard characters Return the average, for each hour, of any unique field that ends with the string "lay". For example, delay, xdelay, relay, etc. ... | stats avg(*lay) BY date_hour
4. Remove duplicates in the result set and return the total count for the unique results Remove duplicates of results with the same "host" value and return the total count of the remaining results. ... | stats dc(host)
158
Addcoltotals command Description The addcoltotals command appends a new result to the end of the search result set. The result contains the sum of each numeric field or you can specify which fields to summarize. Results are displayed on the Statistics tab. If the labelfield argument is specified, a column is added to the statistical results table with the name specified. Syntax addcoltotals [labelfield=] [label=] [] Optional arguments Syntax: ... Description: A space delimited list of valid field names. The addcoltotals command calculates the sum only for the fields in the list you specify. You can use the asterisk ( * ) as a wildcard in the field names. Default: Calculates the sum for all of the fields. labelfield Syntax: labelfield= Description: Specify a field name to add to the result set. Default: none label Syntax: label= Description: Used with the labelfield argument to add a label in the summary event. If the labelfield argument is absent, the label argument has no effect. Default: Total
159
Examples Example 1: Compute the sums of all the fields, and put the sums in a summary event called "change_name". ... | addcoltotals labelfield=change_name label=ALL
Example 2: Add a column total for two specific fields in a table. sourcetype=access_* | table userId bytes avgTime duration | addcoltotals bytes duration
Example 3: Filter fields for two name-patterns, and get totals for one of them. ... | fields user*, *size | addcoltotals *size
Example 4: Augment a chart with a total of the values present. index=_internal source=*metrics.log" group=pipeline |stats avg(cpu_seconds) by processor |addcoltotals labelfield=processor
160
Hands on Lab
1.
Run a search query that uses the top, stats functions with Loan file to get the count:
source="C:\\LoanStats3a.csv"| top limit=20 addr_state | stats count 2. source="C:\\LoanStats3a.csv" addr_state=CA | stats count
3. Try running:
4.
sourcetype=access_* | table userId bytes avgTime duration | addcoltotals bytes duration Come up with your own example for the Loans file
161
End of Module Hands on Quiz Please refer to your virtual machine for test details
162
Module 9 - Reporting Commands, Part 2
Explore the available visualizations Create a basic chart Split values into multiple series Hands on Lab covering: Explore the available visualizations, Create a basic chart, Split values into multiple series Omit null and other values from charts Create a time chart Chart multiple values on the same timeline Hands on Lab covering: Omit null and other values from charts, Create a time chart, Chart multiple values on the same timeline Format charts Explain when to use each type of reporting command Hands on Lab covering: Format Charts, Explain when to use each type of reporting command. End of Module hands on Quiz
163
Explore the available visualizations Accessing Splunk's visualization definition features Splunk provides user interface tools to create and modify visualizations. You can access these tools from various places in Splunk Web.
Search Dashboards Dashboard visual editor Pivot Reports
Visualizations from Splunk Search You can modify how Splunk displays search results in the Search page. After running a search, select the Visualization tab, then select the type of visualization to display and specify formatting options for the selected visualization. The search must be a reporting search that returns results that can be formatted as a visualization. Dashboard panel visualizations When you base a new dashboard panel on search results, you can choose the visualization that best represents the data returned by the search. You can then use the Visualization Editor to fine-tune the way the panel visualization displays. To create a dashboard panel from search results, after you run the search click Save As > Dashboard Panel. 164
Events visualizations Events visualizations are essentially raw lists of events. You can get events visualizations from any search that does not include a transform operation, such as a search that uses reporting commands like stats, chart, timechart, top, or rare.
Tables You can pick table visualizations from just about any search, but the most interesting tables are generated by searches that include transform operations, such as a search that uses reporting commands like stats, chart, timechart, top, or rare. Charts Splunk provides a variety of chart visualizations, such as column, line, area, scatter, and pie charts. These visualizations require transforming searches (searches that use reporting commands) whose results involve one or more series. Maps Splunk provides a map visualization that lets you plot geographic coordinates as interactive markers on a world map. Searches for map visualizations should use the geostats search command to plot markers on a map. The geostats command is similar to the stats command, but provides options for zoom levels and cells for mapping. Events generated from the geostats command include latitude and longitude coordinates for markers.
165
Create a basic chart
Charts Splunk provides a variety of chart visualizations, such as column, line, area, scatter, and pie charts. These visualizations require transforming searches (searches that use reporting commands) whose results involve one or more series. A series is a sequence of related data points that can be plotted on a chart. For example, each line plotted on a line chart represents an individual series. You can design transforming searches that produce a single series, or you can set them up so the results provide data for multiple series. It may help to think of the tables that can be generated by transforming searches. Every column in the table after the first one represents a different series. A "single series" search would produce a table with only two columns, while a "multiple series" search would produce a table with three or more columns. All of the chart visualizations can handle single-series searches, though you'll find that bar, column, line, and pie chart visualizations are usually best for such searches. In fact, pie charts can only display data from single series searches. On the other hand, if your search produces multiple series, you'll want to go with a bar, column, line, area, or scatter chart visualization. Column and bar charts Use a column chart or bar chart to compare the frequency of values of fields in your data. In a column chart, the x-axis values are typically field values (or time, especially if your search uses the timechart reporting command) and the y-axis can be any other field value, count of values, or statistical calculation of a field value. Bar charts are exactly the same, except that the x-axis and y-axis values are reversed. 166
The following bar chart presents the results of this search, which uses internal Splunk metrics. It finds the total sum of CPU_seconds by processor in the last 15 minutes, and then arranges the processors with the top ten sums in descending order: index=_internal "group=pipeline" | stats sum(cpu_seconds) as totalCPUSeconds by processor | sort 10 totalCPUSeconds desc
Note that in this example, we've also demonstrated how you can roll over a single bar or column to get detail information about it. When you define the properties of your bar and column charts, you can:
set the chart titles, as well as the titles of the x-axis and y-axis. set the minimum y-axis values for the y-axis (for example, if all the y-axis values of your search are above 100 it may improve clarity to have the chart start at 100). 167
set the unit scale to Log (logarithmic) to improve clarity of charts where you have a mix of very small and very large y-axis values. determine whether charts are stacked, 100% stacked, and unstacked. Bar and column charts are always unstacked by default. See the following subsection for details on stacking bar and column charts.
If you are formatting bar or column charts in dashboards with the Visualization Editor you can additionally:
set the major unit for the y-axis (for example, you can arrange to have tick marks appear in units of 10, or 20, or 45...whatever works best). determine the position of the chart legend and the manner in which the legend labels are truncated. turn their drilldown functionality on or off.
Stacked column and bar charts When your base search involves more than one data series, you can use stacked column charts and stacked bar charts to compare the frequency of field values in your data. In an unstacked column chart, the columns for different series are placed alongside each other. This may be fine if your chart is relatively simple--total counts of sales by month for two or three items in a store over the course of a year, for example--but when the series count increases it can make for a cluttered, confusing chart. In a column chart set to a Stack mode of Stacked, all of the series columns for a single datapoint (such as a specific month in the chart described in the preceding paragraph) are stacked to become segments of a single column (one column per month, to reference that example again). The total value of the column is the sum of the segments. Note: You use a stacked column or bar chart to highlight the relative weight (importance) of the different types of data that make up a specific dataset.
168
The following chart illustrates the customer views of pages in the website of MyFlowerShop, a hypothetical web-based flower store, broken out by product category over a 7 day period:
Here's the search that built that stacked chart: 169
sourcetype=access_* method=GET | timechart count by categoryId | fields _time BOUQUETS FLOWERS GIFTS SURPRISE TEDDY
Note the usage of the fields command; it ensures that the chart only displays counts of events with a product category ID; events without one (categorized as null by Splunk) are excluded. The third Stack mode option, Stacked 100%, enables you to compare data distributions within a column or bar by making it fit to 100% of the length or width of the chart and then presenting its segments in terms of their proportion of the total "100%" of the column or bar. Stacked 100% can help you to better see data distributions between segments in a column or bar chart that contains a mix of very small and very large stacks when Stack mode is just set to Stacked. Line and area charts Line and area charts are commonly used to show data trends over time, though the x-axis can be set to any field value. If your chart includes more than one series, each series will be represented by a differently colored line or area. This chart is based on a simple search that reports on internal Splunk metrics: index=_internal | timechart count by sourcetype
170
The shaded areas in area charts can help to emphasize quantities. The following area chart is derived from this search, which also makes use of internal Splunk metrics: index=_internal source=*metrics.log group=search_concurrency "system total" NOT user=* | timechart max(active_hist_searches) as "Historical Searches" max(active_realtime_searches) as "Real-time Searches"
171
When you define the properties of your line and area charts, you can:
set the chart titles, as well as the titles of the x-axis and y-axis. determine what Splunk does with missing (null) y-axis values. You can have the system leave gaps for null datapoints, have connect to zero datapoints, or just connect to the next positive datapoint. If you choose to leave gaps, Splunk will display markers for datapoints that are disconnected because they are not adjacent to other positive datapoints. set the minimum y-axis values (for example, if all the y-axis values of your search are above 100 it may improve clarity to have the chart start at 100). set the unit scale to Log (logarithmic) to improve clarity of charts where you have a mix of very small and very large y-axis values. determine whether charts are stacked, 100% stacked, and unstacked. Bar and column charts are always unstacked by default. See the following subsection for details on stacking bar and column charts.
172
If you are formatting line or area charts in dashboards with the Visualization Editor you can additionally:
set the major unit for the y-axis (for example, you can arrange to have tick marks appear in units of 10, or 20, or 45...whatever works best). determine the position of the chart legend and the manner in which the legend labels are truncated. turn their drilldown functionality on or off.
Stacked line and area charts Stacked line and area charts operate along the same principles of stacked column and row charts (see above). Stacked line and area charts can help readers when several series are involved; it makes it easier to see how each data series relates to the entire set of data as a whole. The following chart is another example of a chart that presents information from internal Splunk metrics. The search used to create it is: index=_internal per_sourcetype_thruput | timechart sum(kb) by series useother=f
173
Pie chart Use a pie chart to show the relationship of parts of your data to the entire set of data as a whole. The size of a slice in a pie graph is determined by the size of a value of part of your data as a percentage of the total of all values. The following pie chart presents the views by referrer domain for a hypothetical online store for the previous day. Note that you can get metrics for individual pie chart wedges by mousing over them.
174
When you define the properties of pie charts you can set the chart title. If you are formatting pie charts in dashboards with the Visualization Editor you can additionally:
determine the position of the chart legend. turn pie chart drilldown functionality on or off.
175
Scatter chart Use a scatter chart ( or "scatter plot") to show trends in the relationships between discrete values of your data. Generally, a scatter plot shows discrete values that do not occur at regular intervals or belong to a series. This is different from a line graph, which usually plots a regular series of points. Here's an example of a search that can be used to generate a scatter chart. It looks at USGS earthquake data (in this case a CSV file that presents all magnitude 2.5+ quakes recorded over a given 7-day period, worldwide), pulls out just the Californian quakes, plots out the quakes by magnitude and quake depth, and then color-codes them by region. As you can see the majority of quakes recorded during this period were fairly shallow--10 or fewer meters in depth, with the exception of one quake that was around 27 meters deep. None of the quakes exceeded a magnitude of 4.0.
176
To generate the chart for this example, we've used the table command, followed by three fields. The first field is what appears in the legend (Region). The second field is the x-axis value (Magnitude), which leaves the third field (Depth) to be the y-axis value. Note that when you use table the latter two fields must be numeric in nature. source=usgs Region=*California | table Region Magnitude Depth | sort Region
You can download a current CSV file from the USGS Earthquake Feeds and add it as an input to Splunk, but the field names and format will be slightly different from the example shown here. When you define the properties of your scatter charts, you can:
set the chart titles, as well as the titles of the x-axis and y-axis. set the minimum y-axis values for the y-axis (for example, if all the y-axis values of your search are above 100 it may improve clarity to have the chart start at 100). set the unit scale to Log (logarithmic) to improve clarity of charts where you have a mix of very small and very large y-axis values.
If you are formatting bar or column charts in dashboards with the Visualization Editor you can additionally:
set the major unit for the y-axis (for example, you can arrange to have tick marks appear in units of 10, or 20, or 45...whatever works best). determine the position of the chart legend and the manner in which the legend labels are truncated. turn their drilldown functionality on or off.
177
Split values into multiple series Run for example: sourcetype=access_* | timechart count(eval(method="GET")) AS GET, count(eval(method="POST")) AS POST
Then click the visualization tab to see the result of this having two series. Make sure to select Line Chart
178
Hands on Lab : 1. Upload a data file called: ImplementingSplunkDataGenerator.tgz located on the desktop Run: source="ImplementingSplunkDataGenerator.tgz:*" host="WIN-SQM8ERRKEIJ"| chart count over date_month by date_wday If you look back at the results from stats, the data is presented as one row per combination. Instead of a row per combination, chart generates the intersection of the two fields. You can specify multiple functions, but you may only specify one field each for over and by. Switching the fields (by rearranging our search statement a bit) turns the data the other way.
By simply clicking on the Visualization tab (to the right of the Statistics tab), we can see these results in a chart:
179
This is an Area chart, with particular format options set. Within the chart area, you can click on Area to change the chart type (Line, Area, Column, Bar, and so on) or Format to change the format options (Stack, Null Values, Multi-series Mode, and Drilldown).
Bonus Lab:
Create a chart from the Loan file csv on your desktop
180
Omit null and other values from charts Sometimes Splunk has extra null fields floating around.
If your records have a unique Id field, then the following snippet removes null fields: | stats values(*) as * by Id
The reason is that "stats values won't show fields that don't have at least one non-null value".
If your records don't have a unique Id field, then you should create one first using streamstats: | streamstats count as Id | stats values(*) as * by Id
181
Create a time chart The timechart command The timechart command generates a table of summary statistics which can then be formatted as a chart visualization where your data is plotted against an x-axis that is always a time field. Use timechart to display statistical trends over time, with the option of splitting the data with another field as a separate series in the chart. Timechart visualizations are usually line, area, or column charts. Examples Example 1: This report uses internal Splunk log data to visualize the average indexing thruput (indexing kbps) of Splunk processes over time, broken out by processor: index=_internal "group=thruput" | timechart avg(instantaneous_eps) by processor
182
Chart multiple values on the same timeline Refer to lesson above for multiple series: Run for example: sourcetype=access_* | timechart count(eval(method="GET")) AS GET, count(eval(method="POST")) AS POST
Then click the visualization tab to see the result of this having two series. Make sure to select Line Chart
183
Hands on Lab Run sourcetype=access_* | timechart count(eval(method="GET")) AS GET, count(eval(method="POST")) AS POST
Create another example of a timechart with the Loan csv file
184
Format charts
Let's go ahead and take a look at the (chart) Format options. These options are grouped as:
General: Under general, you have the option to set the Stack Model (which indicates how Splunk will display your chart columns for different series (alongside each other or as a single column), determine how to handle Null Values (you can leave gaps for null data points, connect to zero data points, or just connect to the next positive data point), set the Multi-series mode (Yes or No), and turn Drilldown (active or inactive) on or off. X-Axis: Is mostly visual, you can set a custom title, allow truncation of label captions, and set the rotation of the text for your chart labels. Y-Axis: Here you can set not just a custom title, but also the scale (linear or log), the interval, and the min and max values. Chart Overlay: Here you can set the following options: Overlay: Select a field to show as an overlay. View as Axis: Select On to map the overlay to a second Y-axis. Title: Specify a title for the overlay. Scale: Select Inherit, Linear, or Log. Inherit uses the scale for the base chart. Log provides a logarithmic scale, useful for minimizing the display of large peak values. Interval: Enter the units between tick marks in the axis. Min Value: The minimum value to display. Values less than the Min Value do not appear on the chart. Max Value: The maximum value to display. Values greater than the Max Value do not appear on the chart. Legend: Finally, under Legend, you can set Position (where to place the legend (or to not include the legend) in the visualization.) and Truncation (set how to represent names that are too long to display). Keep in mind that, depending on your search results and the visualization options that you select, you may or may not get a useable result. Some experimentation with the various options is recommended.
185
Explain when to use each type of reporting command
A reporting command primer This subsection covers the major categories of reporting commands and provides examples of how they can be used in a search. The primary reporting commands are:
chart:
used to create charts that can display any series of data that you want to plot. You can decide what field is tracked on the x-axis of the chart. timechart: used to create "trend over time" reports, which means that _time is always the x-axis. top: generates charts that display the most common values of a field. rare: creates charts that display the least common values of a field. stats, eventstats, and streamstats: generate reports that display summary statistics. associate, correlate, and diff: create reports that enable you to see associations, correlations, and differences between fields in your data.
186
Note: As you'll see in the following examples, you always place your reporting commands after your search commands, linking them with a pipe operator ("|"). chart, timechart, stats, eventstats,
and streamstats are all designed to work in conjunction with statistical functions. The list of available statistical functions includes:
count, distinct count mean, median, mode min, max, range, percentiles standard deviation, variance sum first occurrence, last occurrence
187
Hands on Lab Please format your chart from the last lab exercise
188
End of Module hands on Quiz
189
Module 10 - Analyzing, Calculating, and Formatting Results
Using the eval command Perform calculations Convert values Hands on Lab covering: Using the eval command, Perform calculations, Convert values. Round values Format values Hands on Lab covering: Round values, Format values Use conditional statements Further filter calculated results Hands on Lab covering: Use conditional statements, Further filter calculated results End of Module Hands on Quiz
190
Using the eval command and perform calculations Use the eval command and functions The eval command enables you to devise arbitrary expressions that use automatically extracted fields to create a new field that takes the value that is the result of the expression's evaluation. The eval command is immensely versatile and useful. But while some eval expressions are relatively simple, they often can be quite complex.
Types of eval expressions An eval expression is a combination of literals, fields, operators, and functions that represent the value of your destination field. The expression can involve a mathematical operation, a string concatenation, a comparison expression, a boolean expression, or a call to one of the eval functions. Eval expressions require that the field's values are valid for the type of operation. For example, with the exception of addition, arithmetic operations may not produce valid results if the values are not numerical. For addition, eval can concatenate the two operands if they are both strings. When concatenating values with '.', eval treats both values as strings, regardless of their actual type.
Example 1: Use eval to define a field that is the sum of the areas of two circles, A and B. ... | eval sum_of_areas = pi() * pow(radius_a, 2) + pi() * pow(radius_b, 2)
The area of circle is πr^2, where r is the radius. For circles A and B, the radii are radius_a and radius_b, respectively. This eval expression uses the pi and pow functions to calculate the area of each circle and then adds them together, and saves the result in a field named, sum_of_areas. 191
Example 2: Use eval to define a location field using the city and state fields. For example, if the city=Philadelphia and state=PA, location="Philadelphia, PA". ... | eval location=city.", ".state
This eval expression is a simple string concatenation.
192
Convert values The convert command converts field values into numerical values. Unless you use the AS clause, the original values are replaced by the new values. Example 1 This example uses sendmail email server logs and refers to the logs with sourcetype=sendmail. The sendmail logs have two duration fields, delay and xdelay. The delay is the total amount of time a message took to deliver or bounce. The delay is expressed as "D+HH:MM:SS", which indicates the time it took in hours (HH), minutes (MM), and seconds (SS) to handle delivery or rejection of the message. If the delay exceeds 24 hours, the time expression is prefixed with the number of days and a plus character (D+). The xdelay is the total amount of time the message took to be transmitted during final delivery, and its time is expressed as "HH:MM:SS". Change the sendmail duration format of delay and xdelay to seconds. sourcetype=sendmail | convert dur2sec(delay) dur2sec(xdelay)
This search pipes all the sendmail events into the convert command and uses the dur2sec() function to convert the duration times of the fields, delay and xdelay, into seconds.
193
Here is how your search results look after you use the fields sidebar to add the fields to your events:
You can compare the converted field values to the original field values in the events list. Example 2 This example uses syslog data. Convert a UNIX epoch time to a more readable time formatted to show hours, minutes, and seconds. sourcetype=syslog | convert timeformat="%H:%M:%S" ctime(_time) AS c_time | table _time, c_time
The ctime() function converts the _time value of syslog (sourcetype=syslog) events to the format specified by the timeformat argument. The timeformat="%H:%M:%S" arguments tells the search to format the _time value as HH:MM:SS. Here, the table command is used to show the original _time value and the converted time, which is renamed c_time:
194
The ctime() function changes the timestamp to a non-numerical value. This is useful for display in a report or for readability in your events list.
Example 3 This example uses syslog data. Convert a time in MM:SS.SSS (minutes, seconds, and subseconds) to a number in seconds. sourcetype=syslog | convert mstime(_time) AS ms_time | table _time, ms_time
The mstime() function converts the _time value of syslog (sourcetype=syslog) events from a minutes and seconds to just seconds. 195
Here, the table command is used to show the original _time value and the converted time, which is renamed ms_time:
The mstime() function changes the timestamp to a numerical value. This is useful if you want to use it for more calculations.
More examples Example 1: Convert values of the "duration" field into number value by removing string values in the field value. For example, if "duration="212 sec"", the resulting value is "duration="212"". ... | convert rmunit(duration)
Example 2: Change the sendmail syslog duration format (D+HH:MM:SS) to seconds. For example, if "delay="00:10:15"", the resulting value is "delay="615"". 196
... | convert dur2sec(delay)
Example 3: Change all memory values in the "virt" field to Kilobytes. ... | convert memk(virt)
Example 4: Convert every field value to a number value except for values in the field "foo" Use the "none" argument to specify fields to ignore. ... | convert auto(*) none(foo)
197
Hands on Lab 1.
Run:
source="ImplementingSplunkDataGenerator.tgz:*" host="WIN-SQM8ERRKEIJ" error | stats count by logger user | eventstats sum(count) as totalcount | eval percent=count/totalcount*100 | sort -count And explain the options 2.
Take the Loan csv file and develop some eval functions
198
Round and format values functions Usage
All functions that accept strings can accept literal strings or any field. All functions that accept numbers can accept literal numbers or any numeric field.
Comparison and Conditional functions Function case(X,"Y",...)
cidrmatch("X",Y)
Description This function takes pairs of arguments X and Y. The X arguments are Boolean expressions that will be evaluated from first to last. When the first X expression is encountered that evaluates to TRUE, the corresponding Y argument will be returned. The function defaults to NULL if none are true. This function returns true, when IP address Y belongs to a particular subnet X. The function uses two string arguments: the first is the CIDR subnet; the second is the IP address to match.
Example(s)
This example returns descriptions for the corresponding http status code: ... | eval description=case(error == 404, "Not found", error == 500, "Internal Server Error", error == 200, "OK")
This example uses cidrmatch to set a field, isLocal, to "local" if the field ip matches the subnet, or "not local" if it does not: ... | eval isLocal=if(cidrmatch("123.132.32.0/25",ip), "local", "not local")
199
This example uses cidrmatch as a filter: ... | where cidrmatch("123.132.32.0/25", ip) coalesce(X,...)
This function takes an arbitrary number of arguments and returns the first value that is not null.
Let's say you have a set of events where the IP address is extracted to either clientip or ipaddress. This example defines a new field called ip, that takes the value of either clientip or ipaddress, depending on which is not NULL (exists in that event): ... | eval ip=coalesce(clientip,ipaddress)
if(X,Y,Z)
This function takes three This example looks at the values of error and returns arguments. The first argument err=OK if error=200, otherwise returns err=Error: X must be a Boolean expression. ... | eval err=if(error == 200, "OK", "Error") If X evaluates to TRUE, the result is the second argument Y. If, X evaluates to FALSE, the result evaluates to the third argument Z.
like(TEXT, PATTERN)
This function takes two arguments, a string to match TEXT and a match expression string PATTERN. It returns TRUE if and only if the first argument is like the SQLite pattern in Y. The pattern language supports exact text match, as well as % characters
This example returns islike=TRUE if the field value starts with foo: ... | eval is_a_foo=if(like(field, "foo%"), "yes a foo", "not a foo")
or ... | where like(field, "foo%")
200
for wildcards and _ characters for a single character match. match(SUBJECT, "REGEX")
This function compares the regex string REGEX to the value of SUBJECT and returns a Boolean value. It returns true if the REGEX can find a match against any substring of SUBJECT.
This example returns true IF AND ONLY IF field matches the basic pattern of an IP address. Note that the example uses ^ and $ to perform a full match. ... | eval n=if(match(field, "^\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}$"), 1, 0)
null()
This function takes no arguments and returns NULL. The evaluation engine uses NULL to represent "no value"; setting a field to NULL clears its value.
nullif(X,Y)
This function is used to compare fields. The function takes two ... | eval n=nullif(fieldA,fieldB) arguments, X and Y, and returns NULL if X = Y. Otherwise it returns X.
searchmatch(X)
This function takes one argument X, which is a search ... | eval n=searchmatch("foo AND bar") string. The function returns true IF AND ONLY IF the event matches the search string.
validate(X,Y,...)
This function takes pairs of
This example runs a simple check for valid ports: 201
arguments, Boolean expressions ... | eval n=validate(isint(port), "ERROR: Port is not X and strings Y. The function an integer", port >= 1 AND port <= 65535, "ERROR: Port returns the string Y is out of range") corresponding to the first expression X that evaluates to False and defaults to NULL if all are True. Conversion functions Function tonumber(NUMSTR,BASE) tonumber(NUMSTR)
tostring(X,Y)
Description
Examples
This function converts the input This example returns "164": string NUMSTR to a number, where BASE is optional and used ... | eval n=tonumber("0A4",16) to define the base of the number to convert to. BASE can be 2 to 36, and defaults to 10. If tonumber cannot parse a field value to a number, for example if the value contains a leading and trailing space, the function returns NULL. Use the trim function to remove leading or trailing spaces. If tonumber cannot parse a literal string to a number, it returns an error. This function converts the input value to a string. If the input value is a number, it reformats it 202
This example returns "True 0xF 12,345.68":
as a string. If the input value is a ... | eval n=tostring(1==1) + " " + tostring(15, Boolean value, it returns the "hex") + " " + tostring(12345.6789, "commas") corresponding string value, "True" or "False". This example returns foo=615 and foo2=00:10:15: ... | eval foo=615 | eval foo2 = tostring(foo,
This function requires at least one "duration") argument X; if X is a number, the second argument Y is optional and can This example formats the column totalSales to display values be "hex" "commas" or "duration": with a currency symbol and commas. You must use a period between the currency value and the tostring function. tostring(X,"hex") converts X to ...| fieldformat hexadecimal. totalSales="$".tostring(totalSales,"commas")
tostring(X,"commas")
formats X with commas and, if the number includes decimals, rounds to nearest two decimal places. tostring(X,"duration")
converts seconds X to readable time format HH:MM:SS.
Note: When used with the eval command, the values might not sort as expected because the values are converted to ASCII. Use the fieldformat command with the tostring function to format the displayed values. The underlying values are not changed with the fieldformat command.
Cryptographic functions Function
Description
Example(s)
md5(X)
This function computes and returns the MD5 hash of a string value X.
... | eval n=md5(field)
sha1(X)
This function computes and returns the secure hash of a string value X based on the FIPS compliant SHA-1 hash function.
... | eval n=sha1(field)
203
sha256(X)
This function computes and returns the secure hash of a string value X based on the FIPS compliant SHA-256 hash function.
... | eval n=sha256(field)
sha512(X)
This function computes and returns the secure hash of a string value X based on the FIPS compliant SHA-512 hash function.
... | eval n=sha512(field)
Date and Time functions Function
Description
Example(s)
now()
This function takes no arguments and returns the time that the search was started. The time is represented in Unix time or in seconds since Epoch time.
relative_time(X,Y)
This function takes an epochtime time, X, as the first argument and a relative time specifier, Y, as the second argument and returns the epochtime value of Y applied to X.
... | eval n=relative_time(now(), "-1d@d")
strftime(X,Y)
This function takes an epochtime value, X, as the first argument and renders it as a string using the format specified by Y.
This example returns the hour and minute from the _time field: ... | eval n=strftime(_time, "%H:%M")
strptime(X,Y)
This function takes a time represented by a string, X, and If timeStr is in the form, "11:59", parses it into a timestamp using the format specified by this returns it as a timestamp: Y. ... | eval n=strptime(timeStr, "%H:%M")
time()
This function returns the wall-clock time with microsecond resolution. The value of time() will be 204
different for each event based on when that event was processed by the eval command. Informational functions Function isbool(X)
Description
Example(s)
This function takes one argument X and returns TRUE if X is Boolean.
... | eval n=if(isbool(field),"yes","no")
or ... | where isbool(field)
isint(X)
This function takes one argument X and returns TRUE if X is an integer.
... | eval n=if(isint(field), "int", "not int")
or ... | where isint(field) isnotnull(X)
This function takes one argument X and returns TRUE if X is not NULL. This is a useful check for whether or not a field (X) contains a value.
... | eval n=if(isnotnull(field),"yes","no")
or ... | where isnotnull(field)
isnull(X)
This function takes one argument X and returns TRUE if X is NULL.
... | eval n=if(isnull(field),"yes","no")
or ... | where isnull(field)
isnum(X)
This function takes one argument X and returns TRUE if X is a number. 205
... | eval n=if(isnum(field),"yes","no")
or ... | where isnum(field) isstr(X)
This function takes one argument X and returns TRUE if X is a string.
... | eval n=if(isstr(field),"yes","no")
or ... | where isstr(field)
typeof(X)
This function takes one argument and returns a string representation of its type.
This example returns "NumberStringBoolInvalid": ... | eval n=typeof(12) + typeof("string") + typeof(1==2) + typeof(badfield)
Mathematical functions Function
abs(X)
Description
Examples
This function takes a number X and returns its absolute value.
This example returns the absnum, whose values are the absolute values of the numeric field number: ... | eval absnum=abs(number)
ceil(X), ceiling(X)
This function rounds a number X up to the next highest integer.
exact(X)
This function renders the result of a numeric eval calculation with a larger amount of precision in the 206
This example returns n=2: ... | eval n=ceil(1.9) ... | eval n=exact(3.14 * num)
formatted output. exp(X)
floor(X)
ln(X)
This function takes a number X and returns the exponential function eX.
The following example returns y=e3:
This function rounds a number X down to the nearest whole integer.
This example returns 1:
This function takes a number X and returns its natural log.
... | eval y=exp(3)
... | eval n=floor(1.9)
This example returns the natural log of the values of bytes: ... | eval lnBytes=ln(bytes)
log(X,Y) log(X)
This function takes either one or two numeric arguments and returns the log of the first argument X ... | eval num=log(number,2) using the second argument Y as the base. If the second argument Y is omitted, this function evaluates the log of number X with base 10.
pi()
This function takes no arguments and returns the constant pi to 11 digits of precision.
... | eval area_circle=pi()*pow(radius,2)
pow(X,Y)
This function takes two numeric arguments X and Y and returns XY.
... | eval area_circle=pi()*pow(radius,2)
round(X,Y)
This function takes one or two numeric arguments X and Y, returning X rounded to the amount of decimal places specified by Y. The default is to round to an integer. 207
This example returns n=4: ... | eval n=round(3.5)
This example returns n=2.56: ... | eval n=round(2.555, 2)
This function takes one argument X, a number, and rounds that number to the appropriate number of significant figures.
sigfig(X)
1.00*1111 = 1111,
but
... | eval n=sigfig(1.00*1111)
returns n=1110. This function takes one numeric argument X and returns its square root.
sqrt(X)
This example returns 3: ... | eval n=sqrt(9)
Multivalue functions Function
commands(X)
Description This function takes a search string, or field that contains a search string, X and returns a multivalued field containing a list of the commands used in X. (This is generally not recommended for use except for analysis of audit.log events.)
Example(s) ... | eval x=commands("search foo | stats count | sort count")
returns a multivalued field X, that contains 'search', 'stats', and 'sort'.
mvappend(X,...)
This function takes an arbitrary number of arguments and returns a multivalue result of all the values. The arguments can be strings, multivalue fields or single value fields.
... | eval fullName=mvappend(initial_values, "middle value", last_values)
mvcount(MVFIELD)
This function takes a field MVFIELD. The function returns the number of
... | eval n=mvcount(multifield)
208
values if it is a multivalue, 1 if it is a single value field, and NULL otherwise. mvdedup(X)
mvfilter(X)
This function takes a multivalue field X and returns a multivalue field with its duplicate values removed.
... | eval s=mvdedup(mvfield)
This function filters a multivalue field based on an arbitrary Boolean expression X. The Boolean expression X This example returns all of the values can reference ONLY ONE field at a time. in field email that end in .net or .org: Note:This function will return NULL values of ... | eval n=mvfilter(match(email, the field x as well. If you don't want the NULL "\.net$") OR match(email, "\.org$")) values, use the expression: mvfilter(x!=NULL).
mvfind(MVFIELD,"REGEX")
mvindex(MVFIELD,STARTINDEX, ENDINDEX) mvindex(MVFIELD,STARTINDEX)
This function tries to find a value in multivalue field X that matches the regular expression REGEX. If a match exists, the index of the first matching value is returned (beginning with zero). If no values match, NULL is returned. This function takes two or three arguments, field MVFIELD and numbers STARTINDEX and ENDINDEX, and returns a subset of the multivalue field using the indexes provided. 209
... | eval n=mvfind(mymvfield, "err\d+")
Since indexes start at zero, this example returns the third value in "multifield", if it exists: ... | eval n=mvindex(multifield, 2)
For mvindex(mvfield, startindex, [endindex]), endindex is inclusive and optional. Both startindex and endindex can be negative, where -1 is the last element. If endindex is not specified, it returns only the value at startindex. If the indexes are out of range or invalid, the result is NULL.
mvjoin(MVFIELD,STR)
mvrange(X,Y,Z)
mvsort(X)
This function takes two arguments, multivalue field MVFIELD and string delimiter STR. The function concatenates the individual values of MVFIELD with copies of STR in between as separators. This function creates a multivalue field for a range of numbers. This function can contain up to three arguments: a starting number X, an ending number Y (exclusive), and an optional step increment Z. If the increment is a timespan such as '7'd, the starting and ending numbers are treated as epoch times. This function uses a multivalue field X and returns a multivalue field with the values sorted lexicographically. 210
This example joins together the individual values of "foo" using a semicolon as the delimiter: ... | eval n=mvjoin(foo, ";")
This example returns a multivalue field with the values 1, 3, 5, 7, 9. ... | eval mv=mvrange(1,11,2)
... | eval s=mvsort(mvfield)
mvzip(X,Y,"Z")
This function takes two multivalue fields, X and Y, and combines them by stitching together the first value of X with the first value of field Y, then the second with the second, and so on. The third argument, Z, is optional and is used to specify a delimiting character to join the two values. The default delimiter is a comma. This is similar to Python's zip command.
... | eval nserver=mvzip(hosts,ports)
Statistical functions In addition to these functions, a comprehensive set of statistical functions is available to use with the stats, chart, and related commands. Function
Description
Example(s)
This function takes an arbitrary number of numeric or string max(X,...) arguments, and returns the max; strings are greater than numbers.
This function takes an arbitrary number of numeric or string min(X,...) arguments, and returns the min; strings are greater than numbers. random()
This function takes no arguments and returns a pseudo211
This example returns either "foo" or field, depending on the value of field: ... | eval n=max(1, 3, 6, 7, "foo", field)
This example returns either 1 or field, depending on the value of field: ... | eval n=min(1, 3, 6, 7, "foo", field)
random integer ranging from zero to 231-1, for example: 0…2147483647
Text functions Function
Description
Examples
len(X)
This function returns the character length of a string X.
lower(X)
This function takes one string argument and returns the lowercase version. The upper() function also exists for returning the uppercase version.
ltrim(X,Y) ltrim(X)
... | eval n=len(field)
rtrim(X)
... | eval username=lower(username)
This function takes one or two arguments X and Y and returns X with This example returns x="abcZZ": the characters in Y trimmed from the left side. If Y is not specified, spaces and tabs ... | eval x=ltrim(" ZZZZabcZZ ", " Z") are removed.
This function returns a string formed by substituting string Z for every occurrence replace(X,Y,Z) of regex string Y in string X. The third argument Z can also reference groups that are matched in the regex. rtrim(X,Y)
This example returns the value provided by the field username in lowercase.
This function takes one or two arguments X and Y and returns X with the characters in Y trimmed from the right side. If Y is not specified, spaces 212
This example returns date with the month and day numbers switched, so if the input was 1/14/2015 the return value would be 14/1/2015: ... | eval n=replace(date, "^(\d{1,2})/(\d{1,2})/", "\2/\1/")
This example returns n="ZZZZabc": ... | eval n=rtrim(" ZZZZabcZZ ", " Z")
and tabs are removed.
spath(X,Y)
split(X,"Y")
substr(X,Y,Z)
This function takes two arguments: an input source field X and an spath expression Y, that is the XML or JSON formatted location path to the value that you want to extract from X. If Y is a literal string, it needs quotes, spath(X,"Y"). If Y is a field name (with values that are the location paths), it doesn't need quotes. This may result in a multivalued field. Read more about the spath search command. This function takes two arguments, field X and delimiting character Y. It splits the value(s) of X on the delimiter Y and returns X as a multivalue field.
This example returns the values of locDesc elements: ... | eval locDesc=spath(_raw, "vendorProductSet.product.desc.locDesc")
This example returns the hashtags from a twitter event: index=twitter | eval output=spath(_raw, "entities.hashtags")
... | eval n=split(foo, ";")
This function takes either two or three arguments, where X is a string and Y and Z are numeric. It returns a substring of X, starting at the index specified by Y This example concatenates "str" and "ing" together, with the number of characters specified returning "string": by Z. If Z is not given, it returns the rest ... | eval n=substr("string", 1, 3) + of the string. substr("string", -3)
The indexes follow SQLite semantics; they start at 1. Negative indexes can be used to indicate a start from the end of the string. 213
This function takes one or two arguments X and Y and returns X with the characters in Y trimmed from both sides. If Y is not specified, spaces and tabs are removed.
trim(X,Y) trim(X)
This function takes one string argument and returns the uppercase version. The lower() function also exists for returning the lowercase version.
upper(X)
urldecode(X)
This function takes one URL string argument X and returns the unescaped or decoded URL string.
This example returns "abc": ... | eval n=trim(" ZZZZabcZZ ", " Z")
This example returns the value provided by the field username in uppercase. ... | eval n=upper(username)
This example returns "http://www.splunk.com/download?r=header": ... | eval n=urldecode("http%3A%2F%2Fwww.splunk.com %2Fdownload%3Fr%3Dheader")
Trigonometry and Hyperbolic functions Function
Description
Examples
acos(X)
This function computes the arc cosine of X, in the interval [0,pi] radians.
acosh(X)
This function computes the arc hyperbolic cosine of X, in radians.
asin(X)
This function computes the arc sine of X, in the interval [-pi/2,+pi/2] radians.
214
... | eval n=acos(0) ... | eval degrees=acos(0)*180/pi() ... | eval n=acosh(2) ... | eval n=asin(1) ... | eval degrees=asin(1)*180/pi()
asinh(X)
This function computes the arc hyperbolic sine of X, in radians.
... | eval n=asinh(1)
atan(X)
This function computes the arc tangent of X, in the interval [pi/2,+pi/2] radians.
... | eval n=atan(0.50)
atan2(Y, X)
This function computes the arc tangent of Y, X in the interval [-pi,+pi] radians. Y is a value that represents the proportion of the ycoordinate. X is the value that represents the proportion of the xcoordinate.
.. | eval n=atan2(0.50, 0.75)
To compute the value, the function takes into account the sign of both arguments to determine the quadrant. atanh(X)
This function computes the arc hyperbolic tangent of X, in radians.
... | eval n=atanh(0.500) ... | eval n=cos(-1)
cos(X)
This function computes the cosine of an angle of X radians. ... | eval n=cos(pi())
cosh(X)
This function computes the hyperbolic cosine of X radians.
... | eval n=cosh(1)
This function computes the hypotenuse of a right-angled triangle whose legs are X and Y. hypot(X,Y)
... | eval n=hypot(3,4)
The function returns the square root of the sum of the squares of X and Y, as described in the Pythagorean theorem. ... | eval n=sin(1) sin(X)
This function computes the sine.
sinh(X)
This function computes the hyperbolic sine.
... | eval n=sin(90 * pi()/180)
215
... | eval n=sinh(1)
tan(X)
This function computes the tangent.
... | eval n=tan(1)
tanh(X)
This function computes the hyperbolic tangent.
... | eval n=tanh(1)
216
Hands-on Lab Please take a look at Loan csv file. Use that file and some of the functions in the table in the manual Take a look at round and some other functions that are very popular
217
End of Module Quiz Please refer to virtual machine for quiz
218
Module 11 - Creating Field Aliases and Calculated Fields
Define naming conventions Create and use field aliases Create and use calculated fields Hands on Lab covering: Define naming conventions, Create and use field aliases, Create and use calculated fields. End of Module Hands on Quiz
219
Define naming conventions Example - Set up a naming convention for reports You work in the systems engineering group of your company, and as the knowledge manager for your Splunk Enterprise implementation, it's up to you to come up with a naming convention for the reports produced by your team. In the end you develop a naming convention that pulls together:
Group: Corresponds to the working group(s) of the user saving the search. Search type: Indicates the type of search (alert, report, summary-index-populating) Platform: Corresponds to the platform subjected to the search Category: Corresponds to the concern areas for the prevailing platforms. Time interval: The interval over which the search runs (or on which the search runs, if it is a scheduled search). Description: A meaningful description of the context and intent of the search, limited to one or two words if possible. Ensures the search name is unique.
Group Search type Platform SEG NEG OPS NOC
Alert Report Summary
Windows iSeries Network
Category
Time interval Description
Disk Exchange SQL Event log CPU Jobs Subsystems 220
Services Security
Possible reports using this naming convention:
SEG_Alert_Windows_Eventlog_15m_Failures SEG_Report_iSeries_Jobs_12hr_Failed_Batch NOC_Summary_Network_Security_24hr_Top_src_ip
221
Create and use field aliases You can create multiple aliases for a field. The original field is not removed. This process enables you to search for the original field using any of its aliases. Important: Field aliasing is performed after key/value extraction but before field lookups. Therefore, you can specify a lookup table based on a field alias. This can be helpful if there are one or more fields in the lookup table that are identical to fields in your data, but have been named differentlyYou can define aliases for fields that are extracted at index time as well as those that are extracted at search time. You add your field aliases to props.conf, which you edit in $SPLUNK_HOME/etc/system/local/, or your own custom app directory in $SPLUNK_HOME/etc/apps/. (We recommend using the latter directory if you want to make it easy to transfer your data customizations to other index servers.) Note: Splunk Enterprise's field aliasing functionality does not currently support multivalue fields. To alias fields: 1. Add the following line to a stanza in props.conf: FIELDALIAS- = AS
is the original name of the field. is the alias to assign to the field. You can include multiple field alias renames in one stanza. 222
2. Restart Splunk Enterprise for your changes to take effect. Example of field alias additions for a lookup Say you're creating a lookup for an external static table CSV file where the field you've extracted at search time as "ip" is referred to as "ipaddress." In the props.conf file where you've defined the extraction, you would add a line that defines "ipaddress" as an alias for "ip," as follows: [accesslog] EXTRACT-extract_ip = (?\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}) FIELDALIAS-extract_ip = ip AS ipaddress
When you set up the lookup in props.conf, you can just use ipaddress where you'd otherwise have used ip: [dns] lookup_ip = dnsLookup ipaddress OUTPUT host
223
Create and use calculated fields
The eval command is immensely versatile and useful. But while some eval expressions are relatively simple, they often can be quite complex. If you find that you need to use a particularly long and complex eval expression on a regular basis, you may find that retyping the expression accurately in search after search is tedious business. This is where calculated fields come to the rescue. Calculated fields enable you to define fields with eval expressions in props.conf . Then, when you're writing out a search, you can cut out the eval expression entirely and reference the field like you would any other extracted field. When you run the search, the fields will be extracted at search time and will be added to the events that include the fields in the eval expressions. For example, take this example search , which examines earthquake data and classifies quakes by their depth by creating a new Description field: source=eqs7day-M1.csv | eval Description=case(Depth<=70, "Shallow", Depth>70 AND Depth<=300, "Mid", Depth>300 AND Depth<=700, "Deep") | table Datetime, Region, Depth, Description
Using calculated fields, you could define the eval expression for the Description field in props.conf and write the search as: source=eqs7day-M1.csv | table Datetime, Region, Depth, Description
You can now search on Description as if it is any other extracted field. Splunk Enterprise will find the calculated field key in props.conf and evaluate it for every event that contains a Depth field. You can also run searches like this: source=eqs7day-M1.csv Description=Deep
Note: In the next section we show you how the Description calculated field would be set up in props.conf. 224
Hands on Lab: To create a calculated field go to: Settings -> Fields -> Add new (under Calculated Fields sections)
Sourcetype : csv
Name of field : a_test
Eval function: annual_inc * 2
Save it When you bring up the csv sourcetype in search , you will see the field a_test doubled the amount of annual_inc Now you can try other calculated fields, if you like
225
End of Module Hands on Quiz Please refer to virtual machine for test
226
Module 12 - Creating Field Extractions
Perform field extractions using Field Extractor Hands on Lab covering: Perform field extractions using Field Extractor End of Module Hands on Quiz
227
Perform field extractions using Field Extractor
As Splunk Enterprise processes events, it extracts fields from them. This process is called field extraction. Splunk Enterprise automatically extracts some fields Splunk Enterprise extracts some fields from your events without assistance. It automatically extracts host, source, and sourcetype values, timestamps, and several other default fields when it indexes incoming events. It also extracts fields that appear in your event data as key=value pairs. This process of recognizing and extracting k/v pairs is called field discovery. You can disable field discovery to improve search performance. When fields appear in events without their keys, Splunk Enterprise uses pattern-matching rules called regular expressions to extract those fields as complete k/v pairs. With a properly configured regular expression, Splunk Enterprise can extract user_id=johnz from the previous sample event. Splunk Enterprise comes with several field extraction configurations that use regular expressions to identify and extract fields from event data. To get all of the fields in your data, create custom field extractions To use the power of Splunk Enterprise search, create additional field extractions. Custom field extractions allow you to capture and track information that is important to your needs, but which is not automatically discovered and extracted by Splunk Enterprise. Any field extraction configuration you provide must include a regular expression that tells Splunk Enterprise how to find the field that you want to extract. All field extractions, including custom field extractions, are tied to a specific source, sourcetype, or host value. For example, if you create an ip field extraction, you might tie the extraction configuration for ip to sourcetype=access_combined.
228
Custom field extractions should take place at search time, but in certain rare circumstances you can arrange for some custom field extractions to take place at index time. Before you create custom field extractions, get to know your data Before you begin to create field extractions, ensure that you are familiar with the formats and patterns of the event data associated with the source, sourcetype, or host that you are working with. One way is to investigate the predominant event patterns in your data with the Patterns tab.. Here are two events from the same source type, an apache server web access log. 131.253.24.135 - - [03/Jun/2014:20:49:53 -0700] "GET /wp-content/themes/aurora/style.css HTTP/1.1" 200 7464 "http://www.splunk.com/download" "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.0; Trident/5.0; Trident/5.0)” 10.1.10.14 - - [03/Jun/2014:20:49:33 -0700] "GET / HTTP/1.1" 200 75017 "-" "Mozilla/5.0 (compatible; Nmap Scripting Engine; http://nmap.org/book/nse.html)"
While these events contain different strings and characters, they are formatted in a consistent manner. They both present values for fields such as clientIP, status, bytes, method, and so on in a reliable order. Reliable means that the method value is always followed by the URI value, the URI value is always followed by the status value, the status value is always followed by the bytes value, and so on. When your events have consistent and reliable formats, you can create a field extraction that accurately captures multiple field values from them. For contrast, look at this set of Cisco ASA firewall log events: Jul 15 20:10:27 10.11.36.31 %ASA-6-113003: AAA group policy for user AmorAubrey is being set to
1 Acme_techoutbound
Jul 15 20:12:42 10.11.36.11 %ASA-7-710006: IGMP request discarded from 10.11.36.36 to
2 outside:87.194.216.51
3 Jul 15 20:13:52 10.11.36.28 %ASA-6-302014: Teardown TCP connection 517934 for Outside:128.241.220.82/1561 229
to Inside:10.123.124.28/8443 duration 0:05:02 bytes 297 Tunnel has been torn down (AMOSORTILEGIO) Apr 19 11:24:32 PROD-MFS-002 %ASA-4-106103: access-list fmVPN-1300 denied udp for user 'sdewilde7' 4 outside/12.130.60.4(137) -> inside1/10.157.200.154(137) hit-cnt 1 first hit [0x286364c7, 0x0] "
While these events contain field values that are always space-delimited, they do not share a reliable format like the preceding two events. In order, these events represent: 1. A group policy change 2. An IGMP request 3. A TCP connection 4. A firewall access denial for a request from a specific IP Because these events differ so widely, it is difficult to create a single field extraction that can apply to each of these event patterns and extract relevant field values. In situations like this, where a specific host, source type, or source contains multiple event patterns, you may want to define field extractions that match each pattern, rather than designing a single extraction that can apply to all of the patterns. Inspect the events to identify text that is common and reliable for each pattern. Using required text in field extractions In the last four events, the string of numbers that follows %ASA-#- have specific meanings. You can find their definitions in the Cisco documentation. When you have unique event identifiers like these in your data, specify them as required text in your field extraction. Required text strings limit the events that can match the regular expression in your field extraction. Specifying required text is optional, but it offers multiple benefits. Because required text reduces the set of events that it scans, it improves field extraction efficiency and decreases the number of false-positive field extractions. 230
The field extractor utility enables you to highlight text in a sample event and specify that it is required text. Methods of custom field extraction in Splunk Enterprise As a knowledge manager you oversee the set of custom field extractions created by users of your Splunk Enterprise implementation, and you might define specialized groups of custom field extractions yourself. The ways that you can do this include:
The field extractor utility, which generates regular expressions for your field extractions. Adding field extractions through pages in Settings. You must provide a regular expression. Manual addition of field extraction configurations at the .conf file level. Provides the most flexibility for field extraction.
The field extraction methods that are available to Splunk Enterprise users are described in the following sections. All of these methods enable you to create search-time field extractions. To create an index-time field extraction, choose the third option: Configure field extractions directly in configuration files. Let the field extractor build extractions for you The field extractor utility leads you step-by-step through the field extraction design process. It provides two methods of field extraction: regular expressions and delimiter-based field extraction. The regular expression method is useful for extracting fields from unstructured event data, where events may follow a variety of different event patterns. It is also helpful if you are unfamiliar with regular expression syntax and usage, because it generates regular expressions and lets you validate them. The delimiter-based field extraction method is suited to structured event data. Structured event data comes from sources like SQL databases and CSV files, and produces events where all fields are separated by a common delimiter, such as commas, spaces, or pipe characters. Regular expressions usually are not necessary for structured data events from a common source. With the regular expression method of the field extractor you can:
Set up a field extraction by selecting a sample event and highlighting fields to extract from that event. 231
Create individual extractions that capture multiple fields. Improve extraction accuracy by detecting and removing false positive matches. Validate extraction results by using search filters to ensure specific values are being extracted. Specify that fields only be extracted from events that have a specific string of required text. Review stats tables of the field values discovered by your extraction. Manually configure regular expression for the field expression yourself.
With the delimiter method of the field extractor you can:
Identify a delimiter to extract all of the fields in an event. Rename specific fields as appropriate. Validate extraction results.
The field extractor can only build search time field extractions that are associated with specific sources or source types in your data (no hosts). Define field extractions with the Field Extractions and Field Transformations pages You can use the Field Extractions and Field Transformations pages in Settings to define and maintain complex extracted fields in Splunk Web. This method of field extraction creation lets you create a wider range of field extractions than you can generate with the field extractor utility. It requires that you have the following knowledge.
Understand how to design regular expressions. Have a basic understanding of how field extractions are configured in props.conf and transforms.conf.
If you create a custom field extraction that extracts its fields from _raw and does not require a field transform, use the field extractor utility. The field extractor can generate regular expressions, and it can give you feedback about the accuracy of your field extractions as you define them. 232
Use the Field Extractions page to create basic field extractions, or use it in conjunction with the Field Transformations page to define field extraction configurations that can do the following things.
Reuse the same regular expression across multiple sources, source types, or hosts. Apply multiple regular expressions to the same source, source type, or host. Use a regular expression to extract fields from the values of another field.
The Field Extractions and Field Transformations pages define only search time field extractions.
233
Hands on Lab Please refer to Lab on desktop
234
End of Module Hands on Quiz Please refer to quiz on Virtual Machine
235
Module 13 - Creating Tags and Event Types Create and use tags Describe event types and their uses Create an event type Hands on Lab covering: Create and use tags, Describe event types and their uses, create and event type. End of Module Hands on Quiz
236
Create and use tags
Settings Tags List by tag name Click Add new
237
Describe event types and their uses
Event types are a categorization system to help you make sense of your data. Event types let you sift through huge amounts of data, find similar patterns, and create alerts and reports.
Events versus event types An event is a single record of activity within a log file. An event typically includes a timestamp and provides information about what occurred on the system being monitored or logged. An event type is a user-defined field that simplifies search by letting you categorize events. Event types let you classify events that have common characteristics. When your search results come back, they're checked against known event types. An event type is applied to an event at search time if that event matches the event type definition in eventtypes.conf. Tag or save event types after indexing your data.
Event type classification There are several ways to create your own event types. Define event types via Splunk Web or through configuration files, or you can save any search as an event type.
238
Create an event type You complete a search, then Click Save As Event Type
239
Hands on Lab covering Please refer to Lab on desktop
240
End of Module Hands on Quiz Please refer to quiz on virtual machine
241
Module 14 - Creating Workflow Actions
Describe the function of a workflow action Create a GET workflow action Hands on Lab covering: Describe the function of a workflow action, Create a GET workflow action Create a POST workflow action Create a Search workflow action Hands on Lab covering: Create a POST workflow action, Create a SEARCH workflow action End of Module Hands on Quiz
242
Describe the function of a workflow action
Workflow actions have a wide variety of applications. For example, you can define workflow actions that enable you to:
Perform an external WHOIS lookup based on an IP address found in an event. Use the field values in an HTTP error event to create a new entry in an external issue management system. Launch secondary searches that use one or more field values from selected events. Perform an external search (using Google or a similar web search application) on the value of a specific field found in an event.
In addition, you can define workflow actions that:
Are targeted to events that contain a specific field or set of fields, or which belong to a particular event type. Appear either in field menus or event menus in search results. You can also set them up to only appear in the menus of specific fields, or in all field menus in a qualifying event. When selected, open either in the current window or in a new one.
Define workflow actions using Splunk Web You can set up all of the workflow actions described in the bulleted list at the top of this chapter and many more using Splunk Web. To begin, navigate to Settings > Fields > Workflow actions. On the Workflow actions page you can review and update existing workflow actions by clicking on their names. Or you can click Add new to create a new workflow action. Both methods take you to the workflow action detail page, where you define individual workflow actions. If you're creating a new workflow action, you need to give it a Name and identify its Destination app. There are three kinds of workflow actions that you can set up:
243
GET workflow actions, which create typical HTML links to do things like perform Google searches on specific values or run domain name queries against external WHOIS databases. POST workflow actions, which generate an HTTP POST request to a specified URI. This action type enables you to do things like create entries in external issue management systems using a set of relevant field values. Search workflow actions, which launch secondary searches that use specific field values from an event, such as a search that looks for the occurrence of specific combinations of ipaddress and http_status' field values in your index over a specific time range.
244
Create a GET workflow action
GET link workflow actions drop one or more values into an HTML link. Clicking that link performs an HTTP GET request in a browser, allowing you to pass information to an external web resource, such as a search engine or IP lookup service. To define a GET workflow action: 1. Navigate to Settings > Fields > Workflow Actions. 2. Click New to open up a new workflow action form. 3. Define a Label for the action. The Label field enables you to define the text that is displayed in either the field or event workflow menu. Labels can be static or include the value of relevant fields. 4. Determine whether the workflow action applies to specific fields or event types in your data. Use Apply only to the following fields to identify one or more fields. When you identify fields, the workflow action only appears for events that have those fields, either in their event menu or field menus. If you leave it blank or enter an asterisk the action appears in menus for all fields. Use Apply only to the following event types to identify one or more event types. If you identify an event type, the workflow action only appears in the event menus for events that belong to the event type. 5. For Show action in determine whether you want the action to appear in the Event menu, the Fields menus, or Both. 6. Set Action type to link. 245
7. In URI provide a URI for the location of the external resource that you want to send your field values to. Similar to the Label setting, when you declare the value of a field, you use the name of the field enclosed by dollar signs. Variables passed in GET actions via URIs are automatically URL encoded during transmission. This means you can include values that have spaces between words or punctuation characters. 8. Under Open link in, determine whether the workflow action displays in the current window or if it opens the link in a new window. 9. Set the Link method to get. 10. Click Save to save your workflow action definition.
246
Hands-on Lab
Please refer to Lab on desktop
247
Create a POST workflow action
Set up a POST workflow action You set up POST workflow actions in a manner similar to that of GET link actions. However, POST requests are typically defined by a form element in HTML along with some inputs that are converted into POST arguments. This means that you have to identify POST arguments to send to the identified URI. 1. Navigate to Settings > Fields > Workflow Actions. 2. Click New to open up a new workflow action form. 3. Define a Label for the action. The Label field enables you to define the text that is displayed in either the field or event workflow menu. Labels can be static or include the value of relevant fields. 4. Determine whether the workflow action applies to specific fields or event types in your data. Use Apply only to the following fields to identify one or more fields. When you identify fields, the workflow action only appears events that have those fields, either in their event menu or field menus. If you leave it blank or enter an asterisk the action appears in menus for all fields. Use Apply only to the following event types to identify one or more event types. If you identify an event type, the workflow action only appears in the event menus for events that belong to the event type. 5. For Show action in determine whether you want the action to appear in the Event menu, the Fields menus, or Both. 6. Set Action type to Link. 248
7. Under URI provide the URI for a web resource that responds to POST requests. 8. Under Open link in, determine whether the workflow action displays in the current window or if it opens the link in a new window. 9. Set Link method to Post. 10. Under Post arguments define arguments that should be sent to web resource at the identified URI. These arguments are key and value combinations. On both the key and value sides of the argument, you can use field names enclosed in dollar signs to identify the field value from your events that should be sent over to the resource. You can define multiple key/value arguments in one POST workflow action. Enter the key in the first field, and the value in the second field. Click Add another field to create an additional POST argument. 11. Click Save to save your workflow action definition. Splunk software automatically HTTP-form encodes variables that it passes in POST link actions via URIs. This means you can include values that have spaces between words or punctuation characters.
249
Create a Search workflow action
To set up workflow actions that launch dynamically populated secondary searches, you start by setting Action type to search on the Workflow actions detail page. This reveals a set of Search configuration fields that you use to define the specifics of the secondary search. In Search string enter a search string that includes one or more placeholders for field values, bounded by dollar signs. For example, if you're setting up a workflow action that searches on client IP values that turn up in events, you might simply enter clientip=$clientip$ in that field. Identify the app that the search runs in. If you want it to run in a view other than the current one, select that view. And as with all workflow actions, you can determine whether it opens in the current window or a new one. Be sure to set a time range for the search (or identify whether it should use the same time range as the search that created the field listing) by entering relative time modifiers in the in the Earliest time and Latest time fields. If these fields are left blank the search runs over all time by default. Finally, as with other workflow action types, you can restrict the search workflow action to events containing specific sets of fields and/or which belong to particular event types.
250
Hands-on Lab Please refer to Lab on desktop
251
End of Module Quiz Please refer to questions on virtual machine
252
Module 15 - Creating and Managing Alerts
Describe alerts Create alerts View fired alerts Hands on Lab covering: Describe alerts, Create alerts, View fired alerts End of Module Hands on Quiz
253
Describe alerts
An alert is an action that a saved search triggers based on the results of the search. When creating an alert, you specify a condition that triggers the alert. Typically the action is an email based on the results of the search. But you can also choose to run a script or to list the alert as a triggered alert in Settings. When you create an alert you are creating a saved search with trigger conditions for the alert. To avoid sending out alerts too frequently, specify a throttle condition for an alert. The following list describes the types of alerts:
Per result alert. Based on a real-time search. The trigger condition is whenever the search returns a result.
Scheduled alert. Runs a search according to a schedule that you specify when creating the alert. You specify results of the search that trigger the alert.
Rolling-window alert. Based on a real-time search. The trigger condition is a combination of specified results of the search within a specified time window.
254
Create alerts
A scheduled alert runs periodically at a scheduled time, responding to a condition that triggers the alert. This example uses a search to track when there are too many errors in a Splunk Enterprise instance during the last 24 hours. When the number of errors exceeds 5, the alert sends an email with information about the conditions that triggered the alert. The alert sends an email every day at 10:00AM when the number of errors exceed the threshold. 1. From the Search Page, create the following search index=_internal " error " NOT debug source=*splunkd.log* earliest=-24h latest=now
2. Click Save As > Alert. 3. Specify the following values for the fields in the Save As Alert dialog box: Title: Errors in the last 24 hours Alert type: Scheduled Time Range: Run every day Schedule: At 10:00 Trigger condition: Number of Results Trigger if number of results: is Greater than 5.
255
4. Click Next. 5. Click Send Email. 6. Set the following email settings, using tokens in the Subject and Message fields: To: email recipient Priority: Normal Subject: Too many errors alert: $name$ Message: There were $job.resultCount$ errors reported on $trigger_date$. Include: Link to Alert and Link to Results Accept defaults for all other options. 256
7. Click Save. After you create the alert you can view and edit the alert in the Alerts Page. When the alert triggers, it sends the following email:
257
258
View fired alerts Simply go to the Alerts Page on the top Toolbar
259
Hands-on Lab Please refer to Lab on desktop
260
End of Module Quiz
Please refer to questions on virtual machine
261
Module 16 - Creating and Using Macros
Describe macros Manage macros Create and use a basic macro Hands on Lab covering: Describe macros, Manage macros, Create and use a basic macro. Define arguments and variables for a macro Add and use arguments with a macro Hands on Lab covering: Define arguments and variable for a macro, Add and use arguments with a macro. End of Module Hands on Quiz
262
Describe Macros Search macros are chunks of a search that you can reuse in multiple places, including saved and ad hoc searches. Search macros can be any part of a search, such as an eval statement or search term, and do not need to be a complete command. You can also specify whether or not the macro field takes any arguments.
263
Manage and create macros In Settings > Advanced Search > Search macros, click "New" to create a new search macro. Define the search macro and its arguments Your search macro can be any chunk of your search string or search command pipeline that you want to re-use as part of another search. Destination app is the name of the app you want to restrict your search macro to; by default, your search macros are restricted to the Search app. Name is the name of your search macro, such as mymacro. If your search macro takes an argument, you need to indicate this by appending the number of arguments to the name; for example, if mymacro required two arguments, it should be named mymacro(2). You can create multiple search macros that have the same name but require different numbers of arguments: foo, foo(1), foo(2), etc. Definition is the string that your search macro expands to when referenced in another search. If the search macro requires the user to input arguments, they are tokenized and indicated by wrapping dollar signs around the arguments; for example, $arg1$. The arguments values are then specified when the search macro is invoked.
If Eval Generated Definition? is checked, then the 'Definition' is expected to be an eval expression that returns a string that represents the expansion of this macro. If a macro definition includes a leading pipe character ("|"), you may not use it as the first term in searches from the UI. Example: "| metadata type=sources". The UI does not do the macro expansion and cannot correctly identify the initial pipe to differentiate it from a regular search term. The UI constructs the search as if the macro name were a search term, which after expansion would cause the metadata command to be incorrectly formed and therefore invalid. 264
Arguments are a comma-delimited string of argument names. Argument names may only contain the characters: alphanumeric 'a-Z, A-Z, 0-9'; underscore '_'; and dash '-'. This list should not contain any repeated elements.
If a macro argument includes quotes, you need to escape the quotes when you call the macro in your search. For example, if you wanted to pass a quoted string as your macro's argument, you would use: `my-macro("He said \"hello!\"")`.
Validate your argument values You can verify that the argument values used to invoke the search macro are acceptable. How to invoke search macros are discussed in the following section, "Apply macros to saved and ad hoc searches".
Validation Expression is a string that is an 'eval' expression that evaluates to a boolean or a string. If the validation expression is a boolean expression, validation succeeds when it returns true. If it returns false or is null, validation fails, and the Validation Error Message is returned.
If the validation expression is not a boolean expression, it is expected to return a string or NULL. If it returns null, validation is considered a success. Otherwise, the string returned is rendered as the error string. Apply macros to saved and ad hoc searches To include a search macro in your saved or ad hoc searches, use the left quote (also known as a grave accent) character; on most English-language keyboards, this character is located on the same key as the tilde (~). You can also reference a search macro within other search macros using this same syntax. Note: Do NOT use the straight quote character that appears in the same key as the double quote (").
265
Hands-on Lab
Please refer to Lab on desktop
266
End of Module Quiz
Please refer to virtual machine for quiz
267
Module 17 - Using Pivot Describe Pivot Understand the relationship between data models and pivot Select a data model object Hands on Lab covering: Describe Pivot, Understand the relationship between data models and pivot, Select a data model object. Create a pivot report Save pivot report as a dashboard Hands on Lab covering: Create a pivot report, Save pivot report as a dashboard. End of Module Hands on Quiz.
268
Describe Pivot The Pivot tool lets you report on a specific data set without the Splunk Enterprise Search Processing Language (SPL™). First, identify a dataset that you want to report on, and then use a drag-and-drop interface to design and generate pivots that present different aspects of that data in the form of tables, charts, and other visualizations. How does Pivot work? It uses data models to define the broad category of event data that you're working with, and then uses hierarchically arranged collections of data model objects to further subdivide the original dataset and define the attributes that you want Pivot to return results on. Data models and their objects are designed by the knowledge managers in your organization. They do a lot of hard work for you to enable you to quickly focus on a specific subset of event data.
269
Understand the relationship between data models and pivot
Data models drive the Pivot tool. They enable users of Pivot to create compelling reports and dashboards without designing the searches that generate them. Data models can have other uses, especially for Splunk Enterprise app developers. Splunk Enterprise knowledge managers design and maintain data models. These knowledge managers understand the format and semantics of their indexed data and are familiar with the Splunk Enterprise search language. In building a typical data model, knowledge managers use knowledge object types such as lookups, transactions, search-time field extractions, and calculated fields.
What is a data model? A data model is a hierarchically structured search-time mapping of semantic knowledge about one or more datasets. It encodes the domain knowledge necessary to build a variety of specialized searches of those datasets. These specialized searches are used by Splunk Enterprise to generate reports for Pivot users. When a Pivot user designs a pivot report, she selects the data model that represents the category of event data that she wants to work with, such as Web Intelligence or Email Logs. Then she selects an object within that data model that represents the specific dataset on which she wants to report. Data models are composed of objects, which can be arranged in hierarchical structures of parent and child objects. Each child object represents a subset of the dataset covered by its parent object. If you are familiar with relational database design, think of data models as analogs to database schemas. When you plug them into the Pivot Editor, they let you generate statistical tables, charts, and visualizations based on column and row configurations that you select. To create an effective data model, you must understand your data sources and your data semantics. This information can affect your data model architecture--the manner in which the objects that make up the data model are organized.
270
Data Model Object
Data models are composed of one or more objects. Here are some basic facts about data model objects:
An object is a specification for a dataset. Each data model object corresponds in some manner to a set of data in an index. You can apply data models to different indexes and get different datasets. Objects break down into four types. These types are: Event objects, search objects, transaction objects, and child objects. Objects are hierarchical. Objects in data models can be arranged hierarchically in parent/child relationships. The top-level event, search, and transaction objects in data models are collectively referred to as "root objects." Child objects have inheritance. Data model objects are defined by characteristics that mostly break down into constraints and attributes. Child objects inherit constraints and attributes from their parent objects and have additional constraints and attributes of their own.
271
Hands-on Lab
Please refer to Lab on desktop
272
End of Module Quiz
Please refer to virtual machine for Quiz
273
End of Course Quiz Please refer to virtual machine for Quiz
274