Assignment Chef icon Assignment Chef
All English tutorials

Programming lesson

Mastering Log Analysis with Splunk: A Hands-On Guide for the BOTS Exercise

Learn how to approach the Splunk Boss of the SOC (BOTS) exercise with practical search techniques, log interpretation tips, and real-world incident response strategies tailored for CS6261 students.

Splunk BOTS exercise log analysis tutorial incident response skills Splunk search commands CS6261 project 1 boss of the SOC version 3 Splunk sourcetype analysis cybersecurity log analysis Splunk subsearch technique WinEventLog analysis Suricata log interpretation fortigate log analysis Splunk correlation search BOTS walkthrough Splunk for students Georgia Tech Splunk lab

Introduction to Splunk and the BOTS Exercise

Log analysis is a critical skill for cybersecurity professionals, especially those involved in incident response. The Splunk Boss of the SOC (BOTS) exercise provides a realistic environment to practice searching, correlating, and interpreting logs. In this tutorial, we'll walk through effective strategies to tackle the BOTS questions, using the botsv3 index and common Splunk commands. Whether you're a student at Georgia Tech or a self-learner, this guide will help you build confidence in log analysis.

Setting Up Your Splunk Environment

Before diving into searches, ensure you have access to the Splunk instance via the Georgia Tech VPN. Once logged in, navigate to the Search app and set the time range to All time. This is crucial because the BOTS logs span multiple days. Start with a simple search to explore available data:

index=botsv3 | stats count by sourcetype

This command lists all log types (sourcetypes) in the index. Familiarize yourself with common ones like WinEventLog, syslog, iis, and suricata. Understanding sourcetypes helps you narrow down searches later.

Building Foundational Search Skills

Effective Splunk searching relies on key commands: search, stats, table, top, rare, and eval. For example, to find the most frequent source IPs in firewall logs:

index=botsv3 sourcetype=fortigate* | top src_ip

To filter by a specific time, use earliest and latest modifiers: earliest=-7d. Always check the timestamp field; Splunk will parse it automatically, but you can extract custom time fields with eval and convert.

Understanding Log Formats and Fields

Each sourcetype has a unique structure. For Windows Event Logs, fields like EventCode, User, and ComputerName are key. For network logs (e.g., Suricata), look for alert.signature, src_ip, dest_ip, and proto. Use the fields command to list all available fields for a given sourcetype:

index=botsv3 sourcetype=suricata | fields *

This reveals fields like alert.category and http.hostname. Knowing these fields allows you to construct precise queries.

Practical Search Examples for BOTS Questions

Many BOTS questions ask to identify specific events, such as a successful brute force or a malware download. Let's simulate a common scenario: finding a user who logged in from an unusual location. Search for authentication events with a specific EventCode (e.g., 4624 for successful logon) and filter by source IP not in the internal range:

index=botsv3 sourcetype=WinEventLog EventCode=4624 | search NOT src_ip=10.* AND NOT src_ip=192.168.* | stats count by User, src_ip

Another typical task is to identify a file hash associated with malware. Use suricata alerts to find HTTP requests to malicious domains, then extract the file name or hash:

index=botsv3 sourcetype=suricata alert.category=Malware | table http.hostname, http.url, src_ip, dest_ip

Then pivot to the iis logs to see if any internal hosts downloaded that file:

index=botsv3 sourcetype=iis cs_uri_stem=*malicious.exe*

Advanced Techniques: Correlation and Subsearch

When questions require correlating data from multiple sources, use subsearch or append. For example, to find all source IPs that triggered a suricata alert and then appeared in a Windows logon failure event:

index=botsv3 sourcetype=WinEventLog EventCode=4625 | search src_ip IN [search index=botsv3 sourcetype=suricata alert.category=Malware | fields src_ip] | stats count by src_ip, User

This subsearch first collects malicious IPs from suricata, then filters Windows logon failures by those IPs. Use dedup to avoid duplicate results.

Using Lookups and External Data

Some BOTS questions require referencing external threat intelligence, like known malicious IPs or domains. While the built-in lookup files are limited, you can create a lookup table in Splunk. For example, to flag IPs from a known blocklist, upload a CSV with columns ip and threat, then use:

index=botsv3 sourcetype=fortigate* | lookup threatlist ip as src_ip OUTPUT threat | where threat=malicious

This enriches your search results with threat context, helping you answer questions like 'Which internal IP communicated with a known C2 server?'

Optimizing Search Performance

To avoid timeouts during the exam, use summary indexing or report acceleration for repetitive queries. For instance, if you need to count events per sourcetype frequently, create a data model. Also, limit the time range as much as possible—use specific dates from the question rather than 'All time' when you know the incident window.

Common Pitfalls and How to Avoid Them

  • Ignoring the time picker: Always verify the time range covers the incident period. Many questions specify a date range; use earliest=05/20/2026:00:00:00 latest=05/23/2026:23:59:59 to be precise.
  • Misspelling field names: Field names are case-sensitive. Use | fieldsummary to check exact names.
  • Overlooking sourcetype: Not specifying sourcetype can return massive results and slow down searches. Start with a broad search, then refine.
  • Forgetting to deduplicate: Some logs contain duplicate events. Use dedup _time, src_ip to count unique occurrences.

Connecting Log Analysis to Real-World Incident Response

Log analysis isn't just for class—it's used daily in Security Operations Centers (SOCs). For example, during the 2026 Super Bowl, SOC teams monitored logs for phishing attempts targeting fans. Similarly, in the world of AI, Splunk is used to detect anomalies in machine learning pipelines. Understanding how to pivot from a suspicious IP to a user account to a file hash is exactly what incident responders do. The BOTS exercise simulates these real-world scenarios, so treat each question as a mini-investigation.

Final Tips for BOTS Success

Collaborate with classmates on Ed Discussion—explain your search logic without giving away exact answers. Use the official Splunk documentation and the 'How to Search' panel inside Splunk. Remember, the goal is to learn, not just to get points. If you get stuck, break the question into smaller parts: identify the log type, the field, the value, and then build the search incrementally.

"Log analysis is like solving a puzzle—each piece of data adds context until the full picture emerges."

Good luck with your CS6261 project! By mastering these techniques, you'll be well-prepared for real-world incident response challenges.