Skip to content

Elos-RW/splunk-lab

 
 

Repository files navigation

Splunk Lab

This project lets you stand up a Splunk instance in Docker on a quick and dirty basis.

Quick Start!

Paste either of these on the command line:

bash <(curl -s https://raw.githubusercontent.com/dmuth/splunk-lab/master/go.sh)

bash <(curl -Ls https://bit.ly/splunklab)

...and the script will print up what directory it will ingest logs from, your password, etc. Follow the on-screen instructions for setting environment variables and you'll be up and running in no time! Whatever logs you had sitting in your logs/ directory will be searchable in Splunk with the search index=main.

If you want to see neat things you can do in Splunk Lab, check out the Cookbook section.

Useful links after starting

  • https://localhost:8000/ - Default port to log into the local instance. Username is admin, password is what was set when starting Splunk Lab.
  • Splunk Dashboard Examples - Wanna see what you can do with Splunk? Here are some example dashboards.

Features

  • App databoards can be stored in the local filesystem (they don't dissappear when the container exits)
  • Ingested data can be stored in the local filesystem
  • Multiple REST and RSS endpoints "built in" to provide sources of data ingestion
  • Integration with REST API Modular Input
  • Splunk Machine Learning Toolkit included
  • /etc/hosts can be appended to with local ip/hostname entries
  • Ships with Eventgen to populate your index with fake webserver events for testing.

Screenshots

These are screenshots with actual data from production apps which I built on top of Splunk Lab:

Splunk Lab Cookbook

What can you do with Splunk Lab? Here are a few examples of ways you can use Splunk Lab:

Ingest some logs for viewing, searching, and analysis

  • Drop your logs into the logs/ directory.
  • bash <(curl -Ls https://bit.ly/splunklab)
  • Go to https://localhost:8000/
  • Ingsted data will be written to data/ which will persist between runs.

Ingest some logs for viewing, searching, and analysis but DON'T keep ingested data between runs

  • SPLUNK_DATA=no bash <(curl -Ls https://bit.ly/splunklab)
  • Note that data/ will not be written to and launching a new container will cause logs/ to be indexed again.
    • This will increase ingestion rate on Docker for OS/X, as there are some issues with the filesystem driver in OS/X Docker.

Play around with synthetic webserver data

  • SPLUNK_EVENTGEN=1 bash <(curl -Ls https://bit.ly/splunklab)
  • Fake webserver logs will be written every 10 seconds and can be viewed with the query index=main sourcetype=nginx. The logs are based on actual HTTP requests which have come into the webserver hosting my blog.

Adding Hostnames into /etc/hosts

  • Edit a local hosts file
  • ETC_HOSTS=./hosts bash <(curl -Ls https://bit.ly/splunklab)
  • This can be used in conjunction with something like Splunk Network Monitor to ping hosts that don't have DNS names, such as your home's webcam. :-)

Get the Docker command line for any of the above

  • Run any of the above with PRINT_DOCKER_CMD=1 set, and the Docker command line that's used will be written to stdout.

Run Splunk Lab in Development Mode with a bash Shell

This would normally be done with the script ./bin/devel.sh when running from the repo, but if you're running Splunk Lab just with the Docker image, here's how to do it:

docker run -p 8000:8000 -e SPLUNK_PASSWORD=password1 -v $(pwd)/data:/data -v $(pwd)/logs:/logs --name splunk-lab --rm -it -v $(pwd):/mnt -e SPLUNK_DEVEL=1 dmuth1/splunk-lab bash

This is useful mainly if you want to poke around in Splunk Lab while it's running. Note that you could always just run docker exec splunk-lab bash instead of doing all of the above. :-)

Splunk Apps Included

The following Splunk apps are included in this Docker image:

All apps are covered under their own license. Please check the Apps page for more info.

Splunk has its own license. Please abide by it.

Free Sources of Data

I put together this curated list of free sources of data which can be pulled into Splunk via one of the included apps:

Apps Built With Splunk Lab

Since building Splunk Lab, I have used it as the basis for building other projects:

Here's all of the above, presented as a graph:

Building Your Own Apps Based on Splunk Lab

A sample app (and instructions on how to use it) are in the sample-app directory.
Feel free to expand on that app for your own apps.

A Word About Security

HTTPS is turned on by default. Passwords such as password and 12345 are not permitted.

Please, use a strong password if you are deploying this on a public-facing machine.

FAQ

How do I get a valid SSL cert on localhost?

Yes, you can!

First, install mkcert and then run mkcert -install && mkcert localhost 127.0.0.1 ::1 to generate a local CA and a cert/key combo for localhost.

Then, when you run Splunk Lab, set the environment variables SSL_KEY and SSL_CERT and those files will be pulled into Splunk Lab.

Example: SSL_KEY=./localhost.key SSL_CERT=./localhost.pem ./go.sh

Does this work on Macs?

Sure does! I built this on a Mac. :-)

Development

I wrote a series of helper scripts in bin/ to make the process easier:

  • ./bin/build.sh - Build the containers.
    • Note that this downloads packages from an AWS S3 bucket that I created. This bucket is set to "requestor pays", so you'll need to make sure the aws CLI app set up.
  • ./bin/download.sh - Download tarballs of various apps and splits some of them into chunks
  • ./bin/upload-file-to-s3.sh - Upload a specific file to S3. For rolling out new versions of apps
  • ./bin/push.sh - Tag and push the container.
  • ./bin/devel.sh - Build and tag the container, then start it with an interactive bash shell.
    • This is a wrapper for the above-mentioned go.sh script. Any environment variables that work there will work here.
    • To force rebuilding a container during development touch the associated Dockerfile in docker/. E.g. touch docker/1-splunk-lab to rebuild the contents of that container.
  • ./bin/create-1-million-events.py - Create 1 million events in the file 1-million-events.txt in the current directory.
    • If not in logs/ but reachable from the Docker container, the file can then be oneshotted into Splunk with the following command: /opt/splunk/bin/splunk add oneshot ./1-million-events.txt -index main -sourcetype oneshot-0001
  • ./bin/kill.sh - Kill a running splunk-lab container.
  • ./bin/attach.sh - Attach to a running splunk-lab container.
  • ./bin/clean.sh - Remove logs/ and/or data/ directories.
  • ./bin/tarsplit - Local copy of my pacakge from https://github.com/dmuth/tarsplit

Building Container Internals

  • Here's the layout of the cache/ directory
    • cache/ - Where tarballs for Splunk and its apps hang out. These are downloaded when bin/download.sh is run for the first time.
    • cache/deploy/ - When creating a specific Docker image, files are copied here so the Dockerfile can ingest them. (Or rather hardlinked to the files in the parent directory.)
    • cache/build/ - 0-byte files are written here when a specific container is built, and on future builds, the age of that file is checked against the Dockerfile. If the Dockerfile is newer, then the container is (re-)built. Otherwise, it is skipped. This shortens a run of bin/devel.sh where no containers need to be built from 12 seconds on my 2020 iMac to 0.2 seconds.

A word on default/ and local/ directories

I had to struggle with this for awhile, so I'm mostly documenting it here.

When in devel mode, /opt/splunk/etc/apps/splunk-lab/ is mounted to ./splunk-lab-app/ via go.sh and the entrypoint script inside of the container symlinks local/ to default/. This way, any changes that are made to dashboards will be propagated outside of the container and can be checked in to Git.

When in production mode (e.g. running ./go.sh directly), no symlink is created, instead local/ is mounted by whatever $SPLUNK_APP is pointing to, so that any changes made by the user will show up on their host, with Splunk Lab's default/ directory being untouched.

Additional Reading

Notes/Bugs

  • The Docker containers are dmuth1/splunk-lab and dmuth1/splunk-lab-ml. The latter has all of the Machine Learning apps built in to the image. Feel free to extend those for your own projects.
  • If I run ./bin/create-test-logfiles.sh 10000 and then start Splunk Lab on a Mac, all of the files will be Indexed without any major issues, but then the CPU will spin, and not from Splunk.
    • The root cause is that the filesystem code for Docker volume mappings on OS/X's Docker implementation is VERY inefficient in terms of both CPU and memory usage, especially when there are 10,000 files involved. The overhead is just crazy. When reading events from a directory mounted through Docker, I see about 100 events/sec. When the directory is local to the container, I see about 1,000 events/sec, for a 10x difference.
  • The HTTPS cert is self-signed with Splunk's own CA. If you're tired of seeing a Certificate Error every time you try connecting to Splunk, you can follow the instructions at https://stackoverflow.com/a/31900210/196073 to allow self-signed certificates for localhost in Google Chrome.
    • Please understand the implications before you do this.

Credits

  • Splunk N' Box - Splunk N' Box is used to create entire Splunk clusters in Docker. It was the first actual use of Splunk I saw in Docker, and gave me the idea that hey, maybe I could run a stand-alone Splunk instance in Docker for ad-hoc data analysis!
  • Splunk, for having such a fantastic product which is also a great example of Operational Excellence!
  • Eventgen is a super cool way of generating simulating real data that can be used to generate dashboards for testing and training purposes.
  • This text to ASCII art generator, for the logo I used in the script.
  • The logo was made over at https://www.freelogodesign.org/

Copyrights

  • Splunk is copyright by Splunk, Inc. Please stay within the confines of the 500 MB/day free license when using Splunk Lab, unless you brought your own license along.
  • The various apps are copyright by the creators of those apps.

Contact

My email is doug.muth@gmail.com. I am also @dmuth on Twitter and Facebook!

About

Create a lab instance of Splunk for ad hoc data analytics. Includes Splunk's Machine Learning app!

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Shell 88.9%
  • Python 10.1%
  • Dockerfile 1.0%