This Python script provides a comprehensive workflow for managing the restore and SmartStore integration process for Splunk buckets from S3, including:
- Generating a bucket structure.
- Rebuilding buckets locally.
- Uploading buckets to SmartStore.
- Checking the status of uploaded buckets.
- Evicting buckets from local storage once uploaded.
- Generate Bucket Structure: Scans the source S3 bucket for indexes and buckets, while also checking local status (
Hosts.data) and the secondary S3 bucket forreceipt.json. - Rebuild Buckets: Allows the user to rebuild specific buckets locally.
- Upload Buckets: Automates uploading rebuilt buckets to Splunk SmartStore.
- Check Bucket Status: Monitors the status of uploaded buckets and updates their status.
- Evict Buckets: Evicts buckets from local storage after they are successfully uploaded.
-
Python Requirements:
- Python 3.x
- Required libraries:
boto3,requests,urllib3
Install dependencies:
pip install boto3 requests urllib3
-
Splunk Configuration:
- Ensure Splunk is running locally and accessible at
https://localhost:8089. - Update the script with your Splunk credentials (
AUTH) and URL (SPLUNK_URL).
- Ensure Splunk is running locally and accessible at
-
AWS Configuration:
- Configure AWS credentials with access to the required S3 buckets (
DDSS_BUCKET_NAMEandS2_BUCKET_NAME):aws configure
- Configure AWS credentials with access to the required S3 buckets (
-
Local File System:
- Buckets are assumed to be located at
/opt/splunk/var/lib/splunk.
- Buckets are assumed to be located at
Run the script:
python dda_restore_workflow.py-
Generate Bucket Structure:
- The script scans the source S3 bucket (
DDSS_BUCKET_NAME) and generatesbucket_structure.jsonwith the following statuses:todo: No local file or receipt found in S3.pendingupload: Bucket exists locally but no receipt found in S3.pendingevict: Bucket exists locally and receipt found in S3.done: Receipt found in S3, and bucket no longer needs local processing.
- The script scans the source S3 bucket (
-
Process Buckets:
- Enter the index name and number of buckets to rebuild locally:
- Index name:
requests_apm_prodor any other valid index. - Number of buckets: Enter
0to skip rebuilding and proceed to check pending uploads or evictions.
- Index name:
- Enter the index name and number of buckets to rebuild locally:
-
Restart Splunk:
- The script automatically restarts Splunk after processing buckets.
-
Upload Buckets:
- Buckets with the
pendinguploadstatus are uploaded to SmartStore.
- Buckets with the
-
Check Buckets:
- Monitors the status of uploaded buckets and updates their status to
pendingevictif receipt is found.
- Monitors the status of uploaded buckets and updates their status to
-
Evict Buckets:
- Buckets with
pendingevictstatus are evicted from local storage, updating their status todone.
- Buckets with
-
Enter Index Name:
- Provide the name of the index to process (e.g.,
requests_apm_prod). - Enter
0to skip bucket rebuilding and proceed with pending uploads or evictions.
- Provide the name of the index to process (e.g.,
-
Enter Number of Buckets to Process:
- Specify how many buckets to process for rebuilding. Enter
0to skip this step.
- Specify how many buckets to process for rebuilding. Enter
<index_name>/
db_1651670906_1651669557_0_CEB1E2B6-34F2-40EB-A2EF-10C5531556F9/
<index_name>/db/<sha1[0:2]>/<sha1[2:4]>/<bucketNum>~<serverGUID>/receipt.json
/opt/splunk/var/lib/splunk/<index_name>/db/<bucket_name>/Hosts.data
{
"requests_apm_prod": [
{
"bucket": "db_1651670906_1651669557_0_CEB1E2B6-34F2-40EB-A2EF-10C5531556F9",
"status": "todo"
}
]
}-
Initial Status:
- Scanned from
DDSS_BUCKET_NAMEand categorized based on local and secondary S3 checks:todo,pendingupload,pendingevict,done.
- Scanned from
-
After Rebuilding:
- Updated to
pendingupload.
- Updated to
-
After Uploading:
- Monitored and updated to
pendingevict.
- Monitored and updated to
-
After Evicting:
- Updated to
done.
- Updated to
-
generate_bucket_structure():- Scans
DDSS_BUCKET_NAMEand generates the initialbucket_structure.json.
- Scans
-
process_buckets():- Processes buckets with
todostatus and updates them topendingupload.
- Processes buckets with
-
upload_buckets():- Uploads buckets with
pendinguploadstatus to Splunk SmartStore.
- Uploads buckets with
-
check_buckets():- Monitors uploaded buckets and updates their status to
pendingevict.
- Monitors uploaded buckets and updates their status to
-
evict_buckets():- Evicts buckets with
pendingevictstatus and updates them todone.
- Evicts buckets with
-
AWS Permissions:
- Ensure the IAM user has
s3:ListBucketands3:GetObjectpermissions forDDSS_BUCKET_NAMEandS2_BUCKET_NAME.
- Ensure the IAM user has
-
Splunk API Errors:
- Verify Splunk credentials and ensure the server is accessible at
SPLUNK_URL.
- Verify Splunk credentials and ensure the server is accessible at
-
Local Path Errors:
- Ensure the local Splunk directory exists and matches the configured
LOCAL_BASE_PATH.
- Ensure the local Splunk directory exists and matches the configured
python ddss-restore.py- Enter
requests_apm_prodas the index and5for the number of buckets to process.
- Enter
0for the number of buckets to skip rebuilding and proceed with pending uploads or evictions.
This script is provided under the MIT License. Use at your own risk.