Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
124 changes: 123 additions & 1 deletion eloq_data_store_service/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -131,7 +131,7 @@ To start the DataStore Service:
./dss_server --config=/path/to/s3_config.ini
```

Where s3_config.ini contains:
Where s3_config.ini contains (legacy format):
```ini
[store]
rocksdb_cloud_bucket_name = my-eloqdata-bucket
Expand All @@ -142,6 +142,26 @@ To start the DataStore Service:
aws_secret_key = YOUR_SECRET_KEY
```

**New URL-based configuration (Recommended):**
```ini
[store]
rocksdb_cloud_s3_url = s3://my-eloqdata-bucket/rocksdb_cloud
rocksdb_cloud_region = us-west-2
rocksdb_cloud_sst_file_cache_size = 40GB
aws_access_key_id = YOUR_ACCESS_KEY
aws_secret_key = YOUR_SECRET_KEY
```

**For MinIO or S3-compatible storage:**
```ini
[store]
rocksdb_cloud_s3_url = http://localhost:9000/my-bucket/rocksdb_data
rocksdb_cloud_region = us-east-1
rocksdb_cloud_sst_file_cache_size = 40GB
aws_access_key_id = minioadmin
aws_secret_key = minioadmin
```

8. **Run with logging to stderr**:
```bash
./dss_server --alsologtostderr --ip=192.168.1.100 --port=9200
Expand Down Expand Up @@ -176,6 +196,108 @@ For a production deployment with multiple nodes, you would typically:
./dss_server --eloq_dss_peer_node=192.168.1.100:9100 --ip=192.168.1.101 --port=9100 --data_path=/data/eloqdata/node2 --config=/etc/eloqdata/dss_prod.ini
```

## RocksDB Cloud Configuration Migration Guide

### Overview

Starting from this version, we've introduced a simplified URL-based configuration for RocksDB Cloud (S3/GCS) to reduce configuration complexity. The new `rocksdb_cloud_s3_url` option consolidates multiple configuration parameters into a single URL.

### Why Migrate?

The legacy configuration required multiple separate parameters:
- `rocksdb_cloud_bucket_name`
- `rocksdb_cloud_bucket_prefix`
- `rocksdb_cloud_object_path`
- `rocksdb_cloud_s3_endpoint_url`

The new URL-based configuration simplifies this to a single parameter that's easier to understand and manage.

### Migration Examples

#### Example 1: Standard AWS S3 Configuration

**Legacy Configuration:**
```ini
[store]
rocksdb_cloud_bucket_name = my-production-bucket
rocksdb_cloud_bucket_prefix = eloqkv-
rocksdb_cloud_object_path = rocksdb_data
```

**New URL-based Configuration:**
```ini
[store]
rocksdb_cloud_s3_url = s3://my-production-bucket/rocksdb_data
```

**Note:** The `bucket_prefix` is not supported in URL-based configuration. If you were using `eloqkv-my-production-bucket`, you should include the prefix in the bucket name: `s3://eloqkv-my-production-bucket/rocksdb_data`

#### Example 2: MinIO or S3-compatible Storage

**Legacy Configuration:**
```ini
[store]
rocksdb_cloud_bucket_name = test-bucket
rocksdb_cloud_object_path = eloqdata
rocksdb_cloud_s3_endpoint_url = http://localhost:9000
```

**New URL-based Configuration:**
```ini
[store]
rocksdb_cloud_s3_url = http://localhost:9000/test-bucket/eloqdata
```

#### Example 3: HTTPS S3-compatible Storage

**Legacy Configuration:**
```ini
[store]
rocksdb_cloud_bucket_name = my-bucket
rocksdb_cloud_object_path = data/rocksdb
rocksdb_cloud_s3_endpoint_url = https://s3.custom-provider.com
```

**New URL-based Configuration:**
```ini
[store]
rocksdb_cloud_s3_url = https://s3.custom-provider.com/my-bucket/data/rocksdb
```

### TxLog (Log Service) Configuration

The same migration applies to transaction log service configuration, with `txlog_` prefix:

**Legacy:**
```ini
[local]
txlog_rocksdb_cloud_bucket_name = txlog-bucket
txlog_rocksdb_cloud_bucket_prefix = txlog-
txlog_rocksdb_cloud_object_path = logs
txlog_rocksdb_cloud_endpoint_url =
```

**New:**
```ini
[local]
txlog_rocksdb_cloud_s3_url = s3://txlog-bucket/logs
```

### Important Notes

1. **Precedence:** If both the new URL-based configuration and legacy configuration are provided, the URL-based configuration takes precedence and overrides the legacy settings.

2. **Bucket Prefix Deprecation:** The `bucket_prefix` parameter is not supported in URL-based configuration. If you need a prefix, include it in the bucket name within the URL.

3. **Backward Compatibility:** The legacy configuration options are still supported and will continue to work. However, we recommend migrating to the URL-based configuration for better maintainability.

4. **Protocol Support:** The following protocols are supported:
- `s3://` - AWS S3 (default endpoint)
- `http://` - Custom S3-compatible endpoint (HTTP)
- `https://` - Custom S3-compatible endpoint (HTTPS)

5. **Validation:** Invalid URLs will cause the application to fail at startup with a descriptive error message.

## Storage Backend Configuration

The DataStore Service is compiled with support for specific backend storage technologies. The build defines determine which backend is used. Additional backend-specific configuration can be set in the configuration file.
Expand Down
34 changes: 30 additions & 4 deletions eloq_data_store_service/RocksDB_Configuration_Flags.md
Original file line number Diff line number Diff line change
Expand Up @@ -87,13 +87,39 @@ These flags are only applicable when RocksDB Cloud is enabled with either S3 or

### Cloud Storage Configuration

**New URL-based Configuration (Recommended):**

| Flag Name | Required | Default Value | Format | Description |
|-----------|----------|--------------|--------|-------------|
| `rocksdb_cloud_s3_url` | No | `""` | S3 URL | Complete S3 URL. Takes precedence over legacy config if both provided. |

**URL Format Examples:**
- AWS S3: `s3://my-bucket/my-object-path`
- MinIO (HTTP): `http://localhost:9000/my-bucket/my-object-path`
- S3-compatible (HTTPS): `https://s3.amazonaws.com/my-bucket/my-object-path`

**URL Format Specification:**
- S3: `s3://{bucket_name}/{object_path}`
- HTTP/HTTPS: `http(s)://{host}:{port}/{bucket_name}/{object_path}`

**Supported Protocols:** `s3://`, `http://`, `https://`

**Legacy Configuration (Deprecated, use URL-based config instead):**

| Flag Name | Required | Default Value | Format | Description |
|-----------|----------|--------------|--------|-------------|
| `rocksdb_cloud_bucket_name` | No | `"rocksdb-cloud-test"` | String | Cloud storage bucket name (deprecated) |
| `rocksdb_cloud_bucket_prefix` | No | `"eloqkv-"` | String | Prefix for objects in the bucket (deprecated, not supported in URL config) |
| `rocksdb_cloud_object_path` | No | `"rocksdb_cloud"` | String | Path within the bucket (deprecated) |
| `rocksdb_cloud_s3_endpoint_url` | No | `""` | String | S3-compatible object store endpoint URL (deprecated) |

**Note:** If both `rocksdb_cloud_s3_url` and legacy options are provided, the URL-based configuration takes precedence and overrides the legacy settings.

**Other Cloud Configuration:**

| Flag Name | Required | Default Value | Format | Description |
|-----------|----------|--------------|--------|-------------|
| `rocksdb_cloud_bucket_name` | No | `"rocksdb-cloud-test"` | String | Cloud storage bucket name |
| `rocksdb_cloud_bucket_prefix` | No | `"eloqkv-"` | String | Prefix for objects in the bucket |
| `rocksdb_cloud_object_path` | No | `"rocksdb_cloud"` | String | Path within the bucket |
| `rocksdb_cloud_region` | No | `"ap-northeast-1"` | String | Cloud region |
| `rocksdb_cloud_s3_endpoint_url` | No | `""` | String | S3-compatible object store endpoint URL (for development) |

### Cloud Cache Configuration

Expand Down
61 changes: 49 additions & 12 deletions eloq_data_store_service/rocksdb_cloud_data_store.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -239,13 +239,13 @@ rocksdb::S3ClientFactory RocksDBCloudDataStore::BuildS3ClientFactory(
if (secured_url)
{
config.scheme = Aws::Http::Scheme::HTTPS;
// Disable SSL verification for test env if necessary
// config.verifySSL = false;
}
else
{
config.scheme = Aws::Http::Scheme::HTTP;
}
// Disable SSL verification for HTTPS
config.verifySSL = false;

// Create and return the S3 client
if (credentialsProvider)
Expand Down Expand Up @@ -296,14 +296,51 @@ bool RocksDBCloudDataStore::StartDB()
}
#endif

cfs_options_.src_bucket.SetBucketName(cloud_config_.bucket_name_,
cloud_config_.bucket_prefix_);
// Determine effective bucket configuration
// S3 URL takes precedence over legacy configuration
std::string effective_bucket_name = cloud_config_.bucket_name_;
std::string effective_bucket_prefix = cloud_config_.bucket_prefix_;
std::string effective_object_path = cloud_config_.object_path_;
std::string effective_endpoint_url = cloud_config_.s3_endpoint_url_;

if (!cloud_config_.s3_url_.empty())
{
// Parse S3 URL and use it (overrides legacy config)
S3UrlComponents url_components = ParseS3Url(cloud_config_.s3_url_);
if (!url_components.is_valid)
{
LOG(FATAL) << "Invalid rocksdb_cloud_s3_url: "
<< url_components.error_message
<< ". URL format: s3://{bucket}/{path} or "
"http(s)://{host}:{port}/{bucket}/{path}. "
<< "Examples: s3://my-bucket/my-path, "
<< "http://localhost:9000/my-bucket/my-path";
}

effective_bucket_name = url_components.bucket_name;
effective_bucket_prefix = ""; // No prefix in URL-based config
effective_object_path = url_components.object_path;
effective_endpoint_url = url_components.endpoint_url;

LOG(INFO) << "Using S3 URL configuration (overrides legacy config if "
"present): "
<< cloud_config_.s3_url_
<< " (bucket: " << effective_bucket_name
<< ", object_path: " << effective_object_path
<< ", endpoint: "
<< (effective_endpoint_url.empty() ? "default"
: effective_endpoint_url)
<< ")";
}

cfs_options_.src_bucket.SetBucketName(effective_bucket_name);
cfs_options_.src_bucket.SetBucketPrefix(effective_bucket_prefix);
cfs_options_.src_bucket.SetRegion(cloud_config_.region_);
cfs_options_.src_bucket.SetObjectPath(cloud_config_.object_path_);
cfs_options_.dest_bucket.SetBucketName(cloud_config_.bucket_name_,
cloud_config_.bucket_prefix_);
cfs_options_.src_bucket.SetObjectPath(effective_object_path);
cfs_options_.dest_bucket.SetBucketName(effective_bucket_name);
cfs_options_.dest_bucket.SetBucketPrefix(effective_bucket_prefix);
cfs_options_.dest_bucket.SetRegion(cloud_config_.region_);
cfs_options_.dest_bucket.SetObjectPath(cloud_config_.object_path_);
cfs_options_.dest_bucket.SetObjectPath(effective_object_path);
// Add sst_file_cache for accerlating random access on sst files
// use 2^5 = 32 shards for the cache, each shard has sst_file_cache_size_/32
// bytes capacity
Expand Down Expand Up @@ -338,10 +375,10 @@ bool RocksDBCloudDataStore::StartDB()
<< cfs_options_.purger_periodicity_millis << " ms"
<< ", run_purger: " << cfs_options_.run_purger;

if (!cloud_config_.s3_endpoint_url_.empty())
if (!effective_endpoint_url.empty())
{
cfs_options_.s3_client_factory =
BuildS3ClientFactory(cloud_config_.s3_endpoint_url_);
BuildS3ClientFactory(effective_endpoint_url);
// Intermittent and unpredictable IOError happend from time to
// time when using aws transfer manager with minio. Disable aws
// transfer manager if endpoint is set (minio).
Expand Down Expand Up @@ -447,8 +484,8 @@ bool RocksDBCloudDataStore::OpenCloudDB(
// boost write performance by enabling unordered write
options.unordered_write = true;
// skip Consistency check, which compares the actual file size with the size
// recorded in the metadata, which can fail when skip_cloud_files_in_getchildren is
// set to true
// recorded in the metadata, which can fail when
// skip_cloud_files_in_getchildren is set to true
options.paranoid_checks = false;

// print db statistics every 60 seconds
Expand Down
51 changes: 43 additions & 8 deletions eloq_data_store_service/rocksdb_cloud_dump.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -39,16 +39,23 @@
#include <unordered_map>
#include <vector>

#include "rocksdb_config.h" // For S3UrlComponents and ParseS3Url

// Define command line flags
DEFINE_string(aws_access_key_id, "", "AWS access key ID");
DEFINE_string(aws_secret_key, "", "AWS secret access key");
DEFINE_string(bucket_name, "", "S3 bucket name");
DEFINE_string(bucket_prefix, "", "S3 bucket prefix");
DEFINE_string(s3_url,
"",
"S3 URL. Format: s3://{bucket}/{path} or "
"http(s)://{host}:{port}/{bucket}/{path}. "
"Takes precedence over legacy config if provided");
DEFINE_string(bucket_name, "", "S3 bucket name (legacy, use s3_url instead)");
DEFINE_string(bucket_prefix, "", "S3 bucket prefix (legacy, not supported with s3_url)");
DEFINE_string(object_path,
"rocksdb_cloud",
"S3 object path for RocksDB Cloud storage");
"S3 object path for RocksDB Cloud storage (legacy, use s3_url instead)");
DEFINE_string(region, "us-east-1", "AWS region");
DEFINE_string(s3_endpoint, "", "Custom S3 endpoint URL (optional)");
DEFINE_string(s3_endpoint, "", "Custom S3 endpoint URL (optional, legacy, use s3_url instead)");
DEFINE_string(db_path, "./db", "Local DB path");
DEFINE_bool(list_cf, false, "List all column families");
DEFINE_bool(opendb, false, "Open the DB only");
Expand Down Expand Up @@ -114,11 +121,39 @@ CmdLineParams parse_arguments()

params.aws_access_key_id = FLAGS_aws_access_key_id;
params.aws_secret_key = FLAGS_aws_secret_key;
params.bucket_name = FLAGS_bucket_name;
params.bucket_prefix = FLAGS_bucket_prefix;
params.object_path = FLAGS_object_path;

// Check if s3_url was provided (takes precedence over legacy config)
if (!FLAGS_s3_url.empty())
{
EloqDS::S3UrlComponents url_components = EloqDS::ParseS3Url(FLAGS_s3_url);
if (!url_components.is_valid)
{
throw std::runtime_error("Invalid s3_url: " + url_components.error_message +
". URL format: s3://{bucket}/{path} or "
"http(s)://{host}:{port}/{bucket}/{path}");
}

params.bucket_name = url_components.bucket_name;
params.bucket_prefix = ""; // No prefix in URL-based config
params.object_path = url_components.object_path;
params.s3_endpoint_url = url_components.endpoint_url;

LOG(INFO) << "Using S3 URL configuration: " << FLAGS_s3_url
<< " (bucket: " << params.bucket_name
<< ", object_path: " << params.object_path
<< ", endpoint: " << (params.s3_endpoint_url.empty() ? "default" : params.s3_endpoint_url)
<< ")";
}
else
{
// Use legacy configuration
params.bucket_name = FLAGS_bucket_name;
params.bucket_prefix = FLAGS_bucket_prefix;
params.object_path = FLAGS_object_path;
params.s3_endpoint_url = FLAGS_s3_endpoint;
}

params.region = FLAGS_region;
params.s3_endpoint_url = FLAGS_s3_endpoint;
params.db_path = FLAGS_db_path;
params.list_cf = FLAGS_list_cf;
params.opendb = FLAGS_opendb;
Expand Down
Loading