Skip to content

aicers/giganto

Repository files navigation

Giganto: Raw-Event Storage System for AICE

Giganto is a high-performance raw-event storage system, specifically designed for AICE. It is optimized to receive and store raw events through QUIC channels and provides a flexible GraphQL API for querying the stored events. Giganto empowers AICE with the ability to efficiently handle large-scale data processing and real-time analytics.

Coverage Status

Features

  • Scalable Storage: Giganto provides a scalable and distributed storage system, optimized for handling raw events generated by AICE sensors.
  • GraphQL API: Giganto offers a powerful and flexible GraphQL API, enabling developers to query stored events with ease.
  • QUIC Channels: Giganto utilizes the QUIC protocol to enable fast, low-latency communication and data transfer.
  • High Performance: Giganto is designed to efficiently handle high volumes of data, ensuring optimal performance for AICE.

Build

Build Requirements

  • libpcap
  • Clang & LLVM: To build rocksdb

Usage

You can run Giganto by invoking the following command:

giganto -c <CONFIG_PATH> --cert <CERT_PATH> --key <KEY_PATH> --ca-certs \
<CA_CERT_PATH>[,<CA_CERT_PATH>,...] [--log-path <LOG_PATH>]

Arguments

Name Description Required
<CONFIG_PATH> Path to the TOML configuration file. Yes
<CERT_PATH> Path to the certificate file. Yes
<KEY_PATH> Path to the private key file. Yes
<CA_CERT_PATH> Path to the CA certificates file. Yes
<LOG_PATH> Path to the log file where logs will be stored. No

Notes on Arguments

  • The --ca-certs argument accepts multiple values, separated by commas. You can also repeat the argument to specify multiple CA certificates.
  • Logging behavior based on the --log-path argument is as follows:
    • If <LOG_PATH> is not provided, logs are written to stdout using the tracing library.
    • If <LOG_PATH> is provided and writable, logs are written to the specified file using the tracing library.
    • If <LOG_PATH> is provided but not writable, Giganto will terminate.
    • Any logs generated before the tracing functionality is initialized will be written directly to stdout or stderr using println, eprintln, or similar.

Example

  • Run Giganto with local configuration file and multiple CA certificates.
giganto -c path/to/config.toml --cert /path/to/cert.pem --key /path/to/key.pem \
--ca-certs /path/to/ca_cert1.pem,/path/to/ca_cert2.pem

Configuration

In the config file, you can specify the following options:

Field Description Required Default
ingest_srv_addr Address to listen for ingest QUIC No [::]:38370
publish_srv_addr Address to listen for publish QUIC No [::]:38371
graphql_srv_addr Giganto's GraphQL address No [::]:8443
data_dir Path to directory to store data Yes -
retention Retention period for data No 100d
export_dir Path to Giganto's export file Yes -
max_open_files Max open files for database No 8000
max_mb_of_level_base Max MB for RocksDB Level 1 No 512
num_of_thread Number of background threads for DB No 8
max_subcompactions Number of sub-compactions allowed No 2
ack_transmission Ack count for ingestion data No 1024
compression Enable RocksDB compression No false
addr_to_peers Address to listen for peer QUIC No 254.254.254.254:38383
peers List of peer addresses and hostnames No -

The following is an example of how to configure the config file:

ingest_srv_addr = "0.0.0.0:38370"
publish_srv_addr = "0.0.0.0:38371"
graphql_srv_addr = "127.0.0.1:8443"
data_dir = "tests/data"
retention = "100d"
export_dir = "/opt/clumit/var/giganto/export"
max_open_files = 8000
max_mb_of_level_base = 512
num_of_thread = 8
max_subcompactions = 2
ack_transmission = 1024
addr_to_peers = "10.10.11.1:38383"
peers = [ { addr = "10.10.12.1:38383", hostname = "ai" } ]

For the max_mb_of_level_base, the last level has 100,000 times capacity, and it is about 90% of total capacity. Therefore, about db_total_mb / 111111 is appropriate. For example, 90 MB or less for 10 TB Database, 900 MB or less for 100 TB would be appropriate.

These values assume you've used all the way up to level 6, so the actual values may change if you want to grow your data further at the level base. So if it's less than 512 MB, it's recommended to set default value of 512 MB.

If there is no addr_to_peers option in the configuration file, it runs in standalone mode, and if there is, it runs in cluster mode for P2P.

Database Compression Metadata

Giganto stores the compression setting in a metadata file named COMPRESSION within the database directory (specified by data_dir). This file tracks whether RocksDB compression is enabled or disabled for the database.

On startup, Giganto validates that the current compression configuration matches the setting stored in the COMPRESSION metadata file. If there is a mismatch, Giganto will report an error and refuse to start. This prevents data corruption or access issues that could result from changing the compression scheme of an existing database.

Important: Once a database is created with a specific compression setting, changing the compression configuration option is not supported. If you need to change the compression setting, you must recreate the database.

Test

Run Giganto with the prepared configuration file. (Settings to use the certificate/key from the tests folder.)

cargo run -- -c tests/config.toml --cert tests/certs/node1/cert.pem \
--key tests/certs/node1/key.pem --ca-certs tests/certs/ca_cert.pem

Development

Generating GraphQL Schema

To generate the GraphQL schema file, use the gen_schema binary:

cargo run --bin gen_schema --no-default-features

This command reads the GraphQL API definitions and exports the schema to src/graphql/client/schema/schema.graphql.

License

Copyright 2022-2025 ClumL Inc.

Licensed under Apache License, Version 2.0 (the "License"); you may not use this crate except in compliance with the License.

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See LICENSE for the specific language governing permissions and limitations under the License.

Contribution

Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be licensed as above, without any additional terms or conditions.

About

Raw-Event Storage System for AICE

Resources

License

Stars

Watchers

Forks

Contributors 23

Languages