End-to-end (simulated) system to monitor stock and batches for a supermarket, generate real-time events (foot traffic, shelf interactions, POS transactions), and apply alerting and replenishment logic via Kafka + Spark Structured Streaming, with persistence on Delta Lake and PostgreSQL.
This repository is meant to be run primarily via
docker compose.
Authors of the project:
- Camilla Bonomo
- Chiara Belli
- Enrico Maria Guarnuto
- Smart Shelf — Big Data Project
- Table of Contents
- Repository Structure
- How to Run (Quickstart)
- Project Overview
- Architecture
- Tech Stack
- Data & Topics
- Simulated Time
- Kafka (Producers & Consumers)
- Spark Apps
- Kafka Connect
- PostgreSQL Schemas
- Streamlit Dashboard
- ML / Demand Forecasting
- Performance
- Lessons learned
- Limitations and future work
- References and acknowledgments
.
├── docker-compose.yml # Full stack (Kafka/Redis/Postgres/Kafka Connect/Spark apps/ML dashboard)
├── conf/ # Spark config (spark-defaults.conf)
├── data/ # Synthetic datasets + generators + CSVs for Postgres bootstrap
│ ├── create_db.py
│ ├── create_simulation_data.py
│ ├── create_discounts.py
│ ├── store_inventory_final.parquet
│ ├── warehouse_inventory_final.parquet
│ ├── store_batches.parquet
│ ├── warehouse_batches.parquet
│ ├── all_discounts.parquet
│ ├── sim_out/ # Generated simulation outputs
├── delta/ # Local Delta Lake (raw/cleansed/curated/ops/analytics + staging + _checkpoints)
├── dashboard/ # Streamlit dashboard (map, alerts, state, ML)
│ ├── app.py
│ ├── store_layout.yaml
│ └── assets/
├── images/ # Logo + diagrams + dashboard screenshots
├── kafka-components/ # Kafka components (topics, connect, producers)
│ ├── kafka-init/
│ ├── kafka-connect/
│ ├── kafka-producer-foot_traffic/
│ ├── kafka-producer-shelf/
│ ├── kafka-producer-pos/
│ ├── redis-kafka-bridge/
│ ├── shelf-daily-features/
│ ├── daily-discount-manager/
│ ├── removal-scheduler/
│ └── wh-supplier-manager/
├── ml-model/ # ML services (training + inference)
│ ├── training-service/
│ └── inference-service/
├── models/ # Model artifacts + feature columns (mounted to /models)
├── postgresql/ # DDL + init SQL (ref/config/state/ops/analytics + views)
│ ├── 01_make_db.sql
│ ├── 02_load_ref_from_csv.sql
│ ├── 03_load_config.sql
│ ├── 04_load_features_panel.sql
│ └── 05_views.sql
├── simulated_time/ # Shared simulated clock (Redis)
├── spark-apps/ # Spark Structured Streaming jobs
│ ├── deltalake/
│ ├── foot-traffic-raw-sink/
│ ├── shelf-aggregator/
│ ├── batch-state-updater/
│ ├── shelf-alert-engine/
│ ├── shelf-restock-manager/
│ ├── shelf-refill-bridge/
│ ├── wh-aggregator/
│ ├── wh-batch-state-updater/
│ ├── wh-alert-engine/
│ └── alerts-sink/
- Docker + Docker Compose
- Internet connection during build (some images download dependencies/JARs)
- Python 3.x (Required for pre-launch data generation)
- Minimum resources (full stack, all services):
- CPU: 8 vCPU
- RAM: 16 GB Below that (e.g., 4 vCPU / 8 GB) the stack may still boot, but Spark jobs will contend heavily with Kafka/Postgres and you should expect frequent backpressure, long startup times, or OOMs during backfill.
git clone <https://github.com/EnricoMGuarnuto/Big_Data_Project.git>
cd Big_Data_ProjectOnce the repository is cloned, it is required to run the two main files that create the simulation data:
python data/create_db.py
python data/create_discounts.py
python data/create_simulation_data.pyStart the Stack:
docker compose up -d --buildAlternative (limited CPU/RAM): start without the Kafka producers (simulators), then enable them once the stack is stable.
-
- Run the stack without the Kafka producers (simulators):
docker compose up -d --build $(docker compose config --services | grep -v '^kafka-producer')-
- Wait ~5–8 minutes and then:
Enable the Kafka producers (simulators):
docker compose up -d --build $(docker compose config --services | grep '^kafka-producer')Verify services are up:
docker compose psMain services exposed:
- Postgres:
localhost:5432 - Kafka Connect:
localhost:8083 - Dashboard (optional):
http://localhost:8501
Stop the stack:
docker compose downFor prerequisites, optional services, and troubleshooting see How to Run.
This project simulates a Smart Shelf retail system end to end: from raw store activity, to operational decisions, to a live dashboard and a forecasting model.
In plain terms, the goal is to answer:
- What is happening in the store right now?
- What needs attention soon (alerts)?
- What should we restock, and when?
- Can we forecast demand early enough to plan supplier deliveries?
- Simulate store activity. We generate realistic streams of events such as customer traffic, shelf interactions (pickup/putback/weight change), POS sales, and warehouse movements.
- Build the current state. Streaming jobs continuously turn those raw events into the latest “truth” (current shelf levels, warehouse levels, and batch/expiry state).
- Trigger operational decisions. On top of the state, the system produces alerts and replenishment plans (store restock and supplier plans).
- Forecast ahead of cutoff times. An ML pipeline runs on daily features and predicts how many batches should be ordered before supplier cutoffs.
- Make everything visible. A dashboard surfaces the live map, alerts, states, plans, and ML outputs so the system can be inspected during a run.
- Aim: show a complete, working blueprint of a modern retail data pipeline that supports real-time monitoring and decision-making.
- Ambition: go beyond “just streaming” by combining simulation, stateful processing, operational planning, and ML-driven forecasting in one coherent system.
Note on workload and data realism: this project is computationally heavy because it simulates the full pipeline. In a production scenario, much of this data would come from real retailer APIs or existing datasets. Since those were not available, we generated synthetic event and inventory/batch data to drive and validate the system under realistic assumptions. The synthetic catalog and structures were modeled after real operational data provided by a Conad supermarket (Scandiano (RE), Italy), giving the simulation a representative mix of product categories and turnover patterns.
Current implemented ingestion path for simulated store events: Data generators -> Redis Streams -> redis-kafka-bridge -> Kafka -> Spark.
| Technology | Role/Purpose |
|---|---|
| Python | Main language (dataset generation, Kafka producers, services/apps) |
| Docker / Docker Compose | Containerization and orchestration of the whole stack |
| Apache Kafka | Message broker for real-time streaming across components |
| Kafka Connect (JDBC) | PostgreSQL ↔ Kafka synchronization (metadata sources + state/ops sinks) |
| Redis Streams | “Before Kafka” buffer for producers (resilience/backpressure) |
| Apache Spark 3.5 (Structured Streaming) | Streaming processing, aggregations, alerting, and flow orchestration |
| Delta Lake | Filesystem-based data lake (./delta) for raw/cleansed/ops/analytics layers + checkpoints |
| PostgreSQL 16 | Database for snapshots (ref), config/policy (config), state (state), ops (ops) and analytics |
- Python — Main programming language Used everywhere: data generation, producers/consumers, Spark job logic, ML services, and the Streamlit dashboard. It keeps the project consistent and fast to iterate on.
- Docker / Docker Compose — Containerization and orchestration This project depends on many services (Kafka, Postgres, Spark, Redis, etc.). Compose makes the environment reproducible and lets you start/stop the full stack with one command.
- Apache Kafka — Distributed event streaming platform Kafka is the central “event bus.” Producers write events once, and multiple consumers can process them independently. We also use compacted topics to represent the latest state/metadata.
- Apache ZooKeeper — Kafka coordination (single-broker setup) The Kafka image used here still relies on ZooKeeper for broker coordination, so we run it as part of the stack.
- Kafka Connect + JDBC Connector — Data integration layer This is the “bridge” between Postgres and Kafka. It sources reference/config tables into Kafka and sinks compacted state back into Postgres without custom ingestion code.
- Redis Streams + Redis-Kafka bridge — Buffer and backpressure layer
Simulated producers write only to Redis Streams. A dedicated
redis-kafka-bridgeservice reads Redis via consumer groups and forwards to Kafka, acknowledging Redis messages only after Kafka publish success. - kafka-python — Kafka client for Python Used by the simulated producers and some utility jobs to publish/consume Kafka messages from Python.
- confluent-kafka — Kafka client for high-throughput services Used by services like the supplier/discount managers where a faster native client is helpful.
- redis-py — Redis client for Python Used to read/write Redis Streams and simulated-time keys from Python components.
- Apache Spark (Structured Streaming) — Stateful stream processing Spark is the main stream processor. It builds shelf/warehouse state, generates alerts, and computes restock/supplier plans using stateful streaming + checkpoints.
- Delta Lake (
delta-spark+deltalake) — ACID tables on files Delta Lake gives us reliable tables on top of local files (./delta). We use it for raw/cleansed/ops/analytics layers plus streaming sinks and checkpoints. - Apache Arrow / Parquet (
pyarrow) — Columnar storage and I/O Parquet is the main file format for bootstrap data, andpyarrowkeeps Parquet/Arrow I/O fast and compatible across Spark, Delta, and Python.
- PostgreSQL 16 — Relational database Postgres is the serving database. It stores reference snapshots, configuration/policies, state mirrors, ops tables, and analytics that are easy to query.
- psycopg2 + SQLAlchemy — Postgres connectivity
These are the main Python DB tools:
psycopg2for direct access and SQLAlchemy for safer, cleaner query/connection handling.
- Streamlit — Interactive dashboard Streamlit lets us build a useful dashboard quickly, without a separate frontend stack.
- Plotly — Interactive charts Plotly provides interactive charts inside Streamlit for exploring trends and state.
- scikit-learn — Baseline ML + preprocessing Used for preprocessing utilities and simple, dependable baseline ML pieces.
- XGBoost — Gradient-boosted trees The main forecasting model. It performs very well on tabular features like the ones in this project.
- joblib — Model persistence Used to save/load trained models reliably across training and inference.
- pandas + NumPy — Data wrangling Used throughout for feature engineering, dataset manipulation, and general “glue” code.
- PyYAML — Configuration parsing Used to keep configuration readable instead of hardcoding parameters.
- Pillow — Image handling Used in the dashboard for lightweight image loading/manipulation.
- streamlit-autorefresh / streamlit-plotly-events — Dashboard UX helpers Small helpers that improve dashboard interactivity (auto-refresh, click/selection events).
The main (generated) files are in data/:
data/store_inventory_final.parquetdata/warehouse_inventory_final.parquetdata/store_batches.parquetdata/warehouse_batches.parquetdata/all_discounts.parquetdata/create_discounts.pydata/create_simulation_data.py
For DB bootstrap, CSVs are also created in data/db_csv/ (used by scripts in postgresql/).
To regenerate everything:
python3 data/create_db.pySince, for privacy reasons, it was not possible to access real supermarket databases, we manually generated the inventory datasets, historical data of 1 year of sales, and the sensor simulators. Specifically:
- Store inventory table: 1,142 rows, max stock avg 48.13, current stock avg 38.85, visibility range 0.05–0.20 (
data/store_inventory_final.parquet). - Warehouse inventory table: 1,142 rows, warehouse capacity avg 306.23 units (
data/warehouse_inventory_final.parquet). - Batch-level tables: 1,830 store batches and 2,250 warehouse batches (includes expiry dates) (
data/store_batches.parquet,data/warehouse_batches.parquet). - Weekly discounts table: 9,360 rows with
discount_pctrange 0.0501–0.35 (data/all_discounts.parquet). - Total simulated sales: 2,808,135 units across 2025 (
data/sim_out/synthetic_1y_panel.csv).
Event sources simulated:
- Shelf sensors: pick-up, put-back, weight change events.
- Foot traffic sensors: customer entry, exit, and session duration.
- POS transactions: sales events linked to customers and products.
- Warehouse events: inbound and outbound stock movements.
Although synthetic, the data generation process preserves realistic temporal patterns and inter-dependencies between events. The simulation logic and datasets broadly follow the information provided to us through a conversation with the Conad of Scandiano regarding the operation and organization of the store.
Created by kafka-components/kafka-init/init.py.
- Append‑only (event streams):
shelf_events,pos_transactions,foot_traffic,wh_events,alerts - Compacted (state/metadata):
shelf_state,wh_state,shelf_batch_state,wh_batch_state,daily_discounts,shelf_restock_plan,wh_supplier_plan,shelf_policies,wh_policies,batch_catalog,shelf_profiles
delta/raw/:shelf_events,pos_transactions,foot_traffic,wh_eventsdelta/cleansed/:shelf_state,wh_state,shelf_batch_state,wh_batch_statedelta/curated/:product_total_state,features_store,predictionsdelta/ops/:alerts,shelf_restock_plan,wh_supplier_plan,wh_supplier_orders,wh_supplier_receiptsdelta/analytics/:daily_discountsdelta/staging/:shelf_refill_pendingdelta/_checkpoints/: Spark Structured Streaming checkpoints
The project includes a shared simulated clock that writes to Redis:
- keys:
sim:now(timestamp) andsim:today(date). - used by producers and Spark jobs to keep a consistent timeline during demos/tests.
Configuration (main env vars):
USE_SIMULATED_TIME(default0, use1to enable the clock)TIME_MULTIPLIER(default1.0)STEP_SECONDS(default1)SIM_DAYS(default365, fromSIM_STARTdefined insimulated_time/config.py)
Kafka is the central event bus. Different components publish events, while others consume them to build state, trigger actions, or persist results.
kafka-components/kafka-init/: creates the topics used by the pipeline (append-only events and compacted state/metadata).kafka-components/kafka-producer-foot_traffic/: simulates customer entries/exits and buffersfoot_trafficevents on Redis Streams.kafka-components/kafka-producer-shelf/: simulates shelf sensor activity (pickup/putback/weight change) and buffersshelf_eventson Redis Streams.kafka-components/kafka-producer-pos/: simulates carts/checkouts and bufferspos_transactionson Redis Streams (it also listens to shelf/foot traffic to keep the simulation coherent).kafka-components/redis-kafka-bridge/: forwards buffered Redis events to Kafka topics (foot_traffic,foot_traffic_realistic,shelf_events,pos_transactions).
- Spark jobs in
spark-apps/are the main Kafka consumers. They turn raw events into state, alerts, and plans. kafka-components/wh-supplier-manager/: consumes supplier plans around cutoff times, issues supplier orders/receipts, and publishes the resultingwh_events.kafka-components/daily-discount-manager/: publishes daily discount updates (and mirrors them to Delta).kafka-components/removal-scheduler/: publishes removal events for expirations/cleanup scenarios.kafka-components/shelf-daily-features/: runs a daily batch that reads operational tables and writesanalytics.shelf_daily_featuresfor ML.
Kafka Connect is described separately below because it is an integration layer between Kafka and Postgres rather than a simulation or processing component.
Spark jobs live in spark-apps/ and run as dedicated containers (base image apache/spark:3.5.1-python3).
In this project, Spark is the “state engine.” It continuously consumes Kafka events and maintains the latest operational state, then derives alerts and plans on top of that state.
Initialization:
spark-apps/deltalake/: initializes empty Delta tables for raw/cleansed/ops/analytics layers.
Raw sinks:
spark-apps/foot-traffic-raw-sink/: Kafkafoot_traffic→ Delta raw.
State builders:
spark-apps/shelf-aggregator/: Kafkashelf_events(+shelf_profiles) → Delta raw + Deltashelf_state+ compactedshelf_state.spark-apps/batch-state-updater/: Kafkapos_transactions→ Delta raw + Deltashelf_batch_state+ compactedshelf_batch_state.spark-apps/wh-aggregator/: Kafkawh_events→ Deltawh_state+ compactedwh_state.spark-apps/wh-batch-state-updater/: Kafkawh_events→ Deltawh_batch_state(+ mirror store batches) + compacted topics.
Alerting and planning:
spark-apps/shelf-alert-engine/:shelf_state(+ policies + batch mirror) →alerts+shelf_restock_plan(Kafka + Delta).spark-apps/wh-alert-engine/:wh_state(+ policies + batch mirror) →alerts+wh_supplier_plan(Kafka + Delta).
Operational feedback loops:
spark-apps/shelf-restock-manager/:shelf_restock_plan→wh_events(FIFO picking).spark-apps/shelf-refill-bridge/:wh_events(wh_out) →shelf_events(putback/weight_change) with a delay to simulate travel time.
Serving and persistence:
spark-apps/alerts-sink/:alerts→ Delta (+ optional Postgres).
Note: for Structured Streaming consumers you can set MAX_OFFSETS_PER_TRIGGER to avoid huge micro-batches on the first run with STARTING_OFFSETS=earliest.
Connectors are registered by connect-init (kafka-components/kafka-connect/connect-init/init.py), and the container then exits.
- Postgres → Kafka (source, compacted metadata):
batch_catalog,shelf_profiles - Kafka → Postgres (sink):
- upsert:
shelf_state,wh_state,shelf_batch_state,wh_batch_state,daily_discounts,shelf_restock_plan,wh_supplier_plan - insert append‑only:
wh_events
- upsert:
Note: to avoid duplicates, by default the alerts sink is not created (it is already handled by alerts-sink with WRITE_TO_PG=1), and the pos_transactions sink is not created (nested events not directly mappable to ops.pos_transactions/ops.pos_transaction_items with the JDBC sink).
Initialization in postgresql/01_make_db.sql with schemas:
ref: snapshots imported from CSVs (inventory and batches).config: policies and metadata (e.g.shelf_policies,wh_policies).state: state materializations (mirrors of compacted topics).ops: operational events, alerts, plans.analytics: analytics tables (e.g.daily_discounts).
CSVs are loaded by postgresql/02_load_ref_from_csv.sql, and default policies are populated by postgresql/03_load_config.sql.
Relevant analytics tables/views:
analytics.shelf_daily_features: daily feature engineering (ML inputs).analytics.v_ml_features,analytics.v_ml_train: views for inference/training.analytics.ml_models,analytics.ml_predictions_log: model registry and prediction logs.
Streamlit dashboard in dashboard/app.py that shows:
- interactive store map with clickable zones and alert overlay.
- shelf/warehouse state tables from Delta with filters by category, subcategory, and shelf.
- supplier plan view with next delivery date (Delta first, Postgres fallback).
- ML model registry and prediction explorer from Postgres.
- raw alerts table plus debug/config panel.
- uses simulated time when available; otherwise falls back to UTC.
- Inventory + layout:
data/db_csv/store_inventory_final.csv,dashboard/store_layout.yaml,dashboard/assets/supermarket_map.png - Delta:
delta/cleansed/shelf_state,delta/cleansed/wh_state,delta/cleansed/shelf_batch_state,delta/cleansed/wh_batch_state,delta/ops/wh_supplier_plan,delta/ops/wh_supplier_orders - Postgres (read-only):
ops.alerts,ops.wh_supplier_plan(fallback),analytics.ml_models,analytics.ml_predictions_log
docker compose up -d dashboardport: 8501
The dashboard consolidates real-time-ish operations (alerts, states, supplier plans) with ML outputs to show how the pipeline behaves end-to-end during a simulated run.
Short clip of the live store view where alerts appear and clear over time.
This shows the moment a supplier plan is generated and marked as planned/pending around the cutoff.
This shows the plan after delivery/restock has been executed and the state is updated.
Model registry view showing that training completed and a model is available for inference.
Prediction explorer showing the suggested order quantity produced by the model.
Raw alerts table with the latest alerts as they stream through the pipeline.
The ML pipeline consists of:
- Daily feature engineering – service
kafka-components/shelf-daily-featuresthat populatesanalytics.shelf_daily_featuresand appends the same daily snapshot todelta/curated/features_store. - ML views –
analytics.v_ml_featuresandanalytics.v_ml_train(defined inpostgresql/05_views.sql). - Training –
ml-model/training-serviceservice (XGBoost + time-series CV), writes model artifacts tomodels/and registers metadata inanalytics.ml_models. - Inference –
ml-model/inference-serviceservice that loads the latest model, generates pending supplier plans around cutoff times (Sun/Tue/Thu 12:00 UTC), logs predictions inanalytics.ml_predictions_log, and appends them todelta/curated/predictions(with optional Kafka publish for plans).
- master data: category/subcategory.
- calendar:
day_of_week,is_weekend,refill_day. - pricing/discount: price, discount, current/future discount flags.
- demand:
people_count, sales in the last 1/7/14 days. - shelf stock: capacity, fill ratio, threshold, expired, recent alerts.
- warehouse stock: capacity, fill ratio, reorder point, pending supplier, expired, recent alerts.
- expirations and batches:
standard_batch_size,min/avg_expiration_days,qty_expiring_next_7d. - training target:
batches_to_order(training only).
analytics.shelf_daily_featuresanalytics.ml_modelsanalytics.ml_predictions_logdelta/curated/features_storedelta/curated/predictionsops.wh_supplier_plan(+delta/ops/wh_supplier_plan)models/<MODEL_NAME>_<YYYYMMDD_HHMMSS>.jsonmodels/<MODEL_NAME>_feature_columns.json
The models/ directory is mounted to /models in Docker Compose for both training and inference.
docker compose up -d sim-clock shelf-daily-features training-service inference-serviceNote:
training-serviceruns once and then exits (may skip ifRETRAIN_DAYSis set and the model is still fresh).inference-servicestays idle and triggers only around the configured cutoff schedule (seeCUTOFF_*env vars).
This project runs multiple stateful services (Kafka, Postgres, Redis, Spark streaming jobs, optional Kafka Connect, ML services, and a Streamlit UI). Running the full stack is CPU- and RAM-intensive, and the first run can be particularly heavy when Spark jobs start from STARTING_OFFSETS=earliest (they may backfill large portions of the topics).
- Producers:
kafka-producer-foot-traffic,kafka-producer-pos: tuneSLEEPandTIME_SCALE.kafka-producer-shelf: tuneSHELF_SLEEP.
- Simulated clock: tune
TIME_MULTIPLIERandSTEP_SECONDS(see.envandsimulated_time/README.md). - Spark consumers:
- Prefer
STARTING_OFFSETS=latestfor iterative development; useearliestonly when you intentionally want a full replay. - Reduce
MAX_OFFSETS_PER_TRIGGERif you see memory pressure or very large micro-batches on startup.
- Prefer
The delta/ directory (tables + _checkpoints/) can grow quickly during streaming runs. Checkpoint paths encode the streaming progress; removing checkpoints forces a replay from Kafka offsets and should be done only when you explicitly want to reset consumer state.
One of the main challenges was designing Kafka topics that guarantee full traceability through append-only logs while still enabling fast access to state via compaction. Ensuring correctness in stateful streaming also required careful attention, particularly in the management of stable keys and rigorous checkpointing. However, due to the very nature of the simulation-based architecture, it is not feasible to avoid resetting checkpoints every time the project is started. Doing so would also require checkpointing the simulated time, an option that was not implemented. In addition, reliably orchestrating a large number of containers proved to be complex, especially when handling service dependencies such as database initialization, connector startup, and the execution of Spark jobs.
Using a simulated time source made demos and tests far more predictable, allowing expiry rules, reorder cycles, and cutoffs to be reproduced reliably. The clear separation between raw and derived data sped up debugging and made analytical queries easier to write and reason about.
Stateful processing was particularly sensitive to key drift and inconsistent checkpoints, where even small mistakes could cascade into incorrect state. Due to the need to generate simulated data, the program cannot be supported on machines with very limited CPU and RAM. Container startup was fragile without explicit dependency sequencing and better operational runbooks.
- Data is synthetic/simulated; real deployments need integration with real sensor systems and stronger data quality guarantees.
- Single-node Docker setup (single Kafka broker, local Delta filesystem) does not represent production scaling.
- Stateful Spark jobs may become bottlenecked by state size, shuffle-heavy joins, and checkpoint I/O as event rate count increases.
- Checkpoints need to be deleted after every new run.
- CPU and RAM usage are very high.
- Kafka throughput/partitions (insufficient parallelism), Spark micro-batch latency, and Delta checkpoint growth/storage bandwidth.
- PostgreSQL write amplification if too many derived tables are sunk at high frequency.
- Containers go down not for errors but for OOM issues that need to be handled.
- Move Delta tables to object storage to separate compute from storage and handle growth reliably.
- Use a schema registry and compatibility rules so producers can change event formats without silently breaking downstream consumers.
- Always compare XGBoost against a simple baseline (like a moving average) to prove the model actually adds value.
- Efficient recovery of the program from checkpoints if it is shut down during processing.
- Optimise the usage of CPU and RAM.
- Drive Research. Grocery store statistics: Where, when, and how much people grocery shop. https://www.driveresearch.com/market-research-company-blog/grocery-store-statistics-where-when-how-much-people-grocery-shop/
- Huang, A., Zhang, Z., Yang, L., & Ou, J. (2024). Design of Intelligent Warehouse Management System. 2024 4th International Conference on Electronic Information Engineering and Computer (EIECT), 534–538. https://doi.org/10.1109/EIECT64462.2024.10866043
- Conad supermarket (Scandiano (RE), Italy) for providing real-world category/inventory structure used to model the synthetic data.
- External contributors/maintainers of the open-source libraries used in the stack.








