Skip to content

Project Kandro is a decentralized platform for dataset sharing and monetization. It features data quality evaluation, dataset upload/view/purchase functionality, and a community discussion forum. The backend manages data quality and uploads, while the frontend offers a user interface for dataset interaction and blockchain-based smart contracts.

Notifications You must be signed in to change notification settings

Sabeshragav/project-kandro

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

63 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Project Kandro

Overview

Project Kandro is a decentralized platform for sharing and monetizing datasets. It features a system for evaluating data quality, enabling users to upload, view, and purchase datasets. The platform also includes a discussion forum for community interaction. The backend handles data quality checks and file uploads, while the frontend provides the user interface for interacting with datasets and smart contracts on a blockchain.

Project Structure

The project is organized into two main directories:

  • backend/: Contains the Node.js server with Express.js for API endpoints, Python scripts for data quality analysis, and machine learning model for quality scoring.
  • frontend/: Contains the React application built with Vite, interacting with a Solidity smart contract for dataset management on the blockchain.
  • data/: Contains dataset metadata.

Technologies Used

Frontend

  • React: JavaScript library for building user interfaces.
  • Vite: Fast build tool and development server.
  • Tailwind CSS: Utility-first CSS framework.
  • Ethers.js/Web3.js: Libraries for interacting with Ethereum smart contracts.
  • Solidity: Language for writing smart contracts.
  • React Router: For navigation within the React application.
  • Axios: For making HTTP requests to the backend.

Backend

  • Node.js: JavaScript runtime environment.
  • Express.js: Web application framework for Node.js.
  • Python: For data quality analysis scripts.
    • Pandas, NumPy: For data manipulation.
    • XGBoost, scikit-learn: For the dataset quality prediction model.
    • Joblib: For saving/loading the trained model.
    • pyclamd, tika, pyod: For additional data validation and analysis.
  • Multer: Middleware for handling multipart/form-data (file uploads).
  • Pinata SDK / web3.storage: For decentralized file storage (likely IPFS).
  • ClamAV.js: For virus scanning uploads.

Smart Contracts

  • Solidity: Used to write the DatasetStorage contract.
  • OpenZeppelin Contracts: For utility libraries like Strings.sol.

Setup and Installation

Prerequisites

  • Node.js and npm (or yarn)
  • Python and pip
  • A blockchain development environment (e.g., Hardhat, Ganache) if running smart contracts locally.
  • Access to an Ethereum-compatible blockchain and a wallet (e.g., MetaMask) for frontend interaction.

Backend Setup

  1. Navigate to the backend directory:
    cd backend
  2. Install Node.js dependencies:
    npm install
  3. Install Python dependencies:
    pip install -r requirements.txt
  4. Create a .env file in the backend directory and configure necessary environment variables (e.g., PYTHON_PATH, Pinata API keys). Example:
    PYTHON_PATH=python # or path to your python executable
    PINATA_API_KEY=your_pinata_api_key
    PINATA_SECRET_API_KEY=your_pinata_secret_api_key
  5. If you haven't trained the quality model yet, run the training script:
    python train_model.py
    This will generate dataset_quality_model_xgb.pkl.

Frontend Setup

  1. Navigate to the frontend directory:
    cd frontend
  2. Install Node.js dependencies:
    npm install
  3. Deploy the DatasetStorage.sol smart contract to your chosen blockchain network. Update the contract address and ABI in the frontend code (likely in frontend/src/Context/DatasetStorageABI.jsx or a similar configuration file).
  4. Ensure your MetaMask or other wallet is configured for the network where the contract is deployed.

Running the Application

Backend

  1. Navigate to the backend directory.
  2. Start the backend server (defaults to port 9000):
    npm start
    This uses nodemon to automatically restart the server on file changes.

Frontend

  1. Navigate to the frontend directory.
  2. Start the Vite development server (defaults to port 3000):
    npm run dev
    or
    npm start
  3. Open your browser and go to http://localhost:3000.

Key Features

  • Decentralized Dataset Storage: Datasets are intended to be stored on decentralized systems like IPFS via Pinata.
  • Smart Contract Interaction: Manages dataset metadata and ownership on the blockchain.
  • Data Quality Check: Python backend analyzes uploaded CSV files for quality metrics.
    • Missing values
    • Duplicate rows
    • Data type consistency
    • Outlier detection
    • Malicious/fake data indicators
  • Dataset Marketplace: Users can (presumably) list, browse, and acquire datasets.
  • User Authentication: Wallet connection (e.g., MetaMask) for interacting with the dApp.
  • Discussion Forum: A section for users to discuss topics.
  • File Upload: Users can upload CSV datasets.
  • Responsive UI: Built with React and Tailwind CSS.

Linting

To lint the frontend code:

  1. Navigate to the frontend directory.
  2. Run the lint command:
    npm run lint

Building for Production (Frontend)

  1. Navigate to the frontend directory.
  2. Run the build command:
    npm run build
    This will create a dist folder with the production-ready static assets.

About

Project Kandro is a decentralized platform for dataset sharing and monetization. It features data quality evaluation, dataset upload/view/purchase functionality, and a community discussion forum. The backend manages data quality and uploads, while the frontend offers a user interface for dataset interaction and blockchain-based smart contracts.

Topics

Resources

Stars

Watchers

Forks

Contributors 2

  •  
  •