VENIC - Dialogue Management

Voice Enabled Intelligent Programming Assistant

Overview

VENIC Dialogue Management is a conversational AI system built on Rasa that enables voice-controlled programming assistance. The system understands natural language commands and translates them into programming actions, supporting Java development, IDE operations, and version control workflows.

This dialogue management module serves as the natural language understanding (NLU) and dialogue policy component of the VENIC ecosystem, processing user intents and managing conversational flows for programming assistance.

Key Capabilities

Programming Language Support: Comprehensive Java command recognition (classes, methods, variables, arrays, collections, OOP concepts)
IDE Automation: Voice commands for IDE operations (file management, editing, navigation, views)
Version Control: Git command interpretation and execution
Context-Aware Dialogues: Maintains conversation context for multi-turn interactions
Extensible Intent System: Easily add new intents and responses

Features

Programming Commands

Java Language Constructs
- Classes, Interfaces, and Enums
- Methods, Functions, and Procedures
- Variables, Attributes, Properties, and Constants
- Arrays and Collections (ArrayList, HashMap, HashSet, LinkedList)
- Control Flow (if-else, loops, switch)
- OOP Concepts (constructors, encapsulation, inheritance)

IDE Commands

File Operations: Create, open, save, delete files
Project Management: Build, compile, run, debug, clean projects
Editor Actions: Cut, copy, paste, undo, redo, find, replace
View Management: Split editor, toggle panels, zoom controls
Navigation: Go to file, switch editors, cursor movements
Configuration: Settings, extensions, themes, keyboard shortcuts

Version Control

Git initialization and configuration
Branch creation and switching
Add, commit, push, pull operations
Merge, diff, log, status, stash commands

Dialogue Management

Intent classification with high accuracy
Entity extraction (names, types, numbers, messages)
Slot filling for context retention
Multi-turn conversation support
Fallback handling for out-of-scope queries

Architecture

┌────────────────────────────────────────────────────────────┐
│                        User Interface                      │
│                   (Voice/Text Input Layer)                 │
└───────────────────────────┬────────────────────────────────┘
                            │
                            ▼
┌────────────────────────────────────────────────────────────┐
│                   VENIC Dialogue Management                │
│                         (Rasa Core)                        │
├────────────────────────────────────────────────────────────┤
│  ┌──────────────────┐       ┌──────────────────────────┐   │
│  │   NLU Pipeline   │       │   Dialogue Policies      │   │
│  │  - Tokenization  │       │  - Memoization Policy    │   │
│  │  - Featurization │◄─────►│  - Rule Policy           │   │
│  │  - Intent Class. │       │  - TED Policy            │   │
│  │  - Entity Extract│       │  - Fallback Classifier   │   │
│  └──────────────────┘       └──────────────────────────┘   │
│                                                            │
│  ┌──────────────────────────────────────────────────────┐  │
│  │              Domain & Training Data                  │  │
│  │  - Intents (200+)  - Entities  - Responses           │  │
│  │  - Stories         - Rules     - Slots               │  │
│  └──────────────────────────────────────────────────────┘  │
└───────────────────────────┬────────────────────────────────┘
                            │
                            ▼
┌─────────────────────────────────────────────────────────────┐
│                    Action Server / Backend                  │
│               (IDE Integration / Command Execution)         │
└─────────────────────────────────────────────────────────────┘

Technology Stack

Framework: Rasa 3.0.8
Language: Python 3.7.7
NLU Pipeline: WhitespaceTokenizer, DIETClassifier, EntitySynonymMapper
Policies: MemoizationPolicy, RulePolicy, TEDPolicy, UnexpecTEDIntentPolicy
Containerization: Docker
Deployment: Okteto, Azure Kubernetes Service (AKS)
CI/CD: GitHub Actions
Development Tools: Jupyter Notebooks for training and analysis

Prerequisites

Before setting up the project, ensure you have the following installed:

Python: 3.7.7 or compatible version
pip: Latest version
Docker: (Optional) For containerized deployment
Git: For version control
Rasa: 3.0.8 (will be installed via pip)

Installation

Local Setup

Clone the repository

git clone https://github.com/paradocx96/venic-dm.git
cd venic-dm

Install Rasa

pip install rasa==3.0.8

Verify installation

rasa --version

Docker Setup

Build the Docker image

docker build -t venic-dm:latest .

Run using Docker Compose

docker-compose up

The Rasa server will be available at http://localhost:5006

Usage

Training the Model

Train the Rasa model with your training data:

rasa train

This will create a new model in the models/ directory.

Running the Server

Start the Rasa server with API enabled:

rasa run --enable-api --cors "*" --debug

Default port: 5005

Interactive Testing

Test the model interactively in the command line:

rasa shell

Using the REST API

Send a message to the bot:

curl -X POST http://localhost:5005/webhooks/rest/webhook \
  -H "Content-Type: application/json" \
  -d '{
    "sender": "user",
    "message": "create a class named Student"
  }'

Project Structure

venic-dm/
├── actions/                            # Custom Rasa actions
│   ├── __init__.py
│   └── actions.py                      # Custom action implementations
├── data/                               # Training data
│   ├── nlu.yml                         # NLU training examples
│   ├── stories.yml                     # Conversation stories
│   └── rules.yml                       # Conversation rules
├── dataset_generate/                   # Dataset generation tools
│   ├── data_nlu/                       # Organized NLU data by category
│   │   ├── GIT-Command/                # Git-related intents
│   │   ├── IDE-Command/                # IDE operation intents
│   │   ├── Java-Command/               # Java programming intents
│   │   ├── Other-Error/                # Error handling intents
│   │   └── data_story/                 # Story templates
│   ├── text_generate.py                # Script for generating training data
│   └── result_inform*.yml              # Generated data samples
├── models/                             # Trained Rasa models (generated)
├── notebooks/                          # Jupyter notebooks
│   ├── Rasa_Model_Train_Final.ipynb
│   ├── Google_Colab_Testing_v*.ipynb
│   └── Google_Colab_Train.ipynb
├── results/                            # Model evaluation results
│   └── [timestamp]/                    # Timestamped test results
│       ├── intent_confusion_matrix.png
│       ├── intent_report.json
│       ├── DIETClassifier_*.png
│       └── TEDPolicy_*.png
├── tests/                              # Test stories
│   └── test_stories.yml
├── .github/                            # GitHub Actions workflows
│   └── workflows/
│       ├── docker-image.yml
│       └── deploy-aks.yml
├── config.yml                          # Rasa NLU and Core configuration
├── domain.yml                          # Domain file (intents, entities, responses)
├── credentials.yml                     # Channel credentials
├── endpoints.yml                       # Endpoint configuration
├── Dockerfile                          # Docker image definition
├── docker-compose.yml                  # Docker Compose configuration
├── okteto.yml                          # Okteto deployment configuration
└── README.md                           # Project documentation

Configuration

config.yml

Defines the NLU pipeline and dialogue policies. The default configuration uses:

NLU Pipeline: Optimized for English language processing
Dialogue Policies: Combination of rule-based and machine learning policies

domain.yml

Contains:

Intents: 200+ intents for programming, IDE, and Git commands
Entities: name, type, process, number, message
Slots: Context storage for conversation management
Responses: Templated responses for each intent

endpoints.yml

Configures:

Action server endpoint
Tracker store
Event broker

Training the Model

Using Command Line

# Train a new model
rasa train

# Train NLU only
rasa train nlu

# Train Core only
rasa train core

Using Jupyter Notebooks

Training notebooks are provided in the notebooks/ directory:

Rasa_Model_Train_Final.ipynb: Complete training pipeline
Google_Colab_Train.ipynb: Training on Google Colab with GPU support

Training Data Organization

Training data is organized by command type:

GIT-Command: Git operations
IDE-Command: IDE interactions (cursor, file, font, project, statement)
Java-Command: Java language constructs (30+ categories)
Other-Error: Error handling and common queries

Testing

Interactive Testing

rasa shell

Test Stories

Run automated tests using test stories:

rasa test

Results are saved to the results/ directory with:

Confusion matrices
Classification reports
Failed test stories
Precision, recall, and F1 scores

Using Jupyter Notebooks

Testing notebooks are available:

Google_Colab_Testing_v1.ipynb: Basic model testing
Google_Colab_Testing_v2.ipynb: Enhanced testing with metrics
Google_Colab_Testing_v3.ipynb: Comprehensive evaluation

Deployment

Local Deployment

rasa run --enable-api --cors "*" -p 5005

Docker Deployment

docker-compose up -d

Okteto Cloud Deployment

The project includes okteto.yml for deployment to Okteto Cloud:

okteto deploy

Azure Kubernetes Service (AKS)

GitHub Actions workflow is configured for automated deployment to AKS:

Workflow: .github/workflows/deploy-aks.yml
Trigger: Push to main branch
Steps: Build Docker image → Push to registry → Deploy to AKS

API Endpoints

Local Development

Endpoint	URL
Localhost	`http://localhost:5005`

Production

Endpoint	URL
Okteto Cloud	`https://rasa-paradocx96.cloud.okteto.net`

REST API

Send Message

POST /webhooks/rest/webhook
Content-Type: application/json

{
  "sender": "user_id",
  "message": "create a class named Student"
}

Response

[
  {
    "recipient_id": "user_id",
    "text": "Student class created"
  }
]

Dataset Generation

The dataset_generate/ directory contains tools for generating training data:

text_generate.py

Script for automated generation of NLU training examples:

cd dataset_generate
python text_generate.py

Data Organization

Training data is modularized by command type for easier maintenance and extension:

Each category has dedicated YAML files
Examples follow consistent formatting
Entity annotations are clearly marked

Model Evaluation

Evaluation Metrics

Model performance is tracked using:

Intent Classification: Precision, recall, F1-score
Entity Extraction: Exact match accuracy
Dialogue Prediction: Story success rate
Confusion Matrices: Visual representation of misclassifications

Latest Results

Results are stored in timestamped directories under results/:

results/
└── 220817 2254/
    ├── intent_confusion_matrix.png     # Intent classification visualization
    ├── intent_report.json              # Detailed metrics
    ├── DIETClassifier_report.json      # Entity extraction performance
    ├── TEDPolicy_report.json           # Dialogue policy performance
    └── failed_test_stories.yml         # Stories that failed testing

Contributing

Contributions are welcome! Please follow these guidelines:

Fork the repository

Create a feature branch

git checkout -b feature/your-feature-name

Commit your changes

git commit -m "Add: Description of your feature"

Push to the branch

git push origin feature/your-feature-name

Open a Pull Request

Development Guidelines

Follow Rasa best practices for dialogue design
Add test stories for new intents
Update domain.yml with new intents and responses
Document new features in the README
Ensure all tests pass before submitting PR

License

This project is part of the VENIC system. Please refer to the repository for license information.

Contact

Website: https://venic.io

GitHub Repository: https://github.com/paradocx96/venic-dm

Author: paradocx96

Acknowledgments

Built with Rasa Open Source
Powered by Python and modern NLU technologies
Deployed on Okteto Cloud and Azure Kubernetes Service

Made with ❤️ by paradocx96

Report Bug · Request Feature

Back to Top

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
.github/workflows		.github/workflows
actions		actions
data		data
dataset_generate		dataset_generate
models		models
notebooks		notebooks
results		results
tests		tests
.dockerignore		.dockerignore
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
config.yml		config.yml
credentials.yml		credentials.yml
docker-compose.yml		docker-compose.yml
domain.yml		domain.yml
endpoints.yml		endpoints.yml
okteto.yml		okteto.yml

paradocx96/venic-dm

Folders and files

Latest commit

History

Repository files navigation

VENIC - Dialogue Management

Table of Contents

Overview

Key Capabilities

Features

Programming Commands

IDE Commands

Version Control

Dialogue Management

Architecture

Technology Stack

Prerequisites

Installation

Local Setup

Docker Setup

Usage

Training the Model

Running the Server

Interactive Testing

Using the REST API

Project Structure

Configuration

config.yml

domain.yml

endpoints.yml

Training the Model

Using Command Line

Using Jupyter Notebooks

Training Data Organization

Testing

Interactive Testing

Test Stories

Using Jupyter Notebooks

Deployment

Local Deployment

Docker Deployment

Okteto Cloud Deployment

Azure Kubernetes Service (AKS)

API Endpoints

Local Development

Production

REST API

Dataset Generation

text_generate.py

Data Organization

Model Evaluation

Evaluation Metrics

Latest Results

Contributing

Development Guidelines

License

Contact

Acknowledgments

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages