- Overview
- Features
- Architecture
- Prerequisites
- Installation
- Usage
- Project Structure
- Configuration
- Training the Model
- Testing
- Deployment
- API Endpoints
- Dataset Generation
- Model Evaluation
- Contributing
- License
- Contact
VENIC Dialogue Management is a conversational AI system built on Rasa that enables voice-controlled programming assistance. The system understands natural language commands and translates them into programming actions, supporting Java development, IDE operations, and version control workflows.
This dialogue management module serves as the natural language understanding (NLU) and dialogue policy component of the VENIC ecosystem, processing user intents and managing conversational flows for programming assistance.
- Programming Language Support: Comprehensive Java command recognition (classes, methods, variables, arrays, collections, OOP concepts)
- IDE Automation: Voice commands for IDE operations (file management, editing, navigation, views)
- Version Control: Git command interpretation and execution
- Context-Aware Dialogues: Maintains conversation context for multi-turn interactions
- Extensible Intent System: Easily add new intents and responses
- Java Language Constructs
- Classes, Interfaces, and Enums
- Methods, Functions, and Procedures
- Variables, Attributes, Properties, and Constants
- Arrays and Collections (ArrayList, HashMap, HashSet, LinkedList)
- Control Flow (if-else, loops, switch)
- OOP Concepts (constructors, encapsulation, inheritance)
- File Operations: Create, open, save, delete files
- Project Management: Build, compile, run, debug, clean projects
- Editor Actions: Cut, copy, paste, undo, redo, find, replace
- View Management: Split editor, toggle panels, zoom controls
- Navigation: Go to file, switch editors, cursor movements
- Configuration: Settings, extensions, themes, keyboard shortcuts
- Git initialization and configuration
- Branch creation and switching
- Add, commit, push, pull operations
- Merge, diff, log, status, stash commands
- Intent classification with high accuracy
- Entity extraction (names, types, numbers, messages)
- Slot filling for context retention
- Multi-turn conversation support
- Fallback handling for out-of-scope queries
┌────────────────────────────────────────────────────────────┐
│ User Interface │
│ (Voice/Text Input Layer) │
└───────────────────────────┬────────────────────────────────┘
│
▼
┌────────────────────────────────────────────────────────────┐
│ VENIC Dialogue Management │
│ (Rasa Core) │
├────────────────────────────────────────────────────────────┤
│ ┌──────────────────┐ ┌──────────────────────────┐ │
│ │ NLU Pipeline │ │ Dialogue Policies │ │
│ │ - Tokenization │ │ - Memoization Policy │ │
│ │ - Featurization │◄─────►│ - Rule Policy │ │
│ │ - Intent Class. │ │ - TED Policy │ │
│ │ - Entity Extract│ │ - Fallback Classifier │ │
│ └──────────────────┘ └──────────────────────────┘ │
│ │
│ ┌──────────────────────────────────────────────────────┐ │
│ │ Domain & Training Data │ │
│ │ - Intents (200+) - Entities - Responses │ │
│ │ - Stories - Rules - Slots │ │
│ └──────────────────────────────────────────────────────┘ │
└───────────────────────────┬────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ Action Server / Backend │
│ (IDE Integration / Command Execution) │
└─────────────────────────────────────────────────────────────┘
- Framework: Rasa 3.0.8
- Language: Python 3.7.7
- NLU Pipeline: WhitespaceTokenizer, DIETClassifier, EntitySynonymMapper
- Policies: MemoizationPolicy, RulePolicy, TEDPolicy, UnexpecTEDIntentPolicy
- Containerization: Docker
- Deployment: Okteto, Azure Kubernetes Service (AKS)
- CI/CD: GitHub Actions
- Development Tools: Jupyter Notebooks for training and analysis
Before setting up the project, ensure you have the following installed:
- Python: 3.7.7 or compatible version
- pip: Latest version
- Docker: (Optional) For containerized deployment
- Git: For version control
- Rasa: 3.0.8 (will be installed via pip)
- Clone the repository
git clone https://github.com/paradocx96/venic-dm.git
cd venic-dm- Install Rasa
pip install rasa==3.0.8- Verify installation
rasa --version- Build the Docker image
docker build -t venic-dm:latest .- Run using Docker Compose
docker-compose upThe Rasa server will be available at http://localhost:5006
Train the Rasa model with your training data:
rasa trainThis will create a new model in the models/ directory.
Start the Rasa server with API enabled:
rasa run --enable-api --cors "*" --debugDefault port: 5005
Test the model interactively in the command line:
rasa shellSend a message to the bot:
curl -X POST http://localhost:5005/webhooks/rest/webhook \
-H "Content-Type: application/json" \
-d '{
"sender": "user",
"message": "create a class named Student"
}'venic-dm/
├── actions/ # Custom Rasa actions
│ ├── __init__.py
│ └── actions.py # Custom action implementations
├── data/ # Training data
│ ├── nlu.yml # NLU training examples
│ ├── stories.yml # Conversation stories
│ └── rules.yml # Conversation rules
├── dataset_generate/ # Dataset generation tools
│ ├── data_nlu/ # Organized NLU data by category
│ │ ├── GIT-Command/ # Git-related intents
│ │ ├── IDE-Command/ # IDE operation intents
│ │ ├── Java-Command/ # Java programming intents
│ │ ├── Other-Error/ # Error handling intents
│ │ └── data_story/ # Story templates
│ ├── text_generate.py # Script for generating training data
│ └── result_inform*.yml # Generated data samples
├── models/ # Trained Rasa models (generated)
├── notebooks/ # Jupyter notebooks
│ ├── Rasa_Model_Train_Final.ipynb
│ ├── Google_Colab_Testing_v*.ipynb
│ └── Google_Colab_Train.ipynb
├── results/ # Model evaluation results
│ └── [timestamp]/ # Timestamped test results
│ ├── intent_confusion_matrix.png
│ ├── intent_report.json
│ ├── DIETClassifier_*.png
│ └── TEDPolicy_*.png
├── tests/ # Test stories
│ └── test_stories.yml
├── .github/ # GitHub Actions workflows
│ └── workflows/
│ ├── docker-image.yml
│ └── deploy-aks.yml
├── config.yml # Rasa NLU and Core configuration
├── domain.yml # Domain file (intents, entities, responses)
├── credentials.yml # Channel credentials
├── endpoints.yml # Endpoint configuration
├── Dockerfile # Docker image definition
├── docker-compose.yml # Docker Compose configuration
├── okteto.yml # Okteto deployment configuration
└── README.md # Project documentation
Defines the NLU pipeline and dialogue policies. The default configuration uses:
- NLU Pipeline: Optimized for English language processing
- Dialogue Policies: Combination of rule-based and machine learning policies
Contains:
- Intents: 200+ intents for programming, IDE, and Git commands
- Entities: name, type, process, number, message
- Slots: Context storage for conversation management
- Responses: Templated responses for each intent
Configures:
- Action server endpoint
- Tracker store
- Event broker
# Train a new model
rasa train
# Train NLU only
rasa train nlu
# Train Core only
rasa train coreTraining notebooks are provided in the notebooks/ directory:
- Rasa_Model_Train_Final.ipynb: Complete training pipeline
- Google_Colab_Train.ipynb: Training on Google Colab with GPU support
Training data is organized by command type:
- GIT-Command: Git operations
- IDE-Command: IDE interactions (cursor, file, font, project, statement)
- Java-Command: Java language constructs (30+ categories)
- Other-Error: Error handling and common queries
rasa shellRun automated tests using test stories:
rasa testResults are saved to the results/ directory with:
- Confusion matrices
- Classification reports
- Failed test stories
- Precision, recall, and F1 scores
Testing notebooks are available:
- Google_Colab_Testing_v1.ipynb: Basic model testing
- Google_Colab_Testing_v2.ipynb: Enhanced testing with metrics
- Google_Colab_Testing_v3.ipynb: Comprehensive evaluation
rasa run --enable-api --cors "*" -p 5005docker-compose up -dThe project includes okteto.yml for deployment to Okteto Cloud:
okteto deployGitHub Actions workflow is configured for automated deployment to AKS:
- Workflow:
.github/workflows/deploy-aks.yml - Trigger: Push to main branch
- Steps: Build Docker image → Push to registry → Deploy to AKS
| Endpoint | URL |
|---|---|
| Localhost | http://localhost:5005 |
| Endpoint | URL |
|---|---|
| Okteto Cloud | https://rasa-paradocx96.cloud.okteto.net |
Send Message
POST /webhooks/rest/webhook
Content-Type: application/json
{
"sender": "user_id",
"message": "create a class named Student"
}
Response
[
{
"recipient_id": "user_id",
"text": "Student class created"
}
]The dataset_generate/ directory contains tools for generating training data:
Script for automated generation of NLU training examples:
cd dataset_generate
python text_generate.pyTraining data is modularized by command type for easier maintenance and extension:
- Each category has dedicated YAML files
- Examples follow consistent formatting
- Entity annotations are clearly marked
Model performance is tracked using:
- Intent Classification: Precision, recall, F1-score
- Entity Extraction: Exact match accuracy
- Dialogue Prediction: Story success rate
- Confusion Matrices: Visual representation of misclassifications
Results are stored in timestamped directories under results/:
results/
└── 220817 2254/
├── intent_confusion_matrix.png # Intent classification visualization
├── intent_report.json # Detailed metrics
├── DIETClassifier_report.json # Entity extraction performance
├── TEDPolicy_report.json # Dialogue policy performance
└── failed_test_stories.yml # Stories that failed testing
Contributions are welcome! Please follow these guidelines:
- Fork the repository
- Create a feature branch
git checkout -b feature/your-feature-name
- Commit your changes
git commit -m "Add: Description of your feature" - Push to the branch
git push origin feature/your-feature-name
- Open a Pull Request
- Follow Rasa best practices for dialogue design
- Add test stories for new intents
- Update domain.yml with new intents and responses
- Document new features in the README
- Ensure all tests pass before submitting PR
This project is part of the VENIC system. Please refer to the repository for license information.
Website: https://venic.io
GitHub Repository: https://github.com/paradocx96/venic-dm
Author: paradocx96
- Built with Rasa Open Source
- Powered by Python and modern NLU technologies
- Deployed on Okteto Cloud and Azure Kubernetes Service