Last Updated: 2024-10-31
Status: Phase 1-2 Complete (Backend), Phase 3 Starting (Tauri IDE)
OLD Vision: VS Code fork with embedded SLMs
NEW Vision: Lightweight Tauri-based IDE with local AI + optional cloud providers
Reason: User feedback requested simpler IDE, full AI chat with project manipulation, YOLO mode, learning system, and RAG for framework docs. VS Code fork was too complex and restrictive.
NeuralForge is an open-source local-first AI IDE built with Tauri, offering intelligent code completion, AI chat with project manipulation, adaptive learning, and framework documentation search.
- Binary Size: ~50MB (Tauri efficiency)
- RAM Usage: ~300MB idle (75% less than Electron)
- Inference Speed: <50ms (local AI)
- Cost: $0 for local AI (optional cloud providers)
- Privacy: 100% local by default, optional cloud
"Empower developers with a lightweight, privacy-first AI IDE that learns their coding style, provides intelligent assistance, and optionally connects to cloud AI providers - all while running primarily on local hardware."
- Local-First: Default to local AI, optional cloud providers
- Lightweight: Tauri (50MB) vs Electron (200MB) architecture
- Privacy-Respecting: Code stays local unless user chooses cloud
- Community-Driven: Open-source with transparent development
- Adaptive Learning: AI learns user's coding style over time
- Developer-Centric: Built by developers, for developers
- Individual Developers: Privacy-conscious developers wanting local AI
- Small Teams: Startups needing lightweight tools with optional cloud
- Enterprise Developers: Companies preferring on-premise with optional external AI
- Students: Learning to code with AI mentoring
- Open Source Contributors: Developers building in public
- 5+ years experience
- Works on sensitive/proprietary projects
- Wants AI help without mandatory cloud dependency
- Appreciates option to use cloud AI when needed
- Student or junior developer
- 4GB-8GB RAM laptop
- Cannot afford heavy Electron apps
- Needs fast, responsive IDE
- Uses local AI for most tasks
- Occasionally needs GPT-4/Claude for complex problems
- Values choice: local by default, cloud optional
- Cares about performance and privacy
Features:
├── Syntax highlighting (50+ languages)
├── IntelliSense
├── Multi-file tabs
├── File explorer tree view
├── Settings panel
└── Theme support (dark/light)Why Tauri?
- 50MB RAM vs 200MB (Electron)
- Native performance
- Smaller binary size (~50MB vs ~200MB)
- Faster startup (<3s vs 5-8s)
interface GhostTextEngine {
- Real-time suggestions as you type
- Gray text overlay (opacity 0.6)
- Tab to accept, Esc to reject
- 500ms debouncing (smart request throttling)
- YOLO mode (auto-accept completions)
}- Latency: <50ms (local model)
- Accuracy: Context-aware (file + cursor position)
- Memory: <2GB model footprint (CodeT5+ 220M)
UI:
- Side panel (resizable)
- Message bubbles (user right, AI left)
- Code blocks with syntax highlighting
- Copy button for code snippets
Context-Awareness:
- Knows about open files
- Understands current selection
- Analyzes compile errors
Providers:
- Local (default, CodeT5+ 220M)
- OpenAI (optional, user's API key)
- Claude (optional, user's API key)
- Gemini (optional, user's API key)
- Custom (user's endpoint)// AI can directly manipulate project files
interface CodeActions {
createFile(path: string, content: string): void;
modifyFile(path: string, changes: FileDiff): void;
deleteFile(path: string): void;
// With preview and confirmation
previewChanges(actions: FileAction[]): void;
applyChanges(actions: FileAction[]): void;
undoChanges(actionId: string): void;
}class StyleAnalyzer {
// Detects user preferences from codebase
- Naming conventions (camelCase vs snake_case)
- Formatting style (tabs vs spaces, brace position)
- Comment density and style
- Import organization
- Test patterns
}class StyleAdapter {
// Applies learned style to AI output
- Match user's naming conventions
- Follow user's formatting
- Suggest libraries user already uses
- Maintain project consistency
}// Show user what AI learned
interface LearningDashboard {
namingStyle: "camelCase" | "snake_case" | "mixed";
librariesUsed: string[]; // ["Spring Boot", "JUnit 5", "Lombok"]
testFramework: string; // "JUnit 5"
codeStyle: StyleProfile; // indent, braces, etc.
}Features:
├── Embedded mode (no external services)
├── Semantic search (CodeBERT embeddings)
├── Framework auto-detection (pom.xml, package.json)
├── Auto-crawler (Spring Boot, React, etc. docs)
└── Code snippet extraction// User asks in chat:
"How to use @Transactional in Spring Boot?"
// RAG System:
1. Detects Spring Boot from pom.xml
2. Queries Qdrant for "Spring @Transactional"
3. Returns top 5 relevant doc snippets
4. AI synthesizes answer with examples// AI understands:
- Compile errors (from editor diagnostics)
- Cursor position (what user is working on)
- Selected code (refactoring target)
- Project structure (files, dependencies)// User: "Refactor this class to use dependency injection"
// AI:
1. Shows plan (5 steps)
2. User approves
3. AI executes with progress tracker:
✅ Create interface for service
✅ Update constructor with @Autowired
🔄 Modify test cases... (in progress)
⏸️ Update configuration
⏸️ Verify build passesProject Profile:
├── Language distribution
├── Framework patterns
├── Naming conventions
├── Comment density
├── Testing patterns
├── Error handling style
└── Architecture patterns
→ Creates unique 512-dimension embedding
→ Finds similar projects for transfer learningasync def swarm_inference(prompt):
results = await asyncio.gather(
model1.generate(prompt, temp=0.7),
model2.generate(prompt, temp=0.8),
model3.generate(prompt, temp=0.9)
)
return weighted_voting(results)- Models vote on best completion
- Confidence scores determine weight
- Disagreement triggers user choice
interface MentorFeatures {
codeReview: {
complexity: "O(n²) detected, optimize?",
security: "SQL injection risk here",
performance: "Consider caching this",
style: "This violates team conventions"
},
teaching: {
explanations: "Here's why this works...",
alternatives: "3 other ways to do this",
bestPractices: "Industry standard approach",
edgeCases: "Don't forget null check"
}
}- Identifies similar past bugs
- Shows historical fixes
- Predicts potential issues
- Links to relevant commits
# Share your fine-tuned adapter
neural-forge push adapter my-style --public
# Download specialized adapters
neural-forge pull adapter facebook/react-hooks
neural-forge pull adapter google/angular-best
# Merge multiple adapters
neural-forge merge adapter1 adapter2 --output customspring-boot-microservicesreact-hooks-typescriptclean-architecture-javacompetitive-programming-cppdata-science-python
class FederatedLearning:
def share_gradients(self, local_gradients):
# Differential privacy
noise = generate_gaussian_noise(epsilon=1.0)
private_gradients = local_gradients + noise
# Secure aggregation
encrypted = homomorphic_encrypt(private_gradients)
return send_to_federation(encrypted)┌─────────────────────────────────────────────────┐
│ VS Code Fork (Electron) │
├─────────────────────────────────────────────────┤
│ Extension API Compatibility │
├─────────────┬───────────────────────────────────┤
│ Editor │ AI Service │
│ Core │ ┌─────────────────────┐ │
│ │ │ Java Backend │ │
│ Monaco │◄───┤ Spring Boot │ │
│ Editor │ │ Embedded JVM │ │
│ │ └──────┬──────────────┘ │
│ │ │ │
│ File │ ┌──────▼──────────────┐ │
│ System │ │ Model Engine │ │
│ │ │ DJL/ONNX Runtime │ │
│ Terminal │ └──────┬──────────────┘ │
│ │ │ │
│ Git │ ┌──────▼──────────────┐ │
│ Integration│ │ Vector Store │ │
│ │ │ Qdrant Embedded │ │
└─────────────┴────┴─────────────────────┴───────┘
Model Server:
├── Model Registry
│ ├── Active Models (hot)
│ ├── Available Models (cold)
│ └── Downloading Queue
│
├── Inference Engine
│ ├── ONNX Runtime (CPU)
│ ├── CUDA Provider (GPU optional)
│ └── Quantization (INT4/INT8)
│
├── Adapter Manager
│ ├── LoRA Loader
│ ├── Merge Engine
│ └── Weight Calculator
│
└── Cache Layer
├── KV Cache (attention)
├── Result Cache (completions)
└── Embedding Cache (RAG)class IncrementalLearningPipeline:
def __init__(self):
self.buffer_size = 1000 # interactions
self.replay_buffer = ExperienceReplay()
self.ewc_fisher = {} # Elastic weights
async def process_interaction(self, code, action, feedback):
# 1. Store in buffer
self.replay_buffer.add(code, action, feedback)
# 2. Check if training needed
if self.should_train():
await self.mini_batch_training()
# 3. Update Fisher information
if self.is_important_knowledge():
self.update_ewc_weights()RAG Pipeline:
├── Code Indexer
│ ├── AST Parser (Java/TS/Python)
│ ├── Semantic Chunker
│ └── Dependency Analyzer
│
├── Embedding Engine
│ ├── CodeBERT embeddings
│ └── Custom fine-tuned embedder
│
├── Vector Database (Qdrant)
│ ├── Project vectors
│ ├── Documentation vectors
│ └── Stack Overflow cache
│
└── Retrieval Strategy
├── Semantic search
├── Keyword matching
└── Graph traversal| Model | Size | License | Purpose | Context |
|---|---|---|---|---|
| CodeT5+ 770M | 1.5GB | Apache 2.0 | Primary completion | 2K |
| StableCode 3B | 6GB (Q4) | Apache 2.0 | Long context tasks | 16K |
| SantaCoder 1.1B | 2.2GB | Apache 2.0 | Multi-language | 2K |
| StarCoder 15B | 9GB (Q4) | Apache 2.0 | Complex generation | 8K |
| TinyStories-Code 33M | 66MB | MIT | Comments/docs | 512 |
Quantization Levels:
├── FP16: Development/Fine-tuning
├── INT8: Default deployment
├── INT4: Memory-constrained
└── INT2: Experimental ultra-lightfunction selectModel(context) {
const complexity = analyzeComplexity(context);
const fileSize = context.file.length;
const language = context.language;
if (fileSize > 1000 && complexity.requiresFullContext) {
return 'stablecode-3b'; // 16K context
}
if (complexity.score > 0.8) {
return 'starcoder-15b'; // Most capable
}
if (language in ['Java', 'Python', 'JavaScript']) {
return 'codet5-770m'; // Best for common languages
}
return 'santacoder-1.1b'; // Good default
}Training Triggers:
├── Scheduled: Daily at low-activity hours
├── Threshold: Every 1000 interactions
├── Manual: User-initiated training
└── Git-hook: On major commits
Training Data Sources:
├── Accepted completions (positive)
├── Rejected completions (negative)
├── Manual corrections (highest weight)
├── Code review comments
├── Test results (pass/fail)
└── Build logs (success/error)class AdapterStack:
"""
Composable LoRA adapters for different aspects
"""
def __init__(self):
self.adapters = {
'base': None, # Frozen
'language': LoRAAdapter(r=4, alpha=8), # 10MB
'framework': LoRAAdapter(r=8, alpha=16), # 40MB
'project': LoRAAdapter(r=8, alpha=16), # 40MB
'personal': LoRAAdapter(r=4, alpha=8), # 10MB
'team': LoRAAdapter(r=4, alpha=8), # 10MB
}
def combine_adapters(self, weights=None):
"""Weighted combination of multiple adapters"""
if weights is None:
weights = self.auto_calculate_weights()
combined = sum([
adapter * weight
for adapter, weight in zip(self.adapters.values(), weights)
])
return combinedclass ContinualLearningStrategy:
def __init__(self):
self.strategies = {
'ewc': ElasticWeightConsolidation(),
'replay': ExperienceReplay(buffer_size=10000),
'distill': KnowledgeDistillation(),
'maml': ModelAgnosticMetaLearning(),
}
def train_incremental(self, new_data):
# Prevent catastrophic forgetting
old_knowledge = self.strategies['replay'].sample(100)
# Mix old and new data
training_data = self.mix_data(
new_data,
old_knowledge,
ratio=0.8 # 80% new, 20% old
)
# Apply EWC regularization
loss = self.compute_loss(training_data)
loss += self.strategies['ewc'].penalty()
return lossOur focus is on:
- 100% Local: All processing on user's machine
- Privacy First: No data collection or telemetry
- Resource Efficient: Optimized for lower-end hardware
- Open Source: Fully transparent development
- Community Driven: Adapters and extensions from users
Features we're working toward:
- Continuous Learning: Model improves from user feedback
- Project-Specific Adapters: Unique model per project
- Multi-Model Support: Different models for different tasks
- Git Integration: Learn from commit history
- Offline First: No internet required
- Adapter Marketplace: Community-shared models
- Resource Efficiency: Runs on modest hardware
- Complete Privacy: Zero data leaves your machine
Installation (5 minutes):
1. Download (150MB base + lazy load models)
2. Launch → Automatic setup wizard
3. Select languages you use
4. Choose model size (small/medium/large)
5. Optional: Import Git history for learning
6. Ready to code!// Morning startup
- Models load in background (3 seconds)
- Yesterday's learnings applied
- New team commits analyzed
// While coding
- Ghost text appears (30ms)
- Tab to accept, Esc to reject
- Cmd+K for AI commands
- Right-click for "AI Explain"
// End of day
- Auto-training from today's work
- Adapter saved to profile
- Optional: Share improvements with teamNew UI Elements:
├── AI Status Bar
│ ├── Active model indicator
│ ├── Confidence percentage
│ ├── Learning progress
│ └── Resource usage
│
├── AI Side Panel
│ ├── Chat interface
│ ├── Explanation view
│ ├── Suggestion history
│ └── Training dashboard
│
├── Inline Overlays
│ ├── Complexity warnings
│ ├── Performance hints
│ ├── Security alerts
│ └── Alternative suggestions
│
└── Settings Page
├── Model management
├── Training configuration
├── Privacy settings
└── Adapter marketplace| Shortcut | Action |
|---|---|
| Tab | Accept completion |
| Cmd+K | AI command palette |
| Cmd+I | Inline chat |
| Cmd+Shift+E | Explain selection |
| Cmd+Shift+R | Refactor with AI |
| Cmd+Shift+T | Generate tests |
| Cmd+Shift+D | Generate docs |
| Alt+Enter | Show alternatives |
Marketplace Structure:
├── Official Adapters
│ ├── Verified by core team
│ ├── Tested on large codebases
│ └── Guaranteed compatibility
│
├── Community Adapters
│ ├── User ratings/reviews
│ ├── Download counts
│ ├── Source code visible
│ └── Fork and modify
│
└── Enterprise Adapters
├── Private sharing
├── Team-only access
└── Encrypted storageclass FederationNode:
def join_network(self, network_id):
"""Join learning network while preserving privacy"""
self.network = network_id
self.share_gradients = True
self.receive_updates = True
def contribute(self):
"""Share learning without sharing code"""
gradients = self.compute_gradients()
encrypted = self.apply_differential_privacy(gradients)
self.network.submit(encrypted)Current Status: Early development phase
For now, please use:
- GitHub Issues: Bug reports and feature requests
- GitHub Discussions: General questions and ideas
- Pull Requests: Code contributions welcome
Note: Dedicated community channels will be established as the project matures.
OS: Windows 10/macOS 10.14/Ubuntu 20.04
CPU: 2 cores @ 2.4GHz
RAM: 4GB
Storage: 10GB
GPU: None requiredOS: Latest stable
CPU: 4 cores @ 3.0GHz
RAM: 8GB
Storage: 20GB
GPU: Optional (6GB VRAM)| Metric | Target | Maximum |
|---|---|---|
| Startup time | <5s | 10s |
| First completion | <100ms | 200ms |
| Subsequent completions | <50ms | 100ms |
| Model switching | <1s | 2s |
| Training time (1K samples) | <5min | 10min |
| RAM usage (idle) | <1GB | 2GB |
| RAM usage (active) | <3GB | 4GB |
Optimizations:
├── Speculative Decoding
│ └── 3x speedup with draft model
├── KV Caching
│ └── 50% reduction in compute
├── Flash Attention
│ └── 2x memory efficiency
├── Quantization
│ └── 75% model size reduction
├── Lazy Loading
│ └── Load models on demand
└── Incremental Parsing
└── Parse only changed codePrivacy Guarantees:
✓ No telemetry collection
✓ No cloud connectivity required
✓ All models run locally
✓ Code never leaves machine
✓ Git data stays local
✓ Opt-in federation onlyclass SecurityLayer:
def __init__(self):
self.features = {
'code_sanitization': self.sanitize_input,
'injection_prevention': self.prevent_injection,
'model_isolation': self.sandbox_execution,
'adapter_verification': self.verify_adapter_signature,
}
def scan_adapter(self, adapter_file):
"""Verify adapter safety before loading"""
checksum = self.compute_hash(adapter_file)
signature = self.verify_signature(adapter_file)
malware_scan = self.scan_for_malware(adapter_file)
return all([checksum, signature, malware_scan])- GDPR: Full compliance (no data collection)
- SOC 2: Audit trail capabilities
- HIPAA: Medical code safe (no data exposure)
- Enterprise: On-premise deployment ready
Stable (Monthly):
├── Full testing suite passed
├── Community beta tested
└── Production ready
Beta (Weekly):
├── Feature complete
├── Known issues documented
└── Feedback collection
Nightly (Daily):
├── Latest features
├── Experimental
└── Automated builds| Platform | Package | Size | Auto-Update |
|---|---|---|---|
| Windows | .exe installer | 200MB | Yes |
| macOS | .dmg | 200MB | Yes |
| Linux | .AppImage | 200MB | Yes |
| Linux | .deb/.rpm | 200MB | Via package manager |
| Universal | .tar.gz | 200MB | No |
# Progressive download
- Core editor: 200MB (immediate)
- Essential model: 500MB (background)
- Additional models: On-demand
# CDN Strategy
- GitHub Releases (primary)
- Hugging Face Hub (models)
- Torrent (community mirrors)| Metric | Target Goal |
|---|---|
| GitHub Stars | Growing community interest |
| Active Contributors | Sustainable development |
| Adapters Created | Vibrant ecosystem |
| User Feedback | Positive reception |
Code Quality Goals:
├── Acceptance Rate: >60%
├── False Positive: <20%
├── Latency P95: <200ms
└── Stability: Minimal crashes
User Experience Goals:
├── Easy Installation
├── Smooth Performance
├── Clear Documentation
└── Responsive SupportNote: This is an early-stage project. The roadmap represents our vision and planned features.
✓ Project initialization
✓ Architecture design
✓ Documentation foundation- [ ] VS Code fork setup
- [ ] Basic completion engine
- [ ] Model integration (CodeT5+)
- [ ] Local model serving
- [ ] Essential UI components- [ ] Multi-model support
- [ ] Smart model routing
- [ ] Git integration basics
- [ ] Basic adapter system
- [ ] Performance optimization- [ ] Incremental learning
- [ ] Adapter marketplace
- [ ] Advanced completions
- [ ] Team collaboration features
- [ ] Documentation and polishTimeline Disclaimer: These are aspirational targets subject to community involvement and development progress.
| Component | Technology | License | Purpose |
|---|---|---|---|
| Frontend | |||
| Editor Base | VS Code OSS | MIT | Core editor |
| UI Framework | Electron | MIT | Desktop app |
| Editor Component | Monaco | MIT | Code editing |
| Backend | |||
| Runtime | Java 21 + GraalVM | GPL+CE | Main backend |
| Framework | Spring Boot | Apache 2.0 | Service layer |
| ML Framework | DJL | Apache 2.0 | Model serving |
| Inference | ONNX Runtime | MIT | Model execution |
| AI/ML | |||
| Base Models | Various | Apache/MIT | See section 6.1 |
| Fine-tuning | Custom LoRA | Apache 2.0 | Adaptation |
| Vector DB | Qdrant | Apache 2.0 | Embeddings |
| Embeddings | CodeBERT | MIT | Code understanding |
| Data | |||
| Database | H2 | MPL 2.0/EPL | Metadata |
| Cache | Caffeine | Apache 2.0 | Performance |
| Serialization | Apache Arrow | Apache 2.0 | Data format |
| Infrastructure | |||
| Build | Gradle | Apache 2.0 | Build system |
| CI/CD | GitHub Actions | N/A | Automation |
| Package | electron-builder | MIT | Distribution |
| Utilities | |||
| Math | Apache Commons | Apache 2.0 | Calculations |
| Async | Project Reactor | Apache 2.0 | Reactive |
| Logging | SLF4J + Logback | MIT/EPL | Logging |
Required:
├── Node.js 18+
├── Java 21+
├── Python 3.10+ (model conversion)
├── Git
└── 16GB RAM (for building desktop app)
Optional:
├── CUDA Toolkit (GPU acceleration)
└── VS Code (dogfooding)
NOT Used:
├── Docker (not needed - desktop app)
├── Kubernetes (not needed - desktop app)
└── Web servers (not needed - desktop app)| Risk | Impact | Likelihood | Mitigation |
|---|---|---|---|
| Model performance issues | High | Medium | Multiple model fallbacks |
| Memory constraints | High | Low | Aggressive quantization |
| Compatibility breaks | Medium | Medium | Extensive testing |
| Security vulnerabilities | High | Low | Security audits |
| Risk | Impact | Likelihood | Mitigation |
|---|---|---|---|
| Low adoption | High | Medium | Strong marketing |
| Competition from big tech | High | High | Focus on privacy/local |
| Maintaining momentum | Medium | Medium | Community building |
| Sustainability | Medium | Low | Sponsorships/donations |
The AI coding assistant market includes established players like GitHub Copilot, Cursor, and Tabnine. NeuralForge aims to differentiate through:
- 100% Local Processing: Complete privacy and offline functionality
- Open Source: Full transparency and community involvement
- Resource Efficiency: Optimized for modest hardware
- No Subscriptions: Permanently free
- Adaptable: Learn from individual coding patterns
- Privacy-First Architecture: No code ever leaves your machine
- Community-Driven: Shared adapters and improvements
- Transparent Development: Open source from day one
- Local Learning: Models adapt to your style
- Resource Conscious: Designed for real-world hardware constraints
Tagline: "Your AI pair programmer that learns from you, not from you"
Pre-Launch:
├── Beta program (100 developers)
├── Blog posts series
├── YouTube demos
└── Reddit/HN presence
Launch:
├── Product Hunt
├── Hacker News
├── Dev.to articles
├── Twitter/X campaign
└── Conference talks
Post-Launch:
├── Community building
├── Workshop series
├── Partner integrations
└── Enterprise outreach- Open Source First: Full transparency
- Developer Evangelism: Conference talks
- Content Marketing: Tutorials and guides
- Community Challenges: Coding competitions
- Enterprise Pilots: Free trials for companies
Project License: Apache 2.0
Model Licenses: Apache 2.0 / MIT only
Dataset Licenses: Verified open source
Dependencies: Compatible licenses only- No proprietary code in training
- GDPR compliance built-in
- Export control compliance
- Accessibility standards (WCAG)
- Trademark registration for "NeuralForge"
- Contributor License Agreement (CLA)
- Patent review for novel techniques
- Copyright assignment clarity
| Model | Target Context | Target Latency | Memory Budget |
|---|---|---|---|
| CodeT5+ 770M | 2K tokens | ~50ms | ~1.5GB |
| SantaCoder 1.1B | 2K tokens | ~60ms | ~2.2GB |
| StableCode 3B | 16K tokens | ~100ms | ~3.5GB |
Note: These are target specifications. Actual performance will vary based on hardware and optimization.
interface AIService {
// Completion API
complete(context: CodeContext): Promise<Completion>
// Chat API
chat(message: string): Promise<Response>
// Training API
train(dataset: Dataset): Promise<TrainingResult>
// Adapter API
loadAdapter(path: string): Promise<void>
mergeAdapters(adapters: Adapter[]): Promise<Adapter>
// Settings API
configure(settings: Settings): Promise<void>
}{
"version": "1.0",
"samples": [
{
"context": "public class User {",
"completion": "private String name;",
"accepted": true,
"timestamp": 1234567890,
"metadata": {
"language": "java",
"project": "my-app",
"file": "User.java"
}
}
]
}adapter:
name: "spring-boot-expert"
version: "1.0.0"
base_model: "codet5-770m"
parameters:
r: 8
alpha: 16
dropout: 0.1
training:
samples: 50000
epochs: 3
learning_rate: 0.0001
metadata:
author: "community"
license: "Apache-2.0"
description: "Spring Boot patterns"| Version | Date | Author | Changes |
|---|---|---|---|
| 1.0.0 | 2024-10-30 | Yamaç Bezirgan | Initial PRD |
End of PRD - This document represents the vision and planned features for NeuralForge, an early-stage open-source project.