VibeReader implements comprehensive HTML sanitization to prevent XSS (Cross-Site Scripting) attacks from malicious feed content.
Feed content from RSS/Atom/JSON feeds can contain HTML, which could potentially include malicious scripts. This implementation sanitizes all feed content at multiple layers:
- Server-side sanitization - HTMLPurifier sanitizes content before storing in database
- Client-side sanitization - DOMPurify provides defense-in-depth when rendering content
- Library:
ezyang/htmlpurifier(v4.16+) - Location:
src/Utils/HtmlSanitizer.php - Integration: Automatically applied in
FeedParserwhen parsing feeds
- Feed titles - Plain text (HTML entities escaped)
- Feed descriptions - HTML sanitized
- Item titles - Plain text (HTML entities escaped)
- Item content - HTML sanitized (preserves formatting)
- Item summaries - HTML sanitized
- Item authors - Plain text (HTML entities escaped)
The sanitizer allows common formatting tags used in feed content:
- Text formatting:
p,br,strong,b,em,i,u - Links:
a[href|title|target] - Lists:
ul,ol,li - Code:
pre,code - Images:
img[src|alt|width|height] - Headings:
h1,h2,h3,h4,h5,h6 - Structure:
div,span[style],blockquote - Tables:
table,thead,tbody,tr,td,th
- Links:
href,title,target,rel - Images:
src,alt,width,height - Styling:
style(limited CSS properties) - Allowed CSS properties:
color,background-color,font-size,font-weight,font-style,text-align,text-decoration,margin,padding,border
Sanitization can be disabled via environment variable:
SANITIZATION_ENABLED=0 # Disable sanitization (not recommended)Default: Enabled (SANITIZATION_ENABLED=1)
HTMLPurifier uses a cache directory at var/htmlpurifier/ to improve performance. This directory is automatically created and is excluded from Git.
- Library: DOMPurify v3.3.1 (via CDN)
- Location: Loaded in
views/dashboard.php - Integration: Applied in
assets/js/modules/items.jswhen rendering item content
Even though content is sanitized server-side, DOMPurify provides an additional layer of protection:
- Protects against any content that might bypass server-side sanitization
- Handles edge cases in browser rendering
- Provides real-time sanitization when content is displayed
DOMPurify uses the same allowed tags and attributes as the server-side sanitizer for consistency.
use PhpRss\Utils\HtmlSanitizer;
// Sanitize HTML content (preserves formatting)
$cleanHtml = HtmlSanitizer::sanitize($feedContent);
// Sanitize plain text (escapes HTML entities)
$cleanText = HtmlSanitizer::sanitizeText($feedTitle);// Sanitize HTML before setting innerHTML
const sanitized = DOMPurify.sanitize(content, {
ALLOWED_TAGS: ['p', 'br', 'strong', 'b', 'em', 'i', 'u', 'a', 'ul', 'ol', 'li', 'blockquote', 'pre', 'code', 'img', 'h1', 'h2', 'h3', 'h4', 'h5', 'h6', 'div', 'span', 'table', 'thead', 'tbody', 'tr', 'td', 'th'],
ALLOWED_ATTR: ['href', 'title', 'target', 'src', 'alt', 'width', 'height', 'style', 'rel'],
ALLOW_DATA_ATTR: false
});
element.innerHTML = sanitized;- Prevents Stored XSS - Malicious scripts in feed content are removed before storage
- Prevents Reflected XSS - Content is sanitized before being sent to the browser
- Defense in Depth - Multiple layers of sanitization (server + client)
- Preserves Formatting - Legitimate HTML formatting is maintained
- Configurable - Can be disabled if needed (though not recommended)
- HTMLPurifier: Uses caching to improve performance on repeated sanitization
- DOMPurify: Lightweight client-side library with minimal performance impact
- Caching: HTMLPurifier cache stored in
var/htmlpurifier/(excluded from Git)
If legitimate content is being removed:
- Check HTMLPurifier logs for warnings
- Verify the content uses allowed tags/attributes
- Review
src/Utils/HtmlSanitizer.phpconfiguration
- Verify
SANITIZATION_ENABLED=1in environment - Check that HTMLPurifier is installed:
composer show ezyang/htmlpurifier - Verify DOMPurify is loaded (check browser console)
- Check that
var/htmlpurifier/directory is writable
Not Recommended - Only disable for debugging:
SANITIZATION_ENABLED=0This will bypass server-side sanitization. Client-side DOMPurify will still sanitize content.
src/Utils/HtmlSanitizer.php(new) - HTML sanitization utilitysrc/FeedParser.php- Integrated sanitization into all parsing methodssrc/Config.php- Added sanitization configurationassets/js/modules/items.js- Added DOMPurify client-side sanitizationviews/dashboard.php- Added DOMPurify CDN scriptcomposer.json- Added HTMLPurifier dependencyENV_CONFIGURATION.md- Added sanitization configuration documentation.gitignore- Added HTMLPurifier cache directory
To test sanitization:
-
Test with malicious content:
$malicious = '<script>alert("XSS")</script><p>Safe content</p>'; $sanitized = HtmlSanitizer::sanitize($malicious); // Result: '<p>Safe content</p>' (script removed)
-
Test with legitimate HTML:
$legitimate = '<p>This is <strong>bold</strong> text with a <a href="https://example.com">link</a>.</p>'; $sanitized = HtmlSanitizer::sanitize($legitimate); // Result: Same content (preserved)