Monitor XML files for changes using Playwright 🎭
Designed for Portuguese Parliament's open data - triggers retrieval when resources update
- 🔄 Smart Detection - SHA256 hashing for efficient change detection
- 💾 Space Efficient - Only stores hash, not XML files
- 🌐 Multi-level Discovery - Discovers Resources → Legislatures → XML files
- 🛡️ Error Resilient - Handles network issues gracefully
- 📁 Configurable - Custom data directory support
- 🎯 Selective Monitoring - Monitor specific resources via CLI
npm install
npm run run # Monitor all resources
npm run run -- Deputados Sessoes # Monitor specific resources# Monitor all resources
npm run run
# Monitor specific resources (resource names from URL paths)
npm run run -- Deputados Sessoes Iniciativas
# Examples of resource names:
# - Deputados (from /Cidadania/Paginas/DADeputados.aspx)
# - Sessoes (from /Cidadania/Paginas/DASessoes.aspx)
# - Iniciativas (from /Cidadania/Paginas/DAIniciativas.aspx)npm test # Run tests
npm run test:watch # Watch mode
npm run test:coverage # With coveragebaseUrl- Webpage containing XML linksresourceNames- Array of resource names to monitor (optional, monitors all if empty)dataDir- Hash storage directory (default:./data)
Returns XMLFileChangeResult[]:
{
xmlFile: XMLFile;
changeResult: ChangeDetectionResult;
}[]Resource names are extracted from URLs with pattern /Cidadania/Paginas/DA<name>.aspx:
DADeputados.aspx→DeputadosDASessoes.aspx→SessoesDAIniciativas.aspx→Iniciativas
- 🌐 Navigate to Portuguese Parliament open data page
- 🎯 Discover resources (optionally filtered by provided names)
- 📁 For each resource, discover legislatures
- 📂 For each legislature, discover XML files
- 📥 Download XML content for each file
- 🔒 Generate SHA256 hash
- 🔍 Compare with stored hash
- 💾 Update if changed
Note
Uses single browser instance for performance and only stores hashes to save disk space
playwright- Web automationtypescript- Type safetyjest- Testing framework