Skip to content

Latest commit

 

History

History
10 lines (6 loc) · 445 Bytes

File metadata and controls

10 lines (6 loc) · 445 Bytes

WikiHTMLcleaner

A Perl script for cleaning the Wikipedia specific detritus from the HTML code of article pages.

Test with:

curl "https://en.wikipedia.org/w/index.php?title=Hello&action=render" | perl WikiHTMLcleaner.txt | pbcopy

Paste into Safari's Develop>Snippet Editor to see the results.

This works with the output of Wikipedia's &action=render URL option which returns only the HTML of the article requested by the title= argument.