This is a command line tool to convert the contents of a Confluence space into a MediaWiki imprt data format.
- PHP >= 7.4 with the XML extension must be installed
- The
pandoctool must be installed and available in thePATH(https://pandoc.org/installing.html)
- Download
migrate-confluence.pharfrom https://github.com/hallowelt/migrate-confluence/releases/tag/latest - Copy
migrate-confluence.pharto/usr/local/bin/migrate-confluence
- Create an export of your confluence space
Step 1:
Step 2:
Step 3:
- Save it to a location that is accessbile by this tool (e.g.
/tmp/confluence/input/Confluence-export.zip) - Extract the ZIP file (e.g.
/tmp/confluence/input/Confluence-export)- The folder should contain the files
entities.xmlandexportDescriptor.properties, as well as the folderattachments
- The folder should contain the files
- Create the "workspace" directory (e.g.
/tmp/confluence/workspace/) - From the parent directory (e.g.
/tmp/confluence/), run the migration commands- Run
migrate-confluence analyze --src input/ --dest workspace/to create "working files". After the script has run you can check those files and maybe apply changes if required (e.g. when applying structural changes). - Run
migrate-confluence extract --src input/ --dest workspace/to extract all contents, like wikipage contents, attachments and images into the workspace - Run
migrate-confluence convert --src workspace/ --dest workspace/to convert the wikipage contents from Confluence Storage XML to MediaWiki WikiText - Run
migrate-confluence compose --src workspace/ --dest workspace/to create importable data
- Run
If you re-run the scripts you will need to clean up the "workspace" directory!
- Copy the diretory "workspace/result" directory (e.g.
/tmp/confluence/workspace/result/to your target wiki server (e.g./tmp/result) - Go to your MediaWiki installation directory
- Make sure you have the target namespaces set up properly
- Use
php maintenance/importImages.php /tmp/result/images/to first import all attachment files and images - Use
php maintenance/importDump.php /tmp/result/output.xmlto import the actual pages
You may need to update your MediaWiki search index afterwards.
In the case that the tool can not migrate content or functionality it will create a category, so you can manually fix issues after the import
Broken_linkBroken_user_linkBroken_page_linkBroken_imageBroken_layoutBroken_macro/<macro-name>
- User identities
- Comments
- Various macros
- Various layouts
- Blog posts
- Clone this repo
- Run
composer update - Run
box buildto actually create the PHAR file indist/. See also https://github.com/humbug/box
- Reduce multiple linebreaks (
<br />) to one - Remove line breaks and arbitrary fromatting (e.g.
<b>) from headings - Mask external images (
<img />) - Preserve filename of "Broken_attachment"
- Add
wikitableas default class to<table> - Merge multiple
<code>lines into<pre> - Remove bold/italic formatting from wikitext headings (e.g.
=== '''Some heading''' ===) - Fix unconverted HTML lists in wikitext (e.g.
<ul><li>==== Lorem ipsum ====</li><li>'''<span class="confluence-link"> </span>[[Media:Some_file.pdf]]'''</li></ul><ul>) - Remove empty confluence storage format fragments (e.g.
<span class="confluence-link"> </span>,<span class="no-children icon">)


