This is a command line tool to convert the contents of a Confluence space into a MediaWiki import data format.
- PHP >= 8.2 with the
xmlextension must be installed pandoc>= 3.1.6. Thepandoctool must be installed and available in thePATH(https://pandoc.org/installing.html).
- Download
migrate-confluence.pharfrom https://github.com/hallowelt/migrate-confluence/releases/latest/download/migrate-confluence.phar - Make sure the file is executable. E.g. by running
chmod +x migrate-confluence.phar - Move
migrate-confluence.pharto/usr/local/bin/migrate-confluence(or somewhere else in thePATH)
- Create an export of your confluence space
Step 1:
Step 2:
Step 3:
- Save it to a location that is accessbile by this tool (e.g.
/tmp/confluence/input/Confluence-export.zip) - Extract the ZIP file (e.g.
/tmp/confluence/input/Confluence-export)- The folder should contain the files
entities.xmlandexportDescriptor.properties, as well as the folderattachments
- The folder should contain the files
- Create the "workspace" directory (e.g.
/tmp/confluence/workspace/) - From the parent directory (e.g.
/tmp/confluence/), run the migration commands- Run
migrate-confluence analyze --src input/ --dest workspace/to create "working files". After the script has run you can check those files and maybe apply changes if required (e.g. when applying structural changes). - Run
migrate-confluence extract --src input/ --dest workspace/to extract all contents, like wikipage contents, attachments and images into the workspace - Run
migrate-confluence convert --src workspace/ --dest workspace/(yes,--src workspace/) to convert the wikipage contents from Confluence Storage XML to MediaWiki WikiText - Run
migrate-confluence compose --src workspace/ --dest workspace/(yes,--src workspace/) to create importable data
- Run
If you re-run the scripts you will need to clean up the "workspace" directory!
- Copy the diretory "workspace/result" directory (e.g.
/tmp/confluence/workspace/result/to your target wiki server (e.g./tmp/result) - Go to your MediaWiki installation directory
- Make sure you have the target namespaces set up properly. See
workspace/space-id-to-prefix-map.phpfor reference. - Make sure $wgFileExtensions is setup properly. See
workspace/attachment-file-extensions.phpfor reference. - Use
php maintenance/importImages.php /tmp/result/images/to first import all attachment files and images - Use
php maintenance/importDump.php /tmp/result/output.xmlto import the actual pages
You may need to update your MediaWiki search index afterwards.
It is possible to use a yaml file to configure the commands analyze, extract and convert. As an expample see /doc/config.sample.yaml.
The configuration file can be applied by adding the option --config /tmp/config.yaml.
Not all parameters of config.sample.yaml have to be used in the config file. If something is not part of it the default will be used.
There is now a compatibility for the mediawiki extension https://www.mediawiki.org/wiki/Extension:NSFileRepo which restricts access files and images to a given set of user groups associated with protected namespaces.
If NSFileRepo is used the upload of the images can not be done with the script maintenance/importImages.php but with extensions/NSFileRepo/maintenance/importFiles.php.
Example: php extensions/NSFileRepo/maintenance/importFiles.php /tmp/result/images/
In confluence user spaces are protected. In MediaWiki this is not possible for namespace User. Therefore user spaces are migrated to a namespace User<username> which can be protected in BlueSpice for MediaWiki.
AttachmentsSectionEndAttachmentsSectionStartDetailsDetailsSummaryExcerptExcerptIncludeInfoInlineCommentLayoutLayouts.cssNotePanelRecentlyUpdatedSubpageListSubpageListRowTipWarningPageTreeSpaceDetailsViewFile
Be aware that those pages may be overwritten by the import if they already exist in the target wiki.
Icon-info.svgIcon-note.svgIcon-tip.svgIcon-warning.svg
Be aware that those files may be overwritten by the import if they already exist in the target wiki.
In case your pages contain a lot of external images (<img /> elements), be aware that MediaWiki does not show them by default. You'd need to configure $wgAllowExternalImages.
Read https://www.mediawiki.org/wiki/Manual:$wgAllowExternalImages for more information.
The output generated by the tool contains certain elements that need additonal extensions to be enabled.
- TemplateStyles
- [ParserFunctions] (https://www.mediawiki.org/wiki/Extension:DateTimeTools)
- DateTimeTools
- Checklists
- SimpleTasks
- EnhancedUploads
- Semantic MediaWiki
- HeaderTabs
- SubPageList
In the case that the tool can not migrate content or functionality it will create a category, so you can manually fix issues after the import
Broken_linkBroken_user_linkBroken_page_linkBroken_imageBroken_layoutBroken_macro/<macro-name>
- User identities
- Comments
- Various macros
- Various layouts
- Blog posts
- Files of a space which can not be assigned to a page
- Clone this repo
- Run
composer update --no-dev - Run
box compileto actually create the PHAR file indist/. See also https://github.com/humbug/box
- Reduce multiple linebreaks (
<br />) to one - Remove line breaks and arbitrary fromatting (e.g.
<b>) from headings - Mask external images (
<img />) - Preserve filename of "Broken_attachment"
- Merge multiple
<code>lines into<pre> - Remove bold/italic formatting from wikitext headings (e.g.
=== '''Some heading''' ===) - Fix unconverted HTML lists in wikitext (e.g.
<ul><li>==== Lorem ipsum ====</li><li>'''<span class="confluence-link"> </span>[[Media:Some_file.pdf]]'''</li></ul><ul>) - Remove empty confluence storage format fragments (e.g.
<span class="confluence-link"> </span>,<span class="no-children icon">)


