Difference between revisions of "Backup and Restore"

From PKC
Jump to navigation Jump to search
Line 39: Line 39:
  php maintenance/dumpUploads.php | [[sed]] -e '/\.\.\//d' -e "/'/d" | [[xargs]] --verbose cp -t /tmp/MediaFiles
  php maintenance/dumpUploads.php | [[sed]] -e '/\.\.\//d' -e "/'/d" | [[xargs]] --verbose cp -t /tmp/MediaFiles


Note that the second filtering expression tries to eliminate files with <code>'</code> character in the file names. After dumping all the files to the <code>MediaFiles</code> directory, make sure that you check how many files are missing.  
Note that the second filtering expression tries to eliminate files with <code>'</code> character in the file names. After dumping all the files to the <code>MediaFiles</code> directory, make sure that you check whether there are [[files missing]].  


Then, compress the file in a zip directory.
Then, compress the file in a zip directory.

Revision as of 13:16, 13 January 2022

Introduction

To ensure this MediaWiki's content will not be lost, we created a set of script and put it in $wgResourceBase's extensions/BackupAndRestore directory.

The main challenge is to ensure both textual data and binary files are backed up and restored. There are four distinct steps:

  1. Official Database Backup Tools
  2. Official Media File Backup Tools
  3. Restoring Binary Files
  4. Restoring SQL Data

Database Backup

For textual data backup, the fastest way is to use "mysqldump". The more detailed instructions can be found in the following link: [1]

To backup all the uploaded files, such as images, pdf files, and other binary files, you can reference the following Stack Overflow answer[2]

In the PKC docker-compose configuration, the backup file should be dumped to /var/lib/mysql for convenient file transfer on the host machine of Docker runtime. Example of the command to run on the Linux/UNIX shell:

mysqldump -h hostname -u userid -p --default-character-set=whatever dbname > backup.sql

For running this command in PKC's docker implementation, one needs to get into the Docker instance using something like:

docker exec -it pkc-mediawiki-1 /bin/bash (pkc-mediawiki-1 may be replace by xlp_mediawiki)

Whem running this command on the actual database host machine, hostname can be omitted, and the rest of the parameters are explained below:

mysqldump -u wikiuser -pPASSWORD_FOR_YOUR_DATABASE my_wiki > backup.sql
(note that you should NOT leave a space between -p and the passoword data)

Substituting hostname, userid, whatever, and dbname as appropriate. All four may be found in your LocalSettings.php (LSP) file. hostname may be found under $wgDBserver; by default it is localhost. userid may be found under $wgDBuser, whatever may be found under $wgDBTableOptions, where it is listed after DEFAULT CHARSET=. If whatever is not specified mysqldump will likely use the default of utf8, or if using an older version of MySQL, latin1. While dbname may be found under $wgDBname. After running this line from the command line mysqldump will prompt for the server password (which may be found under Manual:$wgDBpassword in LSP).

For your convenience, the following instruction will compress the file as it is being dumped out.

mysqldump -h hostname -u userid -p dbname | gzip > backup.sql.gz


Media File Backup

Before running the PHP maintenance script dumpUploads.php, you must first create a temporary working directory: mkdir /tmp/workingBackupMediaFiles It is common to have files not being dumped out, due to errors caused by escape characters in File names. This will be resolved in the future. You must first go to the proper directory, in the case of standard PKC configuration, you must make sure you launch the following command at this location /var/www/html:

php maintenance/dumpUploads.php | sed -e '/\.\.\//d' -e "/'/d" | xargs --verbose cp -t /tmp/MediaFiles

Note that the second filtering expression tries to eliminate files with ' character in the file names. After dumping all the files to the MediaFiles directory, make sure that you check whether there are files missing.

Then, compress the file in a zip directory. zip -r ~/UploadedFiles_date_time.zip /tmp/MediaFiles

Remember to remove the temporary files and its directory. rm -r /tmp/UploadedFiles_date_time

Restoring Binary Files

Loading binary files to MediaWiki, one must use a maintenance script in the /maintenance directory. This is the command line information. It needs to be launched in the container that runs MediaWiki instance.

Load images from the UploadedFiles location. In most cases, the variable $ResourceBasePath string can be replaced by /var/www/html.

cd $ResourceBasePath
php $ResourceBasePath/maintenance/importImages.php $ResourceBasePath/images/UploadedFiles/

After all files are uploaded, one should try to run a maintenance scrip on the server that serves Mediawiki service:

php $ResourceBasePath/maintenance/rebuildImages.php

For more information, please refer to MediaWiki's documentation on Manual:rebuildImages.php.

Restoring SQL Data

The following instruction should be launched in the host (through docker exec or kubectl exec -it command) of the container that hosts the mariadb/mysql service.

mysql -u $DATABASE_USER -p $DATABASE_NAME < BACKUP_DATA.sql

If the file is very large, it might have been compressed in to gz or tar.gz form. Then, just use the piped command to first uncompress it and directly send it to msql program for data loading.

gunzip -c BACKUP_DATA.sql.gz | mysql -u $DATABASE_USER -p $DATABASE_NAME

References

Related Pages