Automatic Robust Backup or A.R.B. is an archiving and synchronization tool with automation, encryption, redundancy and performance as it goals. It is fast to deploy and provides pre-built use cases and sensible defaults. It is declarative and easy to customize.
- Automation: after configuration it should not require intervention.
- Encryption: man-in-the-middle or server should not be able to read the data content.
- Redundancy: data should not be lost even if a catastrophic failure happens.
- Performance: tasks should be done in a timely manner and conserve limited resources like size when possible.
- Borg, the most popular chunk-based deduplication backup manager for home users.
- Gocryptfs, the spiritual successor to Ecryptfs that is mature, audited actively developed.
- Rclone is a mature command line cloud storage manager that supports most if not all common providers.
- Rsync is the best way to sync a directory to another local or network directory.
- Git is the most popular version control system.
- Pass, the standard UNIX password manager, or gopass, its actively developed Go fork.
- Save space via Borg's fast and effective deduplication and compression.
- Do daily, or even more granular, archives and mount them wherever you want to check the repository at that time.
- Set complex retention policies to control repository size while preserving time span coverage.
- Ensure privacy and integrity of data stored on cloud through Gocryptfs online encryption.
- Sync your data to over 40 cloud storage services, including all major providers with free tiers (Google Drive, Dropbox, Onedrive, Mega, etc).
- Store secrets encrypted with GPG key and Git versioned.
- Archive system configuration files and package list.
- Archive your personal files and whatever files you wish.
make
make install
Edit .config/arbie/config
to set up pipelines. Instructions and examples included in the file.
Some of the tools require manual initialization or configuration. In the future there will be a tool to partially automate those.
Init a password repository
gopass setup
Note: A GPG key is needed.
Generate a long secure password
pass generate $secret_name
Insert a password manually
pass insert $secret_name
Note: They will be needed later for encryption.
Init Git in System repository
git -C $repo_path init
Init Borg repository
$borg init -e none $repo_path
Note: Encryption is done by gocryptfs.
Init reverse mode encryption in a dir
gocryptfs -extpass pass -extpass $secret_name -init -reverse $repo_path
Note: Reverse mode encryption mount plain dir and files as encrypted files with encrypted dir names which is ideal for storing on the cloud.
Configure streams
rclone config
Enable the systemd timer as user
systemctl --user enable arbie.timer
By default it will try to run daily at midnight and run immediately after login in case of a miss. But you edit the service to make it run whenever you want by using a cron alike syntax.
systemctl --user edit arbie.timer
More information about that on Arch Wiki: Systemd/Timers
Before anything, export the repository path.
export BORG_REPO="$repo_path"
Show repository info
borg info
List archives
borg list
Mount an archive with FUSE
borg mount ::archiveName mountPoint
Security is a big concern. Rclone and Borg have their own encryption features but following the principle of do one thing and do it well Gocrypts is exclusively an audited encryption file system.
Note: The repeated and thus predictable header pattern of Borg files may be a vector for a sophisticated attack.
There is no ideal backup method. But for most users their data can be classified in an ABC fashion: few files that they really can't lose; data with average volume and importance; voluminous but not important data. And each of these categories will have their own ideal methods.
Cloud providers are a cheap way to have an off-site copy replicated in data centers globally. Some people may have a limited Internet connection and may find useful to instead sync a secondary archive with higher compression and heavy use of exclusion patterns while syncing a full archive on premise.
File-based is simpler but the controlled granularity of chunk-based is ideal. To sync a few large size files would be a PITA because any modification would require a re-upload of the whole file. On the other side to sync a great number of small files directly would congest the API requests quota. Borg allows tuning the chunk size and is performant.
Reinstalling is faster and saner than doing whole disk backups. It's more practical to backup the system configurations and a list of installed packages. After a fresh minimal install the user can run a script to recover the system settings. The advantages are: no need to restart; instantly done; no voluminous disk images or tar archives; high-granularity history of system changes.
Desktops generally don't stay on 24/7 so there's a need for a tool that will reschedule missed tasks. Anacron does that but unfortunately it would require the scripts to run as root. While Systemd allows the scripts to run in the user environment and provides it's own logging feature through Journalctl. Also many distros are coming only with Systemd installed.