docker

Disaster recovery using Snapshot & Replication (Bitwarden scenario using Docker)

Luka Manestar

15 Apr 2021 — 8 min read

About a year ago, I have written an article on one of Synology's great apps, the Snapshot and Replication tool.

DR - disaster recovery
SR - Snapshot and Replication
BW - BitWarden

In this article, I would like to focus on a specific disaster (DR) test when it comes to using the SR tool, recovering your BitWarden server running in Docker.

Any service that is important for you or your business needs to have the maximum uptime possible, but from time to time there might be problems that will prevent you to use that service. If that problem as a result will prevent you from using that service how it is currently configured, you might have to do a DR switch.

If you are hosting your own BW platform for yourself, your business, or maybe hosting it for a 3rd party, you will need to make sure that access to that service is always accessible with minimum or no downtime at all. If for whatever reason you do get in a situation that you need to perform a DR switch, you will have to be ready for it upfront.

I have decided to use BW as a demo platform considering that not having access to your password manager might lead to a lot of problems for you or your users, so getting this back up as quickly as possible is imperative.

Also, for this case, we will use bitwardenrs Docker image in an SQLite configuration (so not connected to an outside database).

Let's see how Docker (using Portainer is optional but recommended), and Synology's SR platform can help you get up and running in a matter of minutes in case you need to do a DR switch.

NOTE: This article will imply that you have the following requirements in place to meet this goal

* Two separate NAS devices that are both Docker and SR capable (LAN or remotely)
* Access to your DNS and potential reverse proxy configuration (depending how your BW instance is configured)
* Running BW instance
* (optional) Portainer for Docker container managment

Links throughout this article will show you how to configure BW and set up SR so I will not focus on those but rather a simple walk of how to perform a DR switch.

Main production side

If you already have BW (using this image) up and running you probably have its volume bind configured and pointing to a folder destination on your NAS towards its internal /data destination. That location on your NAS could be for example /volume2/bitwarden/

What needs to be done now is to have that location covered by the SR replication process to your secondary NAS.

Replication setup

This is the process of the replication setup from production to DR site

Begin by clicking the "Create" button in the Replication section

Select "Remote" considering we are replicating to a separate NAS

Enter your DR NAS information (use FQDN if going over the Internet)

Select the target volume (if you have multiple ones) where you want the data to be replicated to

Select the bitwarden folder where docker data lives (root "bitwarden" folder in this case

You will have the option to sync data right away or not. Choose what fits your needs

Configure replication schedule. This one is set to 1h due to the importance of data and frequent change of data

Now that you have configured the SR task, replication will begin at some point and complete. Once that is done you are all set regarding replication to your DR site.

Once you are sure that the replication is in place and working as it should, you have one more thing to do before you are ready, and that's to set up a Docker BW instance on your DR site.

The reason behind this is that you should keep your production and DR side of BW up to date so that when you need DR, both sides are using the same Docker image version, and the only thing you need to do is copy over the /data element and boot up the container.

Disaster time, what now?

Now, let's say that your main production BW instance is not working (Docker side) for whatever reason (NAS is down, a reverse proxy is down, Docker update failed, etc), you will need to activate your DR side to get up and running again.

There are three (3) main things that you need to do if your setup is in LAN:

01. make BW data content accessible on the DR site
02. start the DR BW container
03. change the reverse proxy BW entry to point to a new Docker  destination

With these steps, your users will remain in access to their BW instance on the same public domain name, and the only downtime will be until the switch is executed.

If this setup is configured so that the DR device is outside your current LAN, then these are the steps needed:

01. make BW data content accessible on the DR site
02. start the DR BW container
03. change your BW DNS record to point to a new DR destination
03. change the reverse proxy BW entry to point to a new Docker destination

There is an additional step to be made here, and that's the one that will be needed to point your public BW name to a new location (off-site). Again, same as before, your users will not need to change anything on their end regarding domain name in any of their client apps.

STEP01 - restore data using Snapshot & Replication (DR side)

At the moment of DR, you will first need to restore your main production replica on your DR site to use the content and start the container.

At the moment bitwarden content folder is in read-only mode on the DR side that can be confirmed by looking at the status of shared folders on the DR NAS:

To make this folder read-write, and to have it connected to the BW DR container, we need to initiate a failover.

As described in the initial Snapshot & Replication article you will have two options: Failover and Switchover. In a real DR case (meaning that your main production site is down), you will most definitely use the failover option. This will mean that your main site is completely inaccessible and you need to make your DR side your new "production" site.

Considering that the production side is still "alive" we can only do a "force failover" for this demonstration but the result will be the same

You can also select a snap version to failover to

Failover has been completed pending re-protect operation

Re-protect your replica in order to complete the entire process

Even though the shared folder on the DR side is already in read/write mode we need to complete the process by using the reprotect option to make this DR folder the new/current production side of things

No changes are needed at the moment, just select "re-protect"

NOTE: Depending on the size of you data re-protect step might take a while

Configuration on the original production side that data is being "replicated from" the DR side

Once you have data on your DR side in read/write mode (while in replication mode DR site will be read-only) you are ready to have that content connected with a Docker BW DR container.

STEP02 - start your BW DR container

Nothing special in this step. Once you have got your BW data ready (volume bind) simply fire up your BW DR container, and that's it.

STEP03 - DNS and reverse proxy changes

Finally, there is only one last thing to do. Depending on the implementation of your DR solution using the SR method (LAN or WAN), you will have a few more steps to do.

If you are running everything on-site (LAN) and you simply restored data on your secondary NAS unit then you will have to only change the reverse proxy parameter to point to a new Docker host IP address (depending on where your Docker DR host is located).

If on top of this you have your DR site over the Internet on a remote location, you will also probably have to change your BW DNS record to point to a new location, and then finish up with a reverse proxy change.

NOTE: reverse proxy changes will NOT be needed unless you are using BW in that configuration, but instead you will already have it set up with the correct SSL certificate and ready to go. This article implies that you are using reverse proxy configuration

Conclusion

As you can see the process is quick and with minimum downtime. Of course, you can increase your odds of having a DR scenario in the first place (high availability scenario) but like I said, in case you do have a real DR scenario, best to be prepared.

This example can be implemented with many more Docker apps and services, so make sure that all your critical operations are up to date and DR-ready. Snapshot and Replication package will make sure that you have all your data up to date and ready to go on your DR site in the shortest possible time. Use this to your advantage.

As always, feel free to comment on the matter in the section below.