Explains what the feature is or what its benefits are to the user or customer.
You can use a backup to restore a cluster to a prior state, a differently configured appliance, or move it to another appliance. Some advanced administrative operations also use backups.
When to Use a Backup
Backups are created from an existing snapshot, but they differ from snapshots in the following ways:
Backups are stored on disk in a directory, while snapshots are stored in HDFS.
You can use a backup to recover from data loss or corruption, even if your cluster has been destroyed. Snapshots can be lost if the HDFS name node fails, you lose multiple disks at once, or the entire cluster is destroyed.
If you need to move data between two appliances, you must use a backup. Snapshots may only be used to restore to the cluster they were taken from.
Backups can be full, lightweight, or dataless:
Full backups are entire backups of the cluster with all data, whether loaded from the web interface or from tsload. They are written to a directory, which may be moved between clusters, even if the cluster configuration is different. Full backups are very large, so before taking one, you should make sure there is enough disk space in the directory where it will be stored. NAS (network attached storage) is recommended for storing backups.
Lightweight backups contain everything that makes up a cluster, except for any data loaded through ThoughtSpot Loader (tsload). Any data loaded via tsload can be re-loaded after the cluster has been restored, using the same scripts or remote connections you used to load it initially.
The lightweight backups contain the following:
Cluster configuration (SSH, LDAP, etc.)
In-memory data cache
All data that is stored in HDFS
Data uploaded by users
Metadata for the data store
Users, groups and permissions
Objects created by users (answers, pinboards, worksheets, and formulas) with their shares and permissions.
Data model and row-level security rules.
Dataless backup saves a backup of the schema, with no data. This is provided mainly for support purposes, to enable you to send a copy of your cluster metadata to ThoughtSpot Support for troubleshooting, without compromising data security and privacy. When restoring from a dataless backup, you must supply the correct release tarball, since this type of backup does not include the software release.
Take a Snapshot
A snapshot takes a point-in-time image of your running cluster. You should create a snapshot before making any changes to the environment, loading a large amount of new data, or changing the structure of a table.
Take a Backup of a Snapshot
Taking a backup pulls a snapshot out and dumps it to persistent storage on disk, on a network mounted directory. Use this procedure when you want to create a backup.
Configure Periodic Backups
You can configure ThoughtSpot to take backups for you automatically at intervals you define. Old backups are discarded automatically, using FIFO (first in, first out).