Backup Everything – The datacenter done fast!

      No Comments on Backup Everything – The datacenter done fast!

I support a very diverse environment with a mix of Windows, Netware, Linux, Solaris, and Mac clients running on standard servers as well as VMWare ESX, plus two different brands of NAS, a few iSeries systems, and an Apple XSAN thrown in for good measure.  We have hundreds of applications running on these systems including SQL, Oracle, MySQL, Sharepoint, Documentum, and Agile.  These applications are mostly contained in our primary datacenter but we also have a few remote datacenters for specific applications and for disaster recovery as well as a couple remote business offices.

Recently I’ve been working on a project to replace our existing backup application with a new one.  We were experiencing extremely long backup windows, low throughput per client, and high backup failure rates with our existing solution and it was time to make a change of some kind.  The goal was to protect all of our systems regardless of their location with both an onsite backup in our primary datacenter and an offsite copy for disaster recovery purposes.  Additionally we wanted to use little or no tape.  After research, lots of vendor meetings, a consulting engagement, and lengthy debate we chose Symantec NetBackup with Symantec NetBackup PureDisk and DataDomain.  This combination was chosen for several reasons which will become clearer below.

For those of you who are not familiar with these products here’s a brief description..

Symantec Netbackup is a traditional backup solution that is designed to move data from many clients, as fast as possible, to disk or tape.  It is similar to EMC Networker, Symantec BackupExec, and any number of other backup products.  NetBackup supports a wide variety of clients, NAS devices, applications (SQL, Exchange, etc), as well as tape libraries and disk storage for the backed up data.  Since it simply copies all of the data that resides on the client directly to the backup server it is not particularly tuned for backing up remote offices across the WAN but it can easily flood a local LAN during a backup.

Symantec NetBackup PureDisk is currently a separate solution from the base NetBackup product; it is designed specifically for backing up data over the WAN.  Puredisk is a “source-dedupe” solution and is very similar in function to EMC’s Avamar product with which I have a long standing love affair. PureDisk performs an incremental-forever style of backup where only the data that changed since the last backup is copied to the backup server.  It then uses deduplication technology to reduce the resulting backup dataset down to an even smaller size before it gets copied across the network.  The data is collected and stored (in it’s deduplicated form) on the backup server.  With this design PureDisk saves network bandwidth as well as disk space on the backup server making it ideal for backups across the WAN, VPN, etc.  Symantec’s goal is to merge PureDisk into NetBackup as a single solution at some point probably next year.  PureDisk backup servers can replicate backed up data to other PureDisk backup servers in de-duplicated form for redundancy across sites. The downside to PureDisk is that raw throughput on a PureDisk backup server is not high enough for datacenter use and client support is more limited than the standard NetBackup product.

DataDomain (now part of EMC) has been making it’s DDR products for a while now and has been very successful (prompting the recent bidding war between NetApp and EMC to purchase the company).  DataDomain appliances are “target-dedupe” devices that are designed to replace tape libraries in traditional backup environments, like Netbackup.  The DDR appliance presents itself as a VTL (virtual tape library) via SAN, a CIFS(Windows) file server, and/or a NFS(UNIX) file server making it compatible with pretty much any type of backup system.  DataDomain also supports Symantec’s OpenSTorage (OST) API which is available in Netbackup 6.5.  The DDR system receives all of the data that Netbackup copies from backup clients, deduplicates the data in real-time, then stores it on it’s own internal disk.  Because the DDR is purpose built and has fast processors it can process data at relatively high throughput rates.  For example, a single DD690 model is rated at 2.7TB/hour (about 6gbps) when using OST.  The deduplication in a DDR provides disk-space savings but does not reduce the amount of data copied from backup clients.  DDRs can also replicate data (in deduplicated form) to other DDRs across the LAN or WAN, great for offsite backups.

For an explanation of de-duplication, check out my prior post on the topic..

Two of the challenges we faced when designing the final solution had to do with the cost per TB of DataDomain disk and the slightly limited client OS support of PureDisk.  But we had a clean slate to work from–there was no interest in utilizing any of the existing backup infrastructure aside from the two IBM tape libraries we had.  We were not required to use the libraries but we wouldn’t be buying new ones if we planned on using tape as part of the new solution.

For the primary datacenter we deployed NetBackup Master and Media servers, a DataDomain DD690, and connected them to each other with Cisco 4900M 10gbps switches.  We deployed a warm-standby master server plus a media server and another DD690 in our DR datacenter but did not use 10gbs there due to the additonal cost.

With this set up we covered all of the clients in our primary datacenter.  Systems that have large amounts of data (like Microsoft Exchange, SAS Financials, etc) were connected directly to the 4900M switches (via 1gbps connections).  Aggregate throughput of the backups during a typical night averages 400-500MB/sec with all of the data going to the DataDomain.  The Exchange server’s flood their network links pushing over 100MB/sec per server when backing up the email databases.  We currently back up 9TB of data per night with 3 media servers and a single DDR in about 5 hours.  Our primary bottlenecks are with the VCB Proxy server (we need more of them) and the aging datacenter core network having an aggregate throughput of a barely more than 1gbps.

But what about those remote sites?  What does OST really add?  How do you tackle the NAS backups without resorting to tape?  All that and more is coming up soon…