Tag Archives: windows

Why pNFS can be a big deal even if NFS4.1 isn’t…

Posted on by

It’s been a little while since I’ve posted, mostly due to my life being turned on it’s rear after our first child was born 8 weeks ago.  As things start to settle into a rhythm (as much as is possible) I’ve been back online more, reading blogs, following Twitter, and working with customers regularly.  As some of you may know, EMC announced support for pNFS in Celerra with the release of DART 6.x and there have been several recent posts about the technology which piqued my interest a little.

The other bloggers have done a good job of describing what pNFS is and what is new in NFS4.1 itself so I won’t repeat all of that.  I want to focus specifically on pNFS and why it IS a big deal.

Prior to my coming to work for EMC, I worked in internal IT at company that deals with large binary files in support of product development, as well as video editing for marketing purposes.  I had a chance to evaluate, implement, and support multiple clustered file system technologies.  The first was for an HD video editing solution using Mac’s and we followed the likely path of implementing Apple’s XSAN solution which you may know is an OEM’d version of Quantum(ADIC) StorNext.  StorNext allows you to create large filesystems across many disks and access them as local disk on many clients.  File Open, Close, byte-range locking, etc are handled by MetaData Controllers (MDCs) across an IP network while the actual heavy lifting of read/write IO is done over FibreChannel from the clients to the storage directly.  All the shared filesystem benefits of NAS with the performance benefits of SAN.

The second project was specifically targeted at moving large files (4+GB each) through a workflow across many computers as quickly as possible so we could ship products.  Faster processing of the workflow translated to more completed projects per person/per day which meant better margins and keeping our partners and customers happy.  The workflow was already established, using Windows based computers and a file server.  The file server was running out of steam and the amount of data being stored at any given time had increased from 500GB to 8TB over the past 12 months.  We needed a simple way to increase the performance of the file server and also allow for better scalability.  Working with our local EMC SE, we tested and deployed MPFSi using a Celerra NS40 with integrated storage.

MPFS has been around a long time (also known as High Road) and works with Windows and various *nix based platforms.  It is similar to XSAN/StorNext in that open/close/locking activity is handled over IP by the metadata controller (the Celerra datamover in the case of MPFS) while the read/write IO is handled over block storage technology (MPFS supports FibreChannel and iSCSI connectivity to storage).  The advantage of MPFS over many other solutions is that the metadata controller and storage are all built-in to the EMC Celerra storage device and you don’t have to deploy any other servers.

In our case we chose iSCSI due to the cost of FC (switches and HBAs) and used the GigE ports on the Celerra’s CX3 backend for block connectivity.  In testing we showed that CIFS alone provided approximately 240mbps of throughput over GigE connections while enabling MPFSi netted about 750mbps, even if we used the same NIC on the client.  So we tripled throughput over the same LAN by installing a software client.  Had we gone the extra mile to deploy FibreChannel for the block IO we would have seen much higher throughput.

Even better, the use of MPFS did not preclude the use of NDMP for backup to tape directly from the Celerra, accelerating backup many times over the old fileserver.  For clients that did not have MPFS software installed, they accessed the same files over traditional CIFS with no problems.  Another side benefit of MPFS over traditional CIFS, is that the block I/O stack is much more efficient than the NAS I/O stack so even with increased throughput, CPU utilization is lower on the client returning cycles to the application which is doing work for your business.

There are many clustered file system / clustered NAS solutions on the market from a variety of vendors (StorNext, MPFS, GFS, Polyserve, etc) and most of these products are trying to solve the same basic problems of storing more data and increasing performance.  The problem is they are all proprietary and because of that you end up with multiple solutions deployed in the same company.  In our case we couldn’t use MPFS for the video editing solution because EMC has not provided a client for Mac OSX.  And this is where pNFS really becomes attractive.  Storage vendors and operating system vendors alike will be upgrading the already ubiquitous NFS stack in their code to support NFS4.1 and pNFS.  And that support means that I could deploy an EMC Celerra MPFS like solution using the same Celerra based storage, with no extra servers, and no special client software, just the native NFS client in my operating system of choice.  Perhaps Apple will include a pNFS capable client in a future version of Mac OSX.

If you look at the pNFS standard you’ll see that it supports the use of not only block storage, but object and file based storage as well.  So as we build out larger and larger environments and private clouds start to expand into public clouds you could tier your pNFS data across FiberChannel storage, object storage (think Atmos on premises), as well as out to a service provider cloud (ie: AT&T Synaptic).  Now you’ve dramatically increased performance for the data that needs it, saved money storing the data that you need to keep long term, and geographically dispersed the data that needs to be close to users, with a single protocol supported by most of the industry and a single point of management.

Personally I think pNFS could kill off proprietary solutions over the long run unless they include support for it in their products.

This is just my opinion of course…

Capacity vs Performance: Thin Provisioning-Reclaiming Free Space

Posted on by

A comment about HDS’s Zero Page Reclaim on one of my previous posts got me thinking about the effectiveness of thin provisioning in general.  In that previous post, I talked about the trade-offs between increased storage utilization through the use of thin-provisioning and the potential performance problems associated with it.

There are intrinsic benefits that come with the use of thin provisioning.  First, new storage can be provisioned for applications without nearly as much planning.  Next, application owners get what they want, while storage admins can show they are utilizing the storage systems effectively.  Also, rather than managing the growth of data in individual applications, storage admins are able to manage the growth of data across the enterprise as a whole.

Thin provisioning can also provide performance benefits…  For example, consider a set of virtual Windows servers running across several LUNs contained in the same RAID group.  Each Windows VM stores its OS files in the first few GB of their respective VMDK files.  Each VMDK file is stored in order in each LUN, with some free space at the end.  In essence, we have a whole bunch of OS sections separated by gaps of no data.  If all VMs were booting at approximately the same time, the disk heads would have to move continuously across the entire disk, increasing disk latency.

Now take the same disks, configured as a thin pool, and create the same LUNs (as thin LUNs) and the same VMs.  Because thin-provisioning in general only writes data to the physical disks as it’s being written by the application, starting from the beginning of the disk, all of those Windows VMs’ OS files will be placed at the beginning of the disks.  This increased data locality will reduce IO latency across all of the VMs.  The effect is probably minor, but reduced disk latency translates to possibly higher IOPS from the same set of physical disks.  And the only change is the use of thin-provisioning.

So back to HDS Zero Page Reclaim.  The biggest problem with thin provisioning is that it doesn’t stay thin for long.  Windows NTFS, for example, is particularly NOT thin-friendly since it favors previously untouched disk space for new writes rather than overwriting deleted files.  This activity eventually causes a thin-LUN to grow to it’s maximum size over time, even though the actual amount of data stored in the LUN may not change.  And Windows isn’t the only one with the problem.  This means that thin provisioning may make provisioning easier, or possibly improve IO latency, but it might not actually save you any money on disk.  This is where HDS’s Zero Page Reclaim can help.  Hitachi’s Dynamic Provisioning (with ZPR) can scan a LUN for sections where all the bytes are zero and reclaim that space for other thin LUNs.  This is particularly useful for converting thick LUNs into thin LUNs.  But, it can only see blocks of zeros, and so it won’t necessarily see space freed up by deleting files.  Hitachi’s own documentation points out that many file systems are not-thin friendly, and ZPR won’t help with long-term growth of thin LUNs caused by actively writing and then deleting data.

Although there are ways to script the writing of zeros to free space on a server so that ZPI can reclaim that space, you would need to run that script on all of your servers, requiring a unique tool for each operating system in your environment.  The script would also have to run periodically, since the file system will grow again afterward.

NetApp’s SnapDrive tool for Windows can scan an NTFS file system, detect deleted files, then report the associated blocks back to the Filer to be added back to the aggregate for use by other volumes/LUNs.  The Space Reclamation scan can be run as needed, and I believe it can be scheduled; but, it appears to be Windows only.  Again, this will have to be done periodically.

But what if you could solve the problem across most or all of your systems, regardless of operating system, regardless of application, with real-time reclamation?  And what if you could simultaneously solve other problems?  Enter Symantec’s Storage Foundation with Thin-Reclamation API.  Storage Foundation consists of VxFS, VxVM, DMP, and some other tools that together provide dynamic grow/shrink, snapshots, replication, thin-friendly volume usage, and dynamic SAN multipathing across multiple operating systems.  Storage Foundation’s Thin-Reclamation API is to thin-provisioning what OST is to Backup Deduplication.  Storage vendors can now add near-real-time zero page reclaim for customers that are willing to deploy VxFS/VxVM on their servers.  For EMC customers, DMP can replace PowerPath, thereby offsetting the cost.

As far as I know, 3PAR is the first and only storage vendor to write to Symantec’s thin-API, which means they now have the most dynamic, non-disruptive, zero-page-reclaim feature set on the market.  As a storage engineer myself, I have often wondered if VxVM/VxFS could make management of application data storage in our diverse environment easier and more dynamic.  Adding Thin-Reclamation to the mix makes it even more attractive.  I’d like to see more storage vendors follow 3PAR’s lead and write to Symantec’s API.  I’d also like to see Symantec open up both OST and the Thin-Reclamation API for others to use, but I doubt that will happen.