Tag Archives: nfs

EMC, Isilon, and CSX possibilities..

Posted on by

As you’ve no doubt heard, EMC has completed the tender offer to acquire Isilon (www.isilon.com)  for a Cajillion dollars (actually ~$2 Billion) and some people are asking why.  From where I sit, there are many reasons why EMC would want a company like Isilon, ranging from it’s media-minded customer base, to the technical IP, like scale-out NAS, that sets Isilon apart from the rest.

This EMC Press Release, as well as this one, and Chucks Blog are some of the many places to find out more about the acquisition…

I was thinking a lot about that technology as I worked on a high-bandwidth NAS project with a customer recently.  Isilon’s primary product is an IP-based storage solution that uses commodity based hardware components, combined with their proprietary OneFS Operating System, to deliver scale-out NAS with super simple management and scalability.  A single Isilon OneFS based filesystem can scale to over 10PB across hundreds of nodes.  Isilon also provides various versions of hardware that can be intermixed to increase performance, capacity, or both depending on customer needs.  You don’t necessarily have to add disks to an Isilon cluster to increase performance.

When looking at EMC’s own product line, you’ll find that Atmos delivers similar scale-out clustering for object-based storage, while VMAX does a similar type of scaling for high-end block storage (FC, FCoE, and iSCSI), and Greenplum provides scale-out analytics as well.  Line up Isilon’s OneFS, EMC GreenPlum, EMC Atmos, and EMC VMAX, and we can now deliver massive scale-out storage for database, object, file, and block data.  With VPLEX and Atmos, EMC also delivers block and object storage federation across distance.

Isilon’s OneFS also has technologies that mirror EMC’s but are implemented in such a way as to leverage the Scale-Out NAS model.  Take FlexProtect, for example, which is Isilon’s data protection mechanism (similar to RAID) and allows admins to apply different protection schemes (N+1 ala RAID5, N+2 ala RAID6, N+3, and even N+4 redundancy) on individual files and directories.  SmartPools, which is policy based, automatically tiers data at a file level based on read/write activity across different protection types and physical nodes, similar to how FASTVP tiers data at a block level on EMC Unified and VMAX.  Both EMC and Isilon realize that all data is not equal.

Rather than just repackage OneFS with an EMC logo (which I’m sure we’ll do at first), I wonder what else can we do with Isilon’s IP…

A recent series of blog posts by Steve Todd (Information Playground) on the topic of a Common Software Execution Environment (See CSX Technology and The Benefits of Component Assembly) got me thinking about deeper integration and how CSX can accelerate that integration.

For example…

What if EMC Engineering took the portions of code from Isilon’s OneFS that handle client load-balancing, file-level automated tiering, and flexible protection and turned them into CSX components.  Those components could be dropped into Celerra and immediately add Scale-Out NAS to EMC’s existing Unified storage platforms.  Or, imagine those components running directly in VMAX engines, providing scale-out NAS simultaneously with scale-out SAN across multiple, massive scale storage systems.  Combine the load balancing code and FlexProtect from Isilon with FASTVP in EMC Clariion to provide scale-out SAN in a midrange platform.

We could also reverse the situation and use the compression component that is in Clariion and Celerra, plus federation technology in Atmos, both added to OneFS in order reduce the storage footprint and extend Scale-Out NAS to many sites over any distance.  Add a GreenPlum component and suddenly you have a massive analytics cluster that spans multiple sites for data where you need it, when you need it.

The possibilities here really are endless, it will be very interesting to see what happens over the next 12 to 24 months.

Disclaimer: Even though I am an EMC employee, I am in no way involved in the EMC/Isilon acquisition, have no knowledge of future plans and roadmaps with regard to EMC and Isilon, and am not privy to any non-public information about this topic.  I am merely expressing my own personal views on this topic.

Unified of the Beholder???

Posted on by

Apart from “The Cloud”, “Unified Storage” is the other big buzzword in the storage industry of late.  But what exactly is Unified Storage?

Mirriam-Webster defines unify as “to make into a unit or coherent whole

So how does this apply to storage systems?  If you look at marketing messages by EMC, NetApp, and other vendors you’ll find that they all use the term in different ways in order to fit nicely with the products they have.  Based on what I see, there are generally two different approaches.

Single HW/SW Stack Approach:

Some vendors want you to believe that the only way it can be called Unified Storage is if the same physical box and software stack provides all protocols and features, even if management of the single system is not perfectly cohesive.

NetApp’s FAS storage systems are an example of this strategy.  A single filer provides all services whether SAN or NAS, IP or FiberChannel.  However, a single HA cluster is actually managed as two separate systems, each cluster node is managed independently using independent FilerView instances and there are separate tools (NetApp System Manager, Operations Manager, Provisioning Manager, Protection Manager) that can bring all of the filer heads into one view.  Disks are captive to a specific filer head in a cluster and moving disks and/or volumes between filer heads is not seamless.

Single Point of Management Approach:

Others approach it more holistically and figure that as long as the customer manages it as a single system, it qualifies as “Unified”, even if there may be disparate hardware and software components providing the different services.  After all, once it’s installed you don’t really go in the datacenter to physically look at the hardware very often.

EMC’s Unified Storage (which is a combination of Celerra NAS and Clariion Block storage systems) is an example of this.  In a best-of-breed approach, EMC allows the Clariion backend to do what it does best, block storage via FC or IP, while the Celerra, which is purpose built for NAS, provides CIFS/NFS services while leveraging the disk capacity, processors, cache, and other features of the Clariion as a kind of offload engine.  Regardless of which services you use, all parts of the solution are managed from a single Unisphere instance, including other Clariions and/or Celerras in the environment.  Unisphere launches from any Clariion or Celerra management port, and regardless of which device you launch it from, all systems are manageable together.

Which approach is better?

I see advantages and disadvantages to both approaches, as a former admin of both NetApp and EMC storage, I feel that while NetApp’s hardware and software stack is unified, their management stack is decidedly un-unified.  EMC’s Unified storage is physically “integrated” to work together as a system, but the unifying feature is the management infrastructure built-in with Unisphere.

There are other advantages to EMCs approach as well.  For example, if a particular workload seems to hammer the CPUs on the NAS but the backend is not a bottleneck, more Celerra datamovers can be added to take advantage of the same backend disks and improve front end performance.  Likewise, the backend can be augmented as needed to improve performance, increase capacity, etc without having to scale up the front end NAS head.  With the NetApp approach, if your CPU or cache is stressed, you need to deploy more FAS systems (in pairs for HA) along with any required disks for that new system to store data.

Both approaches work, and both have their merits, but what do customers really want?

In my opinion, most customers don’t really care *how* the hardware works, so long as it DOES WORK, and is easy to manage.  In the grand scheme of things, if I, as an admin, can provision, replicate, snapshot, and clone storage across my entire environment, regardless of protocol,  from a “single pane of glass”, that is a strong positive.

EMC Unisphere makes it easy to do just that and it launches right from the array with no separate installation or servers required.  Unisphere can authenticate against Active Directory or LDAP and has role-based-administration built in.  And since Unisphere launches from any Clariion Storage processor or Celerra Control Station, there’s no single point of failure for storage management either.

So what do you think customers want?  If you are a customer, what do YOU want?

Why pNFS can be a big deal even if NFS4.1 isn’t…

Posted on by

It’s been a little while since I’ve posted, mostly due to my life being turned on it’s rear after our first child was born 8 weeks ago.  As things start to settle into a rhythm (as much as is possible) I’ve been back online more, reading blogs, following Twitter, and working with customers regularly.  As some of you may know, EMC announced support for pNFS in Celerra with the release of DART 6.x and there have been several recent posts about the technology which piqued my interest a little.

The other bloggers have done a good job of describing what pNFS is and what is new in NFS4.1 itself so I won’t repeat all of that.  I want to focus specifically on pNFS and why it IS a big deal.

Prior to my coming to work for EMC, I worked in internal IT at company that deals with large binary files in support of product development, as well as video editing for marketing purposes.  I had a chance to evaluate, implement, and support multiple clustered file system technologies.  The first was for an HD video editing solution using Mac’s and we followed the likely path of implementing Apple’s XSAN solution which you may know is an OEM’d version of Quantum(ADIC) StorNext.  StorNext allows you to create large filesystems across many disks and access them as local disk on many clients.  File Open, Close, byte-range locking, etc are handled by MetaData Controllers (MDCs) across an IP network while the actual heavy lifting of read/write IO is done over FibreChannel from the clients to the storage directly.  All the shared filesystem benefits of NAS with the performance benefits of SAN.

The second project was specifically targeted at moving large files (4+GB each) through a workflow across many computers as quickly as possible so we could ship products.  Faster processing of the workflow translated to more completed projects per person/per day which meant better margins and keeping our partners and customers happy.  The workflow was already established, using Windows based computers and a file server.  The file server was running out of steam and the amount of data being stored at any given time had increased from 500GB to 8TB over the past 12 months.  We needed a simple way to increase the performance of the file server and also allow for better scalability.  Working with our local EMC SE, we tested and deployed MPFSi using a Celerra NS40 with integrated storage.

MPFS has been around a long time (also known as High Road) and works with Windows and various *nix based platforms.  It is similar to XSAN/StorNext in that open/close/locking activity is handled over IP by the metadata controller (the Celerra datamover in the case of MPFS) while the read/write IO is handled over block storage technology (MPFS supports FibreChannel and iSCSI connectivity to storage).  The advantage of MPFS over many other solutions is that the metadata controller and storage are all built-in to the EMC Celerra storage device and you don’t have to deploy any other servers.

In our case we chose iSCSI due to the cost of FC (switches and HBAs) and used the GigE ports on the Celerra’s CX3 backend for block connectivity.  In testing we showed that CIFS alone provided approximately 240mbps of throughput over GigE connections while enabling MPFSi netted about 750mbps, even if we used the same NIC on the client.  So we tripled throughput over the same LAN by installing a software client.  Had we gone the extra mile to deploy FibreChannel for the block IO we would have seen much higher throughput.

Even better, the use of MPFS did not preclude the use of NDMP for backup to tape directly from the Celerra, accelerating backup many times over the old fileserver.  For clients that did not have MPFS software installed, they accessed the same files over traditional CIFS with no problems.  Another side benefit of MPFS over traditional CIFS, is that the block I/O stack is much more efficient than the NAS I/O stack so even with increased throughput, CPU utilization is lower on the client returning cycles to the application which is doing work for your business.

There are many clustered file system / clustered NAS solutions on the market from a variety of vendors (StorNext, MPFS, GFS, Polyserve, etc) and most of these products are trying to solve the same basic problems of storing more data and increasing performance.  The problem is they are all proprietary and because of that you end up with multiple solutions deployed in the same company.  In our case we couldn’t use MPFS for the video editing solution because EMC has not provided a client for Mac OSX.  And this is where pNFS really becomes attractive.  Storage vendors and operating system vendors alike will be upgrading the already ubiquitous NFS stack in their code to support NFS4.1 and pNFS.  And that support means that I could deploy an EMC Celerra MPFS like solution using the same Celerra based storage, with no extra servers, and no special client software, just the native NFS client in my operating system of choice.  Perhaps Apple will include a pNFS capable client in a future version of Mac OSX.

If you look at the pNFS standard you’ll see that it supports the use of not only block storage, but object and file based storage as well.  So as we build out larger and larger environments and private clouds start to expand into public clouds you could tier your pNFS data across FiberChannel storage, object storage (think Atmos on premises), as well as out to a service provider cloud (ie: AT&T Synaptic).  Now you’ve dramatically increased performance for the data that needs it, saved money storing the data that you need to keep long term, and geographically dispersed the data that needs to be close to users, with a single protocol supported by most of the industry and a single point of management.

Personally I think pNFS could kill off proprietary solutions over the long run unless they include support for it in their products.

This is just my opinion of course…

EMC CLARiiON and Celerra Updates – Defining Unified Storage

Posted on by

This past week, during EMC World 2010 in Boston, EMC made several announcements of updates to the Celerra and CLARiiON midrange platforms.  Some of the most impressive were new capabilities coming to CLARiiON FLARE in just a couple short months.  Major updates to Celerra DART will coincide with the FLARE updates and if you are already running CLARiiON CX4 hardware, or are evaluating CX4 (or Celerra), you will want to check these new features out.  They will be available to existing CX4(120,240,480,960)/NS(120,480,960) systems as part of a software update.

Here’s a list of key changes in FLARE 30:

  • Unified management for midrange storage platforms including CLARiiON and Celerra today, plus RecoverPoint, Replication Manager and more in the future.  This is a true single pane of glass for monitoring AND managing SAN, NAS, and data protection and it’s built in to the platform.  “EMC Unisphere” replaces Navisphere Manager and Celerra Manager and supports multiple storage systems simultaneously in a single window. (Video Demo)
  • Extremely large cache (ie: FASTCache) – Up to 2TB of additional read/write cache in CLARiiON using SSDs (Video Demo)
  • Block level Fully Automated Storage Tiering (ie: sub-LUN FAST) – Fully automated assignment of data across multiple disk types
  • Block Level Compression – Compress LUNs in the CLARiiON to reduce disk space requirements
  • VAAI Support – Integrate with vSphere ESX for improved performance

These features are in addition to existing features like:

  • Seamless and non-disruptive mobility of LUNs within a storage array – (via Virtual LUNs)
  • Non-Disruptive Data Migration – (via PowerPath Migration Enabler)
  • VMWare Aware Storage Management – (Navisphere, Unisphere, and vSphere Plugins giving complete visibility  and self-service provisioning for VMWare admins (Video Demo) AND Storage Admins
  • CIFS and NFS Compression – Compress production data on Celerra to reduce disk space requirements including VMs
  • Dynamic SAN path load balancing – (via PowerPath)
  • At-Rest-Encryption – (via PowerPath w/RSA)
  • SSD, FC, and SATA drives in the same system – Balance performance and capacity as needed for your application
  • Local and Remote replication with array level consistency – (SnapView, MirrorView, etc)
  • Hot-swap, Hot-Add, Hot-Upgrade IO Modules – Upgrade connectivity for FC, FCoE, and iSCSI with no downtime
  • Scale to 1.8PB of storage in a single system
  • Simultaneously provide FC, iSCSI, MPFS, NFS, and CIFS access

All together, this is an impressive list of features for a single platform. In fact, while many of EMC’s competitors have similar features, none of them have all of them in the same platform, or leverage them all simultaneously to gain efficiency.  When CLARiiON CX4 and Celerra NS are integrated and managed as a single Unified storage system with EMC Unisphere there is tremendous value as I’ll point out below…

Improve Performance easily…

  • Install a couple SSD drives into a CLARiiON and enable FASTCache to increase the array’s read/write cache from the industry competive 4GB-32GB up to 2TB of array based non-volatile Read AND Write cache available to ALL applications including NAS data hosted by the array.
  • Install PowerPath on Windows, Linux, Solaris, AND VMWare ESX hosts to automatically balance IO across all available paths to storage.  PowerPath detects latency and queuing occuring on each path and adjusts automatically, improving performance at the storage array AND for your hosts.  This is a huge benefit in VMWare environments especially.
  • When VMWare releases the updated version of vSphere ESX that supports VAAI, ESX will be able to leverage VAAI support in the CLARiiON to reduce the amount of IO required to do many tasks, improving performance across the environment again.
  • Upgrade from 1gbe iSCSI to 10gbe iSCSI, or from 4gbe FiberChannel to 8gbe FiberChannel, without a screwdriver or downtime.
  • Provide NAS shared file access with block-level performance for any application using EMC’s MPFS protocol.

Improve Efficiency and cost easily…

  • Create a single pool of storage containing some SSD, some FC, and some SATA drives, that automatically monitors and moves portions of data to the appropriate disk type to both improve performance AND decrease cost simultaneously.
  • Non-disruptively compress volumes and/or files with a single click to save 50% of your disk space in many cases.
  • Convert traditional LUNs to more efficient Thin-LUNs non-disruptively using PowerPath Migration Enabler, saving more disk space.

Increase and Manage Capacity easily…

  • Add additional storage non-disruptively with SSD, FC, and SATA drives in any mix up to 1.8PB of raw storage in a single CLARiiON CX4.
  • Using FASTCache, iSCSI, FC, and FCoE connectivity simultaneously does not reduce total capacity of the system.
  • Expanding LUNs, RAID Groups, and Storage Pools is non-disruptive.
  • Migrating LUNs between RAID groups and/or Storage Pools is non-disruptive using built-in CLARiiON LUN Migration, as is migrating data to a different storage array (using PowerPath Migration Enabler)!
  • Balancing workload between storage processors is non-disruptive and at individual LUN granularity.

Protect your data easily…

  • Snapshot, Clone, and Replicate any of the data to anywhere with built in array tools that can maintain complete data consistency across a single, or multiple applications without installing software.
  • Maintain application consistency for Exchange, SQL, Oracle, SAP, and much more, even within VMWare VMs, while replicating to anywhere with a single pane-of-glass.
  • Encrypt sensitive data seamlessly using PowerPath Encryption w/RSA.

Maintain Flexibility…

  • While you can do all of these things quickly and simply, you still have the flexibility to create traditional RAID sets using RAID 0, 1, 5, 6, and 10 where you need highly predicable performance, or tune read and write cache at the array and LUN level for specific workloads.  Do you want read/write snapshots? How about full copy clones on completely separate disks for workload isolation and failure protection? What about the ability to rollback data to different points in time using snapshots without deleting any other snapshots?  EMC Storage arrays have been able to do this for a long time and that hasn’t changed.

There are few manufacturers aside from EMC that can provide all of these capabilities, let alone provide them within a single platform.  That’s the definition of simple, efficient, Unified Storage in my opinion.

NetApp and EMC: Real world comparisons

Posted on by

I’ve been tasked recently on a project to increase availability of applications through the use of multiple/disparate storage systems.  This environment has heavily invested in EMC Clariion and Celerra storage systems over the past few years and needed a non-EMC platform from which to build the second half of a redundant storage environment.  For various reasons I won’t go into here, we chose IBM nSeries as that second platform. (Since the IBM system is rebranded NetApp FAS, I will refer to this as a NetApp filer.)  I’ve been working on implementing the new equipment as well as integrating it into the Business Continuity strategy.

The overall strategy is to continue to use the EMC Clariion/Celerra systems for production and disaster recovery replication and split applications between and across the two storage platforms for local redundancy.  The NetApp will also perform disaster recovery replication for some of the applications.  Here’s a really simple diagram that might help if the description is confusing:

EMC and NetApp Redundancy

EMC and NetApp Redundancy

Now this may sound easy, but it is, in fact, NOT straightforward.  This strategy requires close coordination with application owners and careful planning.  As we move forward on this project, I’ll talk about various idiosyncrasies, caveats, and problems we’ve faced, how we got around them, and I’ll also talk a lot about the differences between the Clariion/Celerra and NetApp platforms’ features and functionality, application support, and manageability.  These comparisons will include using both systems with FiberChannel connections as well as CIFS/NFS NAS, all in conjunction with DR replication and failover.

To start off, I figure we should compare some of the terminology between EMC and NetApp systems.  Some terms don’t directly translate, but I matched them up as close as I could and noted where there is no equivalent.   Below are two tables: one for Block Storage, and the other for NAS Storage.  Click on them to see full size versions.

EMC-NetApp Block Storage Terminology table

EMC-NetApp Block Storage Terminology

EMC-NetApp NAS Storage Terminology

EMC-NetApp NAS Storage Terminology

In the next update, I’ll start talking about the deployment itself.  The point of these articles is to discuss the differences, advantages, and disadvantages of each platform so that you can understand how each one might work in your environment.  I do not intend to disparage either platform or vendor.  I will try to be vendor agnostic as much as possible, and I do feel like I have a somewhat unique position of comparing new and recent hardware and firmware from both vendors, in the same production capacities, simultaneously, in the same environment.  I am NOT comparing old ONTap code to new FLARE/DART code or vise-versa, nor am I comparing old Clariion CX hardware to new NetApp/IBM hardware, etc.

Stay tuned!