Frank Slootman, CEO, Data Domain

"Over 2007 we created a huge amount of distance between ourselves and the competition."

January 18, 2008

19 Min Read
NetworkComputing logo in a gray background | NetworkComputing

Frank Slootman has the gift of gab. That's not an uncommon trait in CEOs, for sure, but it's notable in his case for several reasons. There's his marketing background at Borland and Compuware that lends him a salesman's persistence. Then there's his confidence and the sheer volume of words he delivers -- this is an executive who could talk for hours. And he's easy to listen to, his words touched with the faintest of Dutch accents.

Slootman's verbal gifts will no doubt come in handy this year, as Data Domain is challenged by competition in the wake of its impressive 2007 IPO which demonstrates the widespread market interest in data de-duplication. In the viciously competitive storage market, no supplier stands still for long, particularly at the level of success that Data Domain has achieved. Slootman's wearing an invisible target on his back.

If it bothers him, he isn't saying. "Over 2007 we created a huge amount of distance between ourselves and the competition," he boasts. In his view, catching up to his newly public firm will be a major challenge, even for companies like EMC.

"Building the technology is a gestation process. It doesn't matter if you have billions of dollars," he says. "If you want to have a baby, it takes nine months. You can throw 100 women at the problem and it's not going to make a difference. You have to learn just like we have over the last six years... If you got started a year ago, you're about five years behind Data Domain."

Figure 1: Frank Slootman, CEO, Data Domain

This is only part of what Frank Slootman told us when Byte and Switch caught up with him earlier this month. Check out the rest below:

Byte and Switch: How are things going at Data Domain?

Slootman: Well, you know, 2007 [was] a fantastic year for the company. I think what is sort of interesting that emerged during the year is how prominent this technology category [data de-duplication] has become.

Most IT executives we see out there are basically pursuing two main initiatives: One is server virtualization, the other is de-duplicated storage. I don't think any of us could have foreseen four or five years ago that that's where we'd be starting out in 2008.

It puts a company like ours in an interesting position because we've been able to establish a good run rate and a good position in this category.You probably noticed at the end of Q3 that Cisco and others were sounding a negative tone around IT spending, and that has gotten worse going into 2008. That doesn't seem to affect this part of the business as much, because we are more subject to microeconomic shifts rather than the macroeconomic environment. What I mean by microeconomic is the shift from tape-based data protection to disk- and network-based data protection. And you know we're just in the very beginning stages of moving through that trend. It could take the better part of a decade for that to play out. Certainly, it's interesting to be part of that.

And, as you know, we went public back in June, followed by a secondary offering in November. So we're in a good position to continue to keep executing and to play this thing out.

Byte and Switch: You know, we've seen at least one survey that shows de-duplication has no traction in large enterprise shops. Is that true?

Slootman: Yes, I've seen that and it sort of baffles me. If you look at our customer roster, it's a "Who's Who" of all the companies you'd recognize. There's a little a bit of competitive spin going on... We basically segment the marketplace in roughly three big buckets: One is a marketplace that measures its data in gigabytes, one that measures its data in terabytes, and one that measures its data in petabytes.

The marketpace that is defined by gigabytes is really where you see software replication guys, where you see Avamar and PureDisk, where you see Riverbed and Cisco with their WAN acceleration products. It's a very network-centric approach to the world.The marketplace in the middle, which is where Data Domain lives, is a marketplace measured in terabytes, and we characterize it as a commercial and distributed enterprise marketplace. And when I say distributed enterprise, it's enterprise, but it's enterprise with many locations. And by the way, that's really where the highest degree of pain is around tape, because the more sites you have to deal with, the complexity goes up exponentially in terms of moving tapes around and getting data back, and so on. Also, you have pain around maintaining skill sets at all these different sites.

Now the marketplace that sits at the very top echelon of the storage market includes people in consolidated data centers dealing with petabytes of data. There aren't that many of those, but that's the place where you're still dealing with a very, very tape-centric environment. That's a marketplace that will be closing the door on tape, [but] they'll be on tape the longest and it will be a long time before they're coming off. We don't characterize that marketplace as enterprise, because I don't think there are that many data centers in the world that are managing petabytes of data in a single place. They're really far and few compared to the overall marketplace. You go to Europe, you see hardly any of those.

So I think when people say "enterprise," they're talking about Goldman Sachs in Jersey City with four petabytes in one building, but that's not the broader enterprise data center market at all. Most enterprises are distributed. They have many sites and they have data to manage all over the place. We're extremely active in those kinds of environments.

To Page 2

Byte and Switch: In our interview with Hugo Patterson [chief architect at Data Domain] last year, one of the things we spoke about was how Data Domain might expand its technology beyond de-duplication, or at least apply de-duplication in other ways than it does right now. Can you speak to that?Slootman: We were first and foremost conceived as a storage company, not as a de-duplication company, or a backup company, or a data protection company. When you listen to people in the marketplace, they mostly regard de-duplication as a feature... That's not true for Data Domain. For us, de-dupe is an enabling technology.

We've always viewed our technology as being very general purpose, even though we obviously got started with one major class of application, which was enterprise backup, because there was such a weak incumbency in terms of tape, and the use case was very compelling for us. But going forward -- and we announced this last September and we've seen some really good uptake on this -- it is very, very obvious that people like this technology for things other than backup.

Just to characterize that a little bit, we've seen situations competitively against all the usual suspects where we have to store data for the long term, not backup but just long-term retention of data that just can't be on tape, where we're getting four- or five-to-one compression and the customers are ecstatic. If four- or five-to-one doesn't sound like a lot, well, you're dealing with a 75 percent reduction in data, reduction in footprint, reduction in power and cooling bills. It is usually compelling for people to see that kind of data reduction. So we're very encouraged by seeing this technology being very general purpose in the storage infrastructure rather than a feature tacked onto a different technology product like a virtual tape library, or a backup app, or a file system.

Byte and Switch: Hugo also mentioned the importance of "staying on the processor side" in the storage market instead of concentrating on the size of disks on the other side of the controller. Can you elaborate?

Slootman: [Data de-duplication] technology has a pretty good appetite for CPU and memory. We very purposely set out to make that the design center of the technology because CPU power has become well, the price/performance trends are literally off the charts. We've broken every law in terms of price/performance, which is very, very beneficial for this technology because customers get more and more power constantly going to multicore environments, and they're getting it cheaper and cheaper and cheaper. And as a result, we're riding the curve very hard. So I don't think we need to go to the controller level rather than having full-blown Intel data center grid server heads doing the processing for this kind of technology.The CPU-centric nature of this technology is very beneficial, it benefits VMware hugely because they need to run many, many servers inside one physical server, but it benefits Data Domain hugely as well, because we're dependent on the price/performance trends in that area to allow us to build bigger and faster systems all the time.

Byte and Switch: So being on the processor side: How is that going to play out in concrete terms, in terms of where you'll be devoting your R&D dollars?

Slootman: The basic notion of finding redundant, repeated patterns in data is not a new technology concept; it's existed in academia for really the better part of the last 10 to 15 years. When people tried to run this kind technology at speed, at the performance that it needed to be in order to be commercially viable in very performance-constrained applications, they found out that to run it very, very quickly they either needed to use an I/O model -- in other words, run tons and tons of storage to drive performance -- or they needed to use a computational model. At the time, none of these alternatives was very interesting, because using an I/O model defeats the performance: Here I am eliminating data from the data to be stored, but I have to add the storage back in to get the performance.

It's one step forward, one step backwards, and people threw up their hands and said, "This is not worth our time, because what we're saving we have to give right back to get performance." The computation model didn't work either, because we didn't have the price/performance; you needed to have a supercomputer to run this kind of a process.

So, what has really blown this marketplace wide open is exactly what Intel and AMD have done: They've delivered extremely cost-effective price/performance of microprocessor technology, and then you've seen people like Data Domain with the software to harvest it, and that's why this technology now exists.Being on the storage side to drive performance is, of course, a tried-and-true method. People like EMC, etc., have made a living adding spindles -- high-RPM spindles, more and more spindles. People didn't need storage for capacity, but they sure needed it for performance, and that's been a boon for that business. We've gone the computational route and of course the price/performance gains have been so dramatic that people can now buy this technology relatively inexpensively compared to what it would be on the storage side.

Going forward, there are only two ways you can use this model to build bigger and faster systems. One is to build bigger nodes, exploiting the price/performance curve, buying Intel and AMD so you're going to go from single- to quad-core, or single-core to dual-core to quad-core and everything else that comes after that. We’re also going from two-socket, two-CPU-type systems to four-CPU-type systems. In other words, you'll have four CPUs, four cores each, 16 cores processing in a single head.

Building bigger nodes, more sockets, more cores is very advantageous, but the other approach you can take is clustering systems, getting a group of nodes to behave as a single node.

We're progressing on both fronts. We're building bigger nodes and we're clustering them together, so you end up with a very, very inexpensive, very, very software-centric approach that can process just huge rivers of data in a relatively short period of time. That's what Hugo was talking about when he said you need to be on the computational side in terms of your architecture and not drive performance with storage I/O, because it's insanely expensive and there is no price/performance improvement in that part of the technology world.

Byte and Switch: In terms of the current size of processors and scaleability of nodes, where are you now, and what's on the roadmap?Slootman: You'll see our quad-core environment relatively quickly here. We have dual-core heads right now. Once we have the quad-core, two-socket systems, you'll see us in 2008 also going to four-socket systems, quad-core based. So in total, we'll have 16-core heads with 16-core processing. We already have our first installs on our clustered implementation of our systems as well.

You'll see in 2008 an enormous push forward from Data Domain, both building and delivering much bigger nodes as well as nodes that will be clustered together to drive very large volumes of data. That gets us to this petabyte marketplace that we spoke about earlier.

To Page 3

Byte and Switch: Will Data Domain ever relinquish or move beyond inline processing?

Slootman: I'll give you my biased view on it. Honestly, we built this thing from the ground up to do what it does. De-dupe is not a feature to us -- it's an enabling technology. You can't touch it or turn it on or manage it or resource it. It can't conflict with anything, because it's just implicit in the storage. It just works while the data is making its way to disk.Most [suppliers] in this marketplace came from another technology category and had to bolt [de-duplication] on. It is very difficult to bolt something on inline. Actually, it's impossible. You have to do open-heart surgery, and usually the patient will die on the table.

There are many reasons why inline is attractive to customers. Aside from religious arguments, you just have to look at what people are buying, and people are buying more inline than everything else combined, so there's probably some attraction to how it works. More specifically, the real benefits of inline are that you don't manage anything. It's just storage. The wall of data is making its way to disk, eliminating repeated data. You don’t have to start or stop it. There's no real monitoring going on in the process. Your elapsed time is a lot shorter because you don't have to land the data first and then start the de-duplication process. It all happens inline.

Another reason that is becoming a real important issue with customers is that once the data is landed in our world, it can be streamed to tape, it can be vaulted, it can be restored, or all those things at the same time. [Other suppliers] first have to land the data and then start de-duping it. While they are de-duping it, that data is virtually unusable, and it can conflict with other processes that have to be executed. If you have to do a restore in the middle of [the de-duplication process], you have to stop the de-dupe process. There are all these complexities that come into that equation. In our world, it's just storage. There's no added complexity from what you had before.

That's a very attractive thing. One thing data centers are not looking for is more complexity, more processes, more points of failure. They are not looking for more things to resource. They want simplicity, they want consolidation. You need to buy storage anyway; I can't solve that problem for you. But if you buy storage from us, we don't bring any new complexity in terms of processes that have to be done to crush the data down. Customers just want to put our appliances in the rack and forget about them.

We can sit behind the VTL today. We already do it offline. If a customer says, "I want to move the data to a VTL first and then go to Data Domain," fine -- we're actually certified with EMC's disk library to do that. What customers will ask us is, "Why do we need that?" We don't have a really good answer because you don't need it.And there's a lot of other cache-and-crush models out there that customers actually use. For example, database backups: Database administrators will sling the production images of the database to a secondary system, and then the system is backed up from there... So we're able to do post-process if that’s what people want to do. It's just that they're choosing not to because they don't need to.

Hugo said that post-processing is a crutch for people that can't do inline. We still stand by that comment. It's what is really going on here.

To Page 4

Byte and Switch: In the process of product development, will you acquire any companies? There's been a lot of talk in different quarters that Data Domain seems to be in a position to be an acquirer.

Slootman: Obviously, we've had our IPO and secondary, and our balance sheet is in a heck of a lot better place than it was when we were a private company, and we have our stock which is fairly highly valued -- so we're certainly in a position to get interested in that sort of thing.We're by culture and by nature not an acquisitive company. Our bias is toward organic development of technology. But that said, this is not a card that we wouldn't play if we thought that it would benefit our agenda. And there are certainly areas that have our interest.

But we are not going to be a consolidator of this space, where we just buy technologies and market share. You will not see that happen from Data Domain. If we are able to acquire other design centers that could really augment our strategy over the long haul, maybe that could play out.

We are a public company, so I'm not really commenting on this in any level of specificity. But that's certainly an arrow that's in the quiver that we would use if we thought it would benefit us.

Byte and Switch: How do you view the competitive situation at this point?

Slootman: Of course, Data Domain is always discussed by analysts and reporters as, "Yes, they are the leading vendor, but competition is coming." Over 2007 we created a huge amount of distance between ourselves and the competition. And investors always ask me, "Why can't EMC do this, why can't Quantum do this?" My answer is always, "They can!" But you need two things: You need to spend the time, because building the technology is a gestation process. It doesn't matter if you have billions of dollars. If you want to have a baby, it takes nine months. You can throw 100 women at the problem and it's not going to make a difference. You have to learn just like we have over the last six years... If you got started a year ago, you're about five years behind Data Domain.Aside from time you need the talent. Some companies have the talent. I certainly believe that companies like Network Appliance have it, but I don't know whether they have the willingness to apply that talent for the time it takes.

Obviously, other people would rather just muddy the water and confuse the positioning and see if they can slow this thing down. That's certainly what we've seen from incumbent companies.

In 2007, the marketplace really digested all the claims and all the messages [about de-duplication]. The great thing about data center technology is you can't get by with hyperbole and rhetoric. I mean you're going to have to put up or shut up. Your systems are going to be put into the data center. People are going to drive their workload through it, they're going to see what happens if the power goes out, what happens if a drive fails. There's just a lot of sophistication to this technology, and in the data center, people just don't have a lot of tolerance for stuff that becomes a management headache or doesn't work or doesn't perform. And because of the years and years of experience we have with this and thousands of customers and installations around the world, there's just a lot of maturity in the product and people see that when they evaluate it -- and that's what they're voting for.

To Page 5

Byte and Switch: Are there any particular vertical markets most in need of your technology right now?Slootman: This is one of the things I like best about this business. It is a universal marketplace. Everyone that has to land data has to also protect it, and they're going to do it one way or the other. We're in every vertical you can possibly imagine, in commercial distributed enterprise. It runs the gamut, and we've never really had a vertical marketing focus, because it just really hasn't been appropriate for this category of technology.

We're a data center play, and you find data centers in every vertical out there. We've been really big in federal systems, we're strong in the Department of Defense. We have our own people in 22 countries out there. Regardless of geography, regardless of vertical -- there's just no concentration at all in our business.

I can give you a list of customers. That, by the way, is a big strength: We can go to customers and say, "We'll give you six references," and they'll all be in your vertical and all local. That represents a lot of strength. There's no way competitors can do that right now. From that standpoint, having a vertical presence is really helpful.

Byte and Switch: What about audio or video? Is there any interest Data Domain has in modifying de-duplication to fit those data types?

Slootman: That's a definitely different world. We do have customers in that area, but the top three data sets that we back up are really corporate data. Number one is corporate databases; number two is email databases; and number three is home directories -- basically, unstructured data of all kinds.We're not a company that has focused on digital content as a marketplace. If people are repetitively backing up the same digital content, yes, you will see considerable de-duplication benefits, but the real benefit of this technology is that it knows how to deal very well with data that shows marginal differences from day to day. With digital, that's a whole different racket altogether.

Of course, there are all kinds of compression technology being used with video and audio data. But they're not lossless. There are fidelity differences with those compression technologies -- and, of course, in our business that's just not possible. We have to have lossless compression. There cannot be a difference between the image we compress and the image that we read back.

Have a comment on this story? Please click "Discuss" below. If you'd like to contact Byte and Switch's editors directly, send us a message.

  • Advanced Micro Devices (NYSE: AMD)

  • Cisco Systems Inc. (Nasdaq: CSCO)

  • Data Domain Inc. (Nasdaq: DDUP)

  • EMC Corp. (NYSE: EMC)

  • Intel Corp. (Nasdaq: INTC)

  • Network Appliance Inc. (Nasdaq: NTAP)

  • Quantum Corp. (NYSE: QTM)

  • Riverbed Technology Inc. (Nasdaq: RVBD)

  • Symantec Corp.

SUBSCRIBE TO OUR NEWSLETTER
Stay informed! Sign up to get expert advice and insight delivered direct to your inbox

You May Also Like


More Insights