<?xml version="1.0" encoding="UTF-8" ?>
<?xml-stylesheet type="text/xsl" href="http://it20.info/rss.xsl" media="screen"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:wfw="http://wellformedweb.org/CommentAPI/"><channel><title>IT 2.0 Main Blog</title><link>http://it20.info/blogs/main/default.aspx</link><description /><dc:language>en-US</dc:language><generator>CommunityServer 2.0 (Build: 60217.2664)</generator><item><title>From Scale Up vs Scale Out... to Scale Down</title><link>http://it20.info/blogs/main/archive/2009/12/10/1427.aspx</link><pubDate>Thu, 10 Dec 2009 18:26:00 GMT</pubDate><guid isPermaLink="false">3066da22-6b27-4cf1-aa0f-2eff79b21f87:1427</guid><dc:creator>Massimo</dc:creator><slash:comments>2</slash:comments><comments>http://it20.info/blogs/main/comments/1427.aspx</comments><wfw:commentRss>http://it20.info/blogs/main/commentrss.aspx?PostID=1427</wfw:commentRss><description>&lt;P align=justify&gt;Those of you that have been following me on &lt;A href="http://twitter.com/mreferre"&gt;twitter&lt;/A&gt; and on &lt;A HREF="/"&gt;my blog&lt;/A&gt; know that I have been very focused on studying and monitoring the latest trends regarding which hardware platforms virtualization users are using for their infrastructures. This includes multiple points of view such as &lt;A HREF="/blogs/main/archive/2007/11/26/83.aspx"&gt;simple sizing rules of thumb&lt;/A&gt;, &lt;A HREF="/blogs/main/archive/2009/10/02/271.aspx"&gt;potential reference architectures&lt;/A&gt; and &lt;A HREF="/files/3/documentation/entry186.aspx"&gt;scale up vs. scale out strategies&lt;/A&gt;. I'd like to spend the next few minutes talking about what's going on lately in this respect, specifically in light of the latest (and future) hardware improvements we have seen or that we will see in the next few months. I am doing this because I have a very weird feeling about what's going on. Bear with me. &lt;/P&gt;
&lt;P align=justify&gt;When I started working with VMware software back in 2001, the only value proposition that we could imagine out of &lt;I&gt;the thing&lt;/I&gt; was the so-called &lt;I&gt;server consolidation&lt;/I&gt;: in essence the process of consolidating many &lt;I&gt;virtual instances&lt;/I&gt; - aka &lt;I&gt;partitions&lt;/I&gt; or &lt;I&gt;guests&lt;/I&gt; - onto a fewer number of physical servers. To make a long story short, down the road we have realized that the value proposition was way more than just &lt;I&gt;server consolidation&lt;/I&gt; as a mean to reduce the costs of operation. It suddenly became pretty evident that there were many more advantages to that which may include things like easier high-availability for applications, easier Disaster Recovery scenarios, faster time-to-market for business applications, and many more. S&lt;I&gt;erver consolidation&lt;/I&gt; was, at that point, just one of the many value items we know today.&lt;/P&gt;
&lt;P align=justify&gt;Right now my feeling is that the advantage of stuffing more and more OS instances on as few physical systems as possible is not even considered an advantage any more these days. To put it another way, it is still considered an advantage, but only to a certain extent. In fact, if consolidating more instances on fewer hardware pieces was still one of the strategic objectives of a virtualization process, what you would have seen was a progression in terms of the ratio &lt;I&gt;# of OS instances / physical system&lt;/I&gt;. Something like this:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;
&lt;P align=justify&gt;&lt;FONT color=#ff0000&gt;&lt;B&gt;4-Socket&lt;/B&gt;&lt;/FONT&gt; &lt;B&gt;single-core&lt;/B&gt; x86-based server with &lt;B&gt;n &lt;/B&gt;GB of memory could support &lt;B&gt;10 VMs&lt;/B&gt;&lt;/P&gt;&lt;/LI&gt;
&lt;LI&gt;
&lt;P align=justify&gt;&lt;FONT color=#ff0000&gt;&lt;B&gt;4-Socket&lt;/B&gt;&lt;/FONT&gt; &lt;B&gt;dual-core&lt;/B&gt; x86-based server with &lt;B&gt;n*2 &lt;/B&gt;GB of memory could support &lt;B&gt;20 VMs&lt;/B&gt;&lt;/P&gt;&lt;/LI&gt;
&lt;LI&gt;
&lt;P align=justify&gt;&lt;FONT color=#ff0000&gt;&lt;B&gt;4-Socket&lt;/B&gt;&lt;/FONT&gt; &lt;B&gt;quad-core&lt;/B&gt; x86-based server with &lt;B&gt;n*4 &lt;/B&gt;GB of memory could support &lt;B&gt;40 VMs&lt;/B&gt;&lt;/P&gt;&lt;/LI&gt;&lt;/UL&gt;
&lt;P align=justify&gt;The numbers above are just examples, and are only used to outline the mathematic progression I was mentioning. The high level idea behind it is that, the more powerful the systems become, the more OS instances you could consolidate onto them. Once you have strategically chosen a given hardware platform (whose main characteristic is expressed in # of CPUs it is capable to support) you will see higher consolidation ratios as the CPUs become more powerful (typically via doubling the number of cores from one generation to the other). Put into a more mathematical language, the constant here should be the number of CPUs (in red). The speed of the CPU is a function of the Moore's law, so to speak. As a result, the number of VMs that can be supported is a function of the CPU speed. Memory is also a function of the CPU speed and it needs to be configured accordingly to keep a balanced system with the proper CPU-to-Memory ratio.&lt;/P&gt;
&lt;P align=justify&gt;That's what would happen (naturally) if server consolidation was a priority. However I have noticed that it doesn't seem to be what's actually happening in the industry. I can think of many such situations, but the most emblematic to me refers to a customer I have been working with very closely since 2001. We started deploying 16-Socket single-core servers, then they moved to 8-Socket dual-core servers, then to 4-Socket quad-core servers and are now in the process of migrating to 2-Socket Nehalem-based servers. In a way, what it is happening is that customers are inverting the mathematical constants and variables compared to what would be natural (see above). This is the approach and mindset most customers are using these days to size their "brick": &lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;
&lt;P align=justify&gt;To support &lt;FONT color=#ff0000&gt;&lt;B&gt;20 VMs&lt;/B&gt;&lt;/FONT&gt; I would need a &lt;B&gt;8-Socket single-core&lt;/B&gt; system with &lt;B&gt;n &lt;/B&gt;GB of memory&lt;/P&gt;&lt;/LI&gt;
&lt;LI&gt;
&lt;P align=justify&gt;To support &lt;FONT color=#ff0000&gt;&lt;B&gt;20 VMs&lt;/B&gt;&lt;/FONT&gt; I would need a &lt;B&gt;4-Socket dual-core&lt;/B&gt; system with &lt;B&gt;n &lt;/B&gt;GB of memory&lt;/P&gt;&lt;/LI&gt;
&lt;LI&gt;
&lt;P align=justify&gt;To support &lt;FONT color=#ff0000&gt;&lt;B&gt;20 VMs&lt;/B&gt;&lt;/FONT&gt; I would need a &lt;B&gt;2-Socket quad-core&lt;/B&gt; system with &lt;B&gt;n &lt;/B&gt;GB of memory&lt;/P&gt;&lt;/LI&gt;&lt;/UL&gt;
&lt;P align=justify&gt;Wow. This is neither Scale Up nor &lt;I&gt;Scale Out&lt;/I&gt;. This is indeed &lt;I&gt;Scale Down&lt;/I&gt;! &lt;/P&gt;
&lt;P align=justify&gt;Again, while the numbers are not tremendously unrealistic, they are only used to demonstrate, at a very high level, the mathematical progression which maps the mindset. As you can see there is a trend in the industry right now that doesn't consider the number of VMs you can get on a system as a function of how fast and powerful the system is. It's quite the opposite. The speed of a system is determined as a function of the requirement to run a fixed number of VMs. Since the size of the memory is typically a function of the number of VMs, its configuration doesn't tend to vary drastically because the number of VMs tends to remain the same. By the way, 20 / 25 VMs seems to be the average number most customers are defaulting to on each physical host, based on what I have seen. &lt;/P&gt;
&lt;P align=justify&gt;There are a few reasons for which this is happening. One of the reasons is that most customers are not confident to put too many eggs into a single basket. They may be guessing that 20 / 25 partitions per host is a good trade-off between disadvantage of the potential downtime of multiple partitions and the advantage of having fewer physical servers (compared to a non-virtualized environment). For example, having 5 partitions would diminish too much the value of the latter, and having 100 partitions would increase too much the potential risk of the former. The consensus today does seem to be 20 / 25 partitions.&lt;/P&gt;
&lt;P align=justify&gt;Another reason why this is happening is that there is a common perception that the smaller the &lt;I&gt;virtualization brick&lt;/I&gt; is, the cheaper it is (due to the commoditization process we are seeing in the low-end x86 market). I don't have a definitive position on this - as I think that it always depends. But there are a number of people in this industry that would claim that, while this may be a good approach for a small business that only has a few dozens partitions to deal with, it wouldn't work for an enterprise customer with thousands of partitions. The method would result in an improperly designed virtualized infrastructure due to the high number of physical low-end servers required.&lt;/P&gt;
&lt;P align=justify&gt;The third - and last - reason I am mentioning here is a bit more tricky and opportunistic in my opinion. The x86 virtualization industry is largely driven by software vendors rather than hardware vendors. Software vendors in this space tend to prefer the usage of low-end commodity servers because, this way, they can provide the value at the software layer. There is no magic: the better the hardware is (in terms of scalability / resiliency / efficiency / etc.), the less infrastructure software features you need to make it an enterprise platform. On the other hand, if you use many low-end commodity x86 servers you can tie them together into a single gigantic (virtual) enterprise platform through the value of the software running on them. The latter is what software vendors really love to hear these days and that's what they are after. &lt;/P&gt;
&lt;P align=justify&gt;If you are still following me and agree with the analysis to some extent, you'll realize that there are a number of implications caused by this trend. &lt;/P&gt;
&lt;P align=justify&gt;One of the implications is that servers are now &lt;I&gt;memory-bound&lt;/I&gt;. If you ask 10 virtualization architects in the x86 space they will all tell you that the limiting factor today in servers is the memory subsystem. Put it another way, you are reaching the physical memory usage limit far before you manage to saturate the processors in a virtualized server. Have you ever wondered why that is the case? As users move backwards from 8-Socket servers to 4-Socket servers to 2-Socket servers the number of memory slots available per server gets reduced. That's how x86-based servers have been designed over the years: the more sockets the server has, the more memory slots that are available. What is happening now is that customers tend to use much smaller servers because they can support the same number of partitions per physical host, but the memory requirements haven't changed. That's because the amount of memory needed is a function of the number of partitions running, and if that number of partitions is kept constant you will always need the same amount of memory. &lt;/P&gt;
&lt;P align=justify&gt;That's the problem: you now have a lot fewer slots available to support the same amount of memory. While memory vendors have been able to squeeze more and more Gigabytes worth of circuitry in the same DIMMs, the fact is that this is not enough to create a balanced system given the speed of CPUs has improved at a faster pace than memory vendors have been able to shrink their parts to put more memory space into a single DIMM. The outcome? You either configure very dense - and expensive! - memory modules into those fewer slots in the low-end servers, or you configure reasonably cheap DIMMs into those slots. The first approach would send the price of that virtualization brick to the roof; the second approach would cause the system to be bottlenecked very soon by the memory subsystem, with the CPUs being used at a fraction of their potential. This is in fact what's happening, as it is not uncommon these days to see virtualized systems being used - from a CPU perspective - at about 30-40%, and memory being already under heavy pressure approaching the physical limit.&lt;/P&gt;
&lt;P align=justify&gt;There is another aspect to consider which is even more "interesting." The high density memory cost seems, frankly, to be the excuse for being stuck in such a situation. After all, it may even be convenient, in some cases, to configure more expensive memory parts to double the number of partitions and put to good use those wasted CPU cycles. However, the real problem seems to be that most customers are &lt;I&gt;mentally partitions-bound&lt;/I&gt;: "No matter the technology and its associated costs, I don't want to get beyond the 20 / 25 partitions per physical host." If that is really the case - it's just my feeling so far - in the near future we won't need cheaper high density memory DIMMs or more memory slots in low-end servers. Most likely what will happen in the near future is that these customers will either start using 1-Socket servers - assuming these have the same memory support characteristics of the 2Socket servers - or more simply they will start populating a single CPU package in 2-Socket-capable servers. At this pace we will be running single socket Atom servers in about 24 to 36 months: Intel and AMD are warned!&lt;/P&gt;
&lt;P align=justify&gt;This also will have further (and funny) implications. For example, the structure of all the industry benchmarks out there may become irrelevant in the future (assuming you consider it relevant today). All these benchmarks are designed to load the CPUs at 100% (configuring all other subsystems to cope with that) and coming out with a scalability number. In the server virtualization context, this number is typically expressed in the number of VMs a given &lt;I&gt;n-Socket&lt;/I&gt; server can support. In the scenario I am picturing, this is completely useless. First of all, because of what we have said, memory is becoming the bottleneck in most of the situations, so these benchmarks should - at least - assume the 100% memory load as the limiting factor of a given server configuration. What's the point of benchmarking a server running at 100% of CPU utilization for which you had to configure 1TB of memory and 3.000+ disk spindles to achieve that CPU load, when customers are using 128GB of memory and a few dozens spindles at best? &lt;/P&gt;
&lt;P align=justify&gt;To make things worse, the number of VMs is not even a function of the speed of the server any more - as we argued - but rather it's becoming a constant in the equation. In the currently available benchmarks, in fact, the constant is the number of Sockets and its 100% load. To build a benchmark that could map exactly what's happening in the industry and could be of use for the community, one would need to design a performance test that would give the number and type of CPUs and memory DIMMs to achieve a certain number of constant partitions (20 or 25). The lowest the resources (and their price), the best is the result.&lt;/P&gt;
&lt;P align=justify&gt;While there is nothing wrong with all this, at the same time we need to acknowledge it is the complete negation of the initial &lt;I&gt;Server Consolidation&lt;/I&gt; value item we started with back in 2001. The problem is that users may be leaving lots of money on the table because of inefficiencies due to underutilized resources and/or the management of many small Intel based servers (think about the costs associated with power consumption or I/O cablings). This is far from being an attempt to convince you that &lt;I&gt;Scale Up&lt;/I&gt; is a better approach. I am ok with a &lt;I&gt;Scale Out&lt;/I&gt; approach, too, as I can see the value of it. However, I see this &lt;I&gt;Scale Down&lt;/I&gt; approach as a trend that won't allow users to exploit the full potential of what you could achieve using the technologies properly. Perhaps I am having the wrong perception of what's going on; or perhaps I am having the right perception and I am wrong in questioning it. Either way, I'd be curious to &lt;A href="mailto:massimo@it20.info"&gt;hear&lt;/A&gt; what you think, if you have a spare minute.&lt;/P&gt;
&lt;P align=justify&gt;Massimo. &lt;/P&gt;&lt;img src="http://it20.info/aggbug.aspx?PostID=1427" width="1" height="1"&gt;</description></item><item><title>XenServer: Why? (Updated)</title><link>http://it20.info/blogs/main/archive/2009/10/29/1422.aspx</link><pubDate>Thu, 29 Oct 2009 00:02:00 GMT</pubDate><guid isPermaLink="false">3066da22-6b27-4cf1-aa0f-2eff79b21f87:1422</guid><dc:creator>Massimo</dc:creator><slash:comments>3</slash:comments><comments>http://it20.info/blogs/main/comments/1422.aspx</comments><wfw:commentRss>http://it20.info/blogs/main/commentrss.aspx?PostID=1422</wfw:commentRss><description>&lt;P align=justify&gt;There have been lots discussions lately about what's happening around Citrix XenServer. Perhaps too many. For what it is worth, I was one of the people discussing this on the net (Twitter, Blogs etc) with some other folks. I originally drafted a blog post when Citrix bought XenSource but it never made it (officially because I was busy, unofficially because I couldn't figure out "why"). &lt;/P&gt;
&lt;P align=justify&gt;I think that what it is happening is pretty clear at this point. The market landscape is being consolidated with Oracle acquiring VirtualIron as well as the "Sun Xen thing" within the overall grand plan of the acquisition (of the remaining) of Sun. All these solutions have hardly, in the past few years, managed to make a difference in the industry and their names were floating around more with the hope that VMware could feel more pressure and competition, and hence lower the prices. In the meanwhile, VMware increased their price which speaks for itself. &lt;/P&gt;
&lt;P align=justify&gt;This is leaving (apparently) the x86 virtualization market with 3 relevant viable alternatives that are VMware, Microsoft and Citrix. I have always said this is going to be a two-horse race and I still stand behind this statement. The first horse is VMware and the second horse is what I call Microtrix (tm). There have been a nice Twitter discussion a few days ago on why Citrix bought XenSource and the future of it etc. This was my tweet in the discussion which, in a way, summarizes my thinking: &lt;/P&gt;
&lt;P align=justify&gt;&lt;I&gt;&lt;FONT color=#0000ff size=4&gt;My XenServer in 140 chars: a non conventional weapon ordered by Microsoft for Citrix to use in the "meanwhile" (meanwhile Hyper-V matures)&lt;/FONT&gt;&lt;/I&gt;&lt;/P&gt;
&lt;P align=justify&gt;While I have always said I am a geek, you can't afford to not look at all this from a business perspective. So the discussion is not so much "features related" but it is rather more like "how a vendor is going to capitalize on something". Because, at the end of the day, all vendors are vendors for a single reason: $$$. &lt;/P&gt;
&lt;P align=justify&gt;And this is what never worked out for Citrix in my opinion. This is what I miss from a business perspective. Don't get me wrong, I am not saying "XenServer is not a good product!". I am rather asking ... "why XenServer?". &lt;/P&gt;
&lt;P align=justify&gt;So Citrix bought XenSource more than a couple of years ago (off the top of my head - I am on a train and not connected) and the idea was that they would have engaged with VMware to win a chunk of the promising business VMware was leading. 500M$, at that time, was a big investment but something you could afford to spend if your grand plan is to win a slice of that lucrative market. Immediately the whole thing sounded a bit weird for at least a couple of reasons: &lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;
&lt;P align=justify&gt;That was not Citrix core business: they essentially deal (very well) with end-user application virtualization at multiple levels. They are not so much into the data center if not for centralizing something that is otherwise distributed on the end-user desktops (oversimplification!). &lt;/P&gt;&lt;/LI&gt;
&lt;LI&gt;
&lt;P align=justify&gt;Microsoft was to come out shortly with their very first implementation of Hyper-V and it was clear that XenServer was going to compete with it.&lt;/P&gt;&lt;/LI&gt;&lt;/UL&gt;
&lt;P align=justify&gt;I was struggling to fit this Citrix strategy into the bigger picture, especially because of the strong Microsoft and Citrix relationship - someone refers to Citrix as a fully independent Microsoft subsidiary, go figure. So while they were "in bed" at the Corporate level they would have forced their respective sales fields and channels to compete at the local level. And we are not talking about a mere add-on tool where there is slightly competition. This would have been a fierce battle for a key layer (and a tremendous point of control) in the data room. Not peanuts folks!&lt;/P&gt;
&lt;P align=justify&gt;Well that was it anyway. So we lived in this limbo for quite a while without bringing up again this concern until Citrix broke the news just before VMworld Europe 2009. Just prior to the event they made the announcement that XenServer Enterprise (I mean the high-end version with all the fireworks) was going to be given away for free. Yeah you got it right: the technology they bought from XenSource for 500M$ was to be given away for free. And you may rightly wonder "why?", especially if you consider that the Citrix business track record, as far as I can say, is not that of a charity nor you can say - more seriously - that Citrix is the kind of company that gives away licenses for free because they make money on professional services and support. Not at all: they have always been in the business of selling you a great piece of software (Metaframe / XenApp being an example) for a great amount of money and profits. Not only that, they were now putting lots of R&amp;amp;D efforts into a product that was going to generate 0 revenue and hence 0 profits. This can't be Citrix I wondered! My assumption of "lots of R&amp;amp;D efforts" comes from what they used to tell customers asking "what is the value of Citrix XenServer as opposed to the freely available open source Xen package?". Their position was, in fact, that they were putting into the base open source code some additional functionalities and enterprise-grade testing of all components. That's what customers were paying for. &lt;/P&gt;
&lt;P align=justify&gt;Immediately afterwards, they made a new announcement where they stated they would be developing add-on management products for XenServer (called Citrix Essentials) to extend the basic capability of the XenServer technology. This was putting them somewhat on a track that did make more sense if it was not for another part of the same announcement: in fact, they stated that these add-ons would have been available to extend the functionalities of both XenServer as well as Microsoft Hyper-V / Virtual Machine Manager. And this, again, made me wonder: they now have the possibility to making money on both the free product they develop and maintain or making money on the free product that Microsoft develops and maintains. So why bother with developing and maintaining your own free stuff if you can off-load the burden to your pals? &lt;/P&gt;
&lt;P align=justify&gt;Citrix didn't take too much to answer (with facts) that question. The latest news is that Citrix announced, a few days ago, that they are going to donate to the open source community not only the Xen hypervisor itself (which is already open source) but the whole proprietary stack that XenSource and then Citrix have been developing around it (and for which Citrix paid 500M$ I would add...). At least this makes more sense for them as, if we go back to the previous discussion, XenServer is now no longer on their R&amp;amp;D budget. However, it doesn't answer why they spent 500M$, in the first place, to get to this point in just after a couple of years. &lt;/P&gt;
&lt;P align=justify&gt;Another weird thing I heard lately is that, in the latest discussions on the web, Citrix has also provided an interesting success metric for XenServer which is the amount of profit loss that XenServer caused to VMware. Now, every single vendor is allowed to spend their own money as they wish (as long as the investors are happy) but they may allow end-users to wonder why they have invested 500M$ in a company just to hurt the (current) leader in that space. I would say that you don't enter a market, as a newcomer, spending a lot of money to buy something and turn it into a freely available open source software in a couple of years... with the only intent to make the leader loose money. However, you may want to do so if you are in a dominant position and you feel the pressure from the leader of a segment where you are still late-to-market. Are you guessing?&lt;/P&gt;
&lt;P align=justify&gt;To recap, this is what have observed in the last few years:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;
&lt;P align=justify&gt;VMware has grown in relevance in this industry.&lt;/P&gt;&lt;/LI&gt;
&lt;LI&gt;
&lt;P align=justify&gt;Microsoft feels they may be loosing an important point of control in the data center (to VMware) but are not ready to counter with Hyper-V (R1). &lt;/P&gt;&lt;/LI&gt;
&lt;LI&gt;
&lt;P align=justify&gt;Citrix buys XenSource (one of VMware most important potential competitors) for 500M$.&lt;/P&gt;&lt;/LI&gt;
&lt;LI&gt;
&lt;P align=justify&gt;Citrix engages a battle with VMware (and apparently with Microsoft) to win the hypervisor battle.&lt;/P&gt;&lt;/LI&gt;
&lt;LI&gt;
&lt;P align=justify&gt;Citrix gives away XenServer for free in an attempt to hurt VMware even more.&lt;/P&gt;&lt;/LI&gt;
&lt;LI&gt;
&lt;P align=justify&gt;Citrix announces the Citrix Essentials package that would extend hypervisor functionalities for both Citrix XenServer as well as Microsoft Hyper-V.&lt;/P&gt;&lt;/LI&gt;
&lt;LI&gt;
&lt;P align=justify&gt;Microsoft announces the availability of Hyper-V R2 (which fills many gaps they had with the VMware offering).&lt;/P&gt;&lt;/LI&gt;
&lt;LI&gt;
&lt;P align=justify&gt;Citrix is to donate the XenServer code to the open source community.&lt;/P&gt;&lt;/LI&gt;&lt;/UL&gt;
&lt;P align=justify&gt;I am not sure about you, but I see something here between the lines.&lt;/P&gt;
&lt;P align=justify&gt;The latest Citrix take on this is that they didn't waste their money as XenServer is a key component of their XenDesktop strategy where they use XenServer as the hypervisor to serve the back-end infrastructure and they are using the Xen kernel to build the client hypervisor platform for off-line VDI scenarios and the like. I don't want to dispute this. There is nothing wrong with this strategy and I think that Citrix also has a technology lead vs. VMware when it comes to application virtualization and VDI (just like VMware has a technology lead for the back-end infrastructure). My mere argument is that, at this very point, they could have done exactly the same thing without spending the 500M$ in the first place back in 2007. For example:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;
&lt;P align=justify&gt;They could have added support for their XenDesktop to a XenSource backend (similarly to how they provide support for VMware and Microsoft hypervisors today).&lt;/P&gt;&lt;/LI&gt;
&lt;LI&gt;
&lt;P align=justify&gt;They could have developed Citrix Essentials for both XenSource and Hyper-V if they really thought it made sense for them to do so.&lt;/P&gt;&lt;/LI&gt;
&lt;LI&gt;
&lt;P align=justify&gt;They could have taken the already open sourced Xen hypervisor to create their own client hypervisor for off-line VDI.&lt;/P&gt;&lt;/LI&gt;&lt;/UL&gt;
&lt;P align=justify&gt;I can't think of a single thing that they couldn't have done leveraging the Xen open source project or leveraging a partnership with XenSource and yet keeping 500M$ in their wallet... I have too much respect for Marc Templeton for not insinuating that there was a larger plan in this XenSource acquisition.&lt;/P&gt;
&lt;P align=justify&gt;Just to make sure we are all on the same page, this doesn't mean Xen(Server) is dead by any means. It will continue to live and grow in the open source community and it will evolve over time. For example it will be a very compelling building block for those (big) service providers trying to implement cloud services. If these players could afford to build everything in house (as Amazon did) and if they don't want to deal with the commercial tricks and license limitations of a more "commercial" package, such as VMware vSphere, then Xen(Server) is a great fit. These customers, in fact, may not see vSphere as a good fit since, while the ESXi hypervisor is free, it does require Virtual Center to fully exploit its basic functionalities. Nothing wrong with that, but these service providers may want to leverage something more flexible and build their in-house developed stuff on it without stringent licensing requirements posed by the vendors.&lt;/P&gt;
&lt;P align=justify&gt;Similarly, typical commercial customers may appreciate a more &lt;I&gt;off-the-shelf&lt;/I&gt; / &lt;I&gt;vendor owned&lt;/I&gt; product such as VMware ESX/vCenter/View or Microsoft Hyper-V/VMM/Citrix Essentials/XenDesktop. That's the two-horse race I was talking about. The VMware vs Microtrix (tm) positioning in the industry is beyond the scope of this post.&lt;/P&gt;
&lt;P align=justify&gt;As an example, I am finding hard to understand why an SMB customer, with some 10 or 20 Windows servers to virtualize, should use XenServer as opposed to Microsoft Hyper-V with Virtual Machine Manager. While the Microsoft solution is not entirely free it would cost "negligible peanuts" and with the new R2 release it will pretty much map what the free XenServer offering can provide (&lt;STRIKE&gt;High Availability&lt;/STRIKE&gt;&lt;FONT color=#ff0000&gt;*&lt;/FONT&gt;, LiveMigration on top of all), especially in a pure Windows context as it is often the case in SMB accounts. By the way if some Linux support is required Microsoft is doing a great job at that too with Hyper-V and if you want even more functionalities the Citrix Essentials package will do!&lt;/P&gt;
&lt;P align=justify&gt;Back to my tweet above, the warning I want to give you is this: watch out because weapons are used and then decommissioned when they become obsolete (from a business perspective). Perhaps I am wrong. Only time will tell. In the meanwhile, mark my words (I can't do worse than what Gartner/IDC did years ago when they speculated Itanium would have ruled the world by 2008 anyway).&lt;/P&gt;
&lt;P align=justify&gt;I have tried to interpret what I have seen in the past without any biased opinion (I hope). At least I tried to keep on straight facts. Perhaps my name will show up on some black-lists after this post; at least I hope it will give end-users an additional point of view to think about before committing to a strategic hypervisor decision. &lt;/P&gt;
&lt;P align=justify&gt;Massimo. &lt;/P&gt;
&lt;P align=justify&gt;P.S. What's in this post only reflects my personal opinions and not those of my employer.&lt;/P&gt;
&lt;P align=justify&gt;&lt;B&gt;&lt;FONT color=#ff0000&gt;*&lt;/FONT&gt;&lt;/B&gt; Roger Klorese from Citrix pointed me to the fact that &lt;I&gt;High Availability&lt;/I&gt; is not included in the free XenServer offering being open sourced but it's rather included in the fee-based Citrix Essential package. Thanks Roger for the heads up. &lt;/P&gt;&lt;img src="http://it20.info/aggbug.aspx?PostID=1422" width="1" height="1"&gt;</description></item><item><title>Ad Hoc Designed Infrastructures: do they still make sense?</title><link>http://it20.info/blogs/main/archive/2009/10/02/271.aspx</link><pubDate>Fri, 02 Oct 2009 00:32:00 GMT</pubDate><guid isPermaLink="false">3066da22-6b27-4cf1-aa0f-2eff79b21f87:271</guid><dc:creator>Massimo</dc:creator><slash:comments>9</slash:comments><comments>http://it20.info/blogs/main/comments/271.aspx</comments><wfw:commentRss>http://it20.info/blogs/main/commentrss.aspx?PostID=271</wfw:commentRss><description>&lt;P align=justify&gt;The topic in this article is something that I have been thinking about for a while. It's about the methodology, the patterns, the habits - if you will - associated with how new IT infrastructures are being assessed, designed, sold and - in the final analysis - acquired by end-users for their datacenters. While it might not make a lot of sense to you initially, please bear with me as I go through my "internal mental brainstorming." It seems long but, as usual, it's full of pictures.&amp;nbsp; &lt;/P&gt;
&lt;P align=justify&gt;The Italian market is pretty interesting: the vast majority of the customers are (very) small organizations distributed across the entire territory. We also have a few medium-sized businesses (although not the core economy of the country), and then we have big organizations (a mix of public customers and privately held corporations). To turn this into IT terms, the vast majority of Italian customers' datacenters are very small - in the range of 5 to 15 x86-based servers. We then have customers - such as medium-sized businesses, big banks and big public organizations - that have hundreds to a few thousand x86-based servers. Having spent most of my IT career focusing on the optimization of the x86 infrastructures, I had to deal with all these scenarios above so I think I have a pretty complete view of the spectrum. This article is going to discuss specifically a couple of points that I had to deal with during the process: &lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;
&lt;P align=justify&gt;The &lt;I&gt;&lt;B&gt;assessment&lt;/B&gt;&lt;/I&gt; of the legacy infrastructures from a capacity and characteristics perspective.&lt;/P&gt;&lt;/LI&gt;
&lt;LI&gt;
&lt;P align=justify&gt;The &lt;I&gt;&lt;B&gt;design&lt;/B&gt;&lt;/I&gt; of the target architecture of the virtualized infrastructures.&lt;/P&gt;&lt;/LI&gt;&lt;/OL&gt;
&lt;P align=justify&gt;These are two different aspects, and they could deserve a dedicated discussion, but I am trying to cover both in this article anyway.&amp;nbsp; &lt;/P&gt;
&lt;P align=justify&gt;&amp;nbsp;&lt;/P&gt;
&lt;P align=justify&gt;&lt;B&gt;Assessing and Designing Optimized x86 Infrastructures for the &lt;I&gt;Small IT Shops&lt;/I&gt;&lt;/B&gt;&lt;/P&gt;
&lt;P align=justify&gt;At the very beginning of the virtualization era (around 2002-2003), I was using a pretty standard methodology that would require the analysis of the current datacenter in terms of number of physical x86 servers deployed, their hardware configuration and their usage (average at least, historical at best). You would then take the data and work through them to get to a specific hardware sizing that was capable of consolidating those physical servers onto a lower number of physical boxes. This has worked pretty well until a few months ago when I sat down with my good fellow Maurizio Benassi and we drafted a brand new methodology for sizing. It all started with a joke:&lt;/P&gt;
&lt;P align=justify&gt;&lt;I&gt;&lt;FONT face="Franklin Gothic Medium" size=2&gt;"The majority of customers could be consolidated on either one single mainframe (which never breaks), two Unix boxes (which very rarely break) or three x86 servers (which happen to break from time to time)."&lt;/FONT&gt;&lt;/I&gt;&lt;/P&gt;
&lt;P align=justify&gt;A further analysis of the patterns resulted in an updated joke (err: statement) regarding the new pragmatic methodology: &lt;/P&gt;
&lt;P align=justify&gt;&lt;I&gt;&lt;FONT face="Franklin Gothic Medium" size=2&gt;"One x86 server could sustain the whole workload, the second x86 server is configured for high availability, the third server is used to sleep well at night."&lt;/FONT&gt;&lt;/I&gt;&lt;/P&gt;
&lt;P align=justify&gt;Fun aside I guess you are starting to see a pattern here. Think about that for a moment: the fact is that the smallest &lt;B&gt;&lt;I&gt;x86 architecture&lt;/I&gt;&lt;/B&gt; you can configure today is capable of supporting the workload that the vast majority of customers have in place. And I am using the notion &lt;I&gt;x86 architecture&lt;/I&gt; here on purpose since you never - ever - configure a single x86 box for any given datacenter - no matter what the workload is. What happened in the last few months is that the majority of the virtualization requests I have seen coming in could be served efficiently with a standard configuration which comprises just a couple of Nehalem-based servers tied together with some sort of shared storage. Why would you bother assessing a common pattern and reinventing the wheel (er: the architecture) every time? More on this later. &lt;/P&gt;
&lt;P align=justify&gt;&amp;nbsp;&lt;/P&gt;
&lt;P align=justify&gt;&lt;B&gt;Designing Optimized x86 Infrastructures for the &lt;I&gt;Medium and Big IT Shops&lt;/I&gt;&lt;/B&gt;&lt;/P&gt;
&lt;P align=justify&gt;This is a completely different realm, however assessing, designing, selling and acquiring such infrastructures do have their own peculiarities which might contrast with the standard historical methodology I have mentioned above (deep level analysis of the installed base to produce a to-be new infrastructure). I have already discussed in the past a more pragmatic approach to &lt;A HREF="/blogs/main/archive/2007/11/26/83.aspx"&gt;sizing&lt;/A&gt; (virtual) infrastructures I ended up using in the last few months. I still stand behind the controversial comments in that article regarding the opportunity to go through a detailed analysis of the entire environment Vs taking a shortcut like the one I have described in the post. It's interesting also to notice that, similarly to what happens for the small shops, the layout of the to-be virtualized infrastructure doesn't dramatically change across the different situations. Sure the size might change dramatically, in fact where most if not all small shops could be doing fine with two servers, these enterprise customers might require a different number of physical servers (along with a different amount of storage and network connections); however the high-level architecture isn't so drastically different among all the configurations I have been working on. I am referring to common patterns we can learn from such as shared storage configurations, cluster(s) of virtualized servers and common network configurations. &lt;/P&gt;
&lt;P align=justify&gt;By the way, this isn't supposed to be shocking and the pattern could be easily explained. In the old days - when physical deployments where the norm - you had to take into account each application silo, and determine the best infrastructure configuration for each. That's how you ended up with complex and heterogeneous scenarios where some applications could be deployed on physical standalone servers with no redundancy, other applications had to be deployed on physical standalone servers with some degree of redundancy, others yet had to be deployed on dedicated physical clusters - forget active / active heterogeneous application clusters - for the most demanding high availability requirements. Virtualization, at least in the context of the 100% virtualized datacenter if I can steal Chad Sakac's mantra,&amp;nbsp;is changing all this complexity. First applications are no longer bound to specific physical servers so you can start thinking in "MIPS" terms for the whole infrastructure rather than sizing each vertical silo on its own. This is when my &lt;A HREF="/blogs/main/archive/2007/11/26/83.aspx"&gt;rule of thumb&lt;/A&gt; comes handy as you will always - most likely - end up in the average (the more servers you have the better it works). &lt;/P&gt;
&lt;P align=justify&gt;Another side effect of virtualization is that it has raised the bar of SLAs and you can tune your service levels on the fly without having to re-work your entire hardware infrastructure underneath. A good example is the possibility of moving your workload from SATA storage to Fibre Channel storage on-line (or nearly on-line) if you need it, or creating your application high availability policies at run-time time: in a VMware infrastructure, for example, this might be No-HighAvailability, HighAvailability or even FaultTolerance. At the end of the day, designing an enterprise infrastructure boils down to sizing the aggregated workload (where aggregated is the key word here) and providing the right set of infrastructure characteristics and attributes that an organization might require (with the flexibility to apply them to selected workloads only at workload deployment time).&lt;/P&gt;
&lt;P align=justify&gt;&amp;nbsp;&lt;/P&gt;
&lt;P align=justify&gt;&lt;B&gt;Do the Functional Requirements Matter During the Design Phase?&lt;/B&gt;&lt;/P&gt;
&lt;P align=justify&gt;Simply put, IT&amp;nbsp;is comprised of two major building blocks: &lt;A href="http://en.wikipedia.org/wiki/Functional_requirements"&gt;Functional Requirements&lt;/A&gt; and &lt;A href="http://en.wikipedia.org/wiki/Non-functional_requirement"&gt;Non-Functional Requirements&lt;/A&gt;. This is how Wikipedia defines them: &lt;/P&gt;
&lt;P align=justify&gt;&lt;U&gt;Functional Requirements&lt;/U&gt;: "A functional requirement defines a function of a software system or its component. A function is described as a set of inputs, the behavior, and outputs (see also software)"&lt;/P&gt;
&lt;P align=justify&gt;&lt;U&gt;Non Functional Requirement&lt;/U&gt;: "A non-functional requirement is a requirement that specifies criteria that can be used to judge the operation of a system, rather than specific behaviors. This should be contrasted with functional requirements that define specific behavior or functions".&lt;/P&gt;
&lt;P align=justify&gt;So the question I have been thinking about for the last few years is simple: in a virtualization context, do I really need - during a customer engagement - to go through a deep level analysis of the applications currently being deployed or soon to be deployed? In addition, defining the new virtualized infrastructure to support the applications mentioned, do I need to analyze all those applications one-by-one (from a Non Functional Requirement perspective) or can I treat them as a whole? You can depict the answer from the following two slides which are included in a &lt;A HREF="/files/3/documentation/entry54.aspx"&gt;set of charts&lt;/A&gt; I created back in 2007. &lt;/P&gt;&lt;IMG height=636 src="http://www.it20.info/misc/pictures/Ad-HocDesignedInfrastructures-dotheystillmakesense1.jpg" width=836 border=0&gt;
&lt;P&gt;&lt;IMG height=636 src="http://www.it20.info/misc/pictures/Ad-HocDesignedInfrastructures-dotheystillmakesense2.jpg" width=836 border=0&gt;&lt;/P&gt;
&lt;P&gt;&lt;/P&gt;
&lt;P align=justify&gt;The yellow line &lt;SPAN&gt;"No Fly Zone" Buffer&lt;/SPAN&gt; pretty much captures the concept I am trying to articulate here: the application realm and the infrastructure realm don't need to be strictly correlated. The infrastructure underneath needs to be designed and architected to match current and projected total workload of the functional requirements. In addition to that it needs to be designed to match the customer's policies around the required Non-Functional Requirements. None of these two items requires an in-depth analysis and assessment of the various application silos currently deployed in a non-virtualized datacenter. &lt;/P&gt;
&lt;P align=justify&gt;&amp;nbsp;&lt;/P&gt;
&lt;P align=justify&gt;&lt;B&gt;Does the Public Cloud Concept Bother With Functional Requirements After All?&lt;/B&gt;&lt;/P&gt;
&lt;P align=justify&gt;You have heard the buzz lately about internal and external Cloud, haven't you? And I am sure you heard the concept of &lt;I&gt;Private&lt;/I&gt; (aka &lt;I&gt;Internal&lt;/I&gt;) and &lt;I&gt;Public&lt;/I&gt; (aka &lt;I&gt;External&lt;/I&gt;) Clouds. The idea is that you can have a given workload that you can choose to execute either internally on your infrastructure or externally on a third-party infrastructure (typically that of a service provider). This should happen transparently.&lt;/P&gt;
&lt;P align=justify&gt;It is obvious at this point that the Public Clouds out there have not been designed upfront with your own applications in mind and nor they can be. That is obviously impossible. First, they are shared infrastructures so they should be &lt;I&gt;ad hoc designed&lt;/I&gt; against more than a single customer (impossible). Plus they are ready to use so they need to be in place before the provider could even think about assessing your internal infrastructure - assuming it makes sense, but clearly it doesn't as I said above - to be able to support it in its Public Cloud. &lt;U&gt;All security concerns about running applications in a Public Cloud aside for a moment&lt;/U&gt;, let's agree that you can effectively run your application either internally or externally. And if that is possible, why would you need to purpose design an ad hoc internal infrastructure based on an assessment and in-depth analysis of the legacy, if the public infrastructure allows you to do that without going through that pain? That's simply because the Public Cloud infrastructures are designed against standard well-known successful patterns that have been used to design internal virtualized infrastructures for years. &lt;/P&gt;
&lt;P align=justify&gt;This doesn't mean all Public Clouds are equal - they might vary greatly in terms of the characteristics they offer (Non-Functional Requirements). You might find Public Clouds that are optimized for costs, some others might be optimized for high availability, and others still might be optimized for Disaster Recovery scenarios. This is exactly similar, in concept, to how you would want your own private datacenter to behave: are HA and DR important to you (for all or just a selection of applications)? Is scalability important to you? Is data protection important to you? And so on. Again, this is somewhat unrelated to the fact you use IIS or Apache, Lotus Domino or Microsoft Exchange (you name your favorite application here). &lt;/P&gt;
&lt;P align=justify&gt;The problem we have today is that, while we define Public and Private Clouds as being very similar from a "plumbing" perspective, the way they are sold/bought by vendors/customers is too different. We tend to rent a service with some characteristic on the Public Cloud, whereas most customers still buy dispersed technology parts to build a Private Cloud. &lt;/P&gt;
&lt;P align=justify&gt;Sure there are big differences in the sense that while you "buy" a Private Cloud, you actually "rent" a Public Cloud (well, a part of it). Similarly a Private Cloud is dedicated whereas a Public Cloud is shared. Last but not least the management of a Private Cloud is on you whereas the management burden of the Public Cloud is on the service provider. However, if you look at the plumbing (the way servers, networks and storage are assembled and tied together with a hypervisor) the differences are not so drastic. What if the industry started hiding all the plumbing details of Private Clouds and started selling them like Public Clouds are sold? In a scenario like this customers wouldn't buy various pieces of technologies to assemble together; rather they'd buy and then manage a certain capacity with a certain level of Non-Functional Requirements (as opposed to rent and let the provider manage a part of a Public Cloud). What we have seen so far is hardware vendors (aka &lt;I&gt;Private Clouds vendors&lt;/I&gt;) adding Public Clouds services offerings. I wouldn't be surprised to see service providers of Public Clouds turning into &lt;I&gt;Private Clouds vendor&lt;/I&gt;s as well leveraging their know-how. &lt;/P&gt;
&lt;P align=justify&gt;&amp;nbsp;&lt;/P&gt;
&lt;P align=justify&gt;&lt;B&gt;It's All About the Metadata!&lt;/B&gt;&lt;/P&gt;
&lt;P align=justify&gt;As I said, I have been thinking about this concept of simplifying the way virtualized x86 infrastructures are proposed by IT vendors and, in turns, acquired by the end-users. I knew there was a single word to define all this but I was struggling to find it until I read this very interesting &lt;A href="http://vinternals.com/2009/08/this-cloud-needs-an-enema/"&gt;post&lt;/A&gt; from &lt;A href="http://www.vinternals.com/"&gt;vinternals&lt;/A&gt;. Metadata: that's the word I was looking for. Thanks, Stu! In fact, this fits pretty nice with the VMware mantra of vApps if you think about this for a moment. Those of you that have been working on the matter have probably seen this chart many times. &lt;/P&gt;&lt;IMG height=278 src="http://www.it20.info/misc/pictures/Ad-HocDesignedInfrastructures-dotheystillmakesense3.jpg" width=365 border=0&gt;
&lt;P&gt;
&lt;P align=justify&gt;The idea is that, through the OVF standard, a vApp (basically a collection of a number of virtual machines that can provide a service to the end user) publishes its Non-Functional Requirements to be satisfied. As Stu points out, while the vApp can &lt;U&gt;publish its requirements&lt;/U&gt;, there is no structured way - as of today - for the infrastructure underneath to &lt;U&gt;publish what it is capable of providing&lt;/U&gt;. However, if you have noticed, I am trying to push this concept a little bit further: not only infrastructure metadata for Non-Functional Requirements is a must to create the binary match between what the applications require and what the infrastructure is capable of providing, but it also could be used to revolutionize, as I said, how the new infrastructures (comprised of hardware, storage and networking) are designed, architected, built and sold/acquired. This in turns means a shorter&amp;nbsp;and easier sales cycle for vendors and proven, reliable, fully supported all-in-one infrastructures for customers. &lt;/P&gt;
&lt;P align=justify&gt;&amp;nbsp;&lt;/P&gt;
&lt;P align=justify&gt;&amp;nbsp;&lt;/P&gt;
&lt;P align=justify&gt;&lt;B&gt;Reference Architectures: Examples&lt;/B&gt;&lt;/P&gt;
&lt;P align=justify&gt;In retrospect, this is exactly what I was trying to achieve (without using the terminology and the notion I am using in this article) when I started to talk about virtualized &lt;I&gt;reference architectures&lt;/I&gt; during customers' and partners' events in the last few months. I have used a fairly simple approach which might be the basis for a more sophisticated &lt;I&gt;speculative sizing algorithm&lt;/I&gt;. First of all, I made a few assumptions in terms of sizing based on the &lt;A HREF="/blogs/main/archive/2007/11/26/83.aspx"&gt;rules of thumb&lt;/A&gt; I have published in the past (and adjusted to map onto the new technology).&lt;/P&gt;
&lt;P align=justify&gt;The above step covers the "sizing" part but it doesn't really cover the characteristic of the configuration (i.e. what we now call &lt;I&gt;Metadata&lt;/I&gt; in the context of this article). I then started to draft a few common scenarios (or reference architectures if you will) that I have seen being commonly and successfully used by many customers. Actual numbers and other assumptions we have used are not important in this context. I am just showing you the framework I have used and I am sure those numbers and overall assumptions might need more work to capture better patterns. &lt;/P&gt;
&lt;P align=justify&gt;The following is the first example that I presented at a joint IBM-LSI-Intel-VMware event last spring:&lt;/P&gt;&lt;IMG height=475 src="http://www.it20.info/misc/pictures/Ad-HocDesignedInfrastructures-dotheystillmakesense4.jpg" width=625 border=0&gt;
&lt;P&gt;
&lt;P align=justify&gt;This is obviously a very simplistic approach. In addition, it would be laughable (I agree) to call these two brief comments a &lt;I&gt;List of Non-Functional Requirements&lt;/I&gt;. Although the next few examples are a bit better, by no means is this a comprehensive implementation of the potential shift in the industry that I am discussing. &lt;/P&gt;
&lt;P align=justify&gt;The following chart illustrates another example which is a superset of the above configuration where we have added the backup solution.&lt;/P&gt;&lt;IMG height=478 src="http://www.it20.info/misc/pictures/Ad-HocDesignedInfrastructures-dotheystillmakesense5.jpg" width=625 border=0&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The following is another example which uses the &lt;A HREF="/blogs/main/archive/2008/11/14/162.aspx"&gt;BladeCenter S&lt;/A&gt; as a foundation. Note: don't pay too much attention to the number of VMs a configuration like this can support compared to the others. We have used HS12 blades which are single socket blades that don't use the brand new Intel Xeon 5500 (Nehalem) CPUs so the #VMs/Core is a bit lower. Again these are just examples. &lt;/P&gt;&lt;IMG height=479 src="http://www.it20.info/misc/pictures/Ad-HocDesignedInfrastructures-dotheystillmakesense6.jpg" width=625 border=0&gt;
&lt;P&gt;
&lt;P align=justify&gt;&amp;nbsp;&lt;/P&gt;
&lt;P align=justify&gt;The chart below is an example of an infrastructure capable of supporting about 72 VMs and with a "DR counterpart" to be installed at a remote site. In this example, we didn't use the native Storage mirroring capabilities and we opted for a cheaper software replication alternative. Notice the RPO (Recovery Point Objective) is greater than 0 since software based replications like this do not allow a complete sync of the two storage at any point in time. This is a typical Non-Functional Requirement discussion and a design point. This should be one of the first things that Metadata should publish as a characteristic of the underlying infrastructure. If you want you can use more sophisticated and native replication technologies as I discussed in this &lt;A HREF="/blogs/main/archive/2009/07/04/243.aspx"&gt;post&lt;/A&gt;.&lt;/P&gt;&lt;IMG height=484 src="http://www.it20.info/misc/pictures/Ad-HocDesignedInfrastructures-dotheystillmakesense7.jpg" width=625 border=0&gt;
&lt;P&gt;
&lt;P align=justify&gt;One interesting thing to notice is that the first configurations are comprised of the smallest hardware configurations you can buy today in the market. That's true for servers as well as for the storage components. Yet the workload they can sustain with this minimal configuration (expressed in &lt;I&gt;estimated number of VMs&lt;/I&gt;) exceeds the total amount of workload with which most of the SMB customers need to deal. This underlines again that an in-depth analysis to determine the size of the target environment is, in most cases, not even required.&lt;/P&gt;
&lt;P align=justify&gt;&lt;B&gt;Conclusions&lt;/B&gt;&lt;/P&gt;
&lt;P align=justify&gt;In this article, I questioned the value of two specific practices: first, assessing legacy infrastructures is becoming more and more useless because, on one hand, we have so much power available these days that for most customers in the SMB space the smallest thing vendors could design might be a bazooka to shoot a fly. For Enterprise customers, most of the time a rule-of-thumb approach (perhaps complemented with deferred purchases based on actual needs) seems to be a good compromise between quality of the output and the effort required to get to the output. &lt;/P&gt;
&lt;P align=justify&gt;I have questioned the value of designing ad hoc infrastructures: this is true for both SMB and Enterprise shops as we have enough experience in the industry at this point to start pushing reference architectures applying best practices we have learned in the last 10 years without having to reinvent the wheel (or the architecture if you will) every time.&lt;/P&gt;
&lt;P align=justify&gt;I know this is a bit of a stretch and, in fact, it's a sort of provocative article. However, while we are not clearly there today, my guess is that as we move toward the 100% virtualized datacenters, we might start to talk in the sense of selling and buying not just in terms of discrete components and technologies that can be used to create ad-hoc infrastructures, but rather in terms of &lt;I&gt;black boxes&lt;/I&gt; that have a &lt;I&gt;total aggregated throughput&lt;/I&gt; associated which could be expressed in "number of average VMs" or in any other metric that you can think of. &lt;/P&gt;
&lt;P align=justify&gt;Additionally the &lt;I&gt;black box&lt;/I&gt; would carry a label with a list of capabilities, or metadata, that describe the characteristics of the Non-Functional Requirements associated to that specific unit. A vendor might have more units in the catalog with different capacity and different levels of Non-Functional Requirements. The whole idea is to try to simplify the way these solutions are designed, architected and sold by people in the field on one side, and the way they are purchased by the end-users. And with virtualization, which decouples functional and Non-Functional Requirements, we might see the light out of the tunnel this time. &lt;/P&gt;
&lt;P align=justify&gt;Massimo. &lt;/P&gt;&lt;img src="http://it20.info/aggbug.aspx?PostID=271" width="1" height="1"&gt;</description></item><item><title>The (Potential) Value of Blogging for Your Career</title><link>http://it20.info/blogs/main/archive/2009/09/12/262.aspx</link><pubDate>Sat, 12 Sep 2009 06:32:00 GMT</pubDate><guid isPermaLink="false">3066da22-6b27-4cf1-aa0f-2eff79b21f87:262</guid><dc:creator>Massimo</dc:creator><slash:comments>6</slash:comments><comments>http://it20.info/blogs/main/comments/262.aspx</comments><wfw:commentRss>http://it20.info/blogs/main/commentrss.aspx?PostID=262</wfw:commentRss><description>&lt;P align=justify&gt;Last night I posted a &lt;A HREF="/blogs/main/archive/2009/09/11/257.aspx"&gt;new article&lt;/A&gt; about the SpringSource/VMware story and the potential implications for the industry that this will have. After slightly more than 24 hours I am looking at the statistics and they say I am just south of 1000 views, which I think it is amazing - for a casual blogger like myself at least. These days I have also come across a few comments on Twitter about a presentation that &lt;A href="http://boche.net/blog/"&gt;Jason Boche&lt;/A&gt; did at VMworld 2009 about the value of blogging for his visibility. Coincidentally yesterday night I read an excellent post from Duncan Epping on the same &lt;A href="http://www.yellow-bricks.com/2009/09/09/another-year-has-passed-by/"&gt;topic&lt;/A&gt; that is how much he has been able to capitalize out of his "blogging hobby".&lt;/P&gt;
&lt;P align=justify&gt;I wanted to take a moment here to underline how true what Duncan was saying is. I can't agree more with his points as I have been through (some of) them myself. I have documented my blog experience in my revamped CV online &lt;A HREF="/aboutme/cv.htm"&gt;here&lt;/A&gt; (have a look at the third session which is dedicated to my blog experience). Blogging, and the Web 2.0 in general, have literally changed my professional life, specifically my exposure and visibility. Not only that, on a much larger scale, it is changing the way the "Power of Knowledge" in the IT world actually works. I have built a small presentation on this concept a couple of years ago that I did for an internal review and that I have posted &lt;A HREF="/misc/files_download/myweb2.0english.ppt"&gt;here&lt;/A&gt; for you to download. It goes through some of the concepts and a point in time "life-cycle" of a blog post that evolved into a great deal of internal visibility. &lt;/P&gt;
&lt;P align=justify&gt;As Duncan pointed out the potential visibility you can get is very rewarding. By the way you don't need to "post like hell" in order to have good feedbacks and a good following. Take me as an example: even though I tend to post long articles that go through a specific concept and try to get into the details of the matter, I usually post no more than once a month on average. See? I don't post twice a day and you don't have to do so to end-up in the &lt;A href="http://www.virtualization.info/2008/12/top-virtualization-blogs-of-2008.html"&gt;top 5 list&lt;/A&gt; of virtualization.info for the best blogs of 2008. The only piece of advice I have (on top of what Duncan suggested already) is that I wouldn't be too much worried about having a blog... I would, on the other hand, be worried about having something interesting to say in a blog. I have been in countless IBM meetings where people where suggesting, for a particular topic we were working on, to create a Domino team-room or a wiki to collaborate and have people aggregating around it: most people don't understand a wiki or a blog per se is nothing. It's just a frame... if you don't put a picture in it - i.e. the real content - people won't stop and won't look at it. &lt;/P&gt;
&lt;P align=justify&gt;I have to admit I have had so much exposure to end-users and business partners in the last few years that it is easy for me to write about stuff they are interested in. Even if you don't visit often customers these days, other technologies such as forums, allow you to have a real life grip on what's going on in the field. When I spent a few hours on the VMware and Microsoft forums I feel like I have visited some 20 customers given the amount of information you can take out of those posts (pain points, requirements, constraints, even internal politics that have little to do with IT but do influence the IT choices). Sure if you haven't been able to meet a customer in 10 years and you think Twitter is a bad word... well you can always post stuff on your blog but they probably need to be related to how you would suggest cooking pasta "al dente" or a good steak on the grill. &lt;/P&gt;
&lt;P align=justify&gt;100% agreed also with Duncan's post about the fact you post not only to share something you know with the community but also to have a chance to dig into something you need to understand in deeper details. This is for example what happened to me prior to this &lt;A HREF="/blogs/main/archive/2009/07/04/243.aspx"&gt;post&lt;/A&gt; about how DR works in a VMware scenario. I have used it as a challenge: there were many things I didn't have clear about the various steps and I thought that I was too lazy to just sit down and read the manuals. I had to have a challenge and posting an article about how to do that was a good one. &lt;/P&gt;
&lt;P align=justify&gt;The only regret I have is that it seems Duncan has been able to capitalize more on his visibility than I have been able to, but that's ok. In the meanwhile I will enjoy the (almost) 1000 visits in 24 hours... talking to 1000 people of what I think about a given topic would mean 10 years on the road in the pre Web 2.0 era. &lt;/P&gt;
&lt;P align=justify&gt;I guess that having posted yesterday as well as today, I will have to wait another couple of months for the next post to be on my "average posting rate".&lt;/P&gt;
&lt;P&gt;Massimo. &lt;/P&gt;&lt;img src="http://it20.info/aggbug.aspx?PostID=262" width="1" height="1"&gt;</description></item><item><title>VMware, SpringSource and What's Not Appropriate to Say</title><link>http://it20.info/blogs/main/archive/2009/09/11/257.aspx</link><pubDate>Fri, 11 Sep 2009 00:22:00 GMT</pubDate><guid isPermaLink="false">3066da22-6b27-4cf1-aa0f-2eff79b21f87:257</guid><dc:creator>Massimo</dc:creator><slash:comments>6</slash:comments><comments>http://it20.info/blogs/main/comments/257.aspx</comments><wfw:commentRss>http://it20.info/blogs/main/commentrss.aspx?PostID=257</wfw:commentRss><description>&lt;P align=justify&gt;The acquisition of SpringSource that VMware has announced is going to change the way the industry as a whole perceives and segments the key players in the x86 virtualization market. I think most people (myself included) need to change gear and look at the whole thing from a new perspective. In this article I am going to talk more about a concept that I have been thinking about lately: &lt;I&gt;virtualization is becoming more and more broad and deep&lt;/I&gt;.&lt;/P&gt;
&lt;P align=justify&gt;This is clearly becoming a two-horse race between Microsoft and VMware whereas Citrix is going to be forced to gravitate around Microsoft in the "broad and deep" context I am going to discuss hereafter. &lt;/P&gt;
&lt;P align=justify&gt;When I heard about VMware and SpringSource, all of a sudden I realized the world is changing for all of us virtualization geeks. First and foremost those that have only been bothering about low level infrastructure virtualization details - such as VMotion compatibilities, cluster configurations, storage integrations and so forth - will have a hard time keeping up with what's going on in the industry. Virtualization vendors are "moving up the stack" very quickly so you'd better start familiarizing with concepts and technologies around Development Frameworks, Integrated Development Environment (IDE) and stuff like that. Not the sort of things Systems Engineers (aka infrastructure people) paid too much attention to - until now. &lt;/P&gt;
&lt;P align=justify&gt;Those that have grown up with VMware in the virtualization arena have always focused their efforts on hypervisor capabilities first (I still remember my very first customer implementation where we were piloting a beta version of ESX 1.1) and subsequently on the infrastructure capabilities that VMware made available throughout the years (things like Virtual Center with all its associated functionalities as well as add-on products such as SRM and the like). This is the "standard dimension" we all are very familiar with and I would define this dimension as &lt;I&gt;broad.&lt;/I&gt;&amp;nbsp; Basically VMware &lt;I&gt;broadened&lt;/I&gt; its value prop moving from the hypervisor (which is a commodity from a business perspective but a tremendous asset from a sell-up perspective) all the way to make the infrastructure richer and more enterprise-ready with additional functionalities, specifically in the automation space.&lt;/P&gt;
&lt;P align=justify&gt;This move about SpringSource opens up a whole different dimension which is what I refer to as the &lt;I&gt;deep&lt;/I&gt; dimension. In fact if VMware continues to only broaden their hypervisor richness they will always be at the mercy of two things: &lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;
&lt;P align=justify&gt;Their competitors might be able to catch up to the same level of ecosystem and thus functionalities.&lt;/P&gt;&lt;/LI&gt;
&lt;LI&gt;
&lt;P align=justify&gt;Their own potential customers that might not need that vast ecosystem of functionalities and might be satisfied with VMware competitors offerings (even if not so &lt;I&gt;broad&lt;/I&gt;).&lt;/P&gt;&lt;/LI&gt;&lt;/OL&gt;
&lt;P align=justify&gt;Now, everybody knows that the stuff you can find in your data center is not a function of a technology per se, but rather a function of the business applications they are able to support. Basically your platform (be it a processor, an operating system or a middleware - you define it) is as good as the number of ISVs it has been able to attract over the years (I should trademark this). Back to VMware. One of the challenges they had was to not only grow &lt;I&gt;broad&lt;/I&gt; but also find a way to grow &lt;I&gt;deep&lt;/I&gt;. They had to try to differentiate that black box that they provide (i.e. the virtual hardware which describes their virtual machine), essentially moving up the stack trying to foster the development of business code on top of their virtual hardware (and virtual infrastructure) that wouldn't run as well on someone else's virtual hardware (and virtual infrastructure). They basically can't afford anymore (or they will not be able to afford in the long run) to win deals based on infrastructure functionalities alone. They need to create a compelling reason for the ISVs to suggest using VMware rather than leveraging Systems Engineers that suggest using VMware because it makes things "easier and cleaner." Let me tell you what I think: if it was about making things easier and cleaner we would all be running mainframes in our data centers. And we wouldn't be here discussing how to optimize the Intel server sprawl as there wouldn't be any Intel server sprawl in the first place. &lt;/P&gt;
&lt;P align=justify&gt;If VMware doesn't do this they are exposed in the long run to the risks in points #1 and #2 above. In trying to create a better and more integrated application + infrastructure duo - which is their current mantra when discussing the SpringSource acquisition - they also need to find a way to make sure applications that are being developed will run better on certain virtual infrastructures (namely VMware) vs. competitors virtual infrastructures (namely Microsoft). Did I say lock-in? Nah, what a bad term. &lt;/P&gt;
&lt;P align=justify&gt;Let me draw this concept in a simple chart: &lt;/P&gt;
&lt;P&gt;&lt;IMG height=497 src="http://www.it20.info/misc/pictures/VMwareSpringSourceandwhatsnotappropriatetosay1.JPG" width=631 border=0&gt;&lt;/P&gt;
&lt;P align=justify&gt;How do I read this chart? Interestingly enough the hypervisor is central in this vision, however it's perceived as a piece of commodity, which it truly is, from a revenue perspective. Having said this, it's an incredible point of control for the vendors because hypervisor XYZ will drag fee-based management features (typically from the same vendor). The management features are on the &lt;I&gt;broad&lt;/I&gt; dimension (left and right). In the VMware camp here you can find the enterprise features included in vSphere as well as all VMware data center oriented add-on products. In the Microsoft camp you would find Systems Center Virtual Machine Manager along with the whole Systems Center product suite. &lt;/P&gt;
&lt;P align=justify&gt;The other dimension (&lt;I&gt;deep&lt;/I&gt;) is what the new SpringSource acquisition is all about. VMware is willing to create a more integrated application layer, through virtualization hooks in the SpringSource framework, that will make new Java-based applications VMware-aware. Microsoft has a similar if not bigger potential (although they haven't exploited it so far) in the fact that they own the software stack/framework (Windows / .Net) that is being used in about 80% of the x86 deployments worldwide (be them virtual or physical). In light of this and with virtualization in mind, one might speculate that VMware has a very mature &lt;I&gt;broad&lt;/I&gt; dimension and they are starting to build a &lt;I&gt;deep&lt;/I&gt; dimension. On the other side Microsoft has a very mature &lt;I&gt;deep&lt;/I&gt; dimension (although, as I said, they haven't really leveraged other than for some Windows enlightenment integrations) while they are adapting their highly potential &lt;I&gt;broad&lt;/I&gt; dimension with more virtualization in mind - the System Center suite is very mature and complete but it's not virtualization-centric, so to speak. I guess you are starting to see now why I think this is going to be a two-horse race. How could Citrix keep up with all this?&lt;/P&gt;
&lt;P align=justify&gt;All this looks interesting, but I have controversial sentiments about what's going on. In a sense, having virtualization aware applications is going to provide a new level of features and functionalities that do not exist today, which is very positive. On the other hand I have always evangelized (and hoped!) for a very clean separation between the infrastructure services and the application layer as I have outlined in my old presentation I did at VMworld 2007 (download it &lt;A HREF="/files/3/documentation/entry54.aspx"&gt;here&lt;/A&gt;). I strongly believe there is a tremendous value for end-users to use a standard infrastructure where they could switch virtualization technologies back and forth without having to compromise on the way business applications are written. Understandably this is not a value proposition the virtualization vendors like to hear as - for their own good business reasons - they want to be able to have the customers &lt;I&gt;strategically standardized on their own platform&lt;/I&gt;. Did I say lock them in? Nah. Joking aside I believe many others will have controversial points of view in trying to determine whether it would be better to have a more generic application that runs well on all virtualization software platforms, or to have an application that rocks only on a single virtualization platform and runs so-so on all others. All this assumes industry standards will either be non existent, only used by a single vendor or simply ignored. At the end of the day they all lead to the same result from a user perspective, which is proprietary implementations.&lt;/P&gt;
&lt;P align=justify&gt;Assuming this is the right interpretation of where the industry is moving (well, at least it's my interpretation for now), I think VMware is making a big bet with these messages. They are somehow giving the idea (to me at least) that in the long run there will be two optimized stacks in the industry one will need to choose from strategically: the first one is the "VMware stack" with SpringSource-based VMware-aware applications, and the other one is the "Microsoft stack" with Windows/.Net optimized applications where the former would run on top of ESX / vSphere and the latter would run on top of Hyper-V / Systems Center. Sure VMware is going to support Windows as well, but this discussion is not about running legacy physical servers in virtual machines, this discussion is about how to properly and strategically integrate newly developed applications on top of a brand new virtual infrastructure. On the other hand Microsoft does support and will continue to support Linux variants on top of Hyper-V, but you probably wouldn't say Linux is (going to be) optimized to run on top of the Microsoft hypervisor. You might argue that VMware does a better job at running Windows than Microsoft does at running Linux, but I don't think I need to explain why this is the case (just look at the OS marketshare data and you'll find the answer). The key point I am trying to make here is that until you treat the VM (and its application) as a black-box, you can always argue that your virtual infrastructure does a better job at running it, regardless of what runs inside of it. On the other hand as soon as you start having first and second class citizens in terms of application support (not to be confused with base OS support), you are opening up a new dimension that didn't basically exist before... and that might be an assist to your competitor. Perhaps this is a risk that VMware has to take to move to the next level: if they want to compete head-to-head with Microsoft they need to turn hard at some point and not fall in bed with the enemy all the times.&lt;/P&gt;
&lt;P align=justify&gt;The following is a very unofficial view of what's in my mind with regard to things like focus, commitment and interest each of the two vendors will map into their own technology efforts:&lt;/P&gt;&lt;IMG height=497 src="http://www.it20.info/misc/pictures/VMwareSpringSourceandwhatsnotappropriatetosay2.JPG" width=631 border=0&gt;
&lt;P align=justify&gt;There is another thing about the SpringSource acquisition. Other than the application integration I have referred to, there was another thing VMware was interested in: a Platform as a Service offering. There are a number of segmentations and definitions around the various cloud models but there are two that are dominant among the others (so far): IaaS and PaaS. &lt;/P&gt;
&lt;P align=justify&gt;The first one is &lt;I&gt;Infrastructure-as-a-Service&lt;/I&gt;, and its characteristics can be summarized as follows: a software black box that can run whatever the customer requires, starting from the OS all the way to the software stack (middleware and applications) of choice. For the virtualization geeks of the old school this basically is an empty virtual machine... I am sure you are familiar with that black screen that says "OS not found" and that prompts you for a diskette. &lt;/P&gt;
&lt;P align=justify&gt;The characteristics of &lt;I&gt;Platform-as-a-Service&lt;/I&gt; are a bit different and a little higher in the stack. In a PaaS cloud model, the end-user wouldn't be presented with a &lt;I&gt;bare (virtual) metal VM&lt;/I&gt; (horrible definition, but I think you get the idea); rather with a &lt;I&gt;software platform&lt;/I&gt; that includes functionalities that could be generally associated to operating systems, development frameworks as well as data management services. Microsoft Azure anyone? That's where VMware was coming short compared to Microsoft in the PaaS space. Microsoft has a very strong potential here to attract a huge community of developers. VMware had to do something to address that very important layer of the cloud space with its own offering as they had to provide an end-to-end stack for both IaaS (which was easy, and which they have had for a number of years) and PaaS to be credible players. This reason was perhaps even more compelling than the first reason discussed in this article (i.e. being able to create VMware-aware autonomic applications). After all, if it was only about application integration, they could have partnered with key middleware vendors to integrate these functionalities into a variety of leadership frameworks including WebSphere, WebLogic, JBoss to name a few. The fact that they wanted/needed to buy SpringSource to do this is partially due to the fact that they couldn't afford non-exclusive partnerships, as well as to the fact that they needed to move up the stack very quickly. This is not something that a standard, perhaps not even exclusive, technology partnership could provide. &lt;/P&gt;
&lt;P align=justify&gt;Last but not least, while we are in speculation mode, if I look at the two PaaS stacks from Microsoft and VMware, the latter seems to be missing a good data management layer to counter the SQL Services in Azure. With Sun Microsystems falling apart and speculations of selected spin-off of various divisions, I am wondering if VMware isn't valuating an additional move up in their brand new stack targeting MySQL (or similar technologies if Oracle isn't willing to help VMware to become a new Microsoft)...&lt;/P&gt;
&lt;P align=justify&gt;In conclusion, this is a very cruel take on what's on the horizon. Certainly the marketing machines of the vendors will try to smooth the angles of my very simplistic view, as an example VMware is preaching to the industry that they are going to open up the APIs so that all applications built on top of all sort of development frameworks could be integrated into their own infrastructure. While technically true most .NET developers might end up doing this on the Microsoft platform for their own convenience - which might or might not have anything to do with the technical reasons associated. Similarly I don't see many Java developers integrating their applications into Hyper-V and the Microsoft virtualization tools as a whole. It's interesting that the world seems to be aligning for these two vendors although they are coming from two very different perspectives: VMware is coming from the virtual infrastructure expanding into the platform space, whereas Microsoft is coming from the platform space moving into the virtual infrastructure space. The giants are moving and the customers are going to see the benefits. May you (we!) live in interesting times - which seems to be the case. &lt;/P&gt;
&lt;P align=justify&gt;Massimo.&lt;/P&gt;&lt;img src="http://it20.info/aggbug.aspx?PostID=257" width="1" height="1"&gt;</description></item><item><title>Disaster Recovery Inside-Out for Dummies (with LSI)</title><link>http://it20.info/blogs/main/archive/2009/07/04/243.aspx</link><pubDate>Fri, 03 Jul 2009 22:38:00 GMT</pubDate><guid isPermaLink="false">3066da22-6b27-4cf1-aa0f-2eff79b21f87:243</guid><dc:creator>Massimo</dc:creator><slash:comments>6</slash:comments><comments>http://it20.info/blogs/main/comments/243.aspx</comments><wfw:commentRss>http://it20.info/blogs/main/commentrss.aspx?PostID=243</wfw:commentRss><description>&lt;P align=justify&gt;In this article, I'd like to document a setup I have been working on for a few days at the LSI office in Milano (great guys and free beverage there! Thanks!). &lt;A href="http://www.lsi.com/"&gt;LSI&lt;/A&gt; is the company from which IBM OEMs the DS3000, DS4000 and DS5000 lines of storage servers. Since I am trying to get a little bit more into the storage and network subsystems I wanted to spend a few days playing with those kits. I have concentrated on today's hot topic of Disaster Recovery and particularly the integration of LSI RVM (Remote Volume Mirroring) into the VMware SRM (Site Recovery Manager). I have to admit that I am not a storage guru, nor I have looked too much into SRM, so most of the stuff you will find here might be pretty basic. This is clearly not an advanced read for the likes of &lt;A href="http://www.yellow-bricks.com/"&gt;Duncan Epping&lt;/A&gt;, nor for those that go to bed with the VMware vmkfstools CLI or "talk UUID." (I guess Duncan will get what I mean.) Yet it's intended to provide a bit of background about what happens behind the scenes (the "scenes" would be the GUIs of the various products involved in this case). The SRM part is really focused on the storage integration which was the thing I was most interested in for this 2-days storage marathon. I like to treat these articles as a sort of personal log / documentation of what I have done (for future reference) so it will certainly serve me in the long run. Hopefully it will be of use for some of you, too. &lt;/P&gt;
&lt;P align=justify&gt;Last but not least while the bar on the right of your browser might suggest this is a long post... consider that it's full of screenshots! So without further adieu, let's get started. &lt;/P&gt;
&lt;P align=justify&gt;&lt;B&gt;Basic Remote Mirror Setup&lt;/B&gt;&lt;/P&gt;
&lt;P align=justify&gt;This part doesn't involve any specific SRM concept in action. It's just meant to describe the basic infrastructure setup (both logical and physical) as well as the way the storage replicates and how the VMware hosts deal with replicated LUNs. It is important to understand what happens at a lower level in order to move on and plug SRM on top of this. The picture below outlines how the logical layout of the infrastructure looks (including SRM): &lt;/P&gt;&lt;IMG height=640 src="http://www.it20.info/misc/pictures/DisasterRecoveryfordummies(withLSI)1.JPG" width=1179 border=0&gt; 
&lt;P align=justify&gt;For completeness, the following picture describes how the physical infrastructure looks instead: &lt;/P&gt;&lt;IMG height=640 src="http://www.it20.info/misc/pictures/DisasterRecoveryfordummies(withLSI)2.JPG" width=1179 border=0&gt; 
&lt;P&gt;&lt;/P&gt;
&lt;P align=justify&gt;As the picture outlines, the Virtual Center VMs in both sites also host the SRM service. Depending on the scale of your project you might want to have dedicated virtual machines to host the SRM instances or even dedicated physical servers. Milano, in our lab scenario, is the primary site while Roma is the DR site. As you can imagine, LUNs need to be replicated from the DS4700 in Milano onto the DS4800 in Roma. LSI calls this storage feature RVM (Remote Volume Mirroring) and it's essentially an advanced function that allows you to keep a copy of your LUNs on a remote storage server.&lt;/P&gt;
&lt;P align=justify&gt;Notice that the DS4700 is a storage server that includes into a single 3U package both the controllers (A and B) as well as the first string of disks (more can be attached through FC ports on the rear). On the other hand, the DS4800 has a 4U "head" unit that hosts the controllers but doesn't include any disk in the base chassis. They can be added with external expansions (as in the picture above). You might guess that the 4800 is a more powerful machine than the DS4700 and that, in a real life scenario, you might want to have that situation inverted. Your guessing is correct but for the sake of the tests this wasn't interesting since we weren't looking for ultimate performance. Also consider that any DS4xxx type of storage is "replication compatible" both ways with any other DS4xxx type of storage. And even DS5xxx!&lt;/P&gt;
&lt;P align=justify&gt;Note: Other than the standard zoning so that each of the servers with two HBAs can see each of the two controllers on the storage array, please consider that for the RVM feature to work all controllers need to be connected in a certain way. Specifically for this scenario the last FC port of ControllerA on the DS4700 needs to be connected to the last FC port of ControllerA on the DS4800. Same zoning process for ControllerB. Without this extra SAN configuration RVM would not work. And no, having a single switch per site is not a best practice - you would need two in a real life environment.&lt;/P&gt;
&lt;P align=justify&gt;The storage configuration (a summary of it) is described in the pictures below. Basically the DS4700 in Milano has a couple of LUNs that are dedicated to the local cluster and that do not replicate (these are VC-MILANO and SERVICE-MILANO). These LUNs host the Virtual Center instance as well as a Windows template. There are other LUNs (&lt;I&gt;SRM-1-MILANO&lt;/I&gt;, &lt;I&gt;SRM-2-MILANO&lt;/I&gt;, &lt;I&gt;SRM-3-MILANO&lt;/I&gt; and &lt;I&gt;SRM-4-MILANO&lt;/I&gt;) that are replicated onto the DS4800 in Roma. A simple synchronous mirroring configuration has been established. &lt;/P&gt;&lt;IMG height=640 src="http://www.it20.info/misc/pictures/DisasterRecoveryfordummies(withLSI)3.JPG" width=1179 border=0&gt; 
&lt;P&gt;&lt;/P&gt;
&lt;P align=justify&gt;&amp;nbsp;&lt;/P&gt;&lt;IMG height=640 src="http://www.it20.info/misc/pictures/DisasterRecoveryfordummies(withLSI)4.JPG" width=1179 border=0&gt; 
&lt;P&gt;&lt;/P&gt;
&lt;P align=justify&gt;The way you set this up is that you first create companion LUNs on the target: they need to be at least as big as the source LUNs, or bigger if you want.&lt;/P&gt;
&lt;P align=justify&gt;Through the LSI Storage Manager (SANtricity) you then select the source LUN and you mirror it onto the remote storage: a list of DSxxx storage devices with the mirroring feature enabled is shown, as well as a list of compatible companion LUNs for each device. The DS4800 does not mask the replicated LUNs to the cluster in Roma. This means that the hosts in the cluster have no idea whatsoever that there are LUNs on that array that are in sync with the cluster in Milano. In our lab we have manually created &lt;I&gt;SRM-1-ROMA&lt;/I&gt;, &lt;I&gt;SRM-2-ROMA&lt;/I&gt;,&lt;I&gt; SRM-3-ROMA&lt;/I&gt;,&lt;I&gt; SRM-4-ROMA&lt;/I&gt; on the DS4800 (as you can see in the picture above) and then we went through the steps described to create the mirror.&amp;nbsp; &lt;/P&gt;
&lt;P align=justify&gt;Now that the replication is in place, the first test we did at the storage infrastructure level was to create a snapshot of a replicated LUN. From the Storage Manager we created a snapshot of &lt;I&gt;SRM-1-ROMA&lt;/I&gt; leaving the mirror link between &lt;I&gt;SRM-1-MILANO&lt;/I&gt; and &lt;I&gt;SRM-1-ROMA &lt;/I&gt;in place as the picture below suggests:&lt;/P&gt;&lt;IMG height=640 src="http://www.it20.info/misc/pictures/DisasterRecoveryfordummies(withLSI)5.JPG" width=1179 border=0&gt; 
&lt;P&gt;&lt;/P&gt;
&lt;P align=justify&gt;This is how you would read the above picture: &lt;I&gt;SRM-1-ROMA&lt;/I&gt; is a replica of a LUN coming from another storage server. As such it's in a read-only state (in fact you don't want to write onto it since it's continuously being updated by its master LUN on a remote storage). However, we took a snapshot of that R/O LUN at a certain point in time and we called it &lt;I&gt;Snap-SRM-1-ROMA-1&lt;/I&gt;. This LUN is now enabled for R/W so it could be fully used as a point in time copy of an R/O LUN under replication. &lt;/P&gt;
&lt;P align=justify&gt;The next step was then to manually map this snapshot to the cluster in Roma so the servers would be able to recognize it:&lt;/P&gt;&lt;IMG height=809 src="http://www.it20.info/misc/pictures/DisasterRecoveryfordummies(withLSI)6.JPG" width=1179 border=0&gt; 
&lt;P&gt;&lt;/P&gt;
&lt;P align=justify&gt;And this is when the "fun" begins. &lt;/P&gt;
&lt;P align=justify&gt;&lt;FONT face=Arial color=#0000ff&gt;*************&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; Background information that you need to understand and be familiar with before you move on&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; *********************************&lt;/FONT&gt;&lt;/P&gt;
&lt;P align=justify&gt;There are two key parameters that rule how an ESX host deals with the LUNs: &lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;
&lt;P align=justify&gt;EnableResignature (default = 0 = False)&lt;/P&gt;
&lt;LI&gt;
&lt;P align=justify&gt;DisallowSnapshotLUN (default = 1 = True) &lt;/P&gt;&lt;/LI&gt;&lt;/UL&gt;
&lt;P align=justify&gt;It took me a while to digest them (and right now I think I am halfway to it), but essentially the DisallowSnapshotLUN (when active, which is the default) instructs the ESX host NOT to import the VMware Datastore if it recognizes it's a snapshot of an existing LUN. When the parameter is turned off to False the ESX host is allowed to import the snapshot as a VMware Datastore without modifying its original name or its UUID.&lt;/P&gt;
&lt;P align=justify&gt;The first parameter (when active, which is NOT the default) instructs the ESX host to &lt;U&gt;resign&lt;/U&gt; the LUN and import it into the ESX host as a new VMware Datastore&amp;nbsp;(which gets labeled snap-xxxxxx-&amp;lt;Original Datastore Name&amp;gt;) with a new UUID. When this parameter is turned on, the DisallowSnapshotLUN value is irrelevant as the LUN gets resigned right away and imported as a new Datastore.&lt;/P&gt;
&lt;P align=justify&gt;These parameters get very important (and very critical) when you are dealing with snapshots and clone on the same storage server and you try to give the original ESX hosts visibility of these new spaces. For example, if you try to expose to a given host/cluster the original LUN as well as its snapshot without resigning it, you might incur potential data loss and inconsistency as the host/cluster will only make one of these two entities available (they are in fact essentially the same thing: same Datastore name, same UUID). When you are dealing with a remote copy of the LUN(s), this becomes a less important issue because you are basically importing a snapshot (or a mirror) into a different set of ESX hosts. &lt;/P&gt;
&lt;P align=justify&gt;This should be enough for a dummy (like myself), but if you want to get into deeper details about these two parameters and the UUID thing I suggest you read &lt;A href="http://www.yellow-bricks.com/2008/12/11/enableresignature-andor-disallowsnapshotlun/"&gt;one of Duncan's best articles&lt;/A&gt; as well as &lt;A href="http://virtualgeek.typepad.com/virtual_geek/2008/08/a-few-technic-1.html"&gt;this post from Chad&lt;/A&gt;. &lt;/P&gt;
&lt;P align=justify&gt;&lt;FONT face=Arial color=#0000ff&gt;************************************************************************************************************************************************************&lt;/FONT&gt;&lt;/P&gt;
&lt;P align=justify&gt;If you are now familiar with the background above you should guess what happens. Mapping the snapshot &lt;I&gt;Snap-SRM-1-ROMA-1 &lt;/I&gt;to the cluster in Roma forced the ESX hosts to recognize the LUN after the rescan:&lt;/P&gt;&lt;IMG height=638 src="http://www.it20.info/misc/pictures/DisasterRecoveryfordummies(withLSI)7.JPG" width=1179 border=0&gt; 
&lt;P&gt;&lt;/P&gt;
&lt;P align=justify&gt;Since we left the parameters above at their defaults (EnableResignature=0, DisallowSnapshotLUN=1), the LUN doesn't show up as a VMware Datastore on any of the hosts in Roma:&lt;/P&gt;&lt;IMG height=918 src="http://www.it20.info/misc/pictures/DisasterRecoveryfordummies(withLSI)8.JPG" width=1179 border=0&gt; 
&lt;P&gt;&lt;/P&gt;
&lt;P align=justify&gt;This is the desired behavior since the hosts recognize this is a LUN that is coming from a different storage subsystem (so with a sort of "incompatible" UUID). As a matter of fact, you can manually add a brand new Datastore and the LUN above is showed as available space for a new VMFS file system (which we didn't create as we didn't want to destroy the content):&lt;/P&gt;&lt;IMG height=640 src="http://www.it20.info/misc/pictures/DisasterRecoveryfordummies(withLSI)9.JPG" width=1179 border=0&gt; 
&lt;P align=justify&gt;At this point we changed the DisallowSnapshotLUN parameter to 0 (that setting should read "Allow Snapshot to be imported"):&lt;/P&gt;&lt;IMG height=640 src="http://www.it20.info/misc/pictures/DisasterRecoveryfordummies(withLSI)10.JPG" width=1179 border=0&gt; 
&lt;P align=justify&gt;After this change (which doesn't require a reboot of the host), the hypervisor imports the VMware Datastore simply after a rescan of the HBAs:&lt;/P&gt;
&lt;P&gt;&lt;IMG height=640 src="http://www.it20.info/misc/pictures/DisasterRecoveryfordummies(withLSI)11.JPG" width=1179 border=0&gt;&lt;/P&gt;
&lt;P&gt;&lt;/P&gt;
&lt;P align=justify&gt;Similarly, by changing the EnableResignature parameter to 1 and rescanning the HBAs, the Datastore gets imported with a new UUID and a new name as you can see from the picture below: &lt;/P&gt;
&lt;P&gt;&lt;IMG height=640 src="http://www.it20.info/misc/pictures/DisasterRecoveryfordummies(withLSI)12.JPG" width=1179 border=0&gt;&lt;/P&gt;
&lt;P&gt;&lt;/P&gt;
&lt;P align=justify&gt;What I have described above (at a very high level) are basically the steps you would need to implement in order to manually deal with a DR procedure. SRM does that under the covers along with a number of other things, such as reconfiguring the VMs on the DR site (alternatively you would have to manually add them to the DR cluster after importing the Datastores). It's a common misconception that VMware SRM is a layer of additional&amp;nbsp; technologies on top of what VI3 already provides (SRM today is not compatible with vSphere, but it should be soon). I think a better way to describe what SRM does is that it's a &lt;I&gt;method to code&lt;/I&gt; all the actions you would have to manually implement in order to either test or run a DR Recovery Plan. Many refer to SRM as a "binary coded DR runbook." There is nothing that you can't do if you don't have SRM. But having SRM might save you time... and some risks (manual DR procedures might be error prone). &lt;/P&gt;
&lt;P align=justify&gt;&lt;B&gt;Site Recovery Manager Setup (Test the Recovery Plan) &lt;/B&gt;&lt;/P&gt;
&lt;P align=justify&gt;In this section, we are going to essentially automate the manual process above by means of a &lt;I&gt;DR orchestrator&lt;/I&gt; (in this case, it is called VMware Site Recovery Manager). This article is not intended to be a detailed description of the capabilities of SRM nor a step-by-step guide to its configuration. We will assume from now on the reader has a basic understanding of the product. Before we get into the details it is important to describe the virtual environments (guest OSes) we created in the production site. Notice that there are additional VMs that we have used to host a number of infrastructure services (such as the Virtual Center servers themselves). These VMs generally would be either hosted on external physical hardware or would not be subject to any SRM DR plan anyway. We will focus on what we pretend to be "production VMs" in our lab test. From this perspective we have essentially created three VMs (&lt;I&gt;Web1&lt;/I&gt;, &lt;I&gt;Web2&lt;/I&gt;, &lt;I&gt;Web3&lt;/I&gt;) that we mapped into the 4 LUNs described above. (&lt;I&gt;SRM-1-MILANO&lt;/I&gt;, &lt;I&gt;SRM-2-MILANO&lt;/I&gt;, &lt;I&gt;SRM-3-MILANO&lt;/I&gt; and &lt;I&gt;SRM-4-MILANO) &lt;/I&gt;The following picture outlines the mappings. &lt;/P&gt;&lt;IMG height=640 src="http://www.it20.info/misc/pictures/DisasterRecoveryfordummies(withLSI)13.JPG" width=1179 border=0&gt; 
&lt;UL&gt;
&lt;LI&gt;
&lt;P align=justify&gt;&lt;B&gt;Web1&lt;/B&gt; has two VMDK files associated to it. One is on the &lt;I&gt;srm-1&lt;/I&gt; VMware Datastore (which in turn is on the &lt;I&gt;SRM-1-MILANO&lt;/I&gt; LUN) and another one is on the &lt;I&gt;srm-2.&lt;/I&gt;&lt;/P&gt;
&lt;LI&gt;
&lt;P align=justify&gt;&lt;B&gt;Web2&lt;/B&gt; has one single VMDK file associated to it which is on the &lt;I&gt;srm-2&lt;/I&gt; Datastore.&lt;/P&gt;
&lt;LI&gt;
&lt;P align=justify&gt;&lt;B&gt;Web3&lt;/B&gt; is a bit more tricky. It has a VMDK on &lt;I&gt;srm-3&lt;/I&gt; and it also has an RDM (Raw Device Mapping) onto the &lt;I&gt;SRM-4-MILANO&lt;/I&gt; LUN. Notice this LUN doesn't have an &lt;I&gt;srm-4 &lt;/I&gt;Datastore associated because it's raw. Since the RDM mapping is set to virtual, Web3 has a VMDK pointer (on &lt;I&gt;srm3&lt;/I&gt;) to the &lt;I&gt;SRM-4-MILANO&lt;/I&gt; raw LUN.&amp;nbsp; &lt;/P&gt;&lt;/LI&gt;&lt;/UL&gt;
&lt;P align=justify&gt;It is of paramount importance to understand how all the VMs interact with the Datastores / LUNs because there might be some &lt;U&gt;consistency dependencies&lt;/U&gt; that SRM will have to deal with. In fact, once we have installed SRM as well as the LSI SRA (Storage Replication Adapter), this is what the "Configure Array Managers" window displays: &lt;/P&gt;&lt;IMG height=640 src="http://www.it20.info/misc/pictures/DisasterRecoveryfordummies(withLSI)14.JPG" width=1179 border=0&gt; 
&lt;P align=justify&gt;Have you noticed how the various LUNs get grouped together? The first group includes the &lt;I&gt;srm-3&lt;/I&gt; Datastore as well as the &lt;I&gt;SRM-4-MILANO &lt;/I&gt;because there is a virtual RDM mapping from a VMDK file on srm-3 onto the fourth LUN. So they are somewhat dependent. &lt;/P&gt;
&lt;P align=justify&gt;Similarly, there is another group that includes both &lt;I&gt;srm-1&lt;/I&gt; and &lt;I&gt;srm-2&lt;/I&gt;. And that's because there are interdependencies as you can depict from the picture with the layout of the VM disk configuration: Web1 is dependent on the first and on the second LUN so they need to be treated as a single Protection Group (you can't split them, as this would split the VM configuration and this wouldn't maintain data consistency!). However, now that you have to treat &lt;I&gt;srm-1&lt;/I&gt; and &lt;I&gt;srm-2&lt;/I&gt; as a single Datastore Group, SRM realizes what the other dependencies are. In fact, Web1 is not the only VM that is hosted (partially) on &lt;I&gt;srm-2&lt;/I&gt;: Web2 is hosted on &lt;I&gt;srm-2&lt;/I&gt; and it must be included in the very same Protection Group. This is what you would see from a GUI perspective when selecting this Datastore Group : &lt;/P&gt;&lt;IMG height=640 src="http://www.it20.info/misc/pictures/DisasterRecoveryfordummies(withLSI)15.JPG" width=1179 border=0&gt; 
&lt;P&gt;&lt;/P&gt;
&lt;P align=justify&gt;When you select the Datastore or the Datastore Group. SRM automatically displays the VMs that are dependent on that Datastore or those Datastores. That's a read only field. Notice you can't select either &lt;I&gt;srm-1&lt;/I&gt; or &lt;I&gt;srm-2&lt;/I&gt;: they are a single entity for SRM. &lt;/P&gt;
&lt;P align=justify&gt;What we did from here is simple. We created two Protection Groups on the SRM instance hosted on the production site (Milano). These PGs build on top of the &lt;I&gt;srm-1&lt;/I&gt; / &lt;I&gt;srm-2&lt;/I&gt; Datastore Group and the &lt;I&gt;srm-3&lt;/I&gt; Datastore (which includes the RDM on the fourth LUN). Subsequently, we created a Recovery Plan on the DR site (Roma) which contains the failover instructions for these two Protection Groups. That's it. &lt;/P&gt;
&lt;P align=justify&gt;Our production site is now protected. What we need to do is "Test" our Recovery Plan. One of the advantages of SRM is that it has a built-in intelligence to simulate a DR. Obviously this process is not (and should not be) disruptive: you want to keep the replica of the LUNs in place as well not shutting down the VMs in production to run this test. How do I do so? It's easy. Let's push the &lt;B&gt;Test&lt;/B&gt; button on the SRM GUI and go through the plan.&lt;/P&gt;
&lt;P align=justify&gt;The trick here is that you want to create a dedicated environment (from a storage and network perspective) that doesn't interfere with the production environment. As soon as the test starts, a snapshot of the replicated LUNs is created (at least those that are in the Protection Group associated to the Recovery Plan that is being tested). It's conceptually identical to what we have already done with a manual snapshot (see above), but this time it is SRM that instructs the LSI SRA (Storage Replication Adapter) to create the snapshots and the SRA in turn talks natively to the LSI devices to do so. The SRA is basically the driver that SRM uses to communicate with the actual storage subsystem. You can see the snapshots being created in the next picture: &lt;/P&gt;&lt;IMG height=640 src="http://www.it20.info/misc/pictures/DisasterRecoveryfordummies(withLSI)16.JPG" width=1179 border=0&gt; 
&lt;P&gt;&lt;/P&gt;
&lt;P align=justify&gt;&lt;FONT face=Arial color=#0000ff&gt;*************&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; Background information that you need to understand and be familiar with before you move on&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; *********************************&lt;/FONT&gt;&lt;/P&gt;
&lt;P align=justify&gt;VMware SRM is configured by default to set the EnableResignature parameter to 1 (that means TRUE) on each of the hosts in the receiving cluster. This means that, independent of the behavior you configured on the hosts, SRM will always resign the LUNs when imported into the remote cluster in the DR site. This will cause the LUNs to be renamed with the (in)famous naming convention &lt;I&gt;snap-xxxx-&amp;lt;Original Datastore Name&amp;gt;&lt;/I&gt;. &lt;/P&gt;
&lt;P align=justify&gt;If you want to keep things clear and "human readable," you can change the SRM configuration to rename the Datastore to their original names. This is achieved through an SRM configuration file that is &lt;I&gt;vmware-dr.xml&lt;/I&gt; and it's located in the &lt;I&gt;C:\Program Files\Site Recovery Manager\Config&lt;/I&gt; directory of the SRM server in the DR site. You have to identify the line &lt;/P&gt;
&lt;P align=justify&gt;&amp;lt;fixRecoveredDatastoreNames&amp;gt;false&amp;lt;/fixRecoveredDatastoreNames&amp;gt;&lt;/P&gt;
&lt;P align=justify&gt;and modify it to: &lt;/P&gt;
&lt;P align=justify&gt;&amp;lt;fixRecoveredDatastoreNames&amp;gt;true&amp;lt;/fixRecoveredDatastoreNames&amp;gt;&lt;/P&gt;
&lt;P align=justify&gt;Thanks to Duncan E. and Mike L. for their &lt;A href="http://www.yellow-bricks.com/2009/04/18/srm-automatically-rename-your-datastore-back/"&gt;researches&lt;/A&gt;.&lt;/P&gt;
&lt;P align=justify&gt;It's important to understand that this will not change back the value of the EnableResignature parameter to 0. In fact the LUN will be resigned anyway but SRM will take an extra step to rename the Datastore back to its original name (effectively just deleting the &lt;I&gt;snap-xxxx&lt;/I&gt; portion of the new Datastore name).&lt;/P&gt;
&lt;P align=justify&gt;Not being an expert on this, I can only think that doing so is important when you want to maintain a decent naming convention, especially when you consider that a failback onto the production site would cause SRM to rename the Datastore into something like &lt;I&gt;snap-xxxxx-snap-yyyyyy&amp;lt;Original Datastore Name&amp;gt;&lt;/I&gt; (which is indecent in my opinion). Apparently it would have been easier for SRM to configure the host to allow snapshot LUNs (DisallowSnapshotLUN = 0) and not bother in the first place with the resignature and the rename. But if VMware decided to do so, there must be other (hopefully good) reasons.&lt;/P&gt;
&lt;P align=justify&gt;&lt;FONT face=Arial color=#0000ff&gt;************************************************************************************************************************************************************&lt;/FONT&gt;&lt;/P&gt;
&lt;P align=justify&gt;Having this said, we have the background to understand the next picture which outlines the storage configuration on the cluster at the DR site in Roma: &lt;/P&gt;&lt;IMG height=640 src="http://www.it20.info/misc/pictures/DisasterRecoveryfordummies(withLSI)17.JPG" width=1179 border=0&gt; 
&lt;P&gt;&lt;/P&gt;
&lt;P align=justify&gt;The Datastores have been imported with the original names due to the change in the &lt;I&gt;vmware-dr.xml &lt;/I&gt;file. The UUID for the Datastores, however, have been changed since they have been resigned. This is not a problem for SRM because the "place-holder vmx files" that are kept at the DR site do not contain any reference to the disk configuration of the VM. The Datastores are parsed during the execution of the Recovery Plan and the correct disks (with the actual UUIDs) get included in the final vmx prior to the startup of the VM.&lt;/P&gt;
&lt;P align=justify&gt;Notice that the production VMs are being started off the snapshots that the LSI SRA has created and they are now connected to a so-called "Bubble Network." The Bubble Network is a standard VMware Virtual Switch with no Physical NICs connected to it that gets created for the time of the test. This allows the system administrator to test the restart of a copy of the VMs (currently running in production) without bothering about potential network conflicts. Of course at this time, the replica between the primary and DR sites is still in place and we are still fully protected from a potential disaster. &lt;/P&gt;
&lt;P align=justify&gt;The test is being executed, and apparently everything has been running smoothly. At this point, SRM pauses for the system administrator to make an evaluation of the test (notice in the SANtricity Storage Manager how the snapshots also have been automatically mapped to the cluster):&lt;/P&gt;&lt;IMG height=640 src="http://www.it20.info/misc/pictures/DisasterRecoveryfordummies(withLSI)18.JPG" width=1179 border=0&gt; 
&lt;P&gt;&lt;/P&gt;
&lt;P align=justify&gt;Once the administrator is done with the checks he/she can push the "Continue" button, which essentially rolls back the Test. This, in a nutshell, includes shutting down the VMs in the DR site and deleting the snapshots taken from the replicated LUNs. Everything is now back to normal for the next Test to run (or a disaster to recover from).&lt;/P&gt;
&lt;P align=justify&gt;&lt;B&gt;Site Recovery Manager Setup (Run the Recovery Plan) &lt;/B&gt;&lt;/P&gt;
&lt;P align=justify&gt;Running the Recovery Plan is different than testing the Recovery Plan. The most important difference is that SRM doesn't create snapshots of the replicated LUNs; rather it uses the replicated LUNs directly. The other difference is that the VMs on the recovery site are connected to the actual physical network and no longer to the "Bubble Network" that is used in the Test. Everything else is pretty similar to what we have seen already. &lt;/P&gt;&lt;IMG height=640 src="http://www.it20.info/misc/pictures/DisasterRecoveryfordummies(withLSI)19.JPG" width=1179 border=0&gt; 
&lt;P&gt;&lt;/P&gt;
&lt;P align=justify&gt;As you can see, SRM instructed the LSI SRA to revert the role of the mirroring: now the LUNs on the DS4800 (the storage server at the DR site in Roma) are "Active" and get replicated onto the "Passive" LUNs on the DS4700 in Milano. Most likely this is not what would happen in a real life disaster. In that case, probably the DS4700 would not be available (due to the disaster) so the SRM would only activate the replicas on the DS4800 in the DR site. &lt;/P&gt;
&lt;P align=justify&gt;At this point the VMs would be restarted on the cluster in Roma similarly to what happened in the Test scenario (with the exception that they would connect to the actual physical network since they are restarting there to really take over). Remember this is no longer a Test, it's a &lt;I&gt;real Run&lt;/I&gt; of a &lt;I&gt;real Recovery Plan&lt;/I&gt;. &lt;U&gt;Doing this on a production environment will have devastating results!&lt;/U&gt;&lt;/P&gt;
&lt;P align=justify&gt;At the end of the process, all production VMs (Web1, Web2 and Web3) would be running on the VI3 cluster in Roma which now effectively can be considered the new production site. &lt;/P&gt;
&lt;P align=justify&gt;&lt;B&gt;Failback&lt;/B&gt;&lt;/P&gt;
&lt;P align=justify&gt;Failback is a nightmare, at least in my opinion. Unfortunately there is not a "Failback Button" on the SRM console. However, you could work on the VMware consoles to create a Recovery Plan that will move all the VMs currently running on the DR site (Roma, for us) onto the original production site (Milano, in our case). Rather than a real failback, I think it's more appropriate to define this as a new failover plan that happens to bring the workloads back to their original positions. VMware has published a useful &lt;A href="http://www.vmware.com/pdf/srm_10_eval_guide.pdf"&gt;document&lt;/A&gt; that, in chapter 6, describes the steps to failback from an SRM failover. It's a good read. There is only one caveat in that paper that would need further investigation: at some point in the failback process it's suggested to set the DisallowSnapshotLUN parameter on the hosts in the original site to 0 (it would be the hosts in Milano, in our case). This means that when the storage is brought back to the original place, the ESX hosts on the original production site would be able to import the Datastores without resigning them. Since this is done via SRM, it is inconsistent with the behavior we have noticed during the failover. SRM seems to automatically set (on the fly) the EnableResignature to 1 on the hosts where the LUNs are being re-activated, effectively forcing the hosts to re-sign the volumes - and thus making the DisallowSnapshotLUN irrelevant. Further investigation would be required to nail down this inconsistency between the documentation and the behavior we have noticed. &lt;/P&gt;
&lt;P align=justify&gt;Massimo. &lt;/P&gt;
&lt;P align=justify&gt;&lt;SPAN class=Apple-style-span&gt;&lt;SPAN class=Apple-style-span&gt;&lt;FONT face="Times New Roman" size=3&gt;P.S. FOR DUMMIES® is a registered trademark of Wiley Publishing, Inc.&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;img src="http://it20.info/aggbug.aspx?PostID=243" width="1" height="1"&gt;</description></item><item><title>Xeon 5500 (aka Nehalem) Marks the Death of Itanium (and More)</title><link>http://it20.info/blogs/main/archive/2009/04/24/221.aspx</link><pubDate>Fri, 24 Apr 2009 20:03:00 GMT</pubDate><guid isPermaLink="false">3066da22-6b27-4cf1-aa0f-2eff79b21f87:221</guid><dc:creator>Massimo</dc:creator><slash:comments>6</slash:comments><comments>http://it20.info/blogs/main/comments/221.aspx</comments><wfw:commentRss>http://it20.info/blogs/main/commentrss.aspx?PostID=221</wfw:commentRss><description>&lt;P align=justify&gt;The last day of March 2009 Intel officially unveiled its brand new Nehalem core architecture under the &lt;I&gt;Xeon 5500&lt;/I&gt; product name umbrella. There is not much to say about it other than it's impressive from a performance perspective. Just to give you a sense of what we are talking about the new product - only available for 2-socket servers today and with up to 4 cores per socket - has published many benchmark numbers that are either on par or slightly better than 4-socket Intel based servers with up to as many as 24 cores. One might wonder why a successful (and clever) company like Intel is going to cannibalize their highly profitable multi-socket market with a lower profitable product such as the 5xxx Xeon series. And I think the answer to this question is in one of the slides they used to present Nehalem at the launch event: &lt;/P&gt;&lt;IMG height=377 src="http://www.it20.info/misc/pictures/Xeon5500marksthedeathofItanium1.jpg" width=739 border=0&gt;
&lt;P&gt;&lt;/P&gt;
&lt;P align=justify&gt;These numbers are impressive but I am pretty sure that if SUN and IBM marketing people would ever be able to read the small text at the bottom (which seems to be technically impossible) I am pretty sure they would come up with something to counter those numbers as they are obviously presented in a way that favors Intel; however I am not sure about this as I can't read the text myself so I don't know the assumptions behind those numbers. What it is important in this chart however is not the numbers (we know Nehalem has impressive performance per core) but it's the fact that Intel is now using Xeon to go after a 20+ Billion $ UNIX market. Up until now - and in the last 10 years - they would have been using &lt;I&gt;Itanic&lt;/I&gt; (ehm... I mean Itanium... sorry for the typo) to go after the IBM Power or the SUN Sparc processors to get a slice of the Unix pie. This doesn't seem to be the case any longer. One might wonder where Itanium falls into all this: good question. &lt;/P&gt;
&lt;P align=justify&gt;A bit of history on Itanium might help. Originally the Intel vision for the 64-bit Itanium was that it should have been the x86 32-bit follow-on product: the replacement for the Xeon brand basically. And they might have had a chance to succeed if AMD didn't come out with a much smarter evolution for x86 32-bit processors: in case you are wondering that would be an x86 64-bit architecture (namely AMD Opteron). When Intel understood they couldn't fight the Opteron with Itanium - since Opteron was 100% backward compatible with the Xeon software available whereas Itanium was basically not and would have required massive and painful applications porting - they decided to introduce the same "enhancements" to their Xeon processors. This was initially referred by Intel to as &lt;I&gt;x86-32e&lt;/I&gt;: obviously they couldn't say Xeon was 64-bit as it would have overlapped too much with Itanium so they preferred to stay with the ridiculous definition of "32-bit Extended". This was the time where they tried to pitch Itanium as the only "native" 64-bit processor whereas the Xeon (as well as the Opteron obviously) were "just extensions to current 32-bit architectures". And this is when they shot themselves in the feet since they tried to play with the words (i.e. &lt;I&gt;native&lt;/I&gt; sounds better than &lt;I&gt;extended&lt;/I&gt;) but the only problem is that they forgot that, as far as IT is concerned,&amp;nbsp; &lt;I&gt;native&lt;/I&gt; means you have to port the application whereas &lt;I&gt;extended&lt;/I&gt; means it's compatible. So, for most of the customers, eventually &lt;U&gt;&lt;I&gt;extended&lt;/I&gt;&lt;/U&gt; sounded much (much!) better than &lt;U&gt;&lt;I&gt;native&lt;/I&gt;&lt;/U&gt;. And this is when Itanium started to see its decline in perception. I did a presentation at an IBM System x Symposium in France back in 2004 where I have shared these thoughts. Interestingly enough at that time we had an Itanium based System x box in our portfolio - the x455 - and I basically implied that Itanium (hence the x455) was at a dead-end and a useless product given the historical context we were facing. This is for example a chart that I used in 2004 to predict Windows on Itanium had no real place and didn't make any sense at all; it took a while but I think now MS think along the same lines: &lt;/P&gt;&lt;IMG height=530 src="http://www.it20.info/misc/pictures/Xeon5500marksthedeathofItanium2.jpg" width=739 border=0&gt;
&lt;P&gt;&lt;/P&gt;
&lt;P align=justify&gt;Funny enough there was an Intel representative in the room that apparently didn't like these messages and he decided to escalate and complain about my pitch to my line all the way to the General Manager of the IBM Systems and Technology Group (that reported directly to Lou Gerstner - CEO of IBM at that time). I was never been officially involved in this &lt;I&gt;complaint&lt;/I&gt; but the fact is that, later in the year, we dropped the x455. I like to think I gave a hint to the product marketing team on what to do but more likely what I said in the session might have been a blessing from the field about what product management was going to do anyway (and for very good business reasons). For your information I have posted the entire Power Point deck in the &lt;I&gt;Files&lt;/I&gt; session of my site if you want to have a look. You can download it &lt;A HREF="/files/3/documentation/entry220.aspx"&gt;here&lt;/A&gt;.&lt;/P&gt;
&lt;P align=justify&gt;To make a long story short Intel had nothing left to do than re-position Itanium as a high-end RISC replacement with the help of HP that, confident in its value and roadmap, decided to completely drop their own RISC offering - the HP PA-RISC processor - and jump onto the Intel Itanium processor as a strategic replacement. Intel tried to position Itanium as an open platform mentioning they had dozens of OEMs offering servers based on that processors but usually they forget to mention that the vast majority of the sales numbers they were seeing were coming from HP which is the only tier 1 server vendor today offering such a processor (IBM and Dell used to but they withdrew it and SUN never even attempted to). &lt;/P&gt;
&lt;P align=justify&gt;As Xeon (and the AMD Opteron) became more and more enterprise-ready, the Itanium potential started to shrink even further. Up until now when Nehalem seems to be the last nail on the Itanium coffin. Consider also that the first Nehalem incarnation is a CPU model for 2-socket servers (Xeon 5xxx). This might leave the impression that Itanium can address a much larger window as it shines on highly scalable boxes. The truth is that this is the first product iteration based on the Nehalem core. Later in the year Intel will announce a multi-socket Nehalem based CPU&amp;nbsp;- aka Nehalem EX - capable of scaling up to 8 sockets (Xeon 7xxx series). This CPU will feature 8 cores and Hyper-Threading thus providing execution support for 128 simultaneous threads (8 sockets x 8 core x 2 threads) in a single system image. Last but not least this new CPU will also feature additional enterprise functionalities such as MCA (Machine Control Architecture) which was one of the few things Intel used to position Itanium as "more enterprise" than Xeon. On paper a system like this could address the need for 99.9% of the customers' requirements. This statement obviously refers to performance but we obviously all know that performance is just one aspect of platform selection. This will obviously cause some adjustments in the server market shares and this goes back to the fact that apparently Intel is cannibalizing their current high-end market. Most likely what they have in mind, instead, is that they want to push the bar further and enter even more aggressively into the UNIX market with a more appealing and serious offering (than Itanium) like Xeon. The idea is: I will cannibalize a high-end x86 profitable market today which is worth a few B$ with a lower-end and less profitable product, because I want to use its big brother (Nehalem EX) to go after a 20B$ UNIX market. Since a picture is worth 1000 words this is what I am trying to say: &lt;/P&gt;&lt;IMG height=573 src="http://www.it20.info/misc/pictures/Xeon5500marksthedeathofItanium3.jpg" width=739 border=0&gt;
&lt;P&gt;Note that I am not implying this is what I think it will happen. As I said performance is just a metric in platform selection. I am only speculating on the view that Intel has going forward. I am not ruling out completely (either) that this view has a point given what's going on and if this happens this will not only impact Itanium in the RISC space but other UNIX platforms as well.&lt;/P&gt;
&lt;P align=justify&gt;&lt;STRIKE&gt;Back to the Itanium discussion, last but not least it's worth mentioning that there is going to be a convergence in the Itanium Tukwila time frame (unsurprisingly delayed again) where you can drop this new CPU into a Nehalem standard socket&lt;/STRIKE&gt; (see the &lt;I&gt;Update&lt;/I&gt; below). Intel has always pictured this flexibility as a mean to lower Itanium development costs and make it more flexible/cheap for customers and OEMs to move from Xeon to Itanium. The reality is that at the end of the day you end up having a common system, with the same components, with the same CPU socket. At that point you'll have the choice of installing either a cheap, super fast Nehalem processor with an unmatched flexibility of OS flavours and ISV applications... or installing a more expensive, somewhat slow Itanium Tukwila processor with an embarrassing flexibility of choice of OSes and ISV applications (at least compared to the Xeon family). I am pretty sure there are some HP execs regretting the port of HP-UX onto Itanium rather than having ported it onto the x86 architecture - if they knew 10 years ago what the x86 architecture would have looked like 10 years later.&lt;/P&gt;
&lt;P align=justify&gt;It's well known that not only Itanium didn't bring any profit but its development costs have been impressive and they never got on par with slow sales. In a word Intel has lost tons of money on Itanium. Having this said there are obviously a number of issues that prevent Intel from dropping immediately the dead processor: for example contracts that they have signed with "these dozens of OEMs" - and one in particular which I won't mention (again) - that dropped their in-house developed CPU architecture for jumping on Itanium. They cannot just say "hey we are dropping Itanium" and leave these vendors in the mud (especially one). So I guess it's fair to say that, officially, Itanium is alive and healthy, obviously you can imagine what the reality is. &lt;/P&gt;
&lt;P align=justify&gt;Massimo.&lt;/P&gt;
&lt;P align=justify&gt;&lt;FONT color=#ff0000&gt;Update (10th June 2009): while Tukwila and Nehalem EX will share the same QPI bus the sockets of the two processors will continue to remain incompatible for the moment.&lt;/FONT&gt; &lt;/P&gt;&lt;img src="http://it20.info/aggbug.aspx?PostID=221" width="1" height="1"&gt;</description></item><item><title>Cisco UCS: there is something I am still missing</title><link>http://it20.info/blogs/main/archive/2009/03/31/203.aspx</link><pubDate>Mon, 30 Mar 2009 23:18:00 GMT</pubDate><guid isPermaLink="false">3066da22-6b27-4cf1-aa0f-2eff79b21f87:203</guid><dc:creator>Massimo</dc:creator><slash:comments>3</slash:comments><comments>http://it20.info/blogs/main/comments/203.aspx</comments><wfw:commentRss>http://it20.info/blogs/main/commentrss.aspx?PostID=203</wfw:commentRss><description>&lt;P align=justify&gt;On Monday 16th Cisco unveiled its &lt;A href="http://newsroom.cisco.com/dlls/2009/prod_031609.html"&gt;Unified Computing System (UCS)&lt;/A&gt;. A few days ago I have been briefed by some local Cisco guys about the product (err, the architecture as they stressed). I assume that people reading this post know what Cisco is doing and are familiar with the announcement. In a nutshell they have announced a new &lt;I&gt;thing&lt;/I&gt; which is a mix of hardware (primarily) and software that is comprised of the following:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;
&lt;P align=justify&gt;their &lt;I&gt;Unified Fabric&lt;/I&gt; technology (as it can be found in other products like the Nexus family of switches)&lt;/P&gt;&lt;/LI&gt;
&lt;LI&gt;
&lt;P align=justify&gt;their new Blade technology&lt;/P&gt;&lt;/LI&gt;
&lt;LI&gt;
&lt;P align=justify&gt;their Management technology (which is an OEM and supposedly customized version of the BMC BladeLogic software)&lt;/P&gt;&lt;/LI&gt;&lt;/UL&gt;
&lt;P align=justify&gt;Consider there is not a lot of information available at the moment so most of the discussions are based on preliminary - and poor - initial documentation. This picture explodes the pieces and it's one of the few diagrams that is being shared by Cisco at this stage: &lt;/P&gt;&lt;IMG height=497 src="http://www.it20.info/misc/pictures/Cisco-UCS-there-is-something-I-am-still-missing1.JPG" width=631 border=0&gt;
&lt;P&gt;&lt;/P&gt;
&lt;P align=justify&gt;Never mind I work for IBM and many of my colleagues see this as a potential threat to our server hardware business (which I am sure it is the case). In the final analysis I am a technology geek and that's how I run this personal blog. What I write here is my own unbiased (believe it or not) personal opinion. &lt;/P&gt;
&lt;P align=justify&gt;I must admit I am fascinated by what Cisco is trying to achieve here. Ideally it sounds like a very compelling solution and something that anyone should be seriously valuating for virtualization deployments. Having this said, as for all things in life - none excluded - there are pros and cons. I am not going to spend time to talk about the pros as they are obvious and Cisco is certainly going to explain those to you in the details. These include, for example, the potential benefits of the Unified Fabric, which are enormous. I believe end-users reading this blog would be better served, at this point, by someone that starts to highlight the (potential) challenges of designing and implementing such a vision and architecture. This is done to balance the flow of "pros" you will be flooded with. Note this is nothing new on this blog: when VMware announced VMware 3i I wrote an &lt;A HREF="/blogs/main/archive/2007/09/19/48.aspx"&gt;article&lt;/A&gt; on the misleading marketing information that were associated to it; similarly I have done a &lt;A HREF="/blogs/main/archive/2007/12/29/86.aspx"&gt;reality check&lt;/A&gt; for VMware Site Recovery Manager to underline its deficiencies rather than magnifying its excellences (that's what the VMware marketing is paid for). &lt;/P&gt;
&lt;P align=justify&gt;This is exactly what I'd like to do here with this new article: I'd like to underline the challenges that Cisco is facing. However I don't want to do that from a &lt;I&gt;competitor blade vendor &lt;/I&gt;perspective (that's what the Dell/IBM/HP marketing organizations are for), but rather from a VMware virtualization expert (vExpert) perspective based on feedbacks from the field and various customers' projects I have been involved in now and in the past. &lt;/P&gt;
&lt;P align=justify&gt;&amp;nbsp;&lt;/P&gt;
&lt;P align=justify&gt;&lt;I&gt;(Physically) Unified Fabric? No, Divide et Impera!&lt;/I&gt;&lt;/P&gt;
&lt;P align=justify&gt;Cisco is trying to capture a potential convergence in the datacenter. This is a process that started early in the 21st century when the major servers vendors started to ship blades form factors: those blade chassis in fact integrate both Ethernet and Fibre Channel switches as well as compute nodes (i.e. blade servers). This wasn't an easy thing to do in organizations with very strong vertical specializations (and politics!) in the data center. That's why we still see an exaggerated number of "pass-through" technologies being used on blade chassis that basically externalize the thousands of Ethernet and Fibre Channel ports of each blade. This diminishes the intrinsic value of the blade technologies, however it allows to connect the blades to the legacy infrastructure switches. Most of the time in fact this is not done for technical reasons but merely for political reasons: "The server guys are responsible for servers, that's it; the network guys have their own infrastructure and that's (physically) separated from servers....". This is what usually happens with big organizations. I have been through that many times. &lt;/P&gt;
&lt;P align=justify&gt;Having this said, I support the Cisco message: what these big accounts are doing is very inefficient and there is space for a huge optimization if they could possibly get the internal political issues resolved. However I think this is one of the problems Cisco is going to face in promoting their Unified Fabric technologies. Well, in reality this situation is exacerbated by the fact that we are talking about a convergence of IP and Storage networks, so even more politics involved. &lt;/P&gt;
&lt;P align=justify&gt;&amp;nbsp;&lt;/P&gt;
&lt;P align=justify&gt;&lt;I&gt;Unified Fabric, Weak security?&lt;/I&gt;&lt;/P&gt;
&lt;P align=justify&gt;Once we get passed the physical consolidation concerns I have discussed above and the customers have accepted to position the switches in a non conventional location (i.e. closer to the servers than to the infrastructure) Cisco might face another concern related to security. As a background, this will of consolidating and reducing the cabling complexity that each VMware ESX server has associated is nothing new. I have discussed this very exact topic back in 2007 in the article &lt;A HREF="/blogs/main/archive/2007/10/30/75.aspx"&gt;"Infiniband Vs 10Gbit Ethernet... with an eye on virtualization"&lt;/A&gt;. As you might see from the picture in the post (which I am attaching hereafter for your convenience) InfiniBand was supposed to deliver the same concept of I/O virtualization that is being evangelized by Cisco with their Unified Fabric:&lt;/P&gt;&lt;IMG height=497 src="http://www.it20.info/misc/pictures/Cisco-UCS-there-is-something-I-am-still-missing2.JPG" width=631 border=0&gt;
&lt;P&gt;This is very similar to the latest Cisco Nexus value proposition (hence to this UCS announcement as it's based on the Nexus core technology). No matter if it's InfiniBand or 10Gbit Unified Fabric, the biggest problem with this layout and architecture - as reported by customers and VMware network security experts in the forums threads linked below - is that each ESX server has a number of network security zones that best practices would require to keep separate from each other. Many customers achieve this creating network security zones (i.e. for the ConsoleOS, VMotion, iSCSI, VMs etc) by means of &lt;U&gt;physically different network adapters&lt;/U&gt; that connect to &lt;U&gt;physically separated network switches&lt;/U&gt;. For these customers VLANs and PortGroups technologies are not usually a viable option as they don't implement and guarantee the same level of security and separation they need. In the picture above the criticality lies in the fact that these physically and logically separated network segments need to collapse into a single Bridge/Switch for the whole I/O virtualization to work (be it InfiniBand or Cisco Unified Fabric). &lt;/P&gt;
&lt;P&gt;Last but not least consider this discussion is multidimensional. Not only Cisco is trying to unify all different IP segments on the same wire - as already discussed- but they are also trying to unify both IP traffic and Fibre Channel traffic on the same wire (by means of a new technology called FCoE or &lt;I&gt;Fibre Channel over Ethernet&lt;/I&gt;). Obviously this additional dimension adds even more potential security concerns than "simply" collapsing heterogeneous network security zones. There have been a number of interesting discussions on the VMware forum that I highly encourage you to read if you are interested in the matter. You can find them &lt;A href="http://communities.vmware.com/thread/183114"&gt;here&lt;/A&gt; and &lt;A href="http://communities.vmware.com/message/1193931"&gt;here&lt;/A&gt;.&amp;nbsp; &lt;/P&gt;
&lt;P align=justify&gt;This is going to be another challenge for Cisco.&lt;/P&gt;
&lt;P align=justify&gt;&amp;nbsp;&lt;/P&gt;
&lt;P align=justify&gt;&lt;I&gt;Unified Computing? More like &lt;U&gt;Partially&lt;/U&gt; Unified Computing&lt;/I&gt; &lt;/P&gt;
&lt;P align=justify&gt;I don't really get the Cisco message here. I have already talked about how I see the technology trends in this industry; in a nutshell what's happening is that data centers are being transformed from vertical silos of servers, storage that support (statically) applications into pools of physical resources that could be used when they are needed. You can read more about these trends in this other &lt;A HREF="/blogs/main/archive/2008/11/14/162.aspx"&gt;article&lt;/A&gt; I wrote. The picture in the original post doesn't call out one important element of the architecture which is the network: I didn't call it out because it was obviously there but let's try to refine that diagram to draw the complete picture of the elements that comprise a virtualized data center. &lt;/P&gt;&lt;IMG height=497 src="http://www.it20.info/misc/pictures/Cisco-UCS-there-is-something-I-am-still-missing3.JPG" width=631 border=0&gt;
&lt;P&gt;A properly designed and innovative x86 virtualized data center requires these 4 distinct elements: 
&lt;UL&gt;
&lt;LI&gt;A Shared Server infrastructure&lt;/LI&gt;
&lt;LI&gt;A Shared Network infrastructure &lt;/LI&gt;
&lt;LI&gt;A Shared Storage infrastructure &lt;/LI&gt;
&lt;LI&gt;The Virtualization software (which is the glue that ties together all these components)&lt;/LI&gt;&lt;/UL&gt;
&lt;P align=justify&gt;Note: in a traditional virtual infrastructure the storage network (be it fibre or Ethernet) is physically separated from the IP network (which is typically Ethernet). In the context of the Unified Fabric there is a single network (based on 10Gbit technologies) that carries both storage and IP.&amp;nbsp;This doesn't really change the idea of the diagram above; it actually enforces the message meaning that the Shared Network is also shared from a "protocol being carried" perspective.&lt;/P&gt;
&lt;P align=justify&gt;One of the challenges customers have today is that these 4 elements are really managed and operated by different vertical (and specific) management tools: you have to use vCenter to manage VMware, you have to use the Server tools to manage the Shared Servers infrastructure, you have to use specific tools to manage and operate the Network infrastructure and ultimately you have to use specific GUIs to manage the shared disk space. This is not, by the way, a negative thing per se because it allows a customer to switch from one vendor to another at any level they want, thus allowing them to not be locked-in. This is a concept that is historically at the very basis of any x86 deployments and one of the most important aspects that determined - and still determines - the success of this platform.&lt;/P&gt;
&lt;P align=justify&gt;The point I am trying to make is that Cisco "Unified" with their offering only two of these four elements. Namely Servers and Network: &lt;/P&gt;&lt;IMG height=497 src="http://www.it20.info/misc/pictures/Cisco-UCS-there-is-something-I-am-still-missing4.JPG" width=631 border=0&gt;
&lt;P&gt;
&lt;P align=justify&gt;What is this going to mean for customers from a "unification" perspective? Very little I think. Consider also that the servers themselves, frankly speaking, are probably the most commodity thing of all four from a management perspective simply because management standardization (such as &lt;A href="http://en.wikipedia.org/wiki/Intelligent_Platform_Management_Interface"&gt;IPMI and BMC&lt;/A&gt;) is allowing third parties to build into their own products an x86 management layer. A typical example of this is, funny enough, the VMware effort to create a CIM-based interface to manage standard x86 servers (this implementation first appeared in ESXi and it's now available in the standard ESX version). This is an &lt;A href="http://www.virtuallifestyle.nl/2009/01/enabling-cim-on-esxi/"&gt;example&lt;/A&gt; of this concept:&lt;/P&gt;&lt;IMG height=497 src="http://www.it20.info/misc/pictures/Cisco-UCS-there-is-something-I-am-still-missing5.png" width=631 border=0&gt;
&lt;P&gt;
&lt;P align=justify&gt;I certainly don't want to downplay the challenges associated to managing a server farm but, if you ask me, extending an existing tool to add functionalities that properly manage an x86 servers deployment is not something that should be under scrutiny for a technology Nobel prize. So to speak. Ironically VMware is "unifying" Virtualization with servers management whereas Cisco is "unifying" Network with servers management. Not the holistic unification it's being discussed in the marketing announcements though. &lt;/P&gt;
&lt;P align=justify&gt;Similarly to the "unified" management concept above, building a brand new x86 blade is a relatively easy task compared to building a brand new Storage subsystem or compared to building a brand new Virtualization software infrastructure element (ask Microsoft). So I am starting to wonder why they have chosen to (partially) "unify" starting from the easiest of the four elements. Here I am assuming that the innovative characteristics of their blades are either easily achievable by long standing tier 1 servers vendors (Dell, HP, IBM, SUN) or are not strictly necessary as of today: The speculated 500+GB of memory support per Cisco blade seems cool but I am challenging the need for something like this given the current &lt;A HREF="/blogs/main/archive/2007/11/26/83.aspx"&gt;well known rule of thumbs&lt;/A&gt; for sizing ESX hosts. Sure Nehalem will change these numbers but even assuming doubling the amount of RAM required for a 2S/8Core system we are far far away from the 500+GB Cisco specs. &lt;/P&gt;
&lt;P align=justify&gt;More so Cisco has clearly stated that they want to leave the Software Virtualization as well as the Shared Storage elements &lt;I&gt;open&lt;/I&gt;. I don't want to provide more details here as I am not sure about the level of confidentiality associated to the info I have but the key point is that they don't have a strategy that calls for a single Virtualization vendor nor a single Storage vendor. Enough for now. And this again leads me to think what sort of "unification" this is all about. What I have learned basically is that you can buy UCS and use, now or in the future, your storage vendor of choice - with the management framework that comes with it - as well as your virtualization of choice - again with the management framework that comes with it. You have to do this with all the benefits and challenges that end-users experience today in aggregating and integrating different vendors to create the &lt;I&gt;ultimate virtualized infrastructure&lt;/I&gt;. &lt;/P&gt;
&lt;P align=justify&gt;Don't get me wrong. I am a fan of this Unified Fabric concept and I hope it will take off as it will solve many of the enterprise customers challenges associated to the management of the distributed infrastructure. There is lots of information available on the web, as I said, on the benefits of implementing this highly consolidated and "intelligent" fabric. &lt;A href="http://virtualgeek.typepad.com/virtual_geek/2009/03/interesting-dialog-on-the-cisco-ucs-stuff-and-a-bit-of-detail.html"&gt;This&lt;/A&gt; is from Chad Sakac (with EMC) and it discusses some of these benefits, for example.&lt;/P&gt;
&lt;P align=justify&gt;What I am questioning is this Cisco move to extend their value proposition from the Unified Fabric into a market (x86 blades) that isn't really adding any additional benefit to their &lt;I&gt;unification story&lt;/I&gt;. Reading through Chad's excellent post I can't really depict what is the uniqueness of doing something like what he describes, using alternative components such as Dell / HP / IBM / Sun servers and Dell / EMC / HP /IBM / NetApp / Sun storage all tied together with the Cisco Nexus technology which remains the real Cisco value add in this context. That's what I am missing. &lt;/P&gt;
&lt;P align=justify&gt;That's the question I have asked during the session a few days ago: what's in - for the customers- if they use a Cisco UCS infrastructure compared to an IBM BladeCenter + Cisco Nexus infrastructure? Granted Nexus switches for the IBM BladeCenter do not exist today, this is a hypothetical question. Sure they have this "integrated management" framework but what's the value in it if what it does is simply managing a subset of the entire infrastructure? Customers will still be forced to deal with a number of vertical management pieces to operate the infrastructure end-to-end.&amp;nbsp; &lt;/P&gt;
&lt;P align=justify&gt;I am missing it unless there is some sort of grand plan behind the scenes to make the EMC and Cisco pair "more tied" (whatever that means). How about an "EMCisco"? I am going to copyright this term: a brief search on the Internet didn't find any result for this term used in the IT context (although apparently there is a DJ called EMCisco). This single IT &lt;I&gt;entity&lt;/I&gt; would, in fact, be able to provide an end-to-end infrastructure comprised of virtualization software, network, servers and storage and they would be able to really integrate the whole thing into a single management and operational framework with a potential much deeper integration (other than standard public API's that interconnect the different four elements). The interesting part is that, as I said, the x86 server market - and its surroundings - is literally modular and no single customer that I know would be willing to be locked-in in such a way (unless there are compelling reasons to do so - which I am not ruling out). &lt;/P&gt;
&lt;P align=justify&gt;The bottom line is that, if I was malicious, I would be led to think that today Cisco is more interested in getting a slice of the 30B+ US$ x86 server market - on top of what they can do with their Unified Fabric solutions - through the development and integration of the most commodity piece of all the four elements. I can easily see what's in for Cisco: easy additional money. I can't really see, so far, what's in for customers.&lt;/P&gt;
&lt;P align=justify&gt;I'll let Cisco give you the &lt;I&gt;bright&lt;/I&gt; side of their new UCS platform. My role here was to show you the &lt;I&gt;dark&lt;/I&gt; side of it (someone has to).&lt;/P&gt;
&lt;P align=justify&gt;Massimo.&lt;/P&gt;&lt;img src="http://it20.info/aggbug.aspx?PostID=203" width="1" height="1"&gt;</description></item><item><title>Hyper-V Server R2: a few additional thoughts</title><link>http://it20.info/blogs/main/archive/2009/03/19/196.aspx</link><pubDate>Thu, 19 Mar 2009 01:06:00 GMT</pubDate><guid isPermaLink="false">3066da22-6b27-4cf1-aa0f-2eff79b21f87:196</guid><dc:creator>Massimo</dc:creator><slash:comments>2</slash:comments><comments>http://it20.info/blogs/main/comments/196.aspx</comments><wfw:commentRss>http://it20.info/blogs/main/commentrss.aspx?PostID=196</wfw:commentRss><description>&lt;p align="justify"&gt;A few weeks ago I wrote a
&lt;a HREF="/blogs/main/archive/2009/02/09/177.aspx"&gt;tutorial on 
how to deploy Hyper-V R2 on the IBM BladeCenter S&lt;/a&gt; where I demonstrated, among other things, how 
to LiveMigrate from one blade to another. I didn't spend 
too much time commenting on the implications this will have in the market. 
In this article, I'd like to comment on some of those potential implications.&lt;/p&gt;
&lt;p align="justify"&gt;Reading my piece you might have had the impression that I was &amp;quot;backing&amp;quot; 
Microsoft and putting Hyper-V R2 on the 
spotlight. That was not my intention: in fact the geek at the bottom of my heart 
just wanted to give it a try, as easy as it is. While I was pretty much happy with what I have 
seen, I was certainly not implying that Hyper-V R2 will be able to 
match VMware Enterprise technologies (both current and future). In fact, I don't 
honestly think that this is the case. Part of the misunderstanding is that, for some reason, this 
industry has grown with the stereotype that a virtualization product that is 
capable of moving a live workload from one server to another is to be considered 
enterprise-grade. VMotion has become the industry benchmark for being an enterprise product. I want to 
challenge this stereotype. &lt;/p&gt;
&lt;p align="justify"&gt;My article created a bit of confusion around this concept. &amp;quot;I 
saw your article. Are you saying that Microsoft is going to be on par with 
VMware?&amp;quot; is a common question I have heard a lot lately. I want to use this 
new article to give you the &amp;quot;other 
side of the coin&amp;quot; regarding these two important technologies Microsoft is going to bring 
to the market that are &lt;i&gt;LiveMigrate&lt;/i&gt; and &lt;i&gt;CSVs&lt;/i&gt; (Cluster Shared 
Volumes). While having these two capabilities in the new product will help 
Microsoft to overcome some limitations they have today for some deployment 
scenarios, this doesn't mean these features could be used in all scenarios 
(specifically Enterprise scenarios). &lt;/p&gt;
&lt;p align="justify"&gt;The devil is in the detail, so when you start digging a bit 
into the LiveMigration technology, for example, you can find that: &lt;/p&gt;
&lt;p align="justify"&gt;&lt;i&gt;&amp;quot;..... On a given server running Hyper-V, only one live migration (to or from 
the server) can be in progress at a given time. For example, if you have a 
four-node cluster, up to two live migrations can occur simultaneously if each 
live migration involves different nodes.....&amp;quot;&lt;/i&gt;&lt;/p&gt;
&lt;p align="justify"&gt;The full story is here:
&lt;a href="http://technet.microsoft.com/en-us/library/dd443539.aspx"&gt;
http://technet.microsoft.com/en-us/library/dd443539.aspx&lt;/a&gt;&lt;/p&gt;
&lt;p align="justify"&gt;This obviously is documentation that relates to an early beta 
of the product. But if they are going to stick with these limitations, it would be hard to imagine 
wide deployments in enterprise scenarios where you might require multiple live 
migration tasks going on cluster-wise at any point in time for resource 
optimization reasons. So assuming Microsoft (or Citrix with their new &lt;i&gt;Essentials 
for Hyper-V&lt;/i&gt; 
package) will come out with some 
sort of DRS-like product in the R2 timeframe, they might not have the 
underlying infrastructure ready to leverage these add-on tools. &lt;/p&gt;
&lt;p align="justify"&gt;The same goes for Cluster Shared Volumes: the devil is always in the details. If you 
have read my previous article you might have had the impression that CSV will 
deliver pretty much what 
VMFS delivers today. Well, apparently yes, but again, if you dig a bit into the details you will 
find out some limitations that might not be relevant for small deployments 
but might be show-stoppers for enterprise deployments. &lt;/p&gt;
&lt;P align="justify"&gt;
&lt;IMG height=385 src="http://www.it20.info/misc/pictures/Hyper-VServerR2-afewadditionalthoughts1.jpg" width=513 border=0&gt;&amp;nbsp;&amp;nbsp;
&lt;IMG height=385 src="http://www.it20.info/misc/pictures/Hyper-VServerR2-afewadditionalthoughts2.jpg" width=512 border=0&gt;&lt;/P&gt;
&lt;p align="justify"&gt;At the time of writing, these slides were publicly available at this
&lt;a href="http://download.microsoft.com/download/5/E/6/5E66B27B-988B-4F50-AF3A-C2FF1E62180F/ENT-T588_WH08.pptx"&gt;
link&lt;/a&gt;. Kudos to Microsoft for not hiding these details and for letting the 
people know about the limitations.&lt;/p&gt;
&lt;p align="justify"&gt;While it appears at first that CSVs are a &amp;quot;transparent&amp;quot; technology, the 
reality is that as soon as you start pushing the envelope, they are not. How 
many enterprise IT organizations today leverage storage replication technologies to implement 
Disaster/Recovery scenarios? Based on my experience I would say many of them. 
Hyper-V R2 with CSVs will break this common implementation pattern if they won't 
be able to overcome this limitation. A pattern that I would imagine all these 
enterprise customers want to continue to leverage and something that is not just 
bound to current VMware deployments as it's a technique that is being leveraged 
by UNIX and Mainframe deployments as well to achieve High Availability and 
Disaster Recovery.&lt;/p&gt;
&lt;p align="justify"&gt;These are just two examples. As I said, supporting techniques that 
allow a live workload to &lt;i&gt;fly&lt;/i&gt; from a physical server to another is just 
one aspect, but probably not even the most important. The fact that you have a 
small Cessna - and so you can technically &lt;i&gt;fly&lt;/i&gt; - it doesn't mean it's the 
most optimal, secure and comfortable means of transportation to go from Milan to New York. For that you want to &lt;i&gt;fly&lt;/i&gt; on a 767 (and in business class, if 
possible!). Of course there are a lot of Cessnas around as they fit a 
part of the market. &lt;/p&gt;
&lt;p align="justify"&gt;On the other hand, as I said in my previous article, Microsoft 
has a tremendous asset: they are making (almost) everything available for free. 
Which leads to at least a couple interesting comments.&lt;/p&gt;
&lt;p align="justify"&gt;&lt;i&gt;Does the price discussion really matter anyway?&lt;/i&gt;&lt;/p&gt;
&lt;p align="justify"&gt;The first comment is: does the price discussion really matter 
anyway? The Microsoft 
pricing strategy is so that when you have properly licensed your Windows guests 
(typically via either Windows Server Enterprise or Datacenter SKUs), your 
underneath Microsoft Hyper-V virtual infrastructure is already licensed by 
definition. And this is true today. Suppose you have 50 Windows guests to 
deploy on four 2-socket servers for example. Most likely the cheaper way to 
license these 50 guests is via the Windows Server 2008 Datacenter SKU which is 
licensed per physical socket and provides unlimited number of guests. If you do 
so, it doesn't really matter whether you want to use Hyper-V 2008 Server or a full-GUI Windows 2008 Server 
w/ Hyper-V or a GUI-less Core Windows 2008 Server w/Hyper-V as your parent 
partition. You have the right to use everything you 
want for free including the Failover technology (Microsoft Cluster Server). This 
excludes the MS Virtual Machine Manager but this won't change in the Hyper-V R2 
time frame. So this claim that with Hyper-V R2 they will have more stuff 
for free is a bit misleading in my opinion in the sense that they already 
effectively provide many things today (not just the Hyper-V Server SKU) for free. &lt;/p&gt;
&lt;p align="justify"&gt;There is a caveat to this, though, and it boils down to how customers are 
going to license the Windows guests. The analysis above assumes that customers 
are going to buy brand new licenses for their new deployment (because they had OEM Windows licenses on their old physical servers that could 
not be repurposed, for example). If the customer has Windows licenses that they can repurpose 
on the new virtual infrastructure, then the discussion on the cost of the virtual 
infrastructure itself is no longer trivial. And, yes, there will be a big bonus in 
this regard during the R2 timeframe - as the free Hyper-V Server R2 version will 
have more features and fewer limitations than the current free version. The pricing discussion 
can get very complicated, as mentioned in my
&lt;a HREF="/blogs/main/archive/2008/08/05/135.aspx"&gt;blog&lt;/a&gt; a few 
months ago. It would be interesting to see some statistics on how customers have 
currently licensed their legacy physical servers.&lt;/p&gt;
&lt;p align="justify"&gt;Last but not least, I am assuming that these customers are using Windows 
guests on their Hyper-V infrastructure. While Microsoft supports a limited 
number of Linux distributions (today SUSE, but they announced future support for Red Hat, 
too), I don't see too many Linux-only customers leveraging 
Hyper-V for their virtual infrastructure deployments. &lt;/p&gt;
&lt;p align="justify"&gt;&lt;i&gt;Clearly the Microsoft virtualization strategy is different 
than the VMware virtualization strategy&lt;/i&gt;&lt;/p&gt;
&lt;p align="justify"&gt;The second comment regarding the (virtual) price war is this: clearly 
the Microsoft virtualization strategy is different than the VMware 
virtualization strategy. And the pricing strategy reflects that. I wrote another
&lt;a HREF="/blogs/main/archive/2008/11/04/157.aspx"&gt;article&lt;/a&gt; on this topic which I invite you to read. I am attaching hereafter 
the picture for your convenience because I want to use it to back my point.&lt;/p&gt;
&lt;p align="justify"&gt;&amp;nbsp;&lt;IMG height=508 src="http://www.it20.info/misc/pictures/Hyper-VServerR2-afewadditionalthoughts3.jpg" width=674 border=0&gt;&lt;/p&gt;
&lt;p align="justify"&gt;In a nutshell, Microsoft makes money out of the red part 
whereas VMware makes money out of the blue part. Microsoft is probably going to 
stick with their &amp;quot;Virtualization is a value item of the OS&amp;quot; strategy for the time 
to come if the pricing schema for Hyper-V R2 (due early next year) is what they 
are pitching today. Basically what they are doing growing the 
blue part, and giving it away for free. The only way they can sustain this is by 
continuing to make money on the red part. This has at least a couple of 
implications that are worth underlining:&lt;/p&gt;
&lt;ol&gt;
	&lt;li&gt;
	&lt;p align="justify"&gt;Their Linux strategy is pretty much opportunistic (well, it's obvious and 
	totally expected after all - it's a dumb statement) in the sense they want to give customers with 
	a &amp;quot;few Linux servers here and there&amp;quot; the possibility to leverage the 
	Hyper-V infrastructure these customers are using for the majority of the 
	(Windows) VMs. Even though a Linux shop (probably) would not want to use 
	Hyper-V for technical (or religious) reasons, it wouldn't even make sense 
	for Microsoft to go down that path because they would need to make the blue 
	part technologically compelling on its own while giving it away for free. There would be 
	no revenue stream for Microsoft in such a scenario so probably not worth the 
	effort for them.&lt;/li&gt;
	&lt;li&gt;
	&lt;p align="justify"&gt;Microsoft will not have a business interest in making the 
	blue part grow too much as long as they are going to give it away for free. 
	This means that they won't be able to afford to be so aggressive in the 
	Virtual Appliance space because the JEOS concept is pretty dangerous to 
	their current business model as you may detect from the 
	picture (the smaller the red part is the less leverage they have). Unless 
	they radically change their software licensing model - which I wouldn't 
	rule out- I don't see how they could sustain an aggressive move toward this 
	JEOS concept. Consider also that the smaller the red part is, the easier it 
	could be to migrate to a different OS for the ISV. This is a generic 
	statement obviously and might not be applicable to specific situations. &lt;/li&gt;
&lt;/ol&gt;
&lt;p align="justify"&gt;All in all what Microsoft is doing is interesting and it will 
benefit customers because it will keep VMware honest in what they are doing - 
in terms of both technology and pricing. My speculation is that this is going to be a two-horse 
race in the long run between VMware and a virtual 
agglomerate comprised of Microsoft and its historical partner Citrix. There are 
concerns and rumors in the industry - admittedly, I am personally backing them - 
that Citrix has sort of lost interest in battling at the XenServer level, which 
is now being distributed for free. Some people are speculating that 
Citrix is shifting their strategy to expand and provide value on top of someone 
else's basic virtualization offering (namely Microsoft Hyper-V) and losing focus 
on their own commodity hypervisor and management offerings (XenServer). Similar to what they are already doing with XenApp expanding the core 
Microsoft Terminal Services technology.&lt;/p&gt;
&lt;p align="justify"&gt;There is no question that the aggressive pricing move from 
Microsoft in the R2 time frame will garner some reaction from VMware. I 
don't have any insight but I wouldn't be totally surprised if VMware was going to 
provide VMotion either for free or in one of the less prestigious future 
vSphere SKUs. There are enough technology deltas, on top of VMotion, that will 
differentiate VMware from Microsoft (especially for enterprise deployments) that 
will allow the guys in Palo Alto to continue to charge premium prices if they 
want to. &lt;/p&gt;
&lt;p align="justify"&gt;However, I think that VMware will be at a fork sooner or 
later: they could either continue to charge a premium for their unmatched 
features to fill a need some of the Enterprise customers have (and that no one 
in this industry can or will match), or they could substantially lower their 
prices to appeal many more customers - especially those that can't afford their 
technologies. The theory is that you could earn $1,000 either charging 1,000 
customers $1, or charging 100 customers $10. This always holds true unless you 
figure out a way to charge 900 customers $1 and 100 customers $10. Their &lt;i&gt;
Acceleration Kits&lt;/i&gt; are an attempt to achieve that&amp;nbsp; value proposition, but what Microsoft is doing in the R2 time frame might 
require a revisiting of the current VMware portfolio layout (which I am sure is in VMware plans). Of course, we need to remember R2 is still 
about a year away so VMware has some time to think about this. &lt;/p&gt;
&lt;p align="justify"&gt;Massimo.&lt;/p&gt;&lt;img src="http://it20.info/aggbug.aspx?PostID=196" width="1" height="1"&gt;</description></item><item><title>And the winner is... AppSpeed</title><link>http://it20.info/blogs/main/archive/2009/03/06/189.aspx</link><pubDate>Fri, 06 Mar 2009 18:41:00 GMT</pubDate><guid isPermaLink="false">3066da22-6b27-4cf1-aa0f-2eff79b21f87:189</guid><dc:creator>Massimo</dc:creator><slash:comments>2</slash:comments><comments>http://it20.info/blogs/main/comments/189.aspx</comments><wfw:commentRss>http://it20.info/blogs/main/commentrss.aspx?PostID=189</wfw:commentRss><description>&lt;P align=justify&gt;I have just got back from VMworld 2009 Europe in Cannes. It was an interesting week and not just because we were in Cote D'Azur (Azur, not Azure like in Windows Azure). There have been a few interesting announcements, demo and breakout sessions going on at the Palais de Festival during the week so it would be difficult to make a ranking but if I have to give my "virtual Oscar" to something I have seen.... that would be AppSpeed. &lt;/P&gt;
&lt;P align=justify&gt;AppSpeed is a new technology that will take some sort of product shape during 2009 under the vSphere umbrella. Whether it's going to be part of the VDC-OS most expensive SKUs or it's going to be a separated product, that I don't know. The roots of this product are in an acquisition VMware did in the summer of 2008 when they acquired a company called B-Hive that developed a product called Conductor. Conductor - AppSpeed from now on - is an "SLA product" that basically takes apart the architecture of an application and creates a logical view of the sub-workloads taking place; a typical example is a multi-tier application that has web, application logic and database components. Not only this, the interesting part is that AppSpeed will monitor the performance of the workload in the way end-users perceive it that is: latency and time of execution. This means that once AppSpeed has built the logical mapping of the applications, the system administrator will have available at the fingertips information such as, for example, how long the web front end takes to respond to the request (i.e. web server response time), how long it takes for the transaction to get to the DB server (i.e. network latency), how long the DB server takes to respond back to the front end (i.e. DB server response time). If you want more information about AppSpeed you can see &lt;A href="http://www.vmware.com/products/vcenter-appspeed/"&gt;here&lt;/A&gt;; there is also a very nice on-line demo &lt;A href="http://forms.b-hivenetworks.com/resources/demo-conductor.html"&gt;here&lt;/A&gt;.&lt;/P&gt;
&lt;P align=justify&gt;I see this as a huge step forward in virtual infrastructure deployments for two particular reasons that I am going to articulate hereafter. &lt;/P&gt;
&lt;P align=justify&gt;The first reason is because this is what customers implementing virtualization have asked me since I started deploying these technologies. "How much is the ESX overhead?" is probably the most frequently asked question that I have heard in the last 10 years or so of virtualization implementations and evangelism. The good news is that the answer was easy: "it depends". The bad news is that it was rarely satisfactory for the customer. The fundamental problem we have had so far is that VMware systems administrators and the application folks use different metrics to check the health of the implementation. Systems administrators would usually monitor &lt;I&gt;resource usage&lt;/I&gt; on the host (i.e. CPU, Memory etc) such as "your VM is only consuming 10% of its allotted resources so it's doing well". However the end-users use a different metric such as "I don't care it's only using 10% of its allotted resources, the fact of the matter is that the job takes 2 minutes to complete so it's slow!". AppSpeed is going to bridge these two disconnected worlds giving the systems administrators higher level monitoring techniques that are very close to the language the end-users speak. &lt;/P&gt;
&lt;P align=justify&gt;An interesting scenario that was pitched during the breakout session in Cannes was that AppSpeed could even be used in the pre-virtualization stage. The idea is that before virtualizing a given multi-tiered application (or part of it) you would use the AppSpeed sensors to build the logical map while the application is still running on one or more physical servers. That would give you the benchmark when you move the application into the virtual world. So for example if your transactional application deployed on your physical infrastructure has a 2 seconds response time or your batch workload has a 5 minutes elapsed time of execution, you can then benchmark your new virtual deployments against these values to see whether virtualization has brought in some overhead (and how much). And with the "decomponentization" that AppSpeed does at the application level you should be able to drill down to the level where you can determine where the issue is. It's not yet clear to me whether the correlation between AppSpeed metrics and standard resource usage metrics is going to be done out-of-the-box by the VMware tools or it's the systems administrator that will have to match the two metrics. &lt;/P&gt;
&lt;P align=justify&gt;The second reason for which I think this is an enormous step forward in virtualization deployments is because I have always laughed at those people referring, in the early days, to VMware ESX as the mainframe software for x86 servers. There is a fundamental difference between a VMware ESX server and a mainframe and that is that mainframe operations are usually driven by "&lt;I&gt;goal modes&lt;/I&gt;" in the sense that the administrator would set the goal - or the desired performance for a given workload - and it would let the system figure out itself the configuration of resources to deliver on the goal. While ESX has many of the knobs and parameters you could find on high end UNIX boxes and mainframes, its operations are still driven by "&lt;I&gt;let's try to add more resources to that workload and see what happens&lt;/I&gt;". The pattern on ESX usually is:&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;
&lt;P align=justify&gt;The end-user complains about the application to be &lt;I&gt;slow&lt;/I&gt; (what does slow mean by the way?)&lt;/P&gt;&lt;/LI&gt;
&lt;LI&gt;
&lt;P align=justify&gt;the ESX administrator tries to add more resources (i.e. either increasing the CPU and Memory shares or increasing the number of vCPUs and Memory allocated to the VM)&lt;/P&gt;&lt;/LI&gt;
&lt;LI&gt;
&lt;P align=justify&gt;the ESX administrator keeps his/her fingers crossed and goes back to the end-user to see if anything has changed&lt;/P&gt;&lt;/LI&gt;
&lt;LI&gt;
&lt;P align=justify&gt;the end-user will either be happy or will continue to complain because the application is still &lt;I&gt;slow&lt;/I&gt; (and the discussion would go on and on).&lt;/P&gt;&lt;/LI&gt;&lt;/OL&gt;
&lt;P align=justify&gt;While AppSpeed won't add magically the &lt;I&gt;goal mode&lt;/I&gt; capabilities to the VMware infrastructure it's clearly a step into that direction. Most likely in the first incarnation of the product the technology will allow to monitor "passively" the response time of a given application which would require a system administrator to work on the vSphere knobs to change the behaviour reported by AppSpeed. Continuing to speculate it would be natural for VMware to get to that "goal mode" state where a system administrator (or the end-user directly through the vApp SLAs) would set the "response time" for the application and would let the infrastructure figure out how to achieve that level of performance (and perhaps charge back accordingly). &lt;/P&gt;
&lt;P align=justify&gt;I am certainly not saying that vSphere (or any future VMware products incarnation) would easily get to the point of matching the mainframe operations any time soon but AppSpeed is certainly a move into that direction. It is also worth noticing the different nature of&amp;nbsp; the applications deployed on the mainframe and those deployed on x86 infrastructure. While applications deployed on the mainframe can usually be tuned increasing or decreasing priority access to physical resources while keeping the same number of application instances, on VMware infrastructure you can either use the same technique or - most likely - you might be forced to clone those workloads to scale-out (think of a web or application layer comprised of more VMs). This certainly adds complexity to the automation and the "goal mode" scenario since it's not just a matter of tuning &lt;I&gt;priority shares&lt;/I&gt; for an existing VM but it is rather a process that would need to provision and de-provision workload instances on the infrastructure. It can be done but it's not as trivial as tuning a &lt;I&gt;CPU power knob&lt;/I&gt;. The mainframe still rules in this space and it's always used as a &lt;I&gt;benchmark&lt;/I&gt; for these sort of functionalities. And beating it is not trivial.&lt;/P&gt;
&lt;P align=justify&gt;The limited documentation and demos available for the technology would lead to think that AppSpeed is able to respond to events automatically triggering resource reconfigurations (either &lt;I&gt;shares&lt;/I&gt; reconfigurations or the ability to spawn new VMs) although I am not sure if that capability demonstrated was an ad-hoc scenario implemented for the demo or it's an out-of-the-box capability natively integrated with the VMware infrastructure underneath. Since, as I said, this is not a trivial thing to achieve, I would speculate that, initially, the product will only have monitoring capabilities based on which a system administrator could take corrective actions. We'll see as we know more though. &lt;/P&gt;
&lt;P align=justify&gt;There are a couple of downsides however to this technology. The first one is that it's obviously a VMware oriented product so one should expect a real end-to-end meaningful measuring only if the end-to-end application architecture runs on VMware. To be honest VMware has countered this statement saying that you can also &lt;I&gt;probe&lt;/I&gt; applications that run on physical boxes; this is the case for example of complex multi-platform and multi-tier applications where the front-end might run on a VMware infrastructure while the back-end might run on a UNIX box for example. This leads to the second concern which is this technology doesn't require any agent to be installed into the VM or the physical host running the application - which is a good thing - but it requires the AppSpeed server to sniff the network (virtual or physical) in promiscuous mode. This might be a security concern for some organizations. &lt;/P&gt;
&lt;P align=justify&gt;All in all I would say AppSpeed is what any VMware system administrator was waiting for hence it gets my "virtual Oscar" (I know they don't give Oscars at the Palais de Festival.... but nonetheless it sounds nice).&lt;/P&gt;
&lt;P align=justify&gt;Massimo. &lt;/P&gt;
&lt;P align=justify&gt;P.S. I have just been informed that due to previous trademark registrations the name AppSpeed might change at the product general availability. Still up in the air, but watch out for the potential new name.&lt;/P&gt;&lt;img src="http://it20.info/aggbug.aspx?PostID=189" width="1" height="1"&gt;</description></item><item><title>Hyper-V Server R2 on BladeCenter S Tutorial</title><link>http://it20.info/blogs/main/archive/2009/02/09/177.aspx</link><pubDate>Mon, 09 Feb 2009 06:57:00 GMT</pubDate><guid isPermaLink="false">3066da22-6b27-4cf1-aa0f-2eff79b21f87:177</guid><dc:creator>Massimo</dc:creator><slash:comments>15</slash:comments><comments>http://it20.info/blogs/main/comments/177.aspx</comments><wfw:commentRss>http://it20.info/blogs/main/commentrss.aspx?PostID=177</wfw:commentRss><description>&lt;P align=justify&gt;My good friend at Microsoft, &lt;A href="http://blogs.technet.com/pgmalusardi/default.aspx"&gt;Giorgio Malusardi&lt;/A&gt;, noticed my post "&lt;A HREF="/blogs/main/archive/2008/11/14/162.aspx"&gt;Enterprise Virtualization in a Box&lt;/A&gt;" which was essentially an example of how to create a BladeCenter-contained VMware-enabled data center in a box (including servers, storage and networking). Giorgio challenged me with the task to create something similar using the Hyper-V Server R2 Beta&lt;I&gt; &lt;/I&gt;that has just been announced. And I accepted the challenge! &lt;/P&gt;
&lt;P align=justify&gt;This tutorial is going to document the setup of the environment based on what I have seen and I have done. I will share my point of view of what's going on and the implication this will or might have in the x86 market in another piece.&amp;nbsp; &lt;/P&gt;
&lt;P align=justify&gt;&lt;B&gt;Microsoft Virtualization Background&lt;/B&gt;&lt;/P&gt;
&lt;P align=justify&gt;For those of you that are missing the Microsoft basics it would be beneficial to set the stage. Right now, Microsoft is shipping the first version of their hypervisor - Hyper-V - by means of two different channels. The first one is as a component (or &lt;I&gt;&lt;B&gt;role&lt;/B&gt;&lt;/I&gt;) of their Microsoft Windows Server 2008 products. You can enable or disable this role in either a &lt;I&gt;&lt;B&gt;normal&lt;/B&gt;&lt;/I&gt; (GUI-based) Windows Server 2008 install, or a &lt;I&gt;&lt;B&gt;core&lt;/B&gt;&lt;/I&gt; (GUI-less) Windows Server 2008 install. Obviously, in order to get Hyper-V, you need to buy a Windows Server 2008 SKU (Hyper-V is included in any 64-bit x86 version of the &lt;I&gt;Standard&lt;/I&gt;, &lt;I&gt;Enterprise&lt;/I&gt; and &lt;I&gt;Datacenter&lt;/I&gt; SKUs). The license rights for guests and included features - such as Failover Clustering technology - are determined by which SKU is purchased.&lt;/P&gt;
&lt;P align=justify&gt;The second channel is as a free download from the Microsoft web site in a package called &lt;I&gt;Microsoft Hyper-V Server 2008&lt;/I&gt;. In a nutshell this is basically a scaled-down version of Windows Server 2008 with the following restrictions and peculiarities: &lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;
&lt;P align=justify&gt;It is a core install only (i.e. GUI-less as the only option)&lt;/P&gt;&lt;/LI&gt;
&lt;LI&gt;
&lt;P align=justify&gt;The only role that it supports - which is enabled by default - is Hyper-V (for example, you can't enable the Failover Clustering role)&lt;/P&gt;&lt;/LI&gt;
&lt;LI&gt;
&lt;P align=justify&gt;It doesn't include any license for Windows guest OS'es&lt;/P&gt;&lt;/LI&gt;
&lt;LI&gt;
&lt;P align=justify&gt;It does have a number of artificial limitations in terms of number of CPUs and amount of system memory supported.&lt;/P&gt;&lt;/LI&gt;&lt;/UL&gt;
&lt;P align=justify&gt;That's what's available as of today. However, Microsoft recently announced the availability of the Beta version of Windows Server 2008 R2 and Hyper-V Server 2008 R2. Both these products will ship the second generation of the Hyper-V hypervisor and are currently scheduled to ship &lt;U&gt;in about a year from now&lt;/U&gt; (roughly). With this Beta, Microsoft announced new features and new restrictions for the free package. The following table is a summary of the features in the current and future offerings:&lt;/P&gt;
&lt;P&gt;&lt;IMG height=640 src="http://www.it20.info/misc/pictures/Hyper-VServerR2onBladeCenterS000.jpg" width=821 border=0&gt;&lt;/P&gt;&amp;nbsp; 
&lt;P&gt;&lt;/P&gt;
&lt;P align=justify&gt;&lt;B&gt;* &lt;/B&gt;Cluster Shared Volumes is a technology currently in Beta and will ship along with the second generation of Hyper-V. It allows to use the NTFS file system as if it was a "cluster file system" (ala VMFS so to speak). See below in the document for more information on the CSV technology. &lt;/P&gt;
&lt;P align=justify&gt;Those of you familiar with the Microsoft Virtualization technology will notice that the Windows Server 2008 R2 SKUs will have similar restrictions and limitations compared to the current releases. This statement obviously doesn't take into account new features introduced with the second generation of the hypervisor (such as Live Migration, for example). As you may have noticed, the biggest delta both in terms of new features and artificial limitations is between the currently shipping Hyper-V Server 2008 (first column from the left) and the future Hyper-V Server 2008 R2&lt;I&gt; &lt;/I&gt;(second column from the left). Among many differences, it's specifically worth to note that the new (free!) product will support:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;
&lt;P align=justify&gt;8 sockets (vs. current artificially limited 4)&lt;/P&gt;&lt;/LI&gt;
&lt;LI&gt;
&lt;P align=justify&gt;1TB of memory (vs. current artificially limited 32GB)&lt;/P&gt;&lt;/LI&gt;
&lt;LI&gt;
&lt;P align=justify&gt;Quick and Live Migration (vs. nothing) &lt;/P&gt;&lt;/LI&gt;
&lt;LI&gt;
&lt;P align=justify&gt;Failover Clustering (vs. nothing)&lt;/P&gt;&lt;/LI&gt;
&lt;LI&gt;
&lt;P align=justify&gt;Cluster Shared Volumes (vs. nothing) &lt;/P&gt;&lt;/LI&gt;&lt;/UL&gt;
&lt;P align=justify&gt;&amp;nbsp;&lt;/P&gt;
&lt;P align=justify&gt;&lt;B&gt;The Hyper-V Server R2 Based Self-Contained Data Center&lt;/B&gt;&lt;/P&gt;
&lt;P align=justify&gt;Back on track. As I said, the challenge was to replicate the VMware-based setup we have done on the BladeCenter S. We have used the very same hardware setup we have used for the VMware test. While we wanted to test the Hyper-V Server R2 Beta it must be noticed that the currently shipping Hyper-V solution works as well on the BladeCenter S today. This is a (generic) picture of the BladeCenter S chassis: &lt;/P&gt;&lt;IMG height=383 src="http://www.it20.info/misc/pictures/Hyper-VServerR2onBladeCenterS001.jpg" width=487 border=0&gt; 
&lt;P align=justify&gt;For this proof of concept, I decided to look at the things from the following perspective:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;
&lt;P align=justify&gt;I wanted to focus on the Hyper-V Server R2 free product (and not on the general purpose Windows Server 2008 R2 w/ Hyper-V role enabled)&lt;/P&gt;&lt;/LI&gt;
&lt;LI&gt;
&lt;P align=justify&gt;I wanted to focus on new technologies that will be shipping in the R2 timeframe. This includes CSV, Failover Clustering and Live Migration&lt;/P&gt;&lt;/LI&gt;
&lt;LI&gt;
&lt;P align=justify&gt;I wanted to focus on what you could do with the future Microsoft &lt;B&gt;free&lt;/B&gt; offering. This includes the standard free tools to manage the environment and obviously doesn't include the fee-based products such as Virtual Machine Manager (the current version wouldn't support Hyper-V Server R2 anyway and there is not a "sister Beta version" of VMM to test with the Hyper-V R2 Beta bits). &lt;/P&gt;&lt;/LI&gt;&lt;/UL&gt;
&lt;P align=justify&gt;All this being said we can "replay" what I have done. &lt;/P&gt;
&lt;P align=justify&gt;&amp;nbsp;&lt;/P&gt;
&lt;P align=justify&gt;&lt;B&gt;Hyper-V Server R2 Nodes Setup&lt;/B&gt;&lt;/P&gt;
&lt;P align=justify&gt;First, I started installing Windows Hyper-V Server R2 on the two local disks of the two blades in the chassis. This is a picture taken from the Management Module of the BladeCenter S during the setup (remote attended install):&lt;/P&gt;&lt;IMG height=653 src="http://www.it20.info/misc/pictures/Hyper-VServerR2onBladeCenterS002.jpg" width=837 border=0&gt; 
&lt;P align=justify&gt;I could have set up the basic OS on the shared storage as well as dedicating a small LUN to each of the two blades but I remember there was a registry tweak to apply in the Windows 2003 timeframe to allow a single shared SAS/FC to handle both the C:\ drive as well as the shared storage in a MSCS scenario. I didn't want to get into that level of complexity, especially as it was not one of the main goals I had with this Proof of Concept. Enough to say that I am sure you could get rid of the local disks if you really want to. &lt;/P&gt;
&lt;P align=justify&gt;The setup doesn't really ask too many things. Actually nothing. At the next reboot you are asked to change the Administrator password and off you go. This is what you get on a Hyper-V Server R2 Beta local console:&lt;/P&gt;&lt;IMG height=661 src="http://www.it20.info/misc/pictures/Hyper-VServerR2onBladeCenterS003.jpg" width=847 border=0&gt; 
&lt;P&gt;&lt;/P&gt;
&lt;P align=justify&gt;Through the Hyper-V Configuration panel (blue window), I did the following: &lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;
&lt;P align=justify&gt;Changed the default Host Name (into HVR2NODO1 and HVR2NODO2)&lt;/P&gt;&lt;/LI&gt;
&lt;LI&gt;
&lt;P align=justify&gt;Restarted server to apply the computer name settings&lt;/P&gt;&lt;/LI&gt;
&lt;LI&gt;
&lt;P align=justify&gt;Changed the IP to static addresses (192.168.88.131/132)&lt;/P&gt;&lt;/LI&gt;
&lt;LI&gt;
&lt;P align=justify&gt;Enabled RDP support&lt;/P&gt;&lt;/LI&gt;
&lt;LI&gt;
&lt;P align=justify&gt;Configured Remote Management to allow WinRM and relax Firewall settings&lt;/P&gt;&lt;/LI&gt;
&lt;LI&gt;
&lt;P align=justify&gt;Enabled an extra firewall setting (through the command &lt;I&gt;Netsh advfirewall firewall set rule group=“Remote Volume management” new enable=yes&lt;/I&gt;) for managing the disks through a remote MMC snap-in&lt;/P&gt;&lt;/LI&gt;
&lt;LI&gt;
&lt;P align=justify&gt;Joined the domain (Windows 2008 R2 Domain created on a separate server on the network) &lt;/P&gt;&lt;/LI&gt;
&lt;LI&gt;
&lt;P align=justify&gt;Added the domain Administrator to the local Administrators group (option 4 of the Hyper-V Configuration tool).&lt;BR&gt;&amp;nbsp;&lt;/P&gt;&lt;/LI&gt;&lt;/OL&gt;
&lt;P align=justify&gt;At this point - before enabling Failover Clustering support - I configured both blades to access two shared LUNs created with the IBM Storage Configuration Manager, which is the tool you can use to configure the BladeCenter S integrated storage. This picture shows that a Quorum LUN (10GB) and a CSV LUN (100GB) have been assigned to both blades in the chassis. &lt;/P&gt;&lt;IMG height=518 src="http://www.it20.info/misc/pictures/Hyper-VServerR2onBladeCenterS004.jpg" width=662 border=0&gt; 
&lt;P&gt;&lt;/P&gt;
&lt;P align=justify&gt;A restart of both blades allowed the domain change to take effect as well as the disks to be recognized by the two Hyper-V Server R2 instances (alternatively, a disk rescan would do this job). &lt;/P&gt;
&lt;P align=justify&gt;Because of the fully redundant fabric architecture of the BladeCenter S, the two disks we have just configured (Quorum and CSV1) are seen twice by the hypervisor OS because of the dual path that each blade has to get to the disks (this is, by the way, the big plus of this chassis with the integrated storage). A multipath I/O software needs to be installed on the Hyper-V hosts to manage the disks properly. This is done by first enabling Hyper-V-based MPIO support which is not installed by default. The command &lt;I&gt;"oclist"&lt;/I&gt; displays all features that have been enabled/disabled on the host as you can see from the picture below:&lt;/P&gt;&lt;IMG height=507 src="http://www.it20.info/misc/pictures/Hyper-VServerR2onBladeCenterS005.jpg" width=648 border=0&gt; 
&lt;P&gt;&lt;/P&gt;
&lt;P align=justify&gt;On one of the two hosts, I manually enabled base Microsoft MPIO support (via the command "&lt;I&gt;start /w ocsetup MultipathIo&lt;/I&gt;"), but this is not enough. I had to install storage specific multipath software which interacts with the base Microsoft MPIO code. In IBM terms this is called IBM Subsystem Device Driver and can be downloaded off the external website. At the time of this writing, the package is located at this &lt;A href="http://www-01.ibm.com/support/docview.wss?rs=540&amp;amp;context=ST52G7&amp;amp;dc=D430&amp;amp;uid=ssg1S4000350&amp;amp;loc=en_US&amp;amp;cs=utf-8&amp;amp;lang=en"&gt;link&lt;/A&gt; and it's called the "SDDDSM Package for RSSM" (SDDDSM= Subsystem Device Driver Device Specific Module; RSSM=Raid SAS Switch Module). It's interesting to notice that the package in subject has a typical Windows setup, so I was wondering how it could be installed on a GUI-less system. Well, launching the setup.exe did the job, as you can see in the following pictures.&lt;/P&gt;&lt;IMG height=539 src="http://www.it20.info/misc/pictures/Hyper-VServerR2onBladeCenterS006.jpg" width=691 border=0&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;/P&gt;&lt;IMG height=545 src="http://www.it20.info/misc/pictures/Hyper-VServerR2onBladeCenterS007.jpg" width=701 border=0&gt; 
&lt;P&gt;&lt;/P&gt;
&lt;P align=justify&gt;First impression was that this was not really a GUI-less system, but rather a standard Windows system where explorer.exe was disabled. Well, never mind....&lt;/P&gt;
&lt;P align=justify&gt;After the reboot the system was up and running again, and the hypervisor correctly reported only two disks being assigned to the blade (the 68GB disk is the local hard drive whereas the 100GB and the 10GB are the two LUNs I created with the Storage Configuration Manager utility). &lt;/P&gt;&lt;IMG height=518 src="http://www.it20.info/misc/pictures/Hyper-VServerR2onBladeCenterS008.jpg" width=663 border=0&gt; 
&lt;P&gt;&lt;/P&gt;
&lt;P align=justify&gt;On the second blade we found out right away that installing the IBM SDD software automatically enabled Windows base MPIO support (if it doesn't just use the command above to enable it).&lt;/P&gt;
&lt;P align=justify&gt;At this point we enabled the Failover Clustering feature on both hosts via option #8 of the Hyper-V Configuration window. This enables the Microsoft Cluster Server code on the two hosts. The picture below shows what happens on the console when you enable this feature. The Cluster itself will be configured later. &lt;/P&gt;&lt;IMG height=537 src="http://www.it20.info/misc/pictures/Hyper-VServerR2onBladeCenterS009.jpg" width=681 border=0&gt; 
&lt;P&gt;&lt;/P&gt;
&lt;P align=justify&gt;This is pretty much it for the Hyper-V hosts setup. This concludes the configuration of the base support that needs to be done on the Hyper-V Server Configuration console. From now on we can do pretty much everything from the Microsoft remote tools. &lt;/P&gt;
&lt;P align=justify&gt;&amp;nbsp;&lt;/P&gt;
&lt;P align=justify&gt;&lt;B&gt;Hyper-V Server R2 Nodes Configuration from a Remote Workstation&lt;/B&gt;&lt;/P&gt;
&lt;P align=justify&gt;We can switch focus to a Windows 2008 R2 Server that we previously installed and configured to be a Domain Controller for our test bed. Remote administration of the Hyper-V hosts could either be accomplished from this host (after enabling some remote administrative tools that are disabled by default) or from a Vista / Windows7 workstation using the latest RSAT tools available from the Microsoft web site. These tools include advanced Remote Administration MMC Snap-Ins that don't ship with the base client OS and allow to do enhanced tasks such as Live Migration. The latest release of these tools (in beta) can be downloaded &lt;A href="http://www.microsoft.com/downloads/details.aspx?FamilyID=82516c35-c7dc-4652-b2ea-2df99ea83dbb&amp;amp;displaylang=en"&gt;here&lt;/A&gt;.&lt;/P&gt;
&lt;P align=justify&gt;If you use the workstation it must be in the &lt;SPAN&gt;&lt;/SPAN&gt;same domain you joined the HyperVR2 hosts to. If it is not in the same domain, extra configuration steps on the Hyper-V servers are required to relax cross-domain security restrictions.&lt;BR&gt;Since one of the purposes of this test was to demonstrate how you can remotely manage advanced hypervisor features using free tools, we have created an MMC configuration (that we called "MasterMMC") which includes the following Snap-Ins:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;
&lt;P align=justify&gt;Remote Disk Management&lt;/P&gt;&lt;/LI&gt;
&lt;LI&gt;
&lt;P align=justify&gt;Failover Cluster Manager&lt;/P&gt;&lt;/LI&gt;
&lt;LI&gt;
&lt;P align=justify&gt;Hyper-V Manager.&lt;/P&gt;&lt;/LI&gt;&lt;/UL&gt;
&lt;P align=justify&gt;I have used the Remote Disk Management tool to configure partitions and file systems on the two shared disks on both blades: I have assigned the Quorum LUN the Q: letter and the CSV1 LUN the X: letter on both nodes to prepare for cluster enablement. Initially I had a hard time getting to the Hyper-V nodes via this applet. I eventually managed to get to a stable state where I could manage the disks, but I have had many connection issues ("RPC Server unavailable") that I couldn't nail down to a particular problem. Firewall issues as well as bugs in the code (which didn't refresh the pane properly for which I had to close and re-open the MasterMMC) might be potential causes.&lt;/P&gt;
&lt;P align=justify&gt;The Hyper-V Manager Snap-In was more straightforward. The only thing I have done here is assigning the second Gigabit adapter on the blade to a VirtualSwitch (called VMs in the screenshot below) that I have defined on both Hyper-V nodes. The first NIC (which I have configured with a static IP address at the beginning of the setup) remains assigned/dedicated to the parent partition. &lt;/P&gt;&lt;IMG height=614 src="http://www.it20.info/misc/pictures/Hyper-VServerR2onBladeCenterS010.jpg" width=783 border=0&gt; 
&lt;P&gt;&lt;/P&gt;
&lt;P align=justify&gt;This is the "network" (aka VirtualSwitch) to which you will connect the guests to get physical network access. &lt;/P&gt;
&lt;P align=justify&gt;Notice that the BladeCenter S supports blades with up to 4 NICs configured. For this test only two NICs have been configured on each blade. Remember, Hyper-V currently does not allow NIC teaming at the hypervisor level (i.e. assigning more NICs to the same VirtualSwitch). Microsoft advises to use third-party NIC software to create bonds of network adapters and assign the resulting "bonded NIC" to the VirtualSwitch. It's not clear whether Hyper-V R2 is going to change this when they ship the gold code.&lt;/P&gt;
&lt;P align=justify&gt;The next step is to configure the cluster across the nodes. This is not really Hyper-V Server specific, as the procedure is pretty similar to what you would do on a Windows 2008 Enterprise Server. It involves validating the hardware setup first with the built-in utility and then configuring the cluster properties (clustername, IP address etc). &lt;/P&gt;
&lt;P align=justify&gt;Next, I enabled Cluster Shared Volumes. Those of you that are familiar with Hyper-V and Failover Clustering know that in order to manage a Guest as a single entity (i.e. independently "Quick Migrating" a VM from one host to another) the VM needs to be created on a dedicated shared LUN. This is, by the way, the configuration Microsoft usually advises. This has a number of implications in that you could easily run out of drive letters in the cluster (this can, however, be by-passed using specific mounting techniques), but more importantly it introduces a management overhead: you need to create a LUN for each VM you need to deploy, rather than leveraging a BIG shared LUN cluster-wise (like VMware VMFS allows you to do). That is what CSVs are all about: they provide a "cluster file system"-like environment where you can run a number of different guests on different hosts pointing to the same shared LUN. In fact, it's not by chance that I have assigned the blades a 10GB disk to be used as a dedicated Quorum, as well as a unique 100GB CSV1 LUN to be used concurrently as a shared repository to host multiple VMs. This is obviously a new and big benefit since the current Microsoft Cluster Server architecture is such that if a node owns and can access a LUN the other host in the cluster is inhibited from accessing it (at least until the group containing the LUN fails and the cluster changes its ownership).&lt;/P&gt;
&lt;P align=justify&gt;The picture below shows the disclaimer about CSV: they can only be used to host virtual machines in a Hyper-V R2 environment! This means they can't be used in a general purpose Windows Server 2008 Microsoft Failover Clustering scenario.&lt;/P&gt;&lt;IMG height=780 src="http://www.it20.info/misc/pictures/Hyper-VServerR2onBladeCenterS011.jpg" width=1000 border=0&gt; 
&lt;P&gt;&lt;/P&gt;
&lt;P align=justify&gt;The cluster configuration wizard asks me which volumes I want to enable: CSV1 is the only remaining partition I have (the Q: drive has already been used for the Quorum):&lt;/P&gt;&lt;IMG height=678 src="http://www.it20.info/misc/pictures/Hyper-VServerR2onBladeCenterS012.jpg" width=866 border=0&gt; 
&lt;P&gt;&lt;/P&gt;
&lt;P align=justify&gt;Once the CSV has been enabled, &lt;U&gt;on each cluster node&lt;/U&gt; a new directory structure appears. The default is "&lt;A&gt;C:\ClusterStorage\Volume1&lt;/A&gt;" &lt;/P&gt;&lt;IMG height=419 src="http://www.it20.info/misc/pictures/Hyper-VServerR2onBladeCenterS013.jpg" width=534 border=0&gt; 
&lt;P&gt;&lt;/P&gt;
&lt;P align=justify&gt;This is a "virtual pointer" that refers to the CSV1 LUN and it shows up on each of the two blades and describes a sort of common/shared name space that both blades can access at the same time. This concept applies to virtual machines only and the usage of CSV cannot be extended to a general purpose cluster file system at the moment. &lt;/P&gt;
&lt;P align=justify&gt;Now that we have a cluster set up and a CSV volume available, we are going to create a virtual machine. We point to the Hyper-V Manager Snap-In in the MasterMMC window and we configure the VM to be hosted on the CSVs explicitly choosing the common local name space that identifies the CSV on the Storage Area Network:&lt;/P&gt;&lt;IMG height=869 src="http://www.it20.info/misc/pictures/Hyper-VServerR2onBladeCenterS014.jpg" width=1099 border=0&gt; 
&lt;P&gt;&lt;/P&gt;
&lt;P align=justify&gt;At first it seems to be odd to create a to-be-clustered virtual machine on a "C:" drive, but that's the way it works. Obviously the VM files won't be created on the local drive on the blade because, as I said, that path represents a location that is actually on the SAN. This is how our MasterMMC looks like in the end once we have done all this: &lt;/P&gt;&lt;IMG height=840 src="http://www.it20.info/misc/pictures/Hyper-VServerR2onBladeCenterS015.jpg" width=1073 border=0&gt; 
&lt;P&gt;&lt;/P&gt;
&lt;P align=justify&gt;So far we have only created the VM. It's not yet clustered, as there is no integration between the Hyper-V hosts and the Failover Clustering applet, using the free management tools. Microsoft Virtual Machine Manager is supposed to provide this integrated view and operations, but as I said at the beginning, the currently shipping VMM version doesn't manage Hyper-V R2 Beta hosts yet. Besides, it would be beyond the scope of this document anyway. So in order to clusterize the VM we have to explicitly and manually declare this VM as a clustered resource. The steps are similar to how you would configure any cluster resource; just make sure you select "Virtual Machine" as a resource type and then you are presented with a list of VMs that are running on the cluster hosts (i.e. both Hyper-V R2 Beta servers). Notice that the virtual machine needs to be powered off to be clusterized (otherwise the wizard will fail).&lt;/P&gt;
&lt;P align=justify&gt;Once we have configured the resource we can bring the virtual machine on-line:&lt;/P&gt;&lt;IMG height=809 src="http://www.it20.info/misc/pictures/Hyper-VServerR2onBladeCenterS016.jpg" width=1039 border=0&gt; 
&lt;P&gt;&lt;/P&gt;
&lt;P align=justify&gt;The resource (virtual machine) is now online and it's running on the second Hyper-V R2 node (HVR2NODO2) as you can see from the picture below:&lt;/P&gt;&lt;IMG height=809 src="http://www.it20.info/misc/pictures/Hyper-VServerR2onBladeCenterS017.jpg" width=1039 border=0&gt; 
&lt;P&gt;&lt;/P&gt;
&lt;P align=justify&gt;At this point you can invoke from the Failover Clustering interface a "Live Migration" of the resource as you can see below:&lt;/P&gt;&lt;IMG height=809 src="http://www.it20.info/misc/pictures/Hyper-VServerR2onBladeCenterS018.jpg" width=1039 border=0&gt; 
&lt;P&gt;&lt;/P&gt;
&lt;P align=justify&gt;&lt;BR&gt;And the virtual machine will start the live migration onto the other host:&lt;/P&gt;&lt;IMG height=809 src="http://www.it20.info/misc/pictures/Hyper-VServerR2onBladeCenterS019.jpg" width=1039 border=0&gt; 
&lt;P&gt;&lt;/P&gt;
&lt;P align=justify&gt;During my test I have been able to successfully move the virtual machine from one node to the other with basically no downtime except for a ping or two:&lt;/P&gt;&lt;IMG height=809 src="http://www.it20.info/misc/pictures/Hyper-VServerR2onBladeCenterS020.jpg" width=1039 border=0&gt; 
&lt;P&gt;&lt;/P&gt;
&lt;P align=justify&gt;Consider the networking configuration might not be optimal, and we will have to see what Microsoft will suggest in terms of network subsystem setup in the context of Live Migrating a virtual machine. Having said this, loosing one or two pings is usually something most web and client/server applications would be able to handle, and it's not too much different from the experience you would have using alternative live migration technologies from other vendors such as VMware, Citrix, VirtualIron etc. &lt;/P&gt;
&lt;P align=justify&gt;The last test of this proof of concept is to create another virtual machine and demonstrate that they could run simultaneously on the two Hyper-V R2 Beta hosts while insisting on a common shared LUN through the CSV technology. These are the screenshots of the two virtual machines running on different hosts but insisting on the same repository which is the CSV1 volume mapped on both hosts as "&lt;A&gt;C:\ClusterStorage\Volume1&lt;/A&gt;":&lt;/P&gt;&lt;IMG height=809 src="http://www.it20.info/misc/pictures/Hyper-VServerR2onBladeCenterS021.jpg" width=1039 border=0&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;/P&gt;&lt;IMG height=809 src="http://www.it20.info/misc/pictures/Hyper-VServerR2onBladeCenterS022.jpg" width=1039 border=0&gt; 
&lt;P&gt;&lt;/P&gt;
&lt;P align=justify&gt;There is one pretty interesting thing in these, if you noticed. Despite the fact that both nodes can access the CSV at the very same time (otherwise they couldn't simultaneously run two virtual machines hosted on the same volume), the actual LUN is, at any point in time, "officially owned" by one of the two nodes (in this case the owner of the LUN is always HVR2NODO2). I must admit I have to dig more into the CSVs but they seem to be arbitrated and controlled by one node at a time. My assumption is the cluster node that is NOT the owner of the LUN would not use the owner of the LUN as a proxy to get there because this would hurt substantially the disk access performance (i.e. one node has direct access while the other node has a pass-through access through the owner of the LUN - not a viable scenario). Somehow the other node (i.e. HVR2NODO1) has direct access to the LUN performance-wise but it also must coordinate access rights with the official owner of the LUN itself (that is HVR2NODO2). &lt;/P&gt;
&lt;P align=justify&gt;In a scenario like this it would be interesting to understand what happens when the node that is the owner of the CSV crashes. &lt;/P&gt;
&lt;P align=justify&gt;To recap, this is the summary of my current setup:&lt;/P&gt;
&lt;P align=justify&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;B&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;Running on&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;CSV Owner&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/B&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/P&gt;
&lt;P align=justify&gt;&lt;B&gt;VirtualMachine ( )&amp;nbsp;&lt;/B&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;HVR2NODO2&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; HVR2NODO2&lt;/P&gt;
&lt;P align=justify&gt;&lt;B&gt;VirtualMachine (2)&lt;/B&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; HVR2NODO1&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; HVR2NODO2&lt;/P&gt;
&lt;P align=justify&gt;In a cluster file system environment, if HVR2NODO2 fails, VirtualMachine(2) would continue to run on the other node (HVR2NODO1) without any interruption and VirtualMachine( ) would go off-line to restart on the same surviving node (HVR2NODO1). &lt;/P&gt;
&lt;P align=justify&gt;So I turned off blade #2 in the chassis (which is HVR2NODO2) via the remote BladeCenter S Management Module (MM):&lt;/P&gt;&lt;IMG height=584 src="http://www.it20.info/misc/pictures/Hyper-VServerR2onBladeCenterS023.jpg" width=750 border=0&gt; 
&lt;P&gt;&lt;/P&gt;
&lt;P align=justify&gt;VirtualMachine(2) didn't experience any issue both from either a ping perspective or a Failover Cluster Manager notification. This would lead me to think that CSV ownership would change transparently without any service interruption. This was somewhat expected and the only point of concern was the ownership of the CSV (which apparently can be managed in a smart way). However, the other virtual machine experienced downtime. This was expected as well, since VirtualMachine( ) was running on HVR2NODO2 which was turned off "in the hard way" so the failover algorithms had to kick in to bring it back on-line on the surviving node (HVR2NODO1) with a standard boot-up procedure.&lt;/P&gt;&lt;IMG height=809 src="http://www.it20.info/misc/pictures/Hyper-VServerR2onBladeCenterS024.jpg" width=1039 border=0&gt; 
&lt;P&gt;&lt;/P&gt;
&lt;P align=justify&gt;Notice that the ping window first loses the link, then it starts to get a host destination unreachable message from the local IP address (192.168.88.133 is the host from which I am pinging). Eventually it starts to ping the guest again once it's brought back on-line. &lt;/P&gt;
&lt;P align=justify&gt;&lt;B&gt;Preliminary Conclusions and Impressions&lt;/B&gt;&lt;/P&gt;
&lt;P align=justify&gt;As I said at the beginning, I will write another piece on what I think the implications of these technologies will be in the market. From what I have seen so far, the Hyper-V R2 platform seems to be pretty stable (once I got passed some weird issues with the Remote Disk Management stuff). Let's not forget that we will not see these technologies before year end 2009 or the beginning of 2010. This is the common speculation in the industry, anyway. While this will allow plenty of time for Microsoft to fix these problems, the fact that these are still one year away will give VMware some time to think about their main competitor.... although I am sure all this is already on their radar in Palo Alto.&lt;/P&gt;
&lt;P align=justify&gt;There are a number of aspects in the Microsoft technologies that I think are a long way from catching up with what VMware is doing. VMware had the advantage of starting to develop a true virtualization platform from a &lt;U&gt;blank sheet&lt;/U&gt;. Microsoft, on the other hand, has a legacy of technologies, so virtualization for Microsoft seems more &lt;U&gt;hammered-in&lt;/U&gt; than anything else. An example is the fact that when you create a Virtual Machine from the Hyper-V Manager, the default location is &lt;I&gt;"&lt;A&gt;C:\ProgramData\Microsoft\Windows\Hyper-V&lt;/A&gt;"&lt;/I&gt;, which is not what I would define as a proper default location for hosting enterprise workloads (in fact, it looks more like a Microsoft Office document default location). This might sound simple, but it tells you a lot about the heritage Microsoft wants and needs to protect.&lt;/P&gt;
&lt;P align=justify&gt;That's pretty much it for the negative part. As far as the positive aspects are concerned, everything you have seen here (except the BladeCenter S and the Windows guests!) is all software that is free of charge. And this is not a trivial aspect or something to overlook.&lt;/P&gt;
&lt;P align=justify&gt;Massimo. &lt;/P&gt;&lt;img src="http://it20.info/aggbug.aspx?PostID=177" width="1" height="1"&gt;</description></item><item><title>VMworld 2009 Europe is coming: do you want to Scale Up or Scale Out?</title><link>http://it20.info/blogs/main/archive/2009/02/04/175.aspx</link><pubDate>Wed, 04 Feb 2009 19:56:00 GMT</pubDate><guid isPermaLink="false">3066da22-6b27-4cf1-aa0f-2eff79b21f87:175</guid><dc:creator>Massimo</dc:creator><slash:comments>2</slash:comments><comments>http://it20.info/blogs/main/comments/175.aspx</comments><wfw:commentRss>http://it20.info/blogs/main/commentrss.aspx?PostID=175</wfw:commentRss><description>
&lt;P align=justify&gt;&lt;A href="http://www.vmworld.com/index.jspa"&gt;VMworld 2009 Europe&lt;/A&gt; is coming (last week of February). I was planning to go and I have just found out that they have also accepted one of the two topics I submitted for the break-out sessions. The title of the session that got selected is: &lt;/P&gt;
&lt;P align=justify&gt;&lt;I&gt;&lt;B&gt;Virtual Infrastructures: Scale Up or Scale Out? Rack or Blade form factors?&lt;/B&gt;&lt;/I&gt;&lt;/P&gt;
&lt;P align=justify&gt;This is the abstract as I entered it originally (I assume it will remain the same):&lt;/P&gt;
&lt;P align=justify&gt;-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------&lt;/P&gt;
&lt;P align=justify&gt;&lt;I&gt;As virtualization is becoming mainstream many organizations are undergoing design efforts to properly deploy their new virtual infrastructures. These organizations usually want to do this within known best practices boundaries. &lt;BR&gt;&lt;BR&gt;Two of the most common concerns in the design criteria surround the hardware footprint. Specifically two of the most frequently asked questions are: &lt;BR&gt;&lt;BR&gt;1) Should I use many small servers or fewer bigger servers? &lt;BR&gt;2) Should I use rack optimized severs or a blade form factor? &lt;BR&gt;&lt;BR&gt;This session will briefly discuss the history of virtualization deployments in the context of the underlying hardware infrastructure and how it is morphing. Pros and cons of the Scale Up and Scale Out models will be discussed with real life examples and general recommendations for deploying many small boxes or few bigger high-end nodes. The session will also outline major differences and design considerations for deploying different form factors including rack servers, blade servers as well as non conventional x86 server footprints. &lt;BR&gt;&lt;BR&gt;The objective for this session is to demonstrate that one solution doesn’t fit all needs and that each organization needs to assess its own requirements and pain points to determine the best hardware layout among the many. This session is supposed to empower these organizations with a list of design considerations in order to elaborate the server infrastructure layout that best meets their needs. &lt;/I&gt;&lt;BR&gt;&amp;nbsp;&lt;/P&gt;
&lt;P align=justify&gt;-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------&lt;/P&gt;
&lt;P align=justify&gt;For those of you that are not patient I will give you the answer right away: &lt;I&gt;&lt;B&gt;It depends! &lt;/B&gt;&lt;/I&gt;(or&lt;I&gt;&lt;B&gt; IT depends?&lt;/B&gt;&lt;/I&gt;)&lt;/P&gt;
&lt;P align=justify&gt;This is not clearly an AD for my session: I am not paid by the number of people that will seat down! By the way as far as the salary is concerned, most of you know that I work for a hardware vendor (IBM). Despite that I am trying (well no... I guarantee!) to keep that session (fairly) technical and not a sales/marketing advertisement. The good thing about IBM is that we have hardware technologies in the x86 space that span pretty much all the spectrum so there is no (evident) conflict of interests in talking about one scenario Vs the other.&lt;/P&gt;
&lt;P align=justify&gt;The ESX Scale Out Vs Scale Up dilemma is something that has always (professionally) fascinated me. In 2004 I was tired of hearing all religious wars on the VMTN community forums about the advantages of one model Vs the other so I decided to write a (hopefully balanced) Redpaper on the subject. The reviews were pretty favorable as you could see (no that was not my family voting - at least I don't think so) and most of the content and philosophies could be applied these days. &lt;/P&gt;
&lt;P&gt;&lt;IMG height=651 src="http://www.it20.info/misc/pictures/VMwareInfrastructures-ScaleOutorScaleUp.JPG" width=868 border=0&gt;&lt;/P&gt;
&lt;P&gt;You can still download this Redpaper at this link: &lt;A href="http://www.redbooks.ibm.com/abstracts/redp3953.html?Open"&gt;http://www.redbooks.ibm.com/abstracts/redp3953.html?Open&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;Another "project" I have been lately working on with respect to this dilemma of scaling Up Vs Out is a table I have on my site whose title is &lt;A href="http://www.it20.info/misc/virtualizationplatformofchoice.htm"&gt;Virtual Infrastructure: Platforms of Choice&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;The idea behind that is that someone would look at the attributes and track down which hardware form factor can deliver what she/he is looking for. It's still very much work in progress as you may notice. One of the many challenges of filling a table like that (as well as of presenting a topic like this) is that the matter in subject is, at least, bi-dimensional. Scale Up Vs Scale Out is one dimension (i.e. big servers Vs small servers) and the "hardware form factor" is another dimension (i.e. racks Vs blades). There are rack optimized designs that scale out, other rack optimized designs that scale up, there are blades whose design is a natural fit for scaling out and there are also other blades whose design resemble a scale up solution (albeit with a number of limitations). &lt;/P&gt;
&lt;P&gt;This discussion is not trivial. To add complexity to an already complex matter other non-conventional form factors are emerging in the market such as the IBM iDataPlex which would be hard to define a rack design (or even a blade design). At this point in time I am thinking about including some iDataPlex charts in the deck just to describe this new trend/architecture (as you can depict my deck is well before draft stage - how would you define a PowerPoint document with one blank page?). &lt;/P&gt;
&lt;P&gt;All in all if you have comments or feedbacks on what you would like to see in a session like this feel free to send me an e-mail: &lt;A href="mailto:massimo@it20.info"&gt;massimo@it20.info&lt;/A&gt;.&lt;/P&gt;
&lt;P&gt;Looking forward to Cannes and if you come by, please stop and say hi. &lt;/P&gt;
&lt;P&gt;Massimo. &lt;/P&gt;&lt;img src="http://it20.info/aggbug.aspx?PostID=175" width="1" height="1"&gt;</description></item><item><title>Enterprise Virtualization In-a-Box</title><link>http://it20.info/blogs/main/archive/2008/11/14/162.aspx</link><pubDate>Fri, 14 Nov 2008 06:19:00 GMT</pubDate><guid isPermaLink="false">3066da22-6b27-4cf1-aa0f-2eff79b21f87:162</guid><dc:creator>Massimo</dc:creator><slash:comments>12</slash:comments><comments>http://it20.info/blogs/main/comments/162.aspx</comments><wfw:commentRss>http://it20.info/blogs/main/commentrss.aspx?PostID=162</wfw:commentRss><description>&lt;P align=justify&gt;In this post I am going to talk about a specific piece of hardware technology that is intercepting a specific virtualization industry trend. This piece of technology is called &lt;B&gt;BladeCenter S&lt;/B&gt;. Those of you that have been reading my blog know I don't usually talk about IBM specific stuff (I work for IBM) but this time I felt like &lt;I&gt;the infringement of the law&lt;/I&gt; was worth it. Believe me or not I would have posted this anyway. &lt;/P&gt;
&lt;P align=justify&gt;Before we get into the specific of the technology let me take a step back and briefly touch on the industry trend I was referring to. This is going to be basic stuff for most of the virtualization experts out there plus these concepts are not new and I have written/talked about those in the &lt;A HREF="/files/3/documentation/entry54.aspx"&gt;past&lt;/A&gt;. Having this said sometimes it's good to pause for a second and try to summarize what is happening in this industry. Up until the late nineties (almost) every data center looked something like this:&lt;/P&gt;&lt;IMG height=411 src="http://www.it20.info/misc/pictures/EnterpriseVirtualizationIn-a-Box1.jpg" width=548 border=0&gt;
&lt;P&gt;&lt;/P&gt;
&lt;P align=justify&gt;Very inflexible and vertical silos. Each silo was comprised of the following building blocks:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;
&lt;P align=justify&gt;A server&lt;/P&gt;&lt;/LI&gt;
&lt;LI&gt;
&lt;P align=justify&gt;A local disk subsystem (aka DAS - Direct Attached Storage)&lt;/P&gt;&lt;/LI&gt;
&lt;LI&gt;
&lt;P align=justify&gt;An operating system&lt;/P&gt;&lt;/LI&gt;
&lt;LI&gt;
&lt;P align=justify&gt;An application&lt;/P&gt;&lt;/LI&gt;&lt;/UL&gt;
&lt;P align=justify&gt;Do you have 100 application services? Deploy 100 of these independent silos! Have you ever heard virtualization (true or appointed) experts talking about how bad life was those days? Look at the picture... and you can imagine how life was. I can tell you: it was very bad (compared to what we have today obviously, at that time it was... OK).&lt;/P&gt;
&lt;P align=justify&gt;At the beginning of the 21st century we have started to see the very first form of "visible" virtualization of an x86 IT infrastructure. I am using the world "visible" because someone might argue that the concept of virtualization was already included in the OS under the form of memory virtualization (physical memory Vs virtual memory etc &lt;SPAN&gt;&lt;/SPAN&gt;; I am not interested in these academic discussions and I am not interested in determining where virtualization first appeared in the x86 ecosystem (we can stay here for days without getting to any useful outcome). I want to focus more on tangible things that end-users/human beings (not IT geeks) understand and can appreciate. Having defined the context, the first form of "visible" virtualization of an x86 IT infrastructure was the storage and particularly the consolidation of all Direct Attached Storage into a single pool of storage resources called SAN (Storage Area Network). And since my mantra is that a picture is worth 1000 words, here it is how a common x86 IT infrastructure looked like at the beginning of this century:&lt;/P&gt;&lt;IMG height=411 src="http://www.it20.info/misc/pictures/EnterpriseVirtualizationIn-a-Box2.jpg" width=548 border=0&gt;
&lt;P&gt;&lt;/P&gt;
&lt;P align=justify&gt;Note: If you ask 100 storage specialists nowadays what storage virtualization is you might very well get 100 responses (perhaps more?) ranging from "Raid 0 is the basic form of storage virtualization" all the way to "a storage grid (whatever that is) is the only form of storage virtualization". I am using here the word virtualization in the context of storage to describe the high level practice of decoupling the disk subsystem from the servers and locate it into a &lt;I&gt;common resource pool&lt;/I&gt;.&lt;/P&gt;
&lt;P align=justify&gt;Back to the basic this is what customers have been doing for the last 10 years or so: getting rid of this locally attached / inefficient / inflexible disk subsystem and move (almost) all the disk spindles into a central repository that is the so called &lt;I&gt;Storage Server &lt;/I&gt;(the physical data repository attached to the SAN). The very first advantage that this has brought to customers is a more efficient and flexible way to use the storage space; someone might refer to this as &lt;I&gt;Storage Consolidation&lt;/I&gt;. On the other hand shared consolidated storage brought in (as a bonus I would say) a brand new architecture that allowed customers to do things that were not simply possible before. One example for all is High Availability clusters: in the good old days of DAS (and the inflexible silos described at the beginning) your application data would most likely be hold physically on the same server that was running the application. Should that server fail you couldn't access any longer your data (unless you restore them from a backup); with SAN shared storage this changed as you can now "attach on the fly" the same set of data to another server and restart the application from there while being consistent in terms of data persistency. Microsoft Cluster Server, anyone? &lt;/P&gt;
&lt;P align=justify&gt;Well time goes by and right now storage virtualization is no longer the hot topic (I guess everyone recognizes it as more of a prerequisite to run an efficient IT). The buzz word today is server virtualization and, if you think about it, it's the natural progression of what we have seen happening in the past: it's about taking the silo apart and move additional stuff below the virtualization bar. We have done that with storage, who's next? Did I ever say a picture is worth 1000 worth? &lt;/P&gt;&lt;IMG height=411 src="http://www.it20.info/misc/pictures/EnterpriseVirtualizationIn-a-Box3.jpg" width=548 border=0&gt;
&lt;P&gt;&lt;/P&gt;
&lt;P align=justify&gt;This is where we are today basically. VMware pioneered this concept some 10 years ago and there is now a string of companies that have realized the benefits of this and are working hard to deliver products to implement this idea. I started working on server virtualization some 8 years ago and at that time it was all about server consolidation (i.e. how many servers do you have? 100? we can bring them down to 5 etc &lt;SPAN&gt;&lt;/SPAN&gt;. The more I was working on it the more I understood that we were only scratching the surface of the potentials. Today server consolidation is still a huge advantage for those customers virtualizing but it's clearly only one of the many advantage line items. As it was for storage virtualization we started with the consolidation concept to find out that there were many other hidden and indirect advantages as a bonus of doing that. One example for all is that, as you virtualize your Windows or Linux systems, it becomes far easier to create a Disaster/Recovery plan for your x86 IT infrastructure. &lt;/P&gt;
&lt;P align=justify&gt;Last but not least the server virtualization trend is intimately associated to the storage virtualization (i.e. SAN) trend for two key reasons:&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;
&lt;P align=justify&gt;the standard server virtualization best practices require shared storage to exploit all the benefits &lt;/P&gt;&lt;/LI&gt;
&lt;LI&gt;
&lt;P align=justify&gt;server virtualization is allowing customers to get rid (completely) of local attached storage. While data has been historically moved to a shared repository (SAN) the standard "2 x Raid1 drives pair" remained a (negative) legacy of the x86 deployments. The latest trends (that are embedded hypervisors on flash disks and/or PXE boot techniques for the hypervisors) will help getting rid completely of all the local server spindles for good!&lt;/P&gt;&lt;/LI&gt;&lt;/OL&gt;
&lt;P align=justify&gt;So why am I so excited about the BladeCenter S you might wonder? Well the BladeCenter S maps exactly the industry trend I have described above. Instead of going out for shopping and cabling together all these elements (servers, SANs, etc) BladeCenter S is a single package that contains them all: &lt;I&gt;&lt;B&gt;servers&lt;/B&gt;&lt;/I&gt;, &lt;I&gt;&lt;B&gt;storage&lt;/B&gt;&lt;/I&gt; and &lt;I&gt;&lt;B&gt;network&lt;/B&gt;&lt;/I&gt;! Enterprise Virtualization In-a-box! Or a &lt;FONT color=#ff0000&gt;&lt;B&gt;data-center-in-a-box&lt;/B&gt;&lt;/FONT&gt; if you will! &lt;/P&gt;&lt;IMG height=489 src="http://www.it20.info/misc/pictures/EnterpriseVirtualizationIn-a-Box4.jpg" width=643 border=0&gt;
&lt;P&gt;&lt;/P&gt;
&lt;P align=justify&gt;What you see here is basically the physical view/package of the de-facto-standard hardware architecture to support virtual environments. The key point I am trying to outline here is that the disks you see integrated into the chassis are really connected to a true fully redundant internal SAN comprised of 2 x SAS redundant RAIDed switches. It essentially maps the standard servers to storage architecture blue-prints we have been using in the last few years to implement shared storage virtualized deployments. The following picture, for example, is an extract from the standard VMware SAN configuration guide and it illustrates this standard blue-print (which is mapped into the BladeCenter S internal architecture):&lt;/P&gt;&lt;IMG height=364 src="http://www.it20.info/misc/pictures/EnterpriseVirtualizationIn-a-Box5.jpg" width=369 border=0&gt;
&lt;P&gt;&lt;/P&gt;
&lt;P&gt;Notice that the only slight difference is that the SAS switches integrated into the BladeCenter S deliver both &lt;B&gt;switch&lt;/B&gt; as well as &lt;B&gt;SP&lt;/B&gt; functionalities. &lt;/P&gt;
&lt;P align=justify&gt;It might perhaps help sharing with you some more documentation I have been working on and that we presented at the local VMware Virtualization Forum that took place in Milan a few days ago. The following picture describes the internal architecture of the BladeCenter S in further details: &lt;/P&gt;&lt;IMG height=562 src="http://www.it20.info/misc/pictures/EnterpriseVirtualizationIn-a-Box6.jpg" width=967 border=0&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Notice how the servers-storage connections are similar in concept to those in the standard VMware blueprint (but not limited to VMware deployments though) attached above. Each blade is equipped with a dual-port SAS HBA which in turn connects to 2 x SAS RAIDed switches which control the disks. For those of you familiar with the IBM storage products family this is very similar to what happens when you connect ESX servers to an external DS3200 SAS Storage Server configured with dual controllers. Since in the last few months I have been talking to customers and partners that were pretty confused about what this &lt;U&gt;really is&lt;/U&gt; and how it compares to other implementations available in the industry I did want to outline what other blade vendors are doing to underline the differences:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;/P&gt;&lt;IMG height=536 src="http://www.it20.info/misc/pictures/EnterpriseVirtualizationIn-a-Box7.jpg" width=967 border=0&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;While from a physical standpoint it might look pretty similar (i.e. "&lt;I&gt;a chassis with a bunch of blades and a bunch of disks&lt;/I&gt;") if you dig into the internals it's of course completely different. The other option outlined in the picture above involves dedicating a single blade (hence a Single Point Of Failure) with Windows Server 2003 Storage Server and a bunch of disks attached to it. The Windows instance running on the Storage Blade controls the disks and exposes them onto the internal Ethernet network via NFS/iSCSI protocols. This is how other blades in the chassis can "share" those disks. There are, obviously, fundamental differences between having a multi-purpose Windows blade sharing disks over the network compared to using a standard and fully redundant SAN approach comprised of a dedicated couple of purpose designed SAS RAID switches that control the disks and map those disks to compute nodes (i.e. the blades dedicated to the virtual infrastructure). The following picture reminds the physical layout of the BladeCenter S with the integrated SAN. &lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;/P&gt;&lt;IMG height=516 src="http://www.it20.info/misc/pictures/EnterpriseVirtualizationIn-a-Box8.jpg" width=967 border=0&gt;
&lt;P&gt;&lt;/P&gt;
&lt;P align=justify&gt;&amp;nbsp;&lt;/P&gt;
&lt;P align=justify&gt;On the left hand side you can see the front of the chassis where the disks (we had 4 of them in our demo on-site) and the blades (2 x HS21XM in our setup) are installed. On the right hand side the rear view of the BC S chassis shows the 2 x Ethernet switches (that can support up to 4 Ethernet connections from each of the blades) and 2 x SAS RAIDed switches (that control the disks on the front of the chassis and are connected to the blades by means of the SAS daughter cards). &lt;/P&gt;
&lt;P align=justify&gt;Another interesting point I wanted to outline via this setup is that the BladeCenter S is really meant to be a self-contained data center. This doesn't only include the standard &lt;I&gt;User Workloads&lt;/I&gt; (i.e. the guests that are going to support the customer own environment such as Active Directory, Databases, Web Servers, Application Servers etc) but it also includes all the additional services that are required to configure, monitor and maintain the data center (in a box). Examples of these &lt;I&gt;System Services&lt;/I&gt; include the &lt;I&gt;vCenter&lt;/I&gt; service (red rectangle in the figure above) which can be installed on top of the virtual infrastructure as well as what I refer to as the &lt;I&gt;HW Management&lt;/I&gt; service which is the suite of software products that are used to manage the hardware and its configuration (the yellow rectangle in the figure above - it might include things like IBM Systems Director, IBM Storage Configuration Manager etc). The logical view shows these two services (&lt;I&gt;vCenter&lt;/I&gt; and &lt;I&gt;HW Management&lt;/I&gt;) as external entities that map respectively the ESX hosts comprising the virtual infrastructure and the Management Module (MM for short) that is the heart of the BladeCenter chassis. There is no reason though for which these services need to be installed physically outside of the BladeCenter "domain". A forward-looking take of these services is to consider them a sort of &lt;I&gt;System Partition&lt;/I&gt;s that run side by side with the end-user workloads. These &lt;I&gt;System Services&lt;/I&gt;, as of today, need to be installed manually but ideally in the future they could potentially be distributed as Virtual Appliances (yes Virtual Appliances is my obsession, sorry) for a more streamlined and fast deployment. &lt;/P&gt;
&lt;P align=justify&gt;In the next few screenshot I'd like to give you a high-level feeling of what happens when you connect to the &lt;I&gt;&lt;SPAN&gt;HW Management&lt;/SPAN&gt;&lt;/I&gt; service to configure the hardware components (the shared storage in this case). For this setup I have only installed the IBM Storage Configuration Manager in that &lt;I&gt;HW Management&lt;/I&gt; &lt;I&gt;System Partition&lt;/I&gt;.&lt;/P&gt;
&lt;P align=justify&gt;&amp;nbsp;&lt;/P&gt;
&lt;P align=justify&gt;First you connect, via web, to the SCM service. One of the main screen summarizes the actual internal hardware storage configuration which is a RAID subsystem comprised of 2 x SAS switches:&lt;/P&gt;&lt;IMG height=652 src="http://www.it20.info/misc/pictures/EnterpriseVirtualizationIn-a-Box9.jpg" width=963 border=0&gt;
&lt;P&gt;&lt;/P&gt;
&lt;P align=justify&gt;&amp;nbsp;&lt;/P&gt;
&lt;P align=justify&gt;Next is the physical view of the chassis. As you can see we have 4 x physical disks plugged into the front of the chassis and 2 x physical SAS switches in the back of the chassis (the two additional devices you notice in the front are the SAS controller caches). A maximum of 12 physical disks can be installed:&lt;/P&gt;&lt;IMG height=652 src="http://www.it20.info/misc/pictures/EnterpriseVirtualizationIn-a-Box10.jpg" width=963 border=0&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The following view details the characteristics of the physical hard disks:&lt;/P&gt;&lt;IMG height=652 src="http://www.it20.info/misc/pictures/EnterpriseVirtualizationIn-a-Box11.jpg" width=963 border=0&gt;
&lt;P&gt;&lt;/P&gt;
&lt;P align=justify&gt;&amp;nbsp;&lt;/P&gt;
&lt;P align=justify&gt;Next we create a Storage Pool (aka Array) comprised&amp;nbsp; of these 4 physical drives. This is a very basic configuration where we designate one of the disk as a global hot spare and three of the disks as a Raid 5 Storage Pool. Total available capacity is 2 disks (1 is used for parity in a RAID 5 array). Notice that the space available is basically 0 because I have already created LUNs out of this array (see next):&lt;/P&gt;&lt;IMG height=652 src="http://www.it20.info/misc/pictures/EnterpriseVirtualizationIn-a-Box12.jpg" width=963 border=0&gt;
&lt;P&gt;&lt;/P&gt;
&lt;P align=justify&gt;&amp;nbsp;&lt;/P&gt;
&lt;P align=justify&gt;These are the two Logical Units (aka LUNs) that I have created using the Storage Pool described above. One is 90GB and the other one is 43GB in capacity:&lt;/P&gt;&lt;IMG height=652 src="http://www.it20.info/misc/pictures/EnterpriseVirtualizationIn-a-Box13.jpg" width=963 border=0&gt;
&lt;P&gt;&lt;/P&gt;
&lt;P align=justify&gt;&amp;nbsp;&lt;/P&gt;
&lt;P align=justify&gt;The following view lists the discovered SAS daughter cards (hence the corresponding blades) on the SAS fabric. Notice that each blade has two ports for redundancy and each port has its own SAS WWN. This is not any different from a standard FC configuration for those of you used to Storage Area Networks:&lt;/P&gt;&lt;IMG height=652 src="http://www.it20.info/misc/pictures/EnterpriseVirtualizationIn-a-Box14.jpg" width=963 border=0&gt;
&lt;P&gt;&lt;/P&gt;
&lt;P align=justify&gt;&amp;nbsp;&lt;/P&gt;
&lt;P align=justify&gt;This is how I have mapped Servers to LUNs. On the left hand side I have listed both blades whereas on the right hand side I have listed both LUNs I have created. Doing so I allowed both blades to share both LUNs. There is no particular reason for which I have created 2 LUNs. I could have created 1 or 3 or 4 if I wanted/needed to and I would have been able to share them with both blades:&lt;/P&gt;&lt;IMG height=652 src="http://www.it20.info/misc/pictures/EnterpriseVirtualizationIn-a-Box15.jpg" width=963 border=0&gt;
&lt;P&gt;&lt;/P&gt;
&lt;P align=justify&gt;&amp;nbsp;&lt;/P&gt;
&lt;P align=justify&gt;So far we have been working against the &lt;I&gt;HW Management&lt;/I&gt; to configure the hardware (this example is limited to configuring the shared storage). Now we can switch gear and we can connect to the other &lt;I&gt;System Partition&lt;/I&gt; to manage the virtual infrastructure software. In this case we will connect to the &lt;I&gt;&lt;SPAN&gt;vCenter&lt;/SPAN&gt;&lt;/I&gt; service to configure our VMware infrastructure. Notice that, although I have been using a beta version of the next VMware virtual infrastructure product, everything you will see here can be done with the latest VI3 version &lt;U&gt;available today&lt;/U&gt;.&lt;/P&gt;
&lt;P align=justify&gt;The following screenshot outlines the overall configuration of our data-center-in-a-box. As you can see there are 2 blades equipped with ESX and they belong to a cluster. On these blades we have created the two management partitions we have been discussing (&lt;I&gt;vCenter&lt;/I&gt; and &lt;I&gt;HW Management&lt;/I&gt;). There are also some Guests templates I have created. One important thing to notice from this screenshot is that the first blade can access both shared SAS LUNs (for the records it can also access its own dedicated/local Storage1 VMFS volume):&lt;/P&gt;&lt;IMG height=652 src="http://www.it20.info/misc/pictures/EnterpriseVirtualizationIn-a-Box16m.jpg" width=963 border=0&gt;
&lt;P&gt;&lt;/P&gt;
&lt;P align=justify&gt;&amp;nbsp;&lt;/P&gt;
&lt;P align=justify&gt;The next picture confirms that &lt;U&gt;both blades&lt;/U&gt; can access the shared LUNs created. This allows all VMware advanced features such as VMotion, DRS, HA etc:&lt;/P&gt;&lt;IMG height=652 src="http://www.it20.info/misc/pictures/EnterpriseVirtualizationIn-a-Box17m.jpg" width=963 border=0&gt;
&lt;P&gt;&lt;/P&gt;
&lt;P align=justify&gt;&amp;nbsp;&lt;/P&gt;
&lt;P align=justify&gt;Here we will attempt a VMotion of the &lt;I&gt;HW Management&lt;/I&gt; partition running on &lt;B&gt;esx1 &lt;/B&gt;onto the other host in the cluster:&lt;/P&gt;&lt;IMG height=652 src="http://www.it20.info/misc/pictures/EnterpriseVirtualizationIn-a-Box18m.jpg" width=963 border=0&gt;
&lt;P&gt;&lt;/P&gt;
&lt;P align=justify&gt;&amp;nbsp;&lt;/P&gt;
&lt;P align=justify&gt;The Guest is being moved from one host onto the other. Notice the status bar at the bottom:&lt;/P&gt;&lt;IMG height=652 src="http://www.it20.info/misc/pictures/EnterpriseVirtualizationIn-a-Box19m.jpg" width=963 border=0&gt;
&lt;P&gt;&lt;/P&gt;
&lt;P align=justify&gt;&amp;nbsp;&lt;/P&gt;
&lt;P align=justify&gt;And here the Guest has moved and it's now running on &lt;B&gt;esx2&lt;/B&gt; as you can see from the Summary pane (and the status bar at the bottom):&lt;/P&gt;&lt;IMG height=652 src="http://www.it20.info/misc/pictures/EnterpriseVirtualizationIn-a-Box20m.jpg" width=963 border=0&gt;
&lt;P&gt;&lt;/P&gt;
&lt;P align=justify&gt;I truly believe that the BladeCenter S is a piece of technology that is sometimes under valuated. There is an enormous potential in it that many people haven't fully exploited. It's really what I would describe as a no-compromise Enterprise "pocket" data center. Not so much "pocket" after all because if you think that an HS21XM blade could support, on average, some 15/20 VMs (depending on the workload), we are talking about a 7U Enterprise solution that could support around 100 VMs. Far more than what an average SMB shop might require.&lt;/P&gt;
&lt;P align=justify&gt;Massimo. &lt;/P&gt;&lt;img src="http://it20.info/aggbug.aspx?PostID=162" width="1" height="1"&gt;</description></item><item><title>Will Microsoft sunset VMware? - 18 months later -</title><link>http://it20.info/blogs/main/archive/2008/11/04/157.aspx</link><pubDate>Tue, 04 Nov 2008 00:35:00 GMT</pubDate><guid isPermaLink="false">3066da22-6b27-4cf1-aa0f-2eff79b21f87:157</guid><dc:creator>Massimo</dc:creator><slash:comments>736</slash:comments><comments>http://it20.info/blogs/main/comments/157.aspx</comments><wfw:commentRss>http://it20.info/blogs/main/commentrss.aspx?PostID=157</wfw:commentRss><description>&lt;P align=justify&gt;Early in 2007 I wrote a post whose title was "Will Microsoft Sunset VMware?". You can read it &lt;A href="/blogs/main/archive/2007/04/15/7.aspx"&gt;here&lt;/A&gt;. The closing of that post was: &lt;/P&gt;
&lt;P align=justify&gt;&amp;gt; &lt;I&gt;This analysis is as of April 2007. I am sure many things can and will change and I might be proven wrong. Let's see what happens.&lt;/I&gt;&lt;/P&gt;
&lt;P align=justify&gt;I went through it this morning and I have to say that (so far) I have gotten it right. I could even republish it "as is" and it would still hold true even 18 months later (except Microsoft did change the name of their hypervisor!): Xen didn't really take over the world (and the KVM speculations I made are materializing now with RedHat and SUSE switching to KVM and abandoning Xen) and also all the thoughts about innovation, add-on value, cost and so forth do still make some sort of sense as of (end of) October 2008. &lt;/P&gt;
&lt;P align=justify&gt;The reason I bring this topic to the foreground again on my blog is because more than ever I read on the blogsphere comments about how VMware is going to be eclipsed by Microsoft given the fact that the Redmond giant is engaging seriously. I am not ruling out this possibility as no one knows what will happen in the future (one could only speculate given past and present experiences) but I wanted to stress on the fact that these people don't get (in my opinion) what's really going on here. And what's going on ... is a very big thing. &lt;/P&gt;
&lt;P align=justify&gt;Let me try to be concise (something that I have never really mastered). Overall at VMware I think they are working out their plan at two different levels which I refer to as the &lt;I&gt;&lt;B&gt;tactical level&lt;/B&gt;&lt;/I&gt; and the &lt;I&gt;&lt;B&gt;strategic level&lt;/B&gt;&lt;/I&gt;. &lt;/P&gt;
&lt;P align=justify&gt;At the &lt;B&gt;tactical level&lt;/B&gt;, VMware is engaged to provide the best hypervisor and the best management tools to create a virtual infrastructure. At this level, they position VMware ESX as the best hypervisor Vs Microsoft Hyper-V; VMware VI3 (along with all the other tools they have announced in the last year or so) as the best management tools Vs the Microsoft Systems Center suite (which includes Virtual Machine Manager) etc etc all this aimed at supporting&amp;nbsp; &lt;I&gt;legacy Linux and Windows type of workload &lt;/I&gt;in the best possible way. &lt;/P&gt;
&lt;P align=justify&gt;After all if you think how you use today's virtual infrastructure - built on various software platforms such as VMware, Microsoft, Citrix or VirtualIron - is used, I think it's fair to say that your virtual machine can be defined as super flexible and powerful (virtual) hardware but the software stack you run within the VM (i.e. the black box) is hardly different than the software stack you would be running on a physical box. So given a legacy Linux or Windows stack comprised of many dozens, hundreds or even thousands of physical servers, what is the best target virtualization platform to make a giant P2V, so to speak? This is the tactical battle VMware is engaged in to stay ahead of Microsoft.&lt;/P&gt;
&lt;P align=justify&gt;I agree that if you only look at things from this level, VMware could be in a dangerous position when it's all about "just" writing code to catch your competitor's feature set. We know MS is pretty good at that plus they have deep pockets they can throw at tons of developers to shrink the gap. Well, it's clearly not that easy and I am obviously exaggerating but you have got the idea: if it's just about "a tool" there is always a possibility that your competitors will catch you if they become serious about that. I think this is why many people think that VMware could become the next Netscape.&lt;/P&gt;
&lt;P align=justify&gt;The &lt;B&gt;strategic level&lt;/B&gt; at which VMware is engaged... actually I touched on this 18 months ago and that very same thought remains very much true, and it's materializing with the latest VMware messages. In that blog post (April 2007) I wrote: &lt;/P&gt;
&lt;P align=justify&gt;&lt;I&gt;&lt;FONT color=#ff0000&gt;&amp;gt;Changing the rules&lt;/FONT&gt;: perhaps one of the most important thing which is leading me to think that VMware will not be sunset is the fact that they (VMware) are thinking about "changing the rules" in the datacenter and &amp;gt;of&amp;nbsp; IT in general rather than viewing virtualization as a means to reduce the number of servers from 20 to 1. While the use of virtualization has originally being considered for Server Consolidation projects clearly this &amp;gt;is now one of the many facets of the advantages that a virtualized Datacenter and a virtualized IT will gain (Disaster Recovery is certainly one example of these new scenarios). Another example of these new use cases &amp;gt;for virtualization are Virtual Desktops hosted in the Datacenter that are changing the way Administrators are thinking about their distributed IT. The next frontier would be Virtual Appliances which is a very different &amp;gt;way to develop and deploy applications compared to what we are doing today. In such a scenario the role of the Operating System would change drastically where some of the OS features would be drained into the &amp;gt;virtual infrastructure while some others will be distributed as part of the application in a consolidated virtual machine file (that is the virtual appliance). This is a fascinating scenario and as you can imagine it &amp;gt;involves more than just developing a hypervisor with a management interface to it: it involves creating a new culture on how we deal with IT, taking all the pieces apart and rebuild our datacenters in a much more &amp;gt;efficient way. &lt;/I&gt;&lt;/P&gt;
&lt;P align=justify&gt;I wouldn't know how to say it better in October 2008. Perhaps the only thing I can do is add a couple of pictures that would graphically outline this concept: &lt;/P&gt;&lt;IMG height=426 src="http://www.it20.info/misc/pictures/will%20MS%20sunset%20VMware%20-18%20months%20later-.jpg" width=847 border=0&gt; 
&lt;P&gt;&lt;/P&gt;
&lt;P align=justify&gt;The silo on the left outlines what I think to be the Microsoft systems virtualization strategy. &lt;I&gt;Systems &lt;/I&gt;being here a key word: MS does have a more articulated virtualization strategy that goes beyond virtualizing a piece of server hardware (so do VMware and Citrix, for the record). However this discussion is really centered on systems virtualization and the corresponding stack. Back to the point... at Microsoft they can't afford to compromise a very successful (and healthy) business such as Windows OS, so Windows does need to remain very centric in their systems virtualization strategy. Windows is the mean by which they deliver their value and Windows will be their strategic play. It's not by chance that they pitch Hyper-V as a Windows 2008 value item, for example. It's not by chance that they pitch Microsoft Systems Center as a toolset to properly manage both virtual and physical Windows deployments. It's not by chance that all of their products are Windows-based (except perhaps Office for MAC and a few others which would be fair to describe as "not the bulk of their business" anyway). We can go on and on but at the end we will always be gravitating around one central and critical word: Windows. &lt;/P&gt;
&lt;P align=justify&gt;The silo on the right, on the other hand, outlines what I think to be the ultimate VMware strategy. They basically want the virtualization layer to become the Datacenter OS. I speculated about this at VMworld 2007 and they announced this at VMworld 2008 (read &lt;A href="/blogs/main/archive/2008/09/21/143.aspx"&gt;this irreverent post&lt;/A&gt; if you have time). VMware would like to challenge the current notion of the OS: they would like to take apart the OS we know and redistribute part of its features into their new &lt;I&gt;Virtual Datacenter OS&lt;/I&gt; concept and part of its features into this new Just Enough OS (JEOS) concept. JEOS wraps the application and only provide minimal assistance to it (to the point it only needs to provide boot capabilities and a proper minimal run-time environment). &lt;/P&gt;
&lt;P align=justify&gt;As you can depict from the pictures it would be very difficult to map what Microsoft and VMware are trying to drive strategically and come up with an apple-to-apple comparison. This is the strategic challenge in which VMware is engaged. And the interesting thing is that they are not engaged against Microsoft, they are engaged against a whole industry that is used to look at the x86 stack in a "slightly" different (and much less aggressive) way than VMware is, in my opinion, envisioning. As a matter of fact we are still trying to get users digest "virtualization" to support standard legacy software stacks (and it's not always easy). I am sure you can imagine what it will take for the industry as a whole to digest this new software stack layout. This is in fact, not by chance, one of the strongest value propositions Microsoft is promoting: all the benefits of virtualization without disruption and discontinuity from the past. &lt;/P&gt;
&lt;P align=justify&gt;The final analysis: this is where the real battleground is for the next few years to come. If the industry embraces the VMware message and strategy and starts to redefine the software boundaries in the data center, then VMware will have the lead. If the industry does not embrace the VMware messages and will settle on the advantages of running a legacy software stack in a slim software bubble (VM) as opposed to running the same software stack on top of a dedicated physical box... than MS can cause much trouble for the VMware business, and VMware will be forced to continue their &lt;B&gt;tactical battle&lt;/B&gt; I talked about at the beginning.&lt;/P&gt;
&lt;P align=justify&gt;My speculation is that virtual appliances will have a huge role in this. Virtual appliances, by definition, implement the ultimate VMware vision. The success (or lack of thereof) of the virtual appliances will determine VMware's future as a winner or as a looser in the data centers. VMware could well be the next Netscape but, what if it is the next Microsoft? Interesting dilemma. I don't know who is going to win and who is going to lose in the end, but I am certain Microsoft will not sunset VMware nor will VMware sunset Microsoft. The x86 market is healthy enough that, while the winners can really make tons of money, the losers will have their slice of the pie, too, for some time to come.&lt;/P&gt;
&lt;P align=justify&gt;Massimo. &lt;/P&gt;&lt;img src="http://it20.info/aggbug.aspx?PostID=157" width="1" height="1"&gt;</description></item><item><title>Distributed IT is (definitely) broken</title><link>http://it20.info/blogs/main/archive/2008/10/25/155.aspx</link><pubDate>Fri, 24 Oct 2008 22:38:00 GMT</pubDate><guid isPermaLink="false">3066da22-6b27-4cf1-aa0f-2eff79b21f87:155</guid><dc:creator>Massimo</dc:creator><slash:comments>0</slash:comments><comments>http://it20.info/blogs/main/comments/155.aspx</comments><wfw:commentRss>http://it20.info/blogs/main/commentrss.aspx?PostID=155</wfw:commentRss><description>&lt;P align=justify&gt;I have been working in IT for about 17 years now, 14 of which at IBM. Since the first day I was immediately exposed to the concept of a centralized IT where everything is fully controlled, fully secured, fully automated and easy to manage within the data center boundaries; on the other hand whatever sits outside of the server room should be &lt;I&gt;dumb&lt;/I&gt; and wouldn't require any (major) maintenance tax onto the IT organization. For those that have been around for a while this exactly describes how a mainframe operates (more or less).&lt;/P&gt;
&lt;P align=justify&gt;"Unfortunately" (you can speculate on the apexes if you want) I have built my career at IBM on something that sits exactly on the other side of the spectrum compared to the mainframe: that is the x86-based server business (was PC Servers, was Netfinity, was xSeries, is now System x / BladeCenter). That's why I have enjoyed, in the last few years, looking at the mainframes as the &lt;I&gt;holy grail&lt;/I&gt; (or the &lt;I&gt;polar star&lt;/I&gt;) where I'd like to push my "little" x86 servers. &lt;/P&gt;
&lt;P align=justify&gt;So why is the distributed IT broken? Simply because I think businesses have sold their soul to the evil as they compromised things like control, security, automation and low costs of operations for the nirvana of flexibility and low acquisition costs that came with x86 servers (and PCs). And being this model a client-server model it has affected both the x86-based server portion of the data center as well as the (even more distributed) client environment. Client-Server here doesn't strictly pertain to the architecture of the applications but it rather pertains to the devices one will end up managing no matter what the application architecture is: the application of choice might be Web-based but at the end of the day most likely the IT organization will be running the web server on an x86 Windows or Linux box and the end-user browser will be accessed on a fully featured PC/Laptop running a Windows client OS. It's going to be a Client/Server world anyway no matter the application architecture.&lt;/P&gt;
&lt;P align=justify&gt;In this brief post I just want to show a couple of proof points of this broken IT model. The first one is a screenshot of a "server" I found during a local customer visit. Ready? Fasten your seat-belt please:&lt;/P&gt;&lt;IMG height=411 src="http://www.it20.info/misc/pictures/Distributed-IT-is-definitely-broken1.JPG" width=548 border=0&gt;
&lt;P&gt;&lt;/P&gt;
&lt;P align=justify&gt;Now, this is not a guess, this is for sure (I did ask) a Microsoft Software Update Services (SUS) "Server".&amp;nbsp;While the first sticker (on the green bazel) says "Test..." the other one features a "NON SPEGNERE" that&amp;nbsp; means "DO NOT POWER OFF" so those of you that are thinking this was a sort of quick and dirty trial on the desk... should be thinking twice about it. A couple of additional things you might want to notice are that this "server" was physically located on an office desk so it means that the x86-based portion of that data center basically left the actual physical data center rooms and has had ramifications outside of it (very scaring). The second thing to notice is that by no means this is a small SMB shop (I have seen production MAIL servers at those accounts that were even worse than this); no this is a big enterprise customer with many thousands of (actual) servers. Definitely if such big organizations are doing things like these, what's going on in "our" server rooms (and outside of them!) is pretty scaring to say the least. &lt;/P&gt;
&lt;P align=justify&gt;So much for the server side of the things. How about the clients (desktops/laptops)? Do you remember those zero-maintenance 3270/5250 terminals we all used to access our AS/400 and mainframe programs? Well I took this other picture a few days ago and while it's not as scaring as the other above it tells a lot about where we have got with desktop/laptop management:&lt;/P&gt;
&lt;P&gt;&lt;IMG height=412 src="http://www.it20.info/misc/pictures/Distributed-IT-is-definitely-broken2.JPG" width=372 border=0&gt;&lt;/P&gt;
&lt;P&gt;&lt;/P&gt;
&lt;P align=justify&gt;It literally says: &lt;/P&gt;
&lt;P align=justify&gt;--------------------------------------------------------------------------------------------------------------------------&lt;/P&gt;
&lt;P align=justify&gt;&lt;I&gt;&lt;B&gt;&lt;FONT size=4&gt;Distribution point for 1GB additional memory (RAM) to install Lotus Notes 8.0.1 &lt;/FONT&gt;&lt;/B&gt;&lt;/I&gt;&lt;/P&gt;
&lt;P align=justify&gt;&lt;I&gt;&lt;B&gt;&lt;FONT color=#ff0000 size=5&gt;The laptop needs to be Powered Off! Not Hibernated!!!&lt;/FONT&gt;&lt;/B&gt;&lt;/I&gt;&lt;/P&gt;
&lt;P align=justify&gt;--------------------------------------------------------------------------------------------------------------------------&lt;/P&gt;
&lt;P align=justify&gt;The scaring thing about this is that the organization going through this massive process has roughly 9.000 employees. If you compare this (little example) to the way a central processing unit with dumb terminals used to work you start getting the feeling about how much broken things are in the x86 (client-server) space. &lt;/P&gt;
&lt;P align=justify&gt;Now I am 100% sure we won't go back to those days (nor I am suggesting that we try to do that) also because no one would want to give up with the GUI experience for a green character interface (how the h%&amp;amp;l can I watch YouTube on a 3270 terminal?) but yet clearly something needs to be done. The good news is that there are technologies that will allow IT organizations to do this and get to the point where they do not need to trade-off control, security and other important data center aspects to get the flexibility and experience end-users demand (and expect) in the 21st century.&lt;/P&gt;
&lt;P align=justify&gt;Imagine... a world where your SUS "Server" will just be a service running in your server room (or someone else's server room out in the cloud) that doesn't require a "dedicated server" in your data center (and not even a dedicated desktop in the office - can you believe it?) and where your e-mail client update won't pre-req anyone to go to the office (and waste half a day) to get an additional 1GB of memory.... &lt;/P&gt;
&lt;P align=justify&gt;You may say I am a dreamer, but I am not the only one (where did I hear this?). &lt;/P&gt;
&lt;P align=justify&gt;Massimo. &lt;/P&gt;&lt;img src="http://it20.info/aggbug.aspx?PostID=155" width="1" height="1"&gt;</description></item></channel></rss>