Monday, December 23, 2013

Error while updating VMTools in Fusion 6

Like many Mac users, I run VMware Fusion so I can use Windows when needed. I've got a VM running Windows 8.1 that I use primarily for Visio and RDP sessions (I've never liked the native RDP client for OS X). And to be honest, it's one more machine for me to patch and update; this is therapeutic for my OCD.

I ran into a problem a few weeks ago while trying to update VMTools in my Windows 8.1 guest. It's an error I've seen many times over the years with various Windows applications, and when it popped up, I had flashbacks from regedt32 and blowing away keys on Windows 2000.

I figured this was a common issue, so I searched VMware's KB and found this article: Unable to upgrade existing VMware Tools (1001354). I ran through the section on Windows 8, but none of those steps applied to my installation. The last section (finding and deleting all occurences of vmware in the registry) was a little more than I needed. I didn't want to corrupt all of the VMware software I've got loaded (PowerCLI is a MUST!). So I tried something different.

In the error screenshot, you'll notice that the source field references a CLSID (it's the data that begins with {8AE43F1). The CLSID uniquely identifies the application that you're working with. I decided to search the registry for any data that matched the first portion of this CLSID. Here's the result of that search:


You can see that the search turned up an entry that referred to VMware Tools and the path to the installer. I decided to delete this whole key and then restart the VM. (This is where timid bloggers will say "make a backup." I say just delete it. I'm bold like that.)

Success! I was able to install VMTools without error after getting rid of that key.

Now VMTools is current and running. Don't waste time wondering if this is a Windows problem or a VMTools problem. You'll drive yourself mad in the process. Just fix it and move on. Nothing to see here.

Friday, December 20, 2013

A Hammer for Every Nail

I spend a lot of time at Home Depot. A lot.
You know how it is: every time you start a new project at home, you go out of buy a whole new set of tools specifically for that job, right? The hammer you bought to hang those picture frames can't possibly be used to attach wheels to a pinewood derby car. Or maybe advances in hammer technology have made your old hammers obsolete, so you're compelled to purchase a new one every few years. You still keep the others around, though. Just in case.

Note: Only insane people do this.

Tools perform tasks, not projects

You buy a hammer to drive nails, not just nails for a single project. When another project comes along, you use the same tools, as long as those tools are suited for the job. You certainly do not buy a hammer for every nail you drive.

Now take this approach to tools, and apply it to enterprise monitoring.

In large organizations, it's common for each application, service, or portion of infrastructure to have its own monitoring tools. Exchange is monitored by Exchange monitoring tools. Storage is monitored by storage monitoring tools. And so it is with all of the technologies in use. Each project has its own hammer. And thanks for institutionalized silos (both organizational and cultural) tools shall not be shared.

This doesn't sound so bad. Is this bad?

I took the photo of hammers at Home Depot to prove a point. Yes, if you are so inclined and well funded, you can absolutely buy a hammer for every nail. Load up a shopping cart with 100 hammers and go through the check-out. (On a side note, if you actually do this, please please please send me a picture.) Use each hammer for a single nail.

Or at the office, go ahead and buy a monitoring solution every time you stand up a new service. Load up your budget with Tivoli, BSM, SolarWinds, ManageEngine, Quest, and whatever else you can find. Total up the costs and get your manager to approve it. (On a side note, if you actually do this, please please please send me a picture.) Use each monitoring solution for a single service.

What's the point?

The point is this: if you approach enterprise monitoring differently for each service, you're doing it wrong. If you currently have a spreadsheet of the 40-50 "enterprise" monitoring tools you have in your environment, you're doing it wrong.

Instead, survey the services and applications that your IT department provides to your users. Build out requirements for a true enterprise monitoring solution (which will likely be a combination of tools, yes, but the combination should be intentional and complementary). Invest in a tool to monitor your enterprise. That's the perspective you'll want to have. You'll reduce costs and complexity by relying on fewer, more powerful tools.

Friday, December 6, 2013

HA vs FT

First things first: I'm not talking exclusively about vSphere here. I'm writing about concepts, expectations, and operations, not any one product in particular.

I'm wrapping up my first week at a new project. It was a first week like most others: spending time on administrative tasks, completing various applications for access to stuff, and meeting people that I'll be spending some quality time with in the near future. I even sat in on a few meetings that helped me start to wrap my head around the infrastructure, which is rapidly changing.

Dude, this picture doesn't even make sense.
One of the topics that came up in these meetings had to do with a pair of firewalls, and their configuration as a active / passive pair. From what I gather, the conversation (during the design phase) went like this:

Management - "Are the firewalls designed to be redundant?"
Engineer - "Yes."

A seemingly innocuous, normal exchange. But the difference was in the way the engineer interpreted management's question. Management, by way of redundant, meant that a failure at the hardware level will not affect flow of traffic. Not even for a second. The engineer, upon hearing redundant, meant that yes, there were two firewalls, and if one failed the other one would handle the load after a brief outage.

This is where HA vs FT becomes important. In a vSphere cluster, we know that HA will let us recover virtual machines, automatically, AFTER the hosts determine that a failure has occurred. In order to reduce unnecessary failover, there's some logic in HA that prevents VM recovery until multiple checks and heartbeats have failed. The effect of this logic is that VM restart can take a minute or two (or three, I don't have my vSphere bible (aka the Clustering Deepdive book) with me at the moment) to start. As vSphere people, we're accepting of this time. It's still WAY faster than any manual failover or recovery that we could do. But this does mean that there's an outage.

This truly is HA: High Availability. It's not the avoidance of outages; it's the rapid recovery from outages.

FT is a different beast altogether. Now we're talking active/active. And that brings up lots of other considerations (tracking sessions across devices, addressing, how to monitor, load balancing, et cetera). It's FT that management is expecting when they say "redundant," not HA. They're looking for a solution that has no impact to their business customers during a hardware failure.

Engineers will say, "But fault tolerant systems cost 10x more! Diminishing returns! Unnecessary complexity!" Those all may be true. But management needs to hear that and make the decision on whether pursuing fault tolerance is worth it. Don't assume that it's too expensive. Find out what the functional requirements are at the start of your project, document, and get approval. There's always money for a good solution, and there's rarely money for a bad one.

The management-to-engineering interface is always a challenge in the IT world. Learning to speak both languages helps to avoid the problem of miscommunicated and misunderstood requirements.

Saturday, November 23, 2013

Ever the New Guy

My penmanship sank years ago.
Show of hands: who among you has a hilarious anecdote about a FNG at your job? You know, that annoyingly enthusiastic technologist who wants to share their experience from other workplaces?

An often overlooked benefit of being a consultant in the IT world is that you get to see how technology is used in many diverse business settings. Your eyes open to the variety of functional and non-functional requirements that can shape the design of things like vSphere and UCS. Your assumptions about how technology is used will be challenged, and you'll end developing your consulting skills (and probably your coping skills, too).

But these are all benefits for you, dear consultant. What about the client? What do they gain from some consultant showing up for a few weeks, maybe even a few months?

Consultants are the bumblebees of the IT world. We cross-pollinate infrastructures by sharing our knowledge of what works and what doesn't. And for many organizations with a low turn-over rate, we provide a much needed infusion of fresh ideas and new approaches to problem solving.

Of course, the ultimate responsibility for adopting change falls on the customer. I can share the best, most current thinking on business concepts like enterprise mobility, client virtualization, and BYOD (if you're looking for BYOB, check out my doppelganger's blog here). But if the existing staff is hell-bent on convincing me why they can't do things the right way, or why their outdated, ill-informed understanding of technology should be the basis for an agency's enterprise mobility strategy, change will be... difficult.

tl;dr - Consultants are ever the new guy, and new ideas are worth listening to.

PS - Looking back at this post, it seems to be a bit passive aggressive. It's really not, though. Sure, it was the product of a difficult debate at the office this week, but I did my best to turn a frustrating argument into a positive experience. And the guy I was arguing with is a good engineer, so the debate was every bit technical as it was rhetorical. Of course, I'm an IT professional with a degree in rhetoric, so he didn't stand a chance. :)

Tuesday, November 19, 2013

Cooperative Puzzle Solving

Most of us have completed a lesson in cooperative puzzle solving while we were in grade school. I recall my third grade class being divided into groups, and each group was given a set of puzzle pieces. When each group tried to complete the puzzle, they ended up with a piece that didn't fit into the last space. Then you realized that every group had a leftover piece and a missing piece. And the only way to solve the puzzles was to trade pieces with other groups.

It's a great lesson in cooperation and divergent thinking. Students are tempted to look for a problem within the limits of their group, but they need to think about solutions that are external to their group to find the right answer.

Now let's talk about how information technology professionals, and the technology they manage, is typically "organized." We've got a box for server people, a box for network people, a box for storage people, and then there's that weird "developers" box that is, like, totally on a different org chart. If server, network, and storage were continents, developers would be the moon.

Humans are hard-wired to sort things. It's what we do. The need to sort people based on skills seems straightforward and harmless enough. But before you know it, you've created the antithesis of the goal-oriented engineer: THE SILO. (See my post on a related org chart malady.)

Silos trap not only engineers in neat little boxes, but also tend to isolate applications and functionality as well.

Earlier today, I was discussing strategy for the adoption of cloud computing when someone pointed out that many of the custom applications in use had their own authentication mechanisms, in contrast to using a centralized directory service like Active Directory. Migrating to the cloud is surprisingly easy when you're using centralized authentication. However, migration to the cloud is crippled by multiple, discrete authentication sources. It's possible, just messy.

In this example, the organizational silos were effectively interfering with the agency's ability to quickly move to the cloud. What's most frustrating here is that all of the pieces to this puzzle are distributed among the various IT groups, but those groups aren't willing to share their pieces, nor are they open to the idea that all groups benefit from cooperation and collaboration. If developers and server teams could work together to address access and authentication to applications via Active Directory, the agency would be moving towards a more portable and secure application infrastructure. Developers would save time by not having to build auth from scratch, and server admins would maintain complete control over all users in the environment. Divergent thinking would prevail.

Please keep this in mind as you go about your career in IT. The solutions to your problems might not be within your box, but they're rarely more than a few inches to the left or right on your org chart.

Friday, November 15, 2013

Cisco Champions for Data Center

Looks like a Mantis, therefore cool.
I had just slinked into the driver's seat of my beloved S40 at the end of a long day at work. I always glance at my email before hitting the road to make sure I don't have any grocery store requests to satisfy en route, so I unlocked my phone and did a double-take. Something about "Welcome to Cisco Champ..." but the rest was truncated. I had assumed it was an acknowledgement of my interest in the program, and headed home.

Fast forward to later that night.  At my desk, I read the full text of the message. After a few careful reads (and after dismissing thoughts of being an unintended recipient), I finally realized that I had been accepted into the Cisco Champions Program for Data Center! Not too shabby for what many coworkers dismiss as a "server guy." :)

I'll save the litany of thanks for an evening at DuClaw with friends. But I do want to thank David Yarashus of Chesapeake NetCraftsmen for encouraging me to be active in the community, and supporting my efforts to "reach across the aisle" in order to break free from the server vs. network approach to IT that I've ranted about in the past. David is wicked smart, and while I can't quite call him a co-worker any longer, I am fortunate to call him a friend. Thank you, David!

One of the guidelines for the CCDC program says to continue to be yourself. So expect more photos of insects, the occasional technical break-fix article, and plenty of posts from an English major with an IT addiction. Hell, maybe I'll toss in a few unicorns just so you know I'm legit.

Tuesday, November 5, 2013

Wrestling OpenNMS

All jesting aside, I love OpenNMS.
I've been using OpenNMS since before it was cool. Though to be honest, so has everyone since OpenNMS is still not cool. But it is a great open-source NMS solution, and it has matured significantly in the ten years since my first exposure to the software.

The current release of OpenNMS is much easier to implement than previous versions, even more so when deploying via yum. You no longer have to treat each component as individual installations; the installer takes cares of everything from the postgres database up through the web application. (This is not an insignificant matter. Years ago, simply getting OpenNMS up and running was a mark of honor.)

Last week, I was asked to troubleshoot an issue with OpenNMS. The issue was that it was broken.

Turns out that the application couldn't run because the postgres instance wasn't running. Like any three-tiered web app, if the database layer is down, then the upper layers can't function. So I dug into the logs to find out what the matter was.

If you're not familiar with troubleshooting on linux, let me tell you about a great command to use when diagnosing a problem with a system service: systemctl <processname> status. In this case, I issued systemctl postgresql status and got the following:

FATAL: could not create shared memory segment: Invalid argument 
DETAIL: Failed system call was shmget(key=5432001, size=4194304, 03600).
HINT: This error usually means that PostgreSQL's request for a shared memory segment exceeded your kernel's SHMMAX parameter. You can either reduce the request size or reconfigure the kernel with larger SHMMAX.

You can see pretty easily what the problem is. Postgres is looking for more shared memory than is currently available on the VM (shared memory is different from "physical memory"). So how do we fix that? It's surprisingly simple. Just drop this line into /etc/sysctl:

kernel.shmmax=4194304

Now a reboot, and you're back in business. Technically, you could issue a different command (sysctl -w kernel.shmmax=4194304) and then start postgres, but I like to confirm that the change will survive a reboot so there are no surprises down the road.

You might be wondering why postgres stopped working. Ask the guy who made some changes to the config file without understanding the effect of his changes. (He owes me a few beers for this one. :))

Friday, October 18, 2013

Moving Exchange Server 2010 Transaction Logs

An Exchange 2010 server that I've recently inherited has been ill lately. Symptoms include low disk space warnings, Back Pressure, and queued messages on the sendmail smarthost. I treated these symptoms with ad hoc backups to commit the transaction logs to the database and adding some space to the VM's vmdks, but after two rounds of treatment I decided it was time to deal with the root cause.

It didn't take long to find the problem: the VM had two volumes: C: and E:. C: was the OS, and E: was Exchange. All of it. Application and data. And logs. And E: was running out of space because backups weren't being run frequently enough. In other words: root cause was poor planning.

Drill down to the Mailbox Database. 
The operational issues were easy enough to fix, though. So here's what I did, and what you can do if you run into this problem.

The Fix

In the Exchange Management Console (EMC), drill down until you find the Mailbox Database associated with the transaction logs you want to move.

On the right side of the EMC, click "Move database path." A window will appear that let's you specify the location for your database and the transaction logs. In the example to the right, note that I've changed the location of the logs to the F drive that I added to the VM.

Click Move to begin the operation.

NOTE: Moving either the database or the transaction logs will temporarily dismount the information store that you're working with. In other words, don't do this when users are connected; plan for a brief outage.

Why are we doing this?

Most engineers will tell you that separating your database files from your logfiles is a best practice for performance reasons. This is typically true, but keep in mind that if you're running your database or Exchange server as a VM, you need to consider that your VMDKs may be on the same LUN. The performance benefit can't be realized in this configuration. However, there is a compelling reason to move your logs even in this case: let's say that your backups aren't running as often as they should. Without a successful backup, your Exchange transaction logs will never be committed to the database, and they will continue to grow at a sometimes alarming rate. If these logs grow too large, the Exchange Back Pressure feature will kick in and defer receipt of inbound email.

Of course, the solution to this problem is prevention. Plan your Exchange server database and logfile placement with availability and performance in mind. Monitor your backups, drive space, and mail queues. Easy. Basic.

Monday, October 14, 2013

VMware vCenter Converter Standalone Error - PEM_read_bio:no start line

I started my day off with a planning meeting to review a group presentation for the client's new CIO. You know, discussions about how much detail to go into with regard to the infrastructure, do we talk about SLAs, management type stuff. We were making great progress when suddenly, one of our help desk guys was lurking in the doorway with a worried look.

"A server just crashed."

No worries, I said. We'll sort it out. Turns out that one of the hard drives failed, but thankfully the RAID saved us (hey, that should be the name of a blog!). The server came back online, but with one failed drive. The hardware was an old Dell PE 1950 that was scheduled for p2v later this year. So we agreed to p2v that server tonight instead.

I fired up the VMware vCenter Standalone Converter, and made the classic mistake: not running it as administrator. Of course, this mistake isn't apparent until you click Finish after defining the conversion job. No worries, I thought. I'll just run it as admin and knock this out. But... no. Here's the error I got after defining a new conversion job:

[01880 error 'Default'] class Vmacore::Ssl::SSLException(SSL Exception: error:0906D06C:PEM routines:PEM_read_bio:no start line)

I checked that I had the correct permissions on the source server, and I did. Checked DNS settings on the source machine; they were correct. Then remembered that I had installed the converter agent during my first attempt to convert the VM, and did so with different AD credentials. This was the root of the problem.

The Fix

To resolve this problem, manually remove the Converter Agent from the source machine. Then you'll need to go back to the beginning of the conversion task wizard to install the agent again (use the back button so all of your values are not lost). Then run the conversion job and you'll be back in business.

Tuesday, September 24, 2013

Why Customers Don't Speak at VMUGs

The muggle quadrant.
I saw an interesting question posted on Twitter last week by Scott Lowe. To paraphrase: why don't users volunteer to speak at VMUGs? (He has a great post that you should read to see what he's doing about this.) Scott's question got me thinking about my experience with the Maryland VMUG, and I'd like to share an observation on why I think most customers aren't eager to step up and grab the mic.

VMware is streets years ahead of most of its customers.

VMware is forging ahead with its technology and continues to lead the compute, storage, and network virtualization space. The company needs to set the pace to keep its competitive advantage. VMware geeks like me eat up new VMware technologies faster than we can refresh our Twitter streams. But what about the actual users of the software? Where are they?

In my time with Chesapeake NetCraftsmen, I had the opportunity to visit many, many sites that were in various stages of deploying VMware products. Federal, healthcare, commercial, financial... all using VMware. But were they using the latest and greatest innovations? Of course not. In some cases, they were using 4.0 ESX. Some were using stand-alone ESXi hosts because they didn't need the protections offered by a cluster. Some were using 4.1 ESXi and didn't see a reason to change what was working. But the common theme was that many customers, especially the SMB segment, were still working on their first or second implementation of vSphere.

Now keep in mind that it's typical for a VMUG meeting to include a technical marketing session for a vendor or two. (And this is good! Someone has to pay for that delicious food!). These presentations are GREAT for VMware geeks, because we get some face-time with smart people, and can ask good questions to learn something new. But can you blame a customer, who is trying to figure out how to build a cluster or work with a vDS, for not wanting to follow a cutting-edge presentation? Even if you don't know what you're doing, you know when you're behind.

Again, I understand that VMware can't afford to sit still; look what's happening to BlackBerry this week. VMware needs to innovate and drive their business forward. I think we all understand that. But don't be surprised when many of their customers fall farther and farther behind as the gap between innovation and implementation widens.

A Possible Solution

What if, once a quarter, we could have a VMUG meeting without vendors? Just VMware users in a group. No intimidating presentations on server-side caching, or vCAC integration, or anything that's too far out from where most users are. Instead, just a few hours of customers talking about their progress, their successes, and if they are willing to share, their failures. I suspect we'll foster some great discussion amongst peers, and maybe even solve a problem or two in the process.

Monday, September 16, 2013

sTop THe MadNess - CamelCase has got to go.

Want to know how to evoke the ire of Twitter? It's easy. Just tweet the following:

The stuff vAneurysms are made of.
You'll be able to hear the vExperts collectively lose their minds*. Purists insist on accepting only the corporate-approved capitalization of company and product names. But can you blame filthy casuals regular customers for making a mistake? Even VMware can't keep it straight (at last week's MDVMUG, a few VMware guys shared some slides that contained the most common error: VMWare).

CamelCase is thriving in this 140-character age. Who wants to waste a character on a space anyway? #NotMe. Plus it looks cool, right? But for customers who care less about the vibrant, detailed-oriented virtualization community and its affinity for being technically correct, the capitalization of tech company names and their product lines is largely ignored. Their concerns lie in the functionality of the product and its ability to support their businesses. Cut them some slack if they screw it up and tell you they're using Vmware. Share the correct capitalization of the word and move on to solving problems.

On the other hand, I do find it interesting that there is rarely a problem when it comes to capitalizing names of ubiquitous devices such as iPhones and iPods. So consumers can master non-standard capitalization, as long as it's clearly delineated. Everyone gets Apple's iName. So why is there so much trouble with VMware and its products? It's a combination of two things:
  1. The CamelCase isn't consistent across products - vSphere set our expectations, but then we got dvSwitches, and now VSAN. 
  2. VMware can't even get it right - Internally-produced slides and other marketing material are (well in fairness, were; seems like they've improved lately) littered with Vmware and VMWare.
So don't let it get to you if someone has some vMware questions to ask you. Or if they need help with a vSan. Just help them out.

* If you're confused, that proposed tweet should be capitalized as follows: Loving my new VMware vSphere cluster on NetApp storage!!1

Sunday, September 1, 2013

VMTools on Fedora19

So. You decided that you'd rather use VMware's VMTools package on your Fedora19 VMs than the open-vm-tools package that's installed by default? No worries. Here's how to load VMTools on Fedora19.

Overview
  1. Remove open-vm-tools
  2. Install Perl
  3. Mount the VMTools ISO
  4. Install VMTools
Remove open-vm-tools

To get rid of open-vm-tools, open Terminal and enter the following command:

sudo yum remove open-vm-tools

This will locate the open-vm-tools packages and prompt you to confirm their removal.

If the account you're using isn't in sudoers, you'll need to become root before removing the packages:

su -
yum remove open-vm-tools

Now open-vm-tools is gone from your Fedora19 system.

Install Perl

Since the VMTools installer is a Perl script, you'll need to have Perl loaded on your VM to install VMTools. Easy. In your Terminal, enter the following

sudo yum install perl

YUM will find the packages needed to install Perl, and prompt you to confirm the installation.

Again, if you're not in sudoers, you'll need to do this:

su -
yum install perl

Once YUM has finished installing Perl, you're read to go on.

Mount the VMTools ISO

Just to clarify: when you choose to Install / Upgrade VMTools on a VM from the vSphere Web Client (or even that old and busted vSphere Client), you're actually mounting an ISO to the VM. The ISO contains the packages to install the tools.

Once you've started the Tools install from the client, go to your Fedora19 VM and open the ISO with Files. Select the VMTools package, and Extract it to /tmp. You'll now have a directory named /tmp/vmware-toold-distrib.

Install VMTools

Finally, go back to your Terminal window, change to the /tmp/vmwre-tools-distrib directory, and enter the following command:

sudo ./vmware-install.pl

As usual, unless you've got a good reason to choose otherwise, you can accept the defaults during installation. Once you're done, VMTools will be running and current. Just log out and back in (or restart) to enable the graphics-related features in VMTools. 

Wednesday, August 28, 2013

Community Fatigue

No, not this community.
Seems like #community is all we talk / tweet / blog / podcast about these days. And for a good reason: establishing and maintaining a presence in the technical community of your choice is a great way to develop your personal brand, build relationships, and promote your skills a bit. Software and hardware companies figured this out a while ago, which is why the number of communities has grown so rapidly over the last year or two (or in cases where the community has been established, you'll notice a recent marketing push to attract new members).

Many communities have adopted gamification as a means to attract new eyeballs and keep them on the site longer. I know that lately, I'm addicted to thwack. I'm chasing those points like I use to chase 'chieves in WoW. I scored a coffee cup (almost as nice as my old NPR mug) and a t-shirt in the process. I watched my rank in the community climb, and my level climb, too. I've had a few great discussions along the way, so it's not entirely artificial. But it does make me wonder if I'd participate as frequently without the incentive.

After a few weeks of thwack, I realized I hadn't paid attention to the VMware and Cisco communities as much as I'd like. And then I thought about Spiceworks. And then NetApp and EMC's communities. And then I realized that there truly are too many technical communities for a single person to participate in and actually share & learn, as opposed to just rack up points. At least, you can't participate in them all if you're employed and like your family. Which I am, and I do.

We've hit community fatigue, people.

You should quit worrying, too. See how happy he is?
My advice is to pick one or two communities that you like and stick with them. I do like thwack; it's fun and goofy, kind of like SolarWinds in general. But don't let the laid back atmosphere fool you: you'll find plenty of battle-tested IT pros there who are happy to share what they've learned. And since I'm still in love with vSphere, I browse the threads at VMTN daily. But I've learned to quit worrying about my points with Cisco and others. I'd rather build a strong reputation with a few communities, than weak reputations with many communities.

Thursday, August 8, 2013

Target CPU Utilization on ESXi Hosts

I was asked an interesting question a few days ago during a discussion about virtualization and cloud computing. The context of the question was observed CPU percent utilization on physical hosts prior to converting them to VMs. Here's the question:

What do you think a good target is for CPU Utilization on an ESXi host?

I admit that it was tempting to throw out a number to answer the question. But I realized that I had just been given a great opportunity to educate someone on some vSphere design considerations. And you'll be happy to learn that at no time did I ever utter those ill-fated words: IT DEPENDS. (Even though in this case, it really does.)

Instead of blurting out a target, I explained that we needed to establish some facts about the vSphere environment. Do they have a cluster? Seems like a stupid question, but you'd be surprised how many sites I visit with plenty of stand-alone vSphere hosts. If you do have a cluster, how many hosts does the cluster have? What's your expected failover capability? (Note the use of "expected;" actual configurations sometimes don't provide the expected failover capability.) What's your workload profile? What's the planned growth factor for your workloads?

Basically, I found myself having flashbacks from reading the vSphere 5.1 Clustering Deep Dive book.

I love recognizing opportunities like this one. After all, that's what consulting is about. Not just answering questions, but helping people ask the right questions, and walking them through the solution.

Saturday, August 3, 2013

open-vm-tools installation on Fedora 17, 18, and 19

Most people end up at the #eager0 looking for help installing VMTools on various Linux distros and releases. My posts about Fedora 17 and Fedora 18 sit at the top of my most viewed posts list (that is until VMware's Facebook page gave me some love, and my crash course on Cloud Computing shot up to the number two spot).

I'm rebuilding my home lab (which wouldn't be possible without the support from Chesapeake NetCraftsmen!) these days, and as part of that effort I've been deploying some Fedora VMs again. But this time I thought I'd try something different: using open-vm-tools instead of the native VMware supplied VMTools. I've had success, and thought I'd share what I learned with you.

VMTools Status in the vSphere Client.
If you've ever run a virtual appliance, you've used open-vm-tools. You probably just didn't know it. The image to the right is the give-away.

open-vm-tools is easy to get up and running on a Fedora VM. MUCH easier than using the VMware provided VMTools. Of course, you'll trade functionality for simplicity: open-vm-tools will provide you with the core functionality of VMTools: the ability to request a graceful shutdown of your guest OS from your vSphere Web Client (or god forbid, the vSphere Client). VMware's VMTools for Linux has many other features that may be of use to you; I'll catalog those in a future post. For now, let's go over the open-vm-tools bit.

Now for the good news: open-vm-tools ships with Fedora 19, and will start automatically if Fedora detects that it's running as a VM on VMware software. So... you don't need to do anything! Pretty easy, right?

For Fedora 17 and 18, you'll need to grab the open-vm-tools package through yum:

sudo yum install open-vm-tools

Then reboot to complete the install and verify that open-vm-tools starts up properly. You're looking for that tell-tale sign above: VMware Tools: Running (3rd-party/Independent).

It's nice to see the inclusion of open-vm-tools in Fedora 19. It's even nicer that it is smart enough to run when it's needed. Unless you require those other features (many of which are beta) of VMTools, I highly recommend sticking with open-vm-tools for your Fedora boxes.

Saturday, July 27, 2013

Fix Your Org Chart, Fix Your Infrastructure

See this? Don't do this.
Spend any measurable amount of time thinking about how to solve infrastructure problems, and you'll most likely end up thinking about organizational problems. This is doubly true when you're rooted in virtualization, and you don't really consider yourself a server, network, or storage engineer.

You've probably seen many org charts like the one to the right. The boxes for each team seem innocent enough: group staff by discipline and responsibility. Job descriptions can be pretty generic for each group. Status reports flow up the chain nicely. All of that great management structure from the 1970s.

It just doesn't work today.

Virtualization adds a layer of abstraction to org charts, too. You might not be able to distinguish between a server engineer and a storage engineer anymore. Network engineers are getting smarter about virtual servers. All of a sudden, everyone is speaking in the same technical jargon, and they're even understanding one another!
Almost named it Virtualization Team. Close one.

That is, of course, unless you still insist on sticking with the server, network, and storage team approach. And maybe we're overlooking the commonality in the naming convention: Team. In the sports world, when you have more than one team playing in the same league, they're competitors. I argue it's the same for IT organizations.

Organizational architects of the world: when you put engineers in nice little boxes like the three in the diagram above, you're designing for conflict. It's like building a vSphere cluster and disabling vMotion. Don't force your engineers to stay within the bounds of their "team." Let them move around in the organization freely, based on their resource requirements (without violating availability constraints, of course).

Fixing technology is easy; fixing organizations is assuredly not. But technology can't solve organizational problems, and frankly shouldn't.

Tuesday, July 23, 2013

More to Storage than x86 Virtualization

If you're following Nutanix and their evangelists on Twitter lately, you've no doubt seen lots of tweets using the #NoSAN hashtag. If you're not familiar with their products, it's worth checking out. They combine four server blades in a single chassis, and sweeten the deal with internal shared storage (both SSD and HDD). If you're looking for a great platform for your vSphere cluster and don't want to invest in a SAN, Nutanix deserves your consideration.

In my opinion, however, Nutanix has gone a little overboard with their claim that SAN is dead, and that traditional SAN deployments have no future. It's easy to get caught up in the x86 virtualization space and think of storage (SAN or NAS) as just a resource to be consumed by vSphere. But data centers aren't just filled with vSphere hosts, no matter what VMware would have you believe. I see AIX hosts in a surprising number of environments. I see lots of physical Windows hosts (usually with unique hardware requirements (e.g., PCI cards for fax servers). I've even seen clustered OS X Servers on Apple hardware (NSF, I'm looking at you). All of these servers need access to fast and reliable storage, and that usually means SAN.

Look. I'm a VMware geek like the rest of you. I love running Converter on physical servers and sharing the benefits of virtualization with clients and coworkers (and random Internet people like you!). But as I noted previously, VMware is not the world. Unless you've realized the Software-Defined Data Center, your SAN is more than just a place to stick your datastores.

Sunday, July 21, 2013

Overallocating vCPUs

I'm always looking for a way to explain why overallocating vCPUs to a VM is a bad idea. At best, it doesn't help. I've seen some discussion this morning on Twitter about this, so I'm sharing how I explain this to people.

Let's say you're going to dinner alone. When you talk to the hostess, you tell her you'd like a table for four because you're really hungry. You can eat faster at a table for four, right? Like, four times as fast? Of course not.

It's the same with vCPUs. If your workload is based on a single-threaded application, and you give it four vCPUs, that workload is dining alone at a table for four.

The metaphor can be extended to explain why doing this over and over again has ripple effects on the host (or the restaurant in this case). But I'll leave that up to you to think about. After all, you're the hostess.

Monday, July 15, 2013

Monday Morning Funny: VMware Fusion Locked... What?

Now that Fedora 19 is out, I thought I'd load it up on a Fusion VM and test it out a bit. The alpha releases I checked out a few months ago didn't run on VMware products, so I was anxious to see how the final image worked. I'm happy to say that Fedora 19 loaded without any problems, so you'll see a post soon about VMTools on this release (and a discussion on open-vm-tools while I'm at it).

After I loaded the OS and booted the VM, I went to install VMTools. But I hadn't removed the .iso from the install yet. Fusion gave me the following error message:

I had to read this one a few times before I started laughing. I'm using a MacBook Air (which has no built-in optical drive), and the media is an ISO file. Sure, I get the intent of the message. But the thought of a non-existent door being locked made me laugh.

Virtualization changes not only the technology we use, but the language we use to talk about the technology. How often do you say, "eject the disc" when you're referring to an ISO file that you mounted to a VM?

Thursday, July 11, 2013

SolarWinds Certified Professional

Since I've been working with SolarWinds software lately, I thought I'd take a crack at earning the SolarWinds Certified Professional certification. Happy to say that I passed with flying colors last night!

The test itself was a good measure of how much you know about network monitoring in general, with a healthy dose of SolarWinds Orion NPM administration thrown in for good measure. I've been using SolarWinds software since my days as a network administrator at Avectra (good grief, that was like 13 years ago!). I introduced SolarWinds Orion NPM and NTA to the National Science Foundation years ago, and I hope it's still being used to monitor their growing infrastructure. Now I'm using it for a new project I'm working on.

And on that note, I'm off to see what my Top 10s are for today.

Wednesday, July 10, 2013

Power Saving Modes in vSphere and Cisco UCS

If you've ever had a slow Friday and spent time poking around in vCenter Server or UCS Manager, you've probably come across some promising eco-friendly features like Distributed Power Management (DPM) and N+1 PSU redundancy. If you haven't, here's a summary of these technologies.

VMware's DPM - DPM is a feature available to vSphere clusters that determines if the cluster's workload can be satisfied using a subset of cluster members. If so, the VMs are vMotioned to free up one or more hosts which are then powered down into stand-by mode. Your cluster's HA settings are taken into account, so using DPM won't violate your availability constraints. Should the cluster's workload suddenly increase, vCenter will wake-up the stand-by hosts, then redistribute the workload across the additional hosts. Cool Stuff indeed. You save on power and cooling costs for each server that DPM puts into stand-by mode.

Cisco's UCS N+1 PSU Redundancy - N+1 is sometimes a tricky thing to wrap your head around, since its meaning changes depending on context. In the case of UCS, N+1 means the number of PSUs required to provide non-redundant power to your chassis, plus one additional PSU. So with a 5108 chassis, with all four PSU slots populated, N+1 would mean 3 PSUs active and one in "power save" mode. If one of the active PSUs fails, you still have redundancy, and the fourth PSU will be brought online to restore N+1 redundancy.

So that's the good news. Here's the bad news: DPM basically confirms that you overbought on hardware. And N+1 PSU redundancy may not give you the redundancy you're looking for. Here's why.

If you find that DPM is shutting down servers in your cluster more often than not, you purchased more hardware than you needed. This indicates that you didn't properly assess your workloads prior to creating your logical and physical designs. And that indicates that maybe you didn't account for other design factors. And that is not cool. An erstwhile pessimist, I suspect this is why many vSphere clusters do not have DPM enabled.

On the topic of Cisco UCS, N+1 PSU redundancy, and a false sense of security: chances are that what you really want to use here is Grid Redundancy, not N+1 redundancy. Grid means that you have power from two PDUs running to your 5108, and you want to spread your PSUs across those two PDUs. So you connect PSU1 and 3 to PDUA, and PSU 2 and 4 to PDUB. All four PSUs are online, and should a PDU fail, you still have two PSUs running. With N+1 and PSUs spread across two PDUs, you could encounter a situation where only one PSU is active while the "power save" PSU is turned up. One PSU may not be able to provide sufficient power to your chassis and blades, which can be... you guessed it: not cool.

Looking back on this post, I'm not sure why I lumped these two together, other than that they both deal with power. DPM and PSU configuration options solve different problems. There's no shame in including these features in your designs. Just make certain that you understand the benefits and pitfalls of each.

ps - It's late, and I'm listening to the Beastie Boys, and I'm low on Yuengling. Were I so inclined, I could add a footnote for nearly each claim above. But the point here is that you need to understand what these options do for you, and that means understanding other design requirements like total power consumption of your b-series blades.

Tuesday, July 2, 2013

New Post at netcraftsmen.net - The End of FibreChannel?

I'm in a writing mood lately, and thought I'd share some observations on virtualization, storage, and certain storage protocols that think they're too good to share cabling with other protocols. Read "The End of FibreChannel?" and let me know what you think!

Monday, July 1, 2013

vBeers - Washington, D.C. on Thursday, July 25, 2013

Turns out that setting up a vBeers really is that easy! Join us at The Dubliner on Thursday, July 25 at 5:00pm. We'll be talking virtualization, at least initially. After a few pints, topics will most likely include unicorns, bacon, office politics, VMworld, and IT war stories.

Click here for the vBeers official post.

See you there!

Saturday, June 29, 2013

Is Cloud Just Another Name for Hosting?

I made this in PowerPoint for Mac. Forgive me.
If you've spent any decent amount of time in IT, you've no doubt seen more than your share of "network diagrams" that always have one component in common: a big empty cloud with the word Internet at the top. Something like the diagram on the right.

For server engineers, the cloud at the top of these diagrams was a place holder. It symbolized the domain of the network engineer, where crazy things like BGP and MPLS do... stuff. Packets come in, packets go out: you can't explain that!



Then, 5 years ago, that cloud at the top started to have a new meaning. The cloud was no longer a placeholder, no longer just boilerplate for enterprise architects to pad their Visio drawings. Now it was an active and critical part of the enterprise.

At first, many veteran IT professionals were quick to cast doubt on "cloud computing." It was just another name for hosting, or application server provider, or managed service provider. "In the cloud" was just another buzzphrase that meant "your server is in someone else's data center." It was still your server, still a physical server (ok, 5 years ago, it may have been a VM, but at least in the government IT space, it was probably still a physical machine), still running a traditional workload. It just happened to be somewhere else.

Cloud computing is just another name for hosting, right?

Well, no. It's easy to understand why so many people think of cloud computing this way. It's all those network diagrams, and a healthy suspicion of new technology that promises to change everything. But to condense the many concepts of cloud computing into "hosting" is to truly miss the point.

Cloud computing is less about where your server is, and more about how your IT infrastructure supports your business. Sure, hosting may be part of cloud computing. But the core tenets of cloud (elastic computing, automation, orchestration, security, and availability) are individually and collectively more significant than the raised floor that supports your hardware.

Remember the IBM marketing term "On Demand?" (full disclosure: I worked at Big Blue a long time ago. (fuller disclosure: I was an eight bar specialist in SecondLife)) Combine "On Demand" with hosting, and you're starting to get to the meaning of cloud computing. It's about the instant, automated provisioning of VMs to respond to changes in workloads. It's about the seamless migration of workloads to meet resource requirements. It's about the application of policy in a consistent, organized manner. And most importantly, it's about the efficient use of IT resources to satisfy your business requirements.

In contrast, I'll give you one example of what cloud computing is not. Cloud computing is not a single software package that you download and install to create your cloud. Regardless of your hypervisor of choice, cloud computing is more than clicking Next Next Finish. It's aligning the operation of your infrastructure with your business. And it's about letting your IT people solve good problems, instead of running around maintaining physical servers. (Have you heard of #DevOps? Search twitter for it. That's where we'll see big changes in how infrastructures are built and managed in the next 3 years.)

So there you have it. A crash course in cloud computing. Feedback and questions are welcomed and encouraged!

Tuesday, June 18, 2013

vExpert 2013 Rejection: Just What I Needed

No, really.

When the vExpert 2013 recipients were announced a few weeks ago, I quickly went to the winners list and did a Control-F to search for my name. Chrome found a match for Stump, and I started to get excited. Then the disappointment set it, as I realized that it was another Stump. And like that, I joined the minority of VMware geeks who applied but were rejected from the program.

Self-doubt set in. I started to worry that I didn't actually know vSphere that well. Maybe I was spending too much time in the lab, and couldn't see the forest for the trees. Maybe I was so caught up in the details of VMTools on linux, or banging away at vami, or digging through fdm.log, that I has lost that larger perspective on how people were using vSphere in the wild.

Eager to find a positive outcome for the situation, I set up a quick call with John Troyer to talk about my application, and how to prepare for the next go-round. I didn't anticipate how important that call would turn out to be.

We talked about participating in the community, and how to raise my profile a bit.

  • Attend VMUG and user conferences.
  • Stay active on social media (for me that means Twitter. Like any self-respecting hipster, I quit Facebook years ago.)
  • Work up the nerve to present at VMUG.
John knows the VMware community so well, that I started to get excited just listening to his advice. All of a sudden, I had a clear idea of how to get out there.

As we ended the call, John suggested I contact Amy Lewis at Cisco. Of course, I had known of Amy (or rather @CommsNinja), but aside from a very brief chat at VMwarePEX this year, I hadn't connected with her. But if John thought it was a good idea to reach out to her, it was worth pursuing.

Amy was kind enough to make time to talk with me about all things community, both from a Cisco Data Center and a VMware perspective. She's connected to so many of the stand-outs in these communities that it was a little surreal. We talked about the convergence of skillsets in the modern data center, and how the old dichotomy of server guys vs. network guys doesn't fly anymore. The importance of building a personal "brand" as opposed to having an online identity that's permanently tethered to your current employer (which by the way, sounds like a nearly universal problem!). We even talked about how to introduce more virtualization topics to the CMUG (the Cisco Mid-Atlantic User Group) that Chesapeake NetCraftsmen runs.

I left that call with a renewed sense of community, and an even clearer idea of how to quit lurking in the virtualization circles.

In retrospect, I'm now convinced that rejection from the vExpert 2013 program was a good thing. It forced me to analyze my current participation in the community, and reach out to some wicked smart people. It reinforced the important role that social media plays in staying educated and informed in data center and virtualization concepts, and yes, even marketing. And it encouraged me to focus my efforts on giving back to the community in more meaningful and visible ways.

Now I just need to brush up on my #v0dgeball skills. :)

Saturday, June 8, 2013

No CDP on UCS ESXi vmnics?

I spend most of my professional life digging through UCS and vSphere configurations. Many times I find that UCS has been configured to work, but not always work well. Here's a quick example:

In the process of reviewing the host networking config for some new ESXi hosts, I wanted to see what upstream switches they were connected to. Normally, I'd just click on CDP to see this information. But when I looked, here's what I saw:

Fresh out of CDP.
It's an easy thing to fix. Here's how:

  1. Log into UCS Manager
  2. Navigate to the LAN tab, and locate the Network Control Policies (path is LAN | Policies | root)
  3. Expand this item to see all of the policies that have been defined.
  4. To see which service profiles are using each policy, select the policy, then click Show Policy Usage. This is a great way to confirm how your changes will affect your hosts. It also works for all other policies, FYI.
  5. Find the Network Control Policy that your ESXi hosts are using, and take a look at the settings. Is CDP enabled?

  6. Click the Enabled radio button, then Save Changes. That's it!
Your hosts will not need a reboot, and the vNICs won't experience any disruption in their operation as a result of this change. The only difference is that now, you'll be able to collect CDP data in your vSphere Web Client, like this:

CDP in the house.

Cisco UCS can be a challenging platform to work with, especially if you're unfamiliar with network, server, and virtualization concepts (I should lump storage in there, too!). If you've got questions, or are looking at transitioning to Cisco UCS, consider contacting Chesapeake NetCraftsmen (that's where I work!). I've done enough UCS installs and upgrades to know how to avoid common pitfalls, especially when running vSphere on UCS.

Saturday, June 1, 2013

Adventures in CloudCred

Lately, I've been spending some free time with www.cloudcredibility.com. I'd started it a few months back, but when work started to get busy I let it slide for a while. Now I've decided to catch up and see what's new.

Of course, the big change since the beginning of #CloudCred is VMware's major marketing push for vCloud Hybrid Service. So it should come as no surprise that CloudCred now features many tasks related to learning about vCloud Hybrid Service and its benefits. In fact, there's a contest that lasts about another week to win a FitBit One. All you need to do is complete a bunch (maybe two dozen, I haven't counted (yes I have, it's twenty six)) of tasks to be entered into the drawing. And in the process you'll earn a ton of CloudCred points.

In a short time a few days back, I managed to win the following:

  • A CloudCred Pen
  • A CloudCred Hat
  • A CloudCred Bumper Sticker
  • A CloudCred Twitter mention
It's that last one that makes me laugh. It's because the Twitter mention isn't automatic (ironic, given that we're talking about cloud here). I received an email that says I'll be contacted by someone in the coming days to discuss the "prize details and approval process." It's funny that a simple tweet can be bound by such bureaucracy. Perhaps I should prepare a 27B stroke 6 in advance.

I'm more of a Tuttle kind of guy, which means I have this to look forward to one day:

Tuesday, May 28, 2013

Congrats to the 2013 vExperts!

I didn't make the cut for my first application to the VMware vExpert program, but congratulations to the nearly 600 people who did! Here's the official announcement (with a full list of vExperts).

Looking forward to the "sorry you didn't make it" message to learn what I could do better for next time around. Can't say that I'm happy about not making the list, but I'm pleased with the following that I've built up over the past 5 months since starting this blog. By the looks of it, I've helped quite a few people with some VMTools installs on Fedora. That's what I'm most proud of.

I'll save the grumbling and regret for my six pack of Natty Boh. In the meantime, if you made the list, it's time to celebrate! Congrats again!

Friday, May 24, 2013

Self-referential Post.

Quick post while I write a longer post for netcraftsmen.net.

Last night I ran into a quirky error with HA. A few hosts weren't able to connect to the HA Master. A quick review of the /var/log/fdm.log turned up lots of the following:

[29904B90 verbose 'Cluster' opID=SWI-d0de06e1] [ClusterManagerImpl::IsBadIP] <ip of the ha master> is bad ip

But the hosts were able to vmkping the other cluster members' management IP address, including the HA Master. (I admit that I checked DNS first; old habits die hard). It was late, I was two pots of coffee into a change window, and I was looking for help. So I posted a question to VMTN: vSphere HA Agent Unreachable.

Well wouldn't you know it, Duncan Epping replied pretty quickly with some suggestions. He asked if I had changed SSL certificates recently, which I hadn't, and then included a link to some other steps to take. I ended up resolving the issue just by disabling HA on the cluster, then re-enabling it. Go figure.

Fast forward to this morning, and what do I see from DuncanYB? A link to a new post he wrote about vSphere HA Agents in an unreachable state.

Most interesting is the log snippet he included:

[29904B90 verbose 'Cluster' opID=SWI-d0de06e1] [ClusterManagerImpl::IsBadIP] <ip of the ha master> is bad ip

Look familiar?

It's great to know that the VMware community follows the sun, too.

Wednesday, May 22, 2013

For Veteran LastPass Users - Update Your Password Iterations Value!

I'm going to assume you're using LastPass to manage all of your passwords. For us VMware nerds, it's perfect for keeping track of the various vSphere Web Client logins we accumulate over time. You are using separate credentials for each site, right?

I've been a user and fan of LastPass for years. So long in fact, that I've apparently been overlooking a few settings that LastPass has introduced lately. Specifically, I've neglected to update my Password Iterations value.

Here's what it looks like for us old timers:


You'll notice right away that the red text is floating the suggestion to you to raise the iterations used to create your master encryption key. If you're into light cryptology, click the More link to learn about how LastPass uses SHA-256 and PBKDF2. Otherwise, simply update the field above to 5000 and click the Increase Iterations button. You'll need to re-enter your password to start the key generation process.

It will only take a few seconds for this process to complete. When it's done, you will need to login to LastPass again. This is because your encryption key has been re-created, so your current session is no longer valid.

If you're using Firefox, here's what you'll see:



Just login again (your password hasn't changed as a result of this process), and you're done!

It's important to keep in mind that, when you're an early adopter, it's easy to miss out on new features or capabilities that are introduced but are not retroactively applied to your account.

Tuesday, May 14, 2013

Missing another MDVMUG Meeting!

Missing a single MDVMUG meeting? Ungood.

Missing back-to-back MDVMUG meetings? Doubleplus ungood.

Newspeak aside, I'll be missing another great evening in Catonsville. Though my excuse this time around is better than the last: I'll be doing some UCS work during an evening maintenance window. I'm just glad it's not the old NSF maintenance window from 00:00 to 08:00 Sunday.

If you're in the area, and have some free time in the evening, I strongly encourage you to attend. I've cajoled a few coworkers to swing by and listen to the presentation from Nutanix. If you're not familiar with Nutanix, spend some time on their site and learn about their hardware solution for virtualization. Along with PernixData, I think Nutanix is in the front of the pack when it comes to storage virtualization.

You'll also get an overview of the VMware Horizon suite, which can be refreshing for those of us who spend the bulk of our time in the data center space. The more I learn about Horizon, the more I like it.

Still need more convincing? Nutanix is giving away an iPad. So there's that.

Of course, I'll be attending the Potomac Regional VMware User Conference on June 13 in Washington, D.C. That's an amazing event put on by a ton of great sponsors, with heavy participation from VMware luminaries (last year, Cody Bunch and Alan Renouf were present to talk about automation and PowerCLI). Register here, and if you're going, I'll see you there.

Wednesday, May 8, 2013

Stuck Glove Box Door on a 2005 Volvo S40

I love my Volvo S40. I've posted about my adventures with repairing the windshield washer fluid system, and I'm surprised how many people end up here to read that article. So here I am at 3am, writing more Volvo S40 linkbait for my fellow shabby arrogant bastards.

Many people, myself included, have run into a problem with the glove box. The problem is that it won't open. You can lock and unlock it all you want, but the handle doesn't engage the locking bar under the dash. I've read a few posts that say you should drill a hole in the front of the door which will destroy the lock and open the box. But you'll still need to fix the lock at some point, and you'll have a hole in the door.

I managed to find a way to open the glove compartment when this happens, and it doesn't cause permanent, or even temporary damage to the car. Here's how:


  1. Pop off the access panel on the passenger side of the dash (do this while the passenger door is open).
  2. Remove the door vent deflector (a hard, black piece of plastic behind the access panel).
  3. Use a flashlight to locate the locking pin. It's about 4" into the access space. You should see a small black pin, round with a flat side. See image to the right. 
  4. Use a long screwdriver to push that pin in until the glove box door opens.
Next, I'll figure out how to repair the locking mechanism. Until then, this will at least let you get into the glove box. My advice is to keep your registration and insurance information elsewhere until that pesky lock is fixed.

Monday, May 6, 2013

VMware is Not the World

I heard it mentioned at the VMware Partner Exchange in Las Vegas this year. Two women speaking to each other, wondering aloud what the convention was about. One of the women remarked to the other, "I think it's just some software."

I suppressed a chuckle. I bristled at the suggestion that VMware, and its vSphere software that I've devoted so many years of my life to learn and master, was "just some software." This stuff is IMPORTANT. It revolutionizes the modern data center, squeezes every last penny out of otherwise underutilized hardware, and come on: it's cool!

But it's been creeping up on me since then: she was absolutely right. In the grand scheme of things, VMware, even virtualization as a whole, is just a tool to be considered for use in an enterprise IT operation. Don't use VMware? Guess what: it's still possible to run a successful enterprise IT shop, and therefore a successful business.

Make no mistake: I love VMware. I love virtualization. I love all things Data Center. I'm good at it. But I'm finally getting some much-needed perspective, and it's just in time.

Part of this realization is driven by a new project I'll be working on, where VMware is just one piece of a puzzle. There's storage, there's servers, there's network, there's email, there's applications, et cetera. So I'm pulling back a bit to recognize that importance of an enterprise where all facets, not just virtualization, are managed properly. It's refreshing.

tl;dr - VMware is important. VMware is Not the World.

Sunday, April 28, 2013

VMware Public Sector Technology Exchange East 2013 - Summary

One of the greatest benefits of working for Chesapeake NetCraftsmen is the constant encouragement to attend industry events and participate in the technical community. In my case, that means going to VMware events when they're in town. Last Thursday, the VMware Public Sector Technology Exchange East 2013 set up shop in the Grand Hyatt in Washington, D.C., and I had a chance to attend and listen to some great sessions on cloud vs. virtualization, security and compliance, and end-user computing.

Honestly, I really just wanted to hear Scott Lowe's keynote on VMware's network virtualization strategy. He didn't disappoint. He laid out the progress made through compute virtualization over the last ten years, and explained that network virtualization is the next logical step. The logical construct of the "virtual network" is the foundation for VMware NSX, a product which seeks to provide a virtual network infrastructure on top of an existing layer 3 network. Read the product announcement here.

Other interesting points from Scott's presentation:

  • ~40% of VMware administrators manage the virtual switches for their environment. I really thought this would be much higher.
  • Half of all server access ports are virtual. This should be a wake-up call to people stuck on the traditional physical switch approach.
  • VMware NSX requires vSS or vDS. Cisco's 1000v cannot be used with NSX. Given low adoption and increasing feature-set from the vDS, this shouldn't affect too many people.
I also got a chance to talk with a few vendors about their products, ran into some colleagues from Cisco and F5 (including the guy who inspired me to pursue virtualization, Lyle Marsh, who is wicked smart and funny as hell), and managed to keep the swag to a minimum.

I highly recommend attending these local events. If you can sneak out of the office for a day, it's a great way to learn and network with your peers.

Friday, April 26, 2013

New Post at Chesapeake NetCraftsmen - Updating VCSA to 5.1 U1

I posted an article on how to update your VCSA to 5.1 U1 over at Chesapeake NetCraftsmen. Go check it out, and take a few minutes to browse through the other staff blogs there. I won't be offended. Promise.

Sunday, April 21, 2013

PernixData - A Solution to a Problem You Wish You Had

I've been reading about PernixData over the last few days. It's an interesting company with an interesting product: virtualization and commoditization of server-side flash storage. Not familiar with server-side flash? Here's a quick primer:

Take your classic SAN-attached server topology. You've got a server connected to a FiberChannel network, an FC switch or two, and a pair storage processors at the other end. On your server, you've got an application that relies on the SAN to provide its data. While CPU speed and core density increase, and memory capacity improves, storage latency is largely unchanged.

Enter server-side flash. Server-side flash acts as a cache for data on your SAN. Each time your application needs to read data from the SAN, that data is cached on the flash card. The next time that data is requested, it's read from cache, which greatly reduces the time it takes to access the data. It's cool stuff. Fusion-io is a big player here, but there are others as well.

PernixData's FVP (flash virtualization platform) software promises to take all of your server-side flash and make it available for your hypervisor. There's not much technical information available on their web site yet, so the details of how this will be implemented are unknown. Duncan Epping and Frank Denneman posted great write-ups of the company and its technology on their blogs; they're privy to more technical information than most. I would guess it will work like a VSA: create datastores on local flash, then replicate those datastores out across your cluster.

In my opinion, PernixData has a solution to a problem you wish you had: what to do with all that server-side flash. Flash is still somewhat expensive, and it's not common to see server-side flash in small to mid-sized environments. But take a look at some of the people behind PernixData. They're not just techies; they're some of the guys who made VMware the virtualization powerhouse it is today. They're probably not far off in their thinking, just a year or two ahead.

Incidentally, I traded a few messages with Satyam Vaghani over the weekend via LinkedIn. I'm happy to share that he's not only a wicked smart guy who can say things like "wrote VMFS" on his resume, but he's also a nice guy who's happy to meet people in the industry.

Keep an eye on PernixData. They're onto something here.

Saturday, April 13, 2013

Running ESXi in a VMware Workstation VM in a Fusion VM

Let me start by saying that this is a bad idea: running an ESXi host in VMware Workstation on a VM running within Fusion. You're getting into I N C E P T I O N territory here.

The intrepid among you may forge ahead regardless of the warning. If so, here's a quick tip.

When you first created your VM in Fusion, you probably skipped over an Advanced Option in the Processors & Memory page. If so, and you try to boot ESXi in a Workstation VM within your Fusion VM, here's what you'll see -->






In Fusion, go to your VM's settings page, and click Processors & Memory. Here's what you're looking for:

Check this box, boot up your VM, fire up Workstation, and ESXi will start up without complaining.

Why would you want to do this? Have you heard of CloudCred? :)

mike

Wednesday, April 3, 2013

vCenter Operations Manager - Troubleshooting Network Connectivity

Isn't vCenter Operations Manager great? Finally, you can do something with all the data that vCenter has been quietly and dutifully collecting over the years (you knew it was doing that, right?).

You can't get there from here. - vCOPS
vCOPS is a vApp that consists of two VMs: one to run analytics on vCenter's collected data, and one to handle the presentation components of the application. To work properly, these two VMs need to communicate with one another, and with vCenter. But what do you do if the IP address for one of these hosts changes?

In this example, I'm trying to access the GUI on the UI VM at 192.168.1.19. But I'm getting an error message that indicates the UI VM cannot communicate with the Analytics VM.

First, confirm that the Analytics VM has, in fact, decided to use a different IP address (this happens on lab networks often, when DHCP is used for IP allocation. Please don't do this in production. :) ). Use your vSphere Web Client to see the address it's using. Notice in the image below that the VM is using 192.168.1.68, while the UI VM is expecting it to be at 192.168.1.4.
Right host, wrong address.


To correct this, you'll need to connect to the VM either via ssh or the remote console. Remember that to connect to the shell, you'll use the root account (admin is used only for the GUI).

Changing the IP address of the Analytics VM requires the use of a utility named vami_set_network. This utility is located in /opt/vmware/share/vami. It's easy to correct this problem. You'll pass a series of values to this utility, and it changes the properties of your ethernet adapter for you. A few assumptions: you are using eth0 for your VM (this is the default interface), and you are using IPv4.

Here's the syntax: 

In the example, we need to change the Analytics VM's IP back to 192.168.1.4. Here's how you'd do that:

cd /opt/vmware/share/vami
./vami_set_network eth0 STATICV4 192.168.1.4 255.255.255.0 192.168.1.1

And here's how it looks in action:


Give vCenter a minute to detect the IP change and refresh the Summary page for this VM. Then you'll see that the IP has been properly changed. Now when you browse to the vCOPS UI, you'll get the login page that you were looking for.



Note: It's possible that a restart of the vCenter Operations Manager application is required to restore connectivity. If you get 503 errors when trying to hit the login page, here's what to do:
  1. Connect to the IP of your UI VM, and add /admin to the URL (e.g., https://192.168.1.19/admin).
  2. Log in as admin.
  3. Click the Status Tab.
  4. In the Application Controls tab, click the Restart button. This will take about 5 minutes to stop and start the application.
  5. Log out of the admin console, and refresh the vCOPs login page.
Getting familiar with the utilities in /opt/vmware/share/vami is a GOOD THING. Spend some time learning what's there. I have a feeling that all virtual appliances from VMware will share this same framework, which means these skills will come in handy in the future.