It turns out that my issue came from the difference in default behavior between a vSS (Standard Switch) and a vDS (Distributed Switch). For those that want the TL:DR -> Set all security settings to “accept”.
There are three security settings for switches in VMware.
- Promiscuous Mode – Allows the virtual networking adapter to observer any of the traffic submitted through the vSwitch.
- MAC Address Changes – Allows differences in the initial MAC address and the effective MAC on incoming traffic.
- Forged Transmits – Allows differences in the initial MAC address and the effective MAC on outgoing traffic.
The initial MAC is the one assigned by VMware, and the effective MAC is the one used by the guest OS to transmit data. By default these are the same, but as a system administrator you can change the MAC in the guest OS.
My issue came from the default settings for these three options on the vSS versus the vDS.
On the vSS:
On the vDS:
In all the blog posts I have seen about doing this, every one mentions to set “Promiscuous Mode” to accept, but they fail to tell you that you need to have “Forged Transmits” set to accept as well. To be safe I actually set them all to accept.
I think this is the result of most people doing this in their home lab and only using vSS, and since I was using vDS I didn’t notice at the time that the default options are different.
This makes sense since as soon as you add another vmk to the virtual switch you are transmitting from a different virtual MAC address, which “Forged Transmits” would block.
For the past few months we have been running multiple labs, two physical blades each, to do development against the vCloud 5.1 Beta. With the capacity that we have available in our overall lab, a nested approach like the VMworld labs would be a nice fit for our development team. Ideally we could spin up full vCloud instances in minutes. My experience thus far with nested environments was running vSphere in workstation to study for my VCP exam.
The first thing I did was try to build this out in a brand new vCloud 5.1 environment. For the actual setup I followed this article by William Lam. This also provided a nice graphic of what my network setup would look like.
Inside the nested vCenter I migrated the management vmk nic to vDS. Next I wanted to add another vmk for vMotion on the isolated vMOT network I created. I initially statically IP’d the nics on each vESXiHost and my first vmotion attempt failed to complete due to a connectivity problem. I eventually setup DHCP on the isolated vMOT network and neither host would receive a DHCP address (They end up with your typical 169.x.x.x address).
To test that my isolated network was setup properly I added NICs to my Windows JumpBox and built a new Ubuntu box on the vMOT network. They both received DHCP addresses and could ping between the two successfully.
In an effort to take as many variables out of the picture as I could, I tried to strip this down to the most basic setup. I started this morning with a clean install of ESXi 5.1 on a physical blade. Built two new virtual ESXi hosts, and also threw in two Ubuntu boxes for testing network connectivity. The vESXi hosts have vhv.enable = “true” set in the vmx file and promiscuous mode is turned on at the physical host vSS. As far as networking I tried to keep this really basic and use vSS. On the physical switch, this blade has the 2 NICs plugged into, VLANs 3,4 and 500-599 are trunked.
The virtual ESXi hosts have three NICs that are plugged into the vMGMT, vMOT and vCUST network. Again the vmk that is there by default works perfectly, but as soon as I add another vmk for vMotion there is no network connectivity. Below is the vSS from the nested ESXi host.
I have tested this also by trying to put another vmk on vSwitch0 and it results in the same outcome, no connectivity. Again in this case I am able to put Ubuntu boxes on these networks and verify network connectivity. It is only in the case of virtual ESXi that I cannot get another vmk or port group to have connectivity.