It should be clear that this is quite a change from the way we've traditionally operated networks. You can't just point a network protocol analyzer, such as wireshark, at a physical link and figure out the problem from there. That's not because the protocol analyzer can no longer see the traffic in the overlay -- it's quite straightforward to parse a VXLAN header and see the protocol headers of VM-VM traffic, and thus observe the inter-VM traffic on a given link. It's just that this can't be the only tool in the toolbox, rather one tool that is complemented by the large amount of information that can be gathered from the network virtualization layer.
Note that visibility isn't always just about troubleshooting, but also refers to the problem of mapping VM-VM traffic onto physical network resources. When you build overlay tunnels between hypervisors, traffic is going to follow whatever path the physical (or underlay) network thinks is best. That path will be based on the outer tunnel header and the routing algorithms of the physical network.
The good news is that, by moving a lot of network functions into the vSwitches, overlays make it appealing to use layer 3 routing in the physical network. Many modern data centers are now adopting "leaf-spine" or Clos network L3 designs with large amounts of equal-cost multipath (ECMP) across the core of the data center. These designs efficiently spread traffic across many parallel paths, which often does away with the need to carefully traffic engineer specific traffic flows to certain paths.
If it turns out that ECMP isn't possible in the physical network, then there are a few options. One is to use VM placement as a tool to reduce congestion. This has been done in the past, but network virtualization overlays allow complete freedom to place workloads anywhere in the data center. So if traffic between a pair of VMs is traversing a link and causing congestion, one option is to move a VM so that the traffic no longer traverses that link. And in fact, data gathered from the network virtualization layer can be used to guide VM placement decisions to reduce congestion -- that is, the greater visibility that we have in the virtual networking layer can be used to improve performance at the physical network layer.
Finally, we note that some underlays are appearing with explicit routing capabilities, e.g. to set up a high-speed, fiber-optic net dedicated to high-bandwidth uses, such as a network moving large amounts of real-time exchange information to trading desks. The network virtualization overlay is perfectly positioned to provide information about such flows, because it is already gathering flow-level information directly from vSwitches as described above. It's not a big leap to start passing such information to an underlay that can take advantage of such information.
To sum up, network virtualization shouldn't be seen as a big problem for network visibility. Instead, it's opening up new possibilities for improved visibility into the communication between VMs. With rich APIs for inspecting virtual networks and a combination of new and existing tools, the ability to understand what's going on in data center networks is only getting better.