Network Brouhaha Networking, Cloud, Automation, Infrastructure, Containers and General Geekery http://www.networkbrouhaha.com/ Cloud Director V to T Migration Videos

Recently I recorded a couple videos with my teammate, Joseph Polcar, on Cloud Director V to T migration. The first video provides an overview of the migration tool, running and evaluating an assessment, and other steps needed to prepare for a migration. The second video provides an overview of the YAML configuration file used by the migration tool, a walkthrough of what happens during each phase of the migration, and how to perform a rollback. Hopefully you find these videos helpful. Feel free to leave any questions in the comments, or contact me in LinkedIn. :v:

Fri, 31 Mar 2023 00:00:00 +0000 http://www.networkbrouhaha.com/2023/03/vcd-v2t-videos/ http://www.networkbrouhaha.com/2023/03/vcd-v2t-videos/
Introducing IP Spaces for VMware Cloud Director <p class="center"><a href="/resources/2023/01/sd-computer-network.png" class="drop-shadow"><img src="/resources/2023/01/sd-computer-network.png" alt="" /></a></p> <p>Welcome! This blog post is about a new feature in <a href="https://www.vmware.com/products/cloud-director.html">VMware Cloud Director</a> (VCD), IP Spaces. As a VMware employee, I want to make it clear that the thoughts and opinions expressed in this post are my own and do not necessarily reflect the position of my employer. With that out of the way, let’s try to wrap our heads around IP Spaces in Cloud Director!</p> <p>I find myself asking the question “Why?” frequently in customer conversations (shout out to <a href="https://www.ted.com/talks/simon_sinek_how_great_leaders_inspire_action">Simon Sinek</a> and the <a href="https://simonsinek.com/books/start-with-why/">Golden Circle</a>!) In this blog post, my goal is to get to the “why” of IP Spaces. I will touch on the “how” and “what”, but these are fully covered in the Cloud Director documentation and other blog posts, which are linked at the bottom of this post.</p> <p class="center"><a href="/resources/2023/01/golden-circle2.png" class="drop-shadow"><img src="/resources/2023/01/golden-circle2.png" alt="" width="400" /></a> <br /><em>Simon Sinek’s Golden Circle</em></p> <h1 id="the-background">The Background</h1> <p>When backed by NSX-V, IP address management in Cloud Director is simple. The typical architecture consists of an external network with tenant edge gateways connected. The provider specifies a block of usable IPs that can be assigned to the external interface of each edge. If needed, additional IPs can be pulled from the block and assigned to the edge external interface for NAT, Load Balancing VIPs, VPN endpoints, etc. Everything the tenant needs to connect to the outside world can be accomplished by assigning one or more IPs to an edge interface and routing is very simple.</p> <p class="center"><a href="/resources/2023/01/vcd-nsxv-connectivity.png" class="drop-shadow"><img src="/resources/2023/01/vcd-nsxv-connectivity.png" alt="" /></a> <br /><em>Cloud Director External Connectivity with NSX-V</em></p> <p>External connectivity is quite different when Cloud Director is backed by NSX-T. External networking is provided via a T0 Gateway, which is created by the provider and imported into Cloud Director. Each tenant edge gateway is a T1 router that is connected to the T0 (or in some cases, a T0 VRF). Addresses used by the tenant are no longer assigned to an interface, but rather assigned via endpoint IP, which is essentially a loopback address assigned to the T1. Since there are now multiple hops to get from the data center network, through the T0, to the tenant T1, dynamic routing (e.g. BGP) is typically used to advertise the endpoint IPs that are assigned to the T1. These endpoint IPs can be used to SNAT workloads to the internet or terminate IPsec tunnels, providing very similar functionality to what is available in NSX-V.</p> <p>This change in behavior led to IP address sprawl and providers struggled to keep track of which tenants were using which IPs. To address this challenge, IP Spaces was born.</p> <p class="center"><a href="/resources/2023/01/vcd-nsxt-connectivity.png" class="drop-shadow"><img src="/resources/2023/01/vcd-nsxt-connectivity.png" alt="" /></a> <br /><em>Cloud Director External Connectivity with NSX-T</em></p> <h1 id="ip-spaces-overview">IP Spaces Overview</h1> <p>In VCD 10.4.1, there is a new configuration section to define IP Spaces. IP Spaces can be Public, Private, or Shared. Public IP Spaces are defined by the provider and specify what public IPs can be consumed by tenants. Private IP Spaces are defined by the tenant and are intended to simplify the process of connecting a tenant virtual data center (VDC) to a corporate WAN. Shared IP Spaces are like Private IP Spaces, allowing providers a streamlined way to provide dedicated services to tenants, such as NTP, software repos, managed services, etc.</p> <p>The scope of an IP range defines which networks are internal or external, or in other words, which networks are local to VCD, and which are remote. If you are familiar with the old Cisco terminology for NAT, think inside and outside networks. Relating this to NAT is helpful because that is one of the primary reasons that these scopes are defined. In future VCD releases, this information may be used to automatically create NAT and NONAT rules to simplify the configuration of typical architectures.</p> <p>Rounding out the concepts that are included in an IP Spaces are IP ranges, IP prefixes, and quota settings. IP ranges can be supplied in list form or CIDR notation and must be within the range defined as the internal scope. Tenants can request individual IPs out of the range to assign for services like NAT or a load balancer VIP. IP prefixes are also constrained to the internal scope, and they define specific subnets that tenants can consume. Quota settings define how many individual IPs and prefixes each tenant can use.</p> <h1 id="the-why">The Why</h1> <p>Defining these parameters – IP Space type, scope, ranges, prefixes, and quotas – provides VCD with far more information than was available with the basic IP address management in previous versions. Providers have fine-grained control over exactly which IP addresses and ranges tenants are allowed to consume. This also means that future VCD releases will have enough information to potentially configure NAT/NONAT rules, firewall rules, and BGP policy (prefix lists/filtering/etc.) for a variety of common topologies. The initial release of IP Spaces is just the beginning, providing a much more manageable and coherent IP address management system for providers and tenants. I am looking forward to seeing what other new capabilities will be unlocked as this feature evolves.</p> <h1 id="helpful-links">Helpful Links</h1> <p>Release Notes: <a href="https://docs.vmware.com/en/VMware-Cloud-Director/10.4.1/rn/vmware-cloud-director-1041-release-notes/index.html">https://docs.vmware.com/en/VMware-Cloud-Director/10.4.1/rn/vmware-cloud-director-1041-release-notes/index.html</a></p> <p>Documentation: <a href="https://docs.vmware.com/en/VMware-Cloud-Director/10.4/VMware-Cloud-Director-Tenant-Portal-Guide/GUID-FB230D89-ACBC-4345-A11A-D099D359ED1B.html">https://docs.vmware.com/en/VMware-Cloud-Director/10.4/VMware-Cloud-Director-Tenant-Portal-Guide/GUID-FB230D89-ACBC-4345-A11A-D099D359ED1B.html</a></p> <p>Other blog posts on IP Spaces:</p> <ul> <li>New Networking Features in VMware Cloud Director 10.4.1: <a href="https://fojta.wordpress.com/2022/12/16/new-networking-features-in-vmware-cloud-director-10-4-1/">https://fojta.wordpress.com/2022/12/16/new-networking-features-in-vmware-cloud-director-10-4-1/</a></li> <li>IP Spaces in VMware Cloud Director 10.4.1 – Part 1 – Introduction &amp; Public IP Spaces: <a href="https://kiwicloud.ninja/?p=69005">https://kiwicloud.ninja/?p=69005</a></li> <li>IP Spaces in VMware Cloud Director 10.4.1 – Part 2 – Private IP Spaces: <a href="https://kiwicloud.ninja/?p=69028">https://kiwicloud.ninja/?p=69028</a></li> <li>IP Spaces in VMware Cloud Director 10.4.1 – Part 3 – Tenant Experience, Compatibility &amp; Summary: <a href="https://kiwicloud.ninja/?p=69044">https://kiwicloud.ninja/?p=69044</a></li> </ul> <h1 id="notes">Notes</h1> <p>The <a href="/resources/2023/01/sd-computer-network.png">two</a> <a href="/resources/2023/01/golden-circle2.png">images</a> at the top of this post were made using <a href="https://en.wikipedia.org/wiki/Stable_Diffusion">Stable Diffusion</a>, an AI image generator. The first was generated by a prompt to create a picture with computer networking and clouds. The second was used to modify a <a href="/resources/2023/01/golden-circle.png">simple diagram</a> using pix2pix and img2img. I find it weird, and I like it.</p> Tue, 31 Jan 2023 00:00:00 +0000 http://www.networkbrouhaha.com/2023/01/vcd-intro-ip-spaces/ http://www.networkbrouhaha.com/2023/01/vcd-intro-ip-spaces/ 2022 Update: Simple Cloud Automation with VCD, Terraform, ZeroTier and Slack <p>In 2018 I wrote a blog titled <a href="https://networkbrouhaha.com/2018/03/vcd-terraform-example/">Simple cloud automation with vCD, Terraform, ZeroTier and Slack</a>. A lot has changed since I wrote that post, so it’s time to update it. The goal is still the same: deploy a VM (inside a vApp) in Cloud Director and automate network connectivity with ZeroTier. Slack is used to monitor the progress and display the IP address assigned by ZeroTier. Overall, I want to be able to deploy a VM that has outbound internet connectivity and be able to connect to it without having to configure any firewall rules, NAT, or SSL/IPsec VPN.</p> <p>I did make some adjustments to my approach while preparing to write this post. Instead of relying on <a href="https://docs.vmware.com/en/VMware-vSphere/7.0/com.vmware.vsphere.vm_admin.doc/GUID-58E346FF-83AE-42B8-BE58-253641D257BC.html">Guest Customization</a> with VMware tools, I chose to use <a href="https://cloud-init.io/">cloud-init</a>. This went so poorly that I wrote a dedicated post on it 😂: <a href="https://networkbrouhaha.com/2022/03/cloud-init-vcd/">Using cloud-init for Customization with VCD and Terraform</a>. VCD also has a completely different <a href="https://registry.terraform.io/providers/vmware/vcd/latest">Terraform provider</a> than the one I demoed in 2018, which I will dig into at the end of this post.</p> <h1 id="tools-used-and-prerequisites">Tools Used and Prerequisites</h1> <ul> <li><a href="https://www.vmware.com/products/cloud-director.html">VMware Cloud Director</a> - VMware’s cloud service delivery platform, typically used by service providers in the VMware Cloud Provider Program. I used VCD 10.3 in my lab when using the Terraform code you will see below.</li> <li><a href="https://terraform.io/">HashiCorp Terraform</a> - An open-source tool written in Go, Terraform allows users to define infrastructure as code. Many public cloud <a href="https://registry.terraform.io/browse/providers">providers</a> are supported in Terraform, as well as on-prem infrastructure like <a href="https://registry.terraform.io/providers/hashicorp/vsphere/latest">vSphere</a> and <a href="https://registry.terraform.io/providers/vmware/nsxt/latest">NSX-T</a>. The Terraform provider for VCD is available at <a href="https://registry.terraform.io/providers/vmware/vcd/latest">https://registry.terraform.io/providers/vmware/vcd/latest</a>.</li> <li><a href="https://www.zerotier.com/">ZeroTier</a> - The ZeroTier docs state that “ZeroTier is a smart Ethernet switch for planet Earth.” ZeroTier uses an agent to provide connectivity between endpoints connected to the same ZeroTier network. Anyone can create a free account on the ZeroTier website and create multiple networks. Endpoints connected to ZeroTier are managed through the web portal (or API). In other words, ZeroTier is a simple, free<sup id="fnref:1" role="doc-noteref"><a href="#fn:1" class="footnote" rel="footnote">1</a></sup>, fast<sup id="fnref:2" role="doc-noteref"><a href="#fn:2" class="footnote" rel="footnote">2</a></sup> VPN. If you’re wondering how ZeroTier works, check out their awesome <a href="https://docs.zerotier.com/zerotier/manual">whitepaper</a>. My friend and uber-network nerd <a href="https://twitter.com/showipintbri">Tony Efantis</a> provides a deep dive into ZeroTier on YouTube: <a href="https://www.youtube.com/watch?v=Lao9T_RQTak">https://www.youtube.com/watch?v=Lao9T_RQTak</a></li> <li><a href="https://slack.com/">Slack</a> - I’m assuming everyone is familiar with Slack by now. For this example, Slack is used to provide visibility into the process of connecting a new VM to ZeroTier. Slack’s free tier is great for testing simple automation and receiving notifications via webhooks.</li> <li><a href="https://github.com/">GitHub</a> - I’m hosting scripts on GitHub, but any web host could fill this need. If you choose another host, you should still use Git for version control for Terraform code and other scripts. The current script I’m using is at <a href="https://github.com/shamsway/zerotier-installer">https://github.com/shamsway/zerotier-installer</a>, and it is a simplified and modified version of the install script provided by ZeroTier at <a href="https://install.zerotier.com/">https://install.zerotier.com/</a>.</li> </ul> <p>Before deploying anything with Terraform, I installed ZeroTier on my local workstation, uploaded an Ubuntu cloud image OVA to my VCD catalog, and configured an incoming webhook for Slack. My VCD environment is preconfigured to allow outbound internet traffic, but nothing else.</p> <h1 id="terraform-example">Terraform Example</h1> <p>Below is the <code class="language-plaintext highlighter-rouge">main.tf </code>file to create a vApp, attach an existing Org network to the vApp, and clone a VM into the vApp using cloud-init for customization.</p> <div class="language-tf highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">terraform</span> <span class="p">{</span> <span class="nx">required_providers</span> <span class="p">{</span> <span class="nx">vcd</span> <span class="p">=</span> <span class="p">{</span> <span class="nx">source</span> <span class="p">=</span> <span class="s2">"vmware/vcd"</span> <span class="p">}</span> <span class="p">}</span> <span class="p">}</span> <span class="k">variable</span> <span class="s2">"ztnetwork"</span> <span class="p">{</span> <span class="nx">type</span> <span class="p">=</span> <span class="nx">string</span> <span class="nx">description</span> <span class="p">=</span> <span class="s2">"ZeroTier Network to join"</span> <span class="p">}</span> <span class="k">variable</span> <span class="s2">"ztapi"</span> <span class="p">{</span> <span class="nx">type</span> <span class="p">=</span> <span class="nx">string</span> <span class="nx">sensitive</span> <span class="p">=</span> <span class="kc">true</span> <span class="nx">description</span> <span class="p">=</span> <span class="s2">"ZeroTier API Access Token"</span> <span class="p">}</span> <span class="k">variable</span> <span class="s2">"slack_webhook_url"</span> <span class="p">{</span> <span class="nx">type</span> <span class="p">=</span> <span class="nx">string</span> <span class="nx">description</span> <span class="p">=</span> <span class="s2">"Slack webhook URL"</span> <span class="nx">default</span> <span class="p">=</span> <span class="s2">""</span> <span class="p">}</span> <span class="k">variable</span> <span class="s2">"vcd_vm_name"</span> <span class="p">{</span> <span class="nx">type</span> <span class="p">=</span> <span class="nx">string</span> <span class="nx">description</span> <span class="p">=</span> <span class="s2">"Name of new vApp created from template"</span> <span class="p">}</span> <span class="k">resource</span> <span class="s2">"vcd_vapp"</span> <span class="s2">"ubuntu"</span> <span class="p">{</span> <span class="nx">org</span> <span class="p">=</span> <span class="s2">"my-org"</span> <span class="nx">vdc</span> <span class="p">=</span> <span class="s2">"my-vdc"</span> <span class="nx">name</span> <span class="p">=</span> <span class="s2">"ubuntu"</span> <span class="nx">power_on</span> <span class="p">=</span> <span class="kc">true</span> <span class="p">}</span> <span class="k">resource</span> <span class="s2">"vcd_vapp_org_network"</span> <span class="s2">"ubuntu-network"</span> <span class="p">{</span> <span class="nx">org</span> <span class="p">=</span> <span class="s2">"my-org"</span> <span class="nx">vdc</span> <span class="p">=</span> <span class="s2">"my-vdc"</span> <span class="nx">vapp_name</span> <span class="p">=</span> <span class="nx">vcd_vapp</span><span class="p">.</span><span class="nx">ubuntu</span><span class="p">.</span><span class="nx">name</span> <span class="nx">org_network_name</span> <span class="p">=</span> <span class="s2">"org-network"</span> <span class="p">}</span> <span class="k">resource</span> <span class="s2">"vcd_vapp_vm"</span> <span class="s2">"ubuntu"</span> <span class="p">{</span> <span class="nx">org</span> <span class="p">=</span> <span class="s2">"my-org"</span> <span class="nx">vdc</span> <span class="p">=</span> <span class="s2">"my-vdc"</span> <span class="nx">vapp_name</span> <span class="p">=</span> <span class="s2">"ubuntu"</span> <span class="nx">catalog_name</span> <span class="p">=</span> <span class="s2">"my-catalog"</span> <span class="nx">template_name</span> <span class="p">=</span> <span class="s2">"ubuntu-2110-cloud"</span> <span class="nx">name</span> <span class="p">=</span> <span class="s2">"ubuntu-vm"</span> <span class="nx">memory</span> <span class="p">=</span> <span class="mi">4096</span> <span class="nx">cpus</span> <span class="p">=</span> <span class="mi">1</span> <span class="nx">os_type</span> <span class="p">=</span> <span class="s2">"ubuntu64Guest"</span> <span class="nx">power_on</span> <span class="p">=</span> <span class="kc">true</span> <span class="nx">network</span> <span class="p">{</span> <span class="nx">type</span> <span class="p">=</span> <span class="s2">"org"</span> <span class="nx">name</span> <span class="p">=</span> <span class="s2">"org-network"</span> <span class="nx">ip_allocation_mode</span> <span class="p">=</span> <span class="s2">"MANUAL"</span> <span class="nx">ip</span> <span class="p">=</span> <span class="s2">"192.168.1.10"</span> <span class="p">}</span> <span class="nx">guest_properties</span> <span class="p">=</span> <span class="p">{</span> <span class="s2">"user-data"</span> <span class="p">=</span> <span class="nx">base64encode</span><span class="p">(</span><span class="nx">templatefile</span><span class="p">(</span><span class="s2">"cloud-config.yaml"</span><span class="p">,</span> <span class="p">{</span> <span class="nx">ztnetwork</span> <span class="p">=</span> <span class="kd">var</span><span class="p">.</span><span class="nx">ztnetwork</span><span class="p">,</span> <span class="nx">ztapi</span> <span class="p">=</span> <span class="kd">var</span><span class="p">.</span><span class="nx">ztapi</span><span class="p">,</span> <span class="nx">slack_webhook_url</span> <span class="p">=</span> <span class="kd">var</span><span class="p">.</span><span class="nx">slack_webhook_url</span><span class="p">,</span> <span class="nx">hostname</span> <span class="p">=</span> <span class="kd">var</span><span class="p">.</span><span class="nx">vcd_vm_name</span> <span class="p">}))</span> <span class="p">}</span> <span class="p">}</span> </code></pre></div></div> <p>Most of this is straightforward, but the magic happens in the <code class="language-plaintext highlighter-rouge">guest_properties</code> block of the <code class="language-plaintext highlighter-rouge">vcd_vapp_vm</code> resource. The <code class="language-plaintext highlighter-rouge">user-data</code> property contains a base 64 encoded version of my cloud-init configuration. You can see that the <code class="language-plaintext highlighter-rouge">templatefile()</code> function is used to insert some values needed for the ZeroTier install script: the ZeroTier network to connect to, an API key for ZeroTier, the webhook URL for Slack, and the VM hostname.</p> <p>Here is my cloud-config.yaml, which performs the customization of the VM upon first boot:</p> <div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">#cloud-config</span> <span class="na">hostname</span><span class="pi">:</span> <span class="s">${hostname}</span> <span class="na">users</span><span class="pi">:</span> <span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="s">ubuntu</span> <span class="na">sudo</span><span class="pi">:</span> <span class="pi">[</span><span class="s2">"</span><span class="s">ALL=(ALL)</span><span class="nv"> </span><span class="s">NOPASSWD:ALL"</span><span class="pi">]</span> <span class="na">groups</span><span class="pi">:</span> <span class="pi">[</span><span class="nv">sudo</span><span class="pi">]</span> <span class="na">shell</span><span class="pi">:</span> <span class="s">/bin/bash</span> <span class="na">ssh_authorized_keys</span><span class="pi">:</span> <span class="pi">-</span> <span class="s">ssh-rsa alongstringthatisansshkey</span> <span class="na">manage_resolv_conf</span><span class="pi">:</span> <span class="no">true</span> <span class="na">packages</span><span class="pi">:</span> <span class="pi">-</span> <span class="s">python3-pip</span> <span class="pi">-</span> <span class="s">jq</span> <span class="na">runcmd</span><span class="pi">:</span> <span class="pi">-</span> <span class="s">export ZTNETWORK=${ztnetwork}</span> <span class="pi">-</span> <span class="s">export ZTAPI=${ztapi}</span> <span class="pi">-</span> <span class="s">export SLACK_WEBHOOK_URL=${slack_webhook_url}</span> <span class="pi">-</span> <span class="s">wget https://raw.githubusercontent.com/shamsway/zerotier-installer/master/zerotier-installer.sh</span> <span class="pi">-</span> <span class="s">chmod +x zerotier-installer.sh</span> <span class="pi">-</span> <span class="s">./zerotier-installer.sh</span> <span class="pi">-</span> <span class="s">rm zerotier-installer.sh</span> <span class="na">final_message</span><span class="pi">:</span> <span class="s2">"</span><span class="s">The</span><span class="nv"> </span><span class="s">system</span><span class="nv"> </span><span class="s">is</span><span class="nv"> </span><span class="s">ready</span><span class="nv"> </span><span class="s">and</span><span class="nv"> </span><span class="s">prepped</span><span class="nv"> </span><span class="s">(took</span><span class="nv"> </span><span class="s">$UPTIME</span><span class="nv"> </span><span class="s">seconds)"</span> </code></pre></div></div> <p>This cloud-init config will configure the local ubuntu user with sudo privileges, disable password-based logins, add my desired SSH key and install some necessary packages. The <code class="language-plaintext highlighter-rouge">runcmd</code> block is the bit that actually downloads my ZeroTier installer from GitHub and executes it, connecting the VM to my ZeroTier network and providing output to Slack.</p> <p>Now, let’s see this in action.</p> <h1 id="workflow">Workflow</h1> <p>The output from <code class="language-plaintext highlighter-rouge">terraform apply</code> looks just as you’d expect if you’ve ever seen Terraform run:</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Plan: 3 to add, 0 to change, 0 to destroy. Do you want to perform these actions? Terraform will perform the actions described above. Only 'yes' will be accepted to approve. Enter a value: yes vcd_vapp.ubuntu-zt: Creating... vcd_vapp.ubuntu-zt: Still creating... [10s elapsed] vcd_vapp.ubuntu-zt: Creation complete after 16s [id=urn:vcloud:vapp:db4d4ee7-b171-45dc-a98a-67cd717db127] vcd_vapp_org_network.ubuntu-zt-network: Creating... vcd_vapp_vm.ubuntu: Creating... vcd_vapp_org_network.ubuntu-zt-network: Creation complete after 5s [id=urn:vcloud:network:1b61037f-dc6d-4ae5-aefc-59962de1e647] vcd_vapp_vm.ubuntu: Still creating... [10s elapsed] [snip] vcd_vapp_vm.ubuntu: Creation complete after 1m58s [id=urn:vcloud:vm:d20caca3-8b80-45da-8435-c4d44c988ccb] Apply complete! Resources: 3 added, 0 changed, 0 destroyed. </code></pre></div></div> <p>VCD creates the vApp, clones a template VM into the vApp, and powers it on. When the VM boots, cloud-init runs and executes each step specified in cloud-config.yaml, which will ultimately connect the new VM to my ZeroTier network. API calls are used to authorize the new VM to connect to my ZeroTier network automatically, so I don’t have to go in and manually accept the new VM in the ZeroTier portal. The process of connecting the VM to ZeroTier is output to Slack, and once complete I can grab the provided IP and immediately connect to the new VM.</p> <p class="center"><a href="/resources/2022/03/vcd-automation-slack.png" class="drop-shadow"><img src="/resources/2022/03/vcd-automation-slack.png" alt="" /></a></p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>user@ubuntu:~$ ssh [email protected] The authenticity of host '172.29.189.205 (172.29.189.205)' can't be established. ECDSA key fingerprint is SHA256:sOGaDtQ6D6bvIhmr/YhKt6Olt9EsVNRNGAomfVuIW1o. Are you sure you want to continue connecting (yes/no/[fingerprint])? yes Warning: Permanently added '172.29.189.205' (ECDSA) to the list of known hosts. Welcome to Ubuntu 21.10 (GNU/Linux 5.13.0-28-generic x86_64) * Documentation: https://help.ubuntu.com * Management: https://landscape.canonical.com * Support: https://ubuntu.com/advantage System information as of Wed Mar 16 23:47:03 UTC 2022 System load: 0.03 Processes: 138 Usage of /: 23.0% of 9.52GB Users logged in: 0 Memory usage: 6% IPv4 address for ens192: 192.168.1.10 Swap usage: 0% IPv4 address for ztmjfe5xok: 172.29.189.205 The programs included with the Ubuntu system are free software; the exact distribution terms for each program are described in the individual files in /usr/share/doc/*/copyright. Ubuntu comes with ABSOLUTELY NO WARRANTY, to the extent permitted by applicable law. To run a command as administrator (user "root"), use "sudo &lt;command&gt;". See "man sudo_root" for details. ubuntu@ubuntu-impish-21:~$ ping google.com PING google.com (142.250.191.238) 56(84) bytes of data. 64 bytes from ord38s32-in-f14.1e100.net (142.250.191.238): icmp_seq=1 ttl=113 time=3.54 ms 64 bytes from ord38s32-in-f14.1e100.net (142.250.191.238): icmp_seq=2 ttl=113 time=3.60 ms </code></pre></div></div> <p>Notice that SSH key-based authentication is used instead of a password, which is common practice for instances running in the cloud.</p> <p>So there it is - a VM deployed into VCD and automatically connected to ZeroTier, making it available without having to configure any sort of inbound firewall rules, NAT, or IPSec/SSL VPN.</p> <h1 id="state-of-the-vcd-terraform-provider-in-2022">State of the VCD Terraform Provider in 2022</h1> <p>When I wrote about this in 2018, the VCD Terraform provider was written by HashiCorp and was based on a go library named <code class="language-plaintext highlighter-rouge">govcloudair</code>. This library was not maintained by VMware and it was not actively developed, meaning that the VCD provider supported a limited number of features. I am happy to report that the <a href="https://registry.terraform.io/providers/vmware/vcd/latest">current VCD provider</a> is in a much better state. The provider is actively developed by VMware along with the underlying go library, <a href="https://github.com/vmware/go-vcloud-director">go-vcloud-director</a>. As of March 2022, there were <strong>over 2 million installs</strong> of the VCD Terraform provider, and new features are being added regularly. Many of the workarounds and caveats I mentioned in my 2018 post are no longer required. Huzzah!</p> <p class="center"><img src="https://media.giphy.com/media/d7qN2d6ktQphUeDoQ4/giphy.gif" alt="" /></p> <h1 id="final-thoughts">Final Thoughts</h1> <p>Here are a few random thoughts/potential improvements:</p> <ul> <li>This same workflow could be used in any cloud environment. It would require outbound internet access to be enabled, and cloud-init is well supported across cloud providers. Each cloud provider’s Terraform provider documentation should contain examples for using cloud-init.</li> <li>Cloud-init could be used to install ZeroTier and send the output to Slack, but I didn’t want to spend the time to convert my install script. Initially, I used a script hosted on GitHub because there was a limit on the size of a script that can be used with Guest Customization, but cloud-init does not have that limit. I may convert my install script over to cloud-init at a later date.</li> <li>The ZeroTier install script uses <a href="https://github.com/philippbosch/slack-webhook-cli">https://github.com/philippbosch/slack-webhook-cli</a> to send messages to Slack, which requires Python to be installed. Installing Python adds time to the process. Sending messages to Slack is just a webhook, so a bash script could be used instead. This would remove the requirement to install Python and the whole process would be a bit faster.</li> </ul> <h1 id="resources">Resources</h1> <ul> <li>VCD Terraform provider: <a href="https://registry.terraform.io/providers/vmware/vcd/latest">https://registry.terraform.io/providers/vmware/vcd/latest</a></li> <li>Go-vcloud-director library: <a href="https://github.com/vmware/go-vcloud-director">https://github.com/vmware/go-vcloud-director</a></li> <li>ZeroTier documentation: <a href="https://docs.zerotier.com/zerotier/manual/">https://docs.zerotier.com/zerotier/manual/</a></li> <li>ZeroTier overview on Wikipedia: <a href="https://en.wikipedia.org/wiki/ZeroTier">https://en.wikipedia.org/wiki/ZeroTier</a></li> <li>How Does ZeroTier Actually Work? <a href="https://www.youtube.com/watch?v=Lao9T_RQTak">https://www.youtube.com/watch?v=Lao9T_RQTak</a></li> </ul> <div class="footnotes" role="doc-endnotes"> <ol> <li id="fn:1" role="doc-endnote"> <p>GPL license / Up to 100 devices / Requires license to embed in commercial products. <a href="#fnref:1" class="reversefootnote" role="doc-backlink">&#8617;</a></p> </li> <li id="fn:2" role="doc-endnote"> <p>Quick setup, but actual traffic may proxy through ZeroTier servers. There is no throughput guarantee. <a href="#fnref:2" class="reversefootnote" role="doc-backlink">&#8617;</a></p> </li> </ol> </div> Thu, 10 Mar 2022 00:00:00 +0000 http://www.networkbrouhaha.com/2022/03/vcd-verraform-example/ http://www.networkbrouhaha.com/2022/03/vcd-verraform-example/ Using cloud-init for Customization with VCD and Terraform <p>Recently I decided to update a blog post I wrote in 2018, <a href="https://networkbrouhaha.com/2018/03/vcd-terraform-example/">Simple cloud automation with vCD, Terraform, ZeroTier and Slack</a>. At a very high level, this blog post walks through deploying a vApp to VCD that is customized to run a script at first boot. In the original blog post, I relied on <a href="https://docs.vmware.com/en/VMware-vSphere/7.0/com.vmware.vsphere.vm_admin.doc/GUID-58E346FF-83AE-42B8-BE58-253641D257BC.html">Guest Customization</a> with VMware tools to accomplish this. For a variety of reasons - primarily curiosity - I decided to use <a href="https://cloud-init.io/">cloud-init</a> to run the script instead. Cloud-init is quite flexible and well supported, but in hindsight, my choice led me down quite a rabbit hole. This post covers the details of how cloud-init reads its configuration through VMware tools, tips for troubleshooting cloud-init, and some other lessons learned along the way. Of course, I’ll share a working example that deploys a vApp to VCD using cloud-init for customization.</p> <p>The act that set the stage for this post is something I have done many times: I uploaded an Ubuntu ISO to a VCD catalog and used it to create a vApp. That vApp, and the single VM it contained, would be added to the same VCD catalog as a vApp template. This was my first mistake, but it took me several hours to figure out why.</p> <p class="center"><img src="https://media.giphy.com/media/xUPGcl3ijl0vAEyIDK/giphy.gif" alt="" /></p> <p>Before we get into that, let’s level set on how cloud-init works.</p> <h1 id="the-basics-of-cloud-init">The Basics of cloud-init</h1> <p>Here is how cloud-init describes itself:</p> <p>“Cloud-init is the industry standard multi-distribution method for cross-platform cloud instance initialization. It is supported across all major public cloud providers, provisioning systems for private cloud infrastructure, and bare-metal installations.” -<a href="https://cloudinit.readthedocs.io/">https://cloudinit.readthedocs.io/</a></p> <p>Taking a look at <a href="https://cloudinit.readthedocs.io/en/latest/topics/examples.html">the provided configuration examples</a> makes it clear what the capabilities are:</p> <ul> <li>Add/configure users</li> <li>Create files</li> <li>Install or update software</li> <li>Configure networking</li> <li>Configure Certificate Authorities</li> <li>Run scripts/arbitrary commands</li> <li>And <a href="https://cloudinit.readthedocs.io/en/latest/topics/modules.html">much more</a></li> </ul> <p>The typical scenario for cloud-init is that a config file is supplied when a server boots, is read by cloud-init and executed. The cloud-init docs refer to the config file as <code class="language-plaintext highlighter-rouge">user-data</code>. So, how is <code class="language-plaintext highlighter-rouge">user-data</code> supplied? The details vary, but a datasource is the vehicle to deliver configuration files cloud-init. Cloud-init supports several <a href="https://cloudinit.readthedocs.io/en/latest/topics/datasources.html">datasources</a> to deliver <code class="language-plaintext highlighter-rouge">user-data</code> (there are datasources available for major cloud providers), but in a VMware environment the most promising options are <a href="https://cloudinit.readthedocs.io/en/latest/topics/datasources/ovf.html">OVF</a> and <a href="https://cloudinit.readthedocs.io/en/latest/topics/datasources/vmware.html">VMware</a>.</p> <ul> <li>The <a href="https://cloudinit.readthedocs.io/en/latest/topics/datasources/vmware.html">VMware datasource docs</a> state that it supports <code class="language-plaintext highlighter-rouge">GuestInfo</code> keys for supplying <code class="language-plaintext highlighter-rouge">user-data</code>. <code class="language-plaintext highlighter-rouge">GuestInfo</code> is metadata in the form of key/value pairs set in a VM’s <code class="language-plaintext highlighter-rouge">extraConfig</code> property, which can be read by VMware tools. As long as this metadata can be set via the VCD Terraform provider, this sounds like the datasource that would be used by cloud-init.</li> <li>The <a href="https://cloudinit.readthedocs.io/en/latest/topics/datasources/ovf.html">OVF datasource docs</a> state “The OVF Datasource provides a datasource for reading data from on an Open Virtualization Format ISO transport.” That sounds less promising. I’m not interested in building an ISO to bootstrap cloud-init.</li> </ul> <p>Queue my surprise when I finally got cloud-init working, and the logs indicated that it used the OVF datasource. The datasource used by cloud-init can be checked with the <code class="language-plaintext highlighter-rouge">cloud-id</code> command, and this was the output I received:</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ubuntu@ubuntu-impish-21:~$ cloud-id ovf </code></pre></div></div> <p>Since all of the cloud-init code is available on GitHub, it’s not too difficult to see how the various data sources work. After a bit of snooping, it’s clear that the OVF datasource also reads the <code class="language-plaintext highlighter-rouge">extraConfig</code> metadata through VMware tools. In this case, it appears that the cloud-init docs are out of date. That was one of many valuable lessons during this process. Let me share two important ones with you.</p> <h1 id="lesson-1-check-github-issues">Lesson #1: Check GitHub issues</h1> <p>The VCD Terraform Provider docs have a <a href="https://registry.terraform.io/providers/vmware/vcd/latest/docs/guides/vm_guest_customization">section on guest customization</a>, but it doesn’t mention cloud-init specifically. It does show an example of configuring metadata with the provider, so I felt confident that I could supply cloud-init <code class="language-plaintext highlighter-rouge">user-data</code> with that method. I mentioned in the intro that I made a mistake by attempting to use cloud-init with an Ubuntu server that I built from an ISO. I’m quite sure there is a way to make it work, but I kept hitting roadblocks. Had I skimmed the resolved issues in the VCD Terraform Provider repo, I would have found <a href="https://github.com/vmware/terraform-provider-vcd/issues/667#issuecomment-844030920">this helpful comment</a>:</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>The problem that I had was the OVA machine I tried to use. A standard version of Ubuntu. First part to make this working correctly is to download the cloud image at: http://cloud-images.ubuntu.com/ </code></pre></div></div> <p>The commenter then goes on to provide a working example of using cloud-init with the VCD Terraform Provider. Normally I do a search through GitHub issues when I’m troubleshooting something. In this case, inexplicably, I did not. If I had read that comment first, I would have saved a lot of time. However, I would not have learned so many useful strategies for troubleshooting cloud-init.</p> <p class="center"><img src="https://media.giphy.com/media/3o7aD4ubUVr8EkgQF2/giphy.gif" alt="" /></p> <h1 id="lesson-2-use-a-cloud-image">Lesson #2: Use a Cloud Image</h1> <p>I was aware cloud images existed, but I was set in my ways. I’d used a bootable ISO to build a Linux VM template so many times and I didn’t consider that there was an easier option. I also assumed cloud images were purely for cloud providers, and I didn’t bother to check if there was a VMware flavor available. Lesson learned. There’s a great post on using the Ubuntu cloud image on vSphere here: <a href="https://d-nix.nl/2021/04/using-the-ubuntu-cloud-image-in-vmware/">https://d-nix.nl/2021/04/using-the-ubuntu-cloud-image-in-vmware/</a>. That only covers the vSphere side of things, but that post is a great explainer.</p> <h1 id="deploying-and-customizing-a-vcd-vapp-with-terraform">Deploying and Customizing a VCD vApp with Terraform</h1> <p>With those (rather obvious) lessons learned, <strong>let’s do this thing</strong>.</p> <p class="center"><img src="https://media.giphy.com/media/tyxovVLbfZdok/giphy.gif" alt="" /></p> <p>You will need the following:</p> <ul> <li>A <code class="language-plaintext highlighter-rouge">cloud-config.yaml</code> file, containing the cloud-init <code class="language-plaintext highlighter-rouge">user-data</code>. The file extension is a clue that this is a YAML-formatted file. If you have cloud-init installed locally, you can verify that it is a valid config with <code class="language-plaintext highlighter-rouge">cloud-init devel schema -c cloud-init.yaml</code>. I highly recommend that you do this.</li> <li>A cloud image OVA downloaded on your local workstation. For Ubuntu, these are available at <a href="http://cloud-images.ubuntu.com/">http://cloud-images.ubuntu.com/</a></li> </ul> <h2 id="creating-a-catalog">Creating a Catalog</h2> <p>Creating a catalog in VCD with Terraform is pretty simple. Here is an example:</p> <div class="language-tf highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">resource</span> <span class="s2">"vcd_catalog"</span> <span class="s2">"mycatalog"</span> <span class="p">{</span> <span class="nx">org</span> <span class="p">=</span> <span class="s2">"my-org"</span> <span class="nx">name</span> <span class="p">=</span> <span class="s2">"my-catalog"</span> <span class="nx">description</span> <span class="p">=</span> <span class="s2">"Catalog created by Terraform"</span> <span class="nx">delete_recursive</span> <span class="p">=</span> <span class="s2">"true"</span> <span class="nx">delete_force</span> <span class="p">=</span> <span class="s2">"true"</span> <span class="p">}</span> </code></pre></div></div> <h2 id="uploading-an-ova-to-a-catalog">Uploading an OVA to a Catalog</h2> <p>Similarly, adding the cloud image OVA to the new catalog is straightforward. The upload time will be dependent on the bandwidth available, but the Ubuntu 21.10 cloud image is only about 540 MB.</p> <div class="language-tf highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">resource</span> <span class="s2">"vcd_catalog_item"</span> <span class="s2">"ubuntu-2110-cloud"</span> <span class="p">{</span> <span class="nx">org</span> <span class="p">=</span> <span class="nx">vcd_catalog</span><span class="p">.</span><span class="nx">mycatalog</span><span class="p">.</span><span class="nx">org</span> <span class="nx">catalog</span> <span class="p">=</span> <span class="nx">vcd_catalog</span><span class="p">.</span><span class="nx">mycatalog</span><span class="p">.</span><span class="nx">name</span> <span class="nx">name</span> <span class="p">=</span> <span class="s2">"ubuntu-2110-cloud"</span> <span class="nx">description</span> <span class="p">=</span> <span class="s2">"Ubuntu 21.10 cloud image"</span> <span class="nx">ova_path</span> <span class="p">=</span> <span class="s2">"./impish-server-cloudimg-amd64.ova"</span> <span class="nx">upload_piece_size</span> <span class="p">=</span> <span class="mi">10</span> <span class="p">}</span> </code></pre></div></div> <h2 id="deploying-the-vapp">Deploying the vApp</h2> <p>This is the final step, and it requires a few different Terraform resources, but it’s not too difficult to follow.</p> <div class="language-tf highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">resource</span> <span class="s2">"vcd_vapp"</span> <span class="s2">"ubuntu"</span> <span class="p">{</span> <span class="nx">org</span> <span class="p">=</span> <span class="s2">"my-org"</span> <span class="nx">vdc</span> <span class="p">=</span> <span class="s2">"my-vdc"</span> <span class="nx">name</span> <span class="p">=</span> <span class="s2">"ubuntu"</span> <span class="nx">power_on</span> <span class="p">=</span> <span class="kc">true</span> <span class="p">}</span> <span class="k">resource</span> <span class="s2">"vcd_vapp_org_network"</span> <span class="s2">"ubuntu-network"</span> <span class="p">{</span> <span class="nx">org</span> <span class="p">=</span> <span class="s2">"my-org"</span> <span class="nx">vdc</span> <span class="p">=</span> <span class="s2">"my-vdc"</span> <span class="nx">vapp_name</span> <span class="p">=</span> <span class="nx">vcd_vapp</span><span class="p">.</span><span class="nx">ubuntu</span><span class="p">.</span><span class="nx">name</span> <span class="nx">org_network_name</span> <span class="p">=</span> <span class="s2">"org-network"</span> <span class="p">}</span> <span class="k">resource</span> <span class="s2">"vcd_vapp_vm"</span> <span class="s2">"ubuntu"</span> <span class="p">{</span> <span class="nx">org</span> <span class="p">=</span> <span class="s2">"my-org"</span> <span class="nx">vdc</span> <span class="p">=</span> <span class="s2">"my-vdc"</span> <span class="nx">vapp_name</span> <span class="p">=</span> <span class="nx">vcd_vapp</span><span class="p">.</span><span class="nx">ubuntu</span><span class="p">.</span><span class="nx">name</span> <span class="nx">catalog_name</span> <span class="p">=</span> <span class="s2">"my-catalog"</span> <span class="nx">template_name</span> <span class="p">=</span> <span class="s2">"ubuntu-2110-cloud"</span> <span class="nx">name</span> <span class="p">=</span> <span class="s2">"ubuntu-vm"</span> <span class="nx">memory</span> <span class="p">=</span> <span class="mi">4096</span> <span class="nx">cpus</span> <span class="p">=</span> <span class="mi">1</span> <span class="nx">os_type</span> <span class="p">=</span> <span class="s2">"ubuntu64Guest"</span> <span class="nx">power_on</span> <span class="p">=</span> <span class="kc">true</span> <span class="nx">network</span> <span class="p">{</span> <span class="nx">type</span> <span class="p">=</span> <span class="s2">"org"</span> <span class="nx">name</span> <span class="p">=</span> <span class="s2">"org-network"</span> <span class="nx">ip_allocation_mode</span> <span class="p">=</span> <span class="s2">"MANUAL"</span> <span class="nx">ip</span> <span class="p">=</span> <span class="s2">"192.168.1.10"</span> <span class="p">}</span> <span class="nx">guest_properties</span> <span class="p">=</span> <span class="p">{</span> <span class="s2">"user-data"</span> <span class="p">=</span> <span class="nx">base64encode</span><span class="p">(</span><span class="s2">"cloud-config.yaml"</span><span class="p">)</span> <span class="p">}</span> <span class="p">}</span> </code></pre></div></div> <ul> <li>The <code class="language-plaintext highlighter-rouge">vcd_vapp</code> resource creates the new vApp that contain a single VM running the cloud image template in my catalog</li> <li>The <code class="language-plaintext highlighter-rouge">vcd_vapp_org_network</code> resource attaches an existing org network to the new vApp</li> <li>The <code class="language-plaintext highlighter-rouge">vcd_vapp_vm </code>resource provides all of the configuration for the single VM that will be in the new vApp, including the cloud-init <code class="language-plaintext highlighter-rouge">user-data</code></li> </ul> <p>Most of the config in the <code class="language-plaintext highlighter-rouge">vcd_vapp_vm </code>resource is what you’d expect - compute, memory, and networking settings. The <code class="language-plaintext highlighter-rouge">guest_properties</code> section is the important bit. It configures the <code class="language-plaintext highlighter-rouge">extraConfig</code> property on the VM, which is where cloud-init will read the <code class="language-plaintext highlighter-rouge">user-data</code> from. Notice that the <a href="https://www.terraform.io/language/functions/base64encode">base64encode()</a> function is used to convert the <code class="language-plaintext highlighter-rouge">cloud-config.yaml</code> file into a single, long, encoded string. This is how cloud-init expects the <code class="language-plaintext highlighter-rouge">user-data</code> to be passed over.</p> <p>If you have values in your <code class="language-plaintext highlighter-rouge">cloud-config.yaml</code> file that you need to change on the fly, like credentials or API keys, you can use the <a href="https://www.terraform.io/language/functions/templatefile">templatefile()</a> function to insert those values into the config file before encoding it. It’s possible that <code class="language-plaintext highlighter-rouge">user-data</code> will contain sensitive data and it is trivial to decode base64. In a production environment, you should remove the <code class="language-plaintext highlighter-rouge">user-data</code> from the VM after first boot.</p> <p>I traveled down a winding road to get here, but I finally assembled all of the pieces needed to do what I set out for originally: <a href="/2022/03/vcd-verraform-example/">update an old blog post</a>. If all you needed was some tips on using cloud-init with Terraform and VCD, you can go along your merry way. Stick around if you want some tips on troubleshooting cloud-init.</p> <h1 id="troubleshooting-cloud-init">Troubleshooting cloud-init</h1> <p>Here are some basic troubleshooting steps for cloud-init with vSphere/VCD:</p> <ul> <li>Make sure you have a recent version of VMware Tools installed. This is required to read the metadata associated with the VM.</li> <li>Make sure you are using a cloud image <em>or</em> you have taken the steps to ensure that your VM is properly configured to work with cloud-init. You can see an example of this with the <code class="language-plaintext highlighter-rouge">govc</code> tool at <a href="https://github.com/vmware/govmomi/blob/master/govc/USAGE.md#vmchange">https://github.com/vmware/govmomi/blob/master/govc/USAGE.md#vmchange</a>.</li> <li>Verify that VMware Tools is able to access VM metadata. You can use the command <code class="language-plaintext highlighter-rouge">vmware-rpctool 'info-get guestinfo.ovfEnv' </code>to check this. If the command returns a slew of XML, it is working as expected.</li> <li>Verify the VM metadata. You can view this in vSphere by browsing to the <code class="language-plaintext highlighter-rouge">VM -&gt; Settings -&gt; vApp Options</code>. Base64 encoded <code class="language-plaintext highlighter-rouge">user-data</code> should be visible under the properties section, and you can click the “View OVF Environment” button to see the XML formatted version of the metadata. This is the same information you should see from running the <code class="language-plaintext highlighter-rouge">vmware-rpctool</code> command on the VM. You can also view these properties in VCD by viewing the Guest Properties section in the VM properties.</li> <li>Check the cloud-init logs at <code class="language-plaintext highlighter-rouge">/var/log/cloud-init.log</code> and <code class="language-plaintext highlighter-rouge">/var/log/cloud-init-output.log</code> for errors and warnings.</li> <li>Run <code class="language-plaintext highlighter-rouge">cloud-id</code> to verify that the correct datasource is being used. If the output is <code class="language-plaintext highlighter-rouge">fallback</code> or <code class="language-plaintext highlighter-rouge">none</code>, cloud-init was not able to detect the datasource.</li> <li><code class="language-plaintext highlighter-rouge">ds-identify</code> is used by cloud-init to find all available datasources. Check the logs at <code class="language-plaintext highlighter-rouge">/run/cloud-init/ds-identify.log</code> to see why the desired datasource is not found.</li> <li>While troubleshooting, you can completely reset cloud-init with <code class="language-plaintext highlighter-rouge">sudo cloud-init clean --logs</code>, and reboot to have cloud-init run again. This saves time over redeploying a template.</li> </ul> <h1 id="resources">Resources</h1> <ul> <li>Terraform VCD provider: <a href="https://registry.terraform.io/providers/vmware/vcd/3.5.1">https://registry.terraform.io/providers/vmware/vcd/3.5.1</a></li> <li><code class="language-plaintext highlighter-rouge">vcd_catalog </code>resource: <a href="https://registry.terraform.io/providers/vmware/vcd/latest/docs/resources/catalog">https://registry.terraform.io/providers/vmware/vcd/latest/docs/resources/catalog</a></li> <li><code class="language-plaintext highlighter-rouge">vcd_catalog_item </code>resource: <a href="https://registry.terraform.io/providers/vmware/vcd/latest/docs/resources/catalog_item">https://registry.terraform.io/providers/vmware/vcd/latest/docs/resources/catalog_item</a></li> <li><code class="language-plaintext highlighter-rouge">vcd_vapp </code>resource: <a href="https://registry.terraform.io/providers/vmware/vcd/latest/docs/resources/vapp">https://registry.terraform.io/providers/vmware/vcd/latest/docs/resources/vapp</a></li> <li><code class="language-plaintext highlighter-rouge">vcd_vapp_org_network </code>resource: <a href="https://registry.terraform.io/providers/vmware/vcd/latest/docs/resources/vapp_org_network">https://registry.terraform.io/providers/vmware/vcd/latest/docs/resources/vapp_org_network</a></li> <li><code class="language-plaintext highlighter-rouge">vcd_vapp_vm </code>resource: <a href="https://registry.terraform.io/providers/vmware/vcd/latest/docs/resources/vapp_vm">https://registry.terraform.io/providers/vmware/vcd/latest/docs/resources/vapp_vm</a></li> <li>OVF Runtime Environment: <a href="https://williamlam.com/2012/06/ovf-runtime-environment.html">https://williamlam.com/2012/06/ovf-runtime-environment.html</a></li> <li>Using the Ubuntu Cloud Image in VMware: <a href="https://d-nix.nl/2021/04/using-the-ubuntu-cloud-image-in-vmware/">https://d-nix.nl/2021/04/using-the-ubuntu-cloud-image-in-vmware/</a></li> <li>Terraform, vSphere, and Cloud-Init oh my! <a href="https://grantorchard.com/terraform-vsphere-cloud-init/">https://grantorchard.com/terraform-vsphere-cloud-init/</a></li> <li>Cloud-init config examples: <a href="https://cloudinit.readthedocs.io/en/latest/topics/examples.html">https://cloudinit.readthedocs.io/en/latest/topics/examples.html</a></li> </ul> Thu, 10 Mar 2022 00:00:00 +0000 http://www.networkbrouhaha.com/2022/03/cloud-init-vcd/ http://www.networkbrouhaha.com/2022/03/cloud-init-vcd/ Intro to Google Cloud VMware Engine – Common Networking Scenarios <p>This post will cover some common networking scenarios in Google Cloud VMware Engine (GCVE), like exposing a VM via public IP, accessing cloud-native services, and configuring a basic load balancer in NSX-T. I’ll also recap some important and useful features in GCP and GCVE. There is a lot of material covered, so I’ve provided a table of contents to allow you to skip to the topic you’re interested in.</p> <div style="position: relative;"> <a href="#toc-skipped" class="screen-reader-only">Skip table of contents</a> </div> <h1 class="no_toc" id="table-of-contents">Table of Contents</h1> <ul id="markdown-toc"> <li><a href="#creating-workload-segments-in-nsx-t" id="markdown-toc-creating-workload-segments-in-nsx-t">Creating Workload Segments in NSX-T</a></li> <li><a href="#exposing-a-vm-via-public-ip" id="markdown-toc-exposing-a-vm-via-public-ip">Exposing a VM via Public IP</a> <ul> <li><a href="#creating-firewall-rules" id="markdown-toc-creating-firewall-rules">Creating Firewall Rules</a></li> </ul> </li> <li><a href="#load-balancing-with-nsx-t" id="markdown-toc-load-balancing-with-nsx-t">Load Balancing with NSX-T</a></li> <li><a href="#accessing-cloud-native-services" id="markdown-toc-accessing-cloud-native-services">Accessing Cloud-Native Services</a> <ul> <li><a href="#google-private-access" id="markdown-toc-google-private-access">Google Private Access</a></li> </ul> </li> <li><a href="#viewing-routing-information" id="markdown-toc-viewing-routing-information">Viewing Routing Information</a> <ul> <li><a href="#vpc-routes" id="markdown-toc-vpc-routes">VPC Routes</a></li> <li><a href="#vpc-network-peering-routes" id="markdown-toc-vpc-network-peering-routes">VPC Network Peering Routes</a></li> <li><a href="#nsx-t" id="markdown-toc-nsx-t">NSX-T</a></li> </ul> </li> <li><a href="#vpn-connectivity" id="markdown-toc-vpn-connectivity">VPN Connectivity</a></li> <li><a href="#dns-notes" id="markdown-toc-dns-notes">DNS Notes</a></li> <li><a href="#wrap-up" id="markdown-toc-wrap-up">Wrap Up</a></li> <li><a href="#helpful-resources" id="markdown-toc-helpful-resources">Helpful Resources</a></li> </ul> <div id="toc-skipped"></div> <p><strong>Other posts in this series:</strong></p> <ul> <li><a href="/2021/02/gcve-sddc-with-hcx/">Deploying a GCVE SDDC with HCX</a></li> <li><a href="/2021/02/gcp-vpc-to-gcve/">Connecting a VPC to GCVE</a></li> <li><a href="/2021/03/gcve-bastion/">Bastion Host Access with IAP</a></li> <li><a href="/2021/03/gcve-network-overview/">Network and Connectivity Overview</a></li> <li><a href="/2021/04/gcve-hcx-config/">HCX Configuration</a></li> </ul> <h1 id="creating-workload-segments-in-nsx-t">Creating Workload Segments in NSX-T</h1> <p>Your GCVE SDDC initially comes with networking pre-configured, and you don’t need to worry about configuring and trunking VLANs. Instead, any new networking configuration will be done in NSX-T. If you are new to NSX-T, the GCVE documentation <a href="https://cloud.google.com/vmware-engine/docs/networking/howto-create-vlan-subnet">covers creating new workload segments</a>, which should be your first step before creating or migrating any VMs to your GCVE SDDC.</p> <p class="center"><a href="/resources/2021/05/68_gcve_diagram_1.png" class="drop-shadow"><img src="/resources/2021/05/68_gcve_diagram_1.png" alt="" /></a></p> <p>This diagram represents the initial setup of my GCVE environment, and I will be building on this example over the following sections. If you’ve been following along with this blog series, this should look familiar. You can see a “Customer Data Center” on the left, which in my case is a lab, but it could be any environment connected to GCP via Cloud VPN or Cloud Interconnect. There is also a VPC peered with my GCVE environment, which is where my bastion host is running.</p> <p>I’ve created a workload segment, <code class="language-plaintext highlighter-rouge">192.168.83.0/24</code>, and connected three Ubuntu Linux VMs to it. A few essential steps must be completed outside of NSX-T when new segments are created while using VPC peering or dynamic routing over Cloud VPN or Cloud Interconnect.</p> <p class="center"><a href="/resources/2021/05/52_vpc_peering_imported_edited.png" class="drop-shadow"><img src="/resources/2021/05/52_vpc_peering_imported_edited.png" alt="" /></a></p> <p>First, you must have <code class="language-plaintext highlighter-rouge">Import/export custom routes</code> enabled in private service access for the VPC peered with GCVE. Custom routes are covered in my previous post, <a href="/2021/02/gcp-vpc-to-gcve/">Connecting a VPC to GCVE</a>. Notice that my newly created segment shows up under <code class="language-plaintext highlighter-rouge">Imported Routes</code>.</p> <p class="center"><a href="/resources/2021/05/50_cloud_router_adv_edited.png" class="drop-shadow"><img src="/resources/2021/05/50_cloud_router_adv_edited.png" alt="" /></a></p> <p>Second, any workload segments must be added as a custom IP range to any Cloud Router participating in BGP peering to advertise routes back to your environment. This would apply to both Cloud Interconnect and Cloud VPN, where BGP is used to provide dynamic routing. Configuring this will ensure that the workload subnet will be advertised to your environment. More information can be found <a href="https://cloud.google.com/vmware-engine/docs/networking/howto-connect-to-onpremises#end-to-end_connectivity_and_routing_considerations">here</a>.</p> <p>NSX-T has an excellent <a href="https://registry.terraform.io/providers/vmware/nsxt/latest/docs">Terraform provider</a>, and I have already covered several GCP Terraform examples in previous posts. My recommendation is to add new NSX-T segments via Terraform and add the custom subnet advertisement for the segment to any Cloud Routers via Terraform in the same workflow. This way, you will be sure you never forget to update your Cloud Router advertisements after adding a new segment.</p> <h1 id="exposing-a-vm-via-public-ip">Exposing a VM via Public IP</h1> <p>Let’s add an application into the mix. I have a test webserver running on <code class="language-plaintext highlighter-rouge">VM1</code> that I want to expose to the internet.</p> <p class="center"><a href="/resources/2021/05/69_gcve_diagram_2.png" class="drop-shadow"><img src="/resources/2021/05/69_gcve_diagram_2.png" alt="" /></a></p> <p>In GCVE, public IPs are not assigned directly to a VM. Instead, public IPs are allocated through the GCVE portal and assigned to the private IP of the relevant VM. This creates a simple destination NAT from the allocated public IP to the internal private IP.</p> <p class="center"><a href="/resources/2021/05/54_allocate_public_ip.png" class="drop-shadow"><img src="/resources/2021/05/54_allocate_public_ip.png" alt="" /></a></p> <p>Browse to <code class="language-plaintext highlighter-rouge">Network &gt; Public IPs</code> and click <code class="language-plaintext highlighter-rouge">Allocate</code> to allocate a public IP. You will be prompted to supply a name and the region for the public IP. Click <code class="language-plaintext highlighter-rouge">Submit</code>, and you will be taken back to the <code class="language-plaintext highlighter-rouge">Public IPs</code> page. This page will now show the public IP that has been allocated. The internal address it is assigned to is listed under the <code class="language-plaintext highlighter-rouge">Attached Address</code> column.</p> <p>You can find more information on public IPs in the <a href="https://cloud.google.com/vmware-engine/docs/concepts-public-ip-address">GCVE documentation</a>.</p> <h2 id="creating-firewall-rules">Creating Firewall Rules</h2> <p class="center"><a href="/resources/2021/05/55_create_fw_table.png" class="drop-shadow"><img src="/resources/2021/05/55_create_fw_table.png" alt="" /></a></p> <p>GCVE also includes a firewall beyond the NSX-T boundary, so it will need to be configured to allow access to the public IP that was just allocated. To do this, browse to <code class="language-plaintext highlighter-rouge">Network &gt; Firewall tables</code> and click <code class="language-plaintext highlighter-rouge">Create new firewall table</code>. Provide a name for the firewall table and click <code class="language-plaintext highlighter-rouge">Add Rule</code>.</p> <p class="center"><a href="/resources/2021/05/56_create_fw_rule.png" class="drop-shadow"><img src="/resources/2021/05/56_create_fw_rule.png" alt="" /></a></p> <p>Configure the rule to allow the desired traffic, choosing <code class="language-plaintext highlighter-rouge">Public IP</code> as the destination. Choose the newly allocated public IP from the dropdown, and click <code class="language-plaintext highlighter-rouge">Done</code>.</p> <p class="center"><a href="/resources/2021/05/57_firewall_config.png" class="drop-shadow"><img src="/resources/2021/05/57_firewall_config.png" alt="" /></a></p> <p>The new firewall table will be displayed. Click <code class="language-plaintext highlighter-rouge">Attached Subnets</code>, then <code class="language-plaintext highlighter-rouge">Attach to a Subnet</code>. This will attach the firewall table to a network.</p> <p class="center"><a href="/resources/2021/05/58_attach_fw_edited.png" class="drop-shadow"><img src="/resources/2021/05/58_attach_fw_edited.png" alt="" /></a></p> <p>Choose your SDDC along with <code class="language-plaintext highlighter-rouge">System management</code> from the <code class="language-plaintext highlighter-rouge">Select a Subnet</code> dropdown, and click <code class="language-plaintext highlighter-rouge">Save</code>. <code class="language-plaintext highlighter-rouge">System management</code> is the correct subnet to use when applying the firewall table to traffic behind NSX-T per the GCVE documentation.</p> <p class="center"><a href="/resources/2021/05/61_ubuntu_webserver_edited.png" class="drop-shadow"><img src="/resources/2021/05/61_ubuntu_webserver_edited.png" alt="" /></a></p> <p>I am now able to access my test webserver via the allocated public IP. Huzzah! More information on firewall tables can be found in the <a href="https://cloud.google.com/vmware-engine/docs/concepts-firewall-tables">GCVE documentation</a>.</p> <h1 id="load-balancing-with-nsx-t">Load Balancing with NSX-T</h1> <p>Now that the test webserver is working as expected, it’s time to implement a load balancer in NSX-T. Keep in mind that GCP also has a <a href="https://cloud.google.com/load-balancing/docs/load-balancing-overview">native load balancing service</a>, but that is beyond the scope of this post.</p> <p class="center"><a href="/resources/2021/05/70_gcve_diagram_3.png" class="drop-shadow"><img src="/resources/2021/05/70_gcve_diagram_3.png" alt="" /></a></p> <p>Public IPs can be assigned to any private IP, not just IPs assigned to VMs. For this example, I’ll configure the NSX-T load balancer and move the previously allocated public IP to the load balancer VIP. There are several steps needed to create a load balancer, so let’s dive in.</p> <p class="center"><a href="/resources/2021/05/62_web_lb_1.png" class="drop-shadow"><img src="/resources/2021/05/62_web_lb_1.png" alt="" /></a></p> <p>The first step is to create a new load balancer via the <code class="language-plaintext highlighter-rouge">Load Balancing</code> screen in NSX-T Manager. Provide a name, choose a size, and the tier 1 router to host the load balancer. Click <code class="language-plaintext highlighter-rouge">Save</code>. Now, expand the <code class="language-plaintext highlighter-rouge">Virtual Servers</code> section and click <code class="language-plaintext highlighter-rouge">Set Virtual Servers</code>.</p> <p class="center"><a href="/resources/2021/05/62_web_lb_2.png" class="drop-shadow"><img src="/resources/2021/05/62_web_lb_2.png" alt="" /></a></p> <p>This is where the virtual server IP (VIP) will be configured, along with a backing server pool. Provide a name and internal IP for the VIP. I used an IP that lives in the same segment as my servers, but you could create a dedicated segment for your VIP. Click the dropdown under <code class="language-plaintext highlighter-rouge">Server Pool</code> and click <code class="language-plaintext highlighter-rouge">Create New</code>.</p> <p class="center"><a href="/resources/2021/05/62_web_lb_3.png" class="drop-shadow"><img src="/resources/2021/05/62_web_lb_3.png" alt="" /></a></p> <p>Next, provide a name for your server pool, and choose a load balancing algorithm. Click <code class="language-plaintext highlighter-rouge">Select Members</code> to add VMs to the pool.</p> <p class="center"><a href="/resources/2021/05/62_web_lb_4_edited.png" class="drop-shadow"><img src="/resources/2021/05/62_web_lb_4_edited.png" alt="" /></a></p> <p>Click <code class="language-plaintext highlighter-rouge">Add Member</code> to add a new VM to the pool and provide the internal IP and port. Rinse and repeat until you’ve added all of the relevant VMs to your virtual server pool, then click <code class="language-plaintext highlighter-rouge">Apply</code>.</p> <p class="center"><a href="/resources/2021/05/62_web_lb_5.png" class="drop-shadow"><img src="/resources/2021/05/62_web_lb_5.png" alt="" /></a></p> <p>You’ll be taken back to the server pool screen, where you can add a monitor to check the health of the VMs in your pool. Click <code class="language-plaintext highlighter-rouge">Set Monitors</code> to choose a monitor.</p> <p class="center"><a href="/resources/2021/05/62_web_lb_6_edited.png" class="drop-shadow"><img src="/resources/2021/05/62_web_lb_6_edited.png" alt="" /></a></p> <p>My pool members are running a simple webserver on port 80, so I’m using the <code class="language-plaintext highlighter-rouge">default-http-lb-monitor</code>. After choosing the appropriate monitor, click <code class="language-plaintext highlighter-rouge">Apply</code>.</p> <p class="center"><a href="/resources/2021/05/62_web_lb_7_edited.png" class="drop-shadow"><img src="/resources/2021/05/62_web_lb_7_edited.png" alt="" /></a></p> <p>Review the settings for the VIP and click <code class="language-plaintext highlighter-rouge">Close</code>.</p> <p class="center"><a href="/resources/2021/05/62_web_lb_8_edited.png" class="drop-shadow"><img src="/resources/2021/05/62_web_lb_8_edited.png" alt="" /></a></p> <p>Finally, click <code class="language-plaintext highlighter-rouge">Save</code> to apply the new settings to your load balancer.</p> <p class="center"><a href="/resources/2021/05/63_edit_public_ip.png" class="drop-shadow"><img src="/resources/2021/05/63_edit_public_ip.png" alt="" /></a></p> <p>The last step is to browse to <code class="language-plaintext highlighter-rouge">Network &gt; Public IPs</code> in the GCVE portal and edit the existing public IP allocation. Update the name as appropriate, and change the attached local address to the load balancer VIP. No firewall rules need to be changed since the traffic is coming in over the same port (<code class="language-plaintext highlighter-rouge">tcp/80</code>).</p> <p class="center"><a href="/resources/2021/05/64_lb_test.gif" class="drop-shadow"><img src="/resources/2021/05/64_lb_test.gif" alt="" /></a></p> <p>Browsing to the allocated public IP and pressing refresh a few times shows that our load balancer is working as expected!</p> <h1 id="accessing-cloud-native-services">Accessing Cloud-Native Services</h1> <p>The last addition to this example is to include a GCP cloud-native service. I’ve chosen to use Cloud Storage because it is a simple example, and it provides incredible utility. This diagram illustrates my desired configuration.</p> <p class="center"><a href="/resources/2021/05/72_gcve_diagram_4.png" class="drop-shadow"><img src="/resources/2021/05/72_gcve_diagram_4.png" alt="" /></a></p> <p>My goal is to stage a simple static website in a Google Storage bucket, then mount the bucket as a read-only filesystem on each of my webservers. The bucket will be mounted to <code class="language-plaintext highlighter-rouge">/var/www/html</code> and will replace the testing page that had been staged on each server. You may be thinking, “This is crazy. Why not serve the static site directly from Google Storage?!” This is a valid question, and my response is that this is merely an example, not necessarily a best practice. I could have chosen to use Google Filestore instead of Google Storage as well. This illustrates that there is more than one way to do many things in the cloud.</p> <p>The first step is to create a Google Storage bucket, which I completed with this simple Terraform code:</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">provider</span> <span class="s">"google"</span> <span class="p">{</span> <span class="n">project</span> <span class="o">=</span> <span class="n">var</span><span class="p">.</span><span class="n">project</span> <span class="n">region</span> <span class="o">=</span> <span class="n">var</span><span class="p">.</span><span class="n">region</span> <span class="n">zone</span> <span class="o">=</span> <span class="n">var</span><span class="p">.</span><span class="n">zone</span> <span class="p">}</span> <span class="n">resource</span> <span class="s">"google_storage_bucket"</span> <span class="s">"melliott-vmw-static-site"</span> <span class="p">{</span> <span class="n">name</span> <span class="o">=</span> <span class="s">"melliott-vmw-static-site"</span> <span class="n">location</span> <span class="o">=</span> <span class="s">"US"</span> <span class="n">force_destroy</span> <span class="o">=</span> <span class="n">true</span> <span class="n">storage_class</span> <span class="o">=</span> <span class="s">"STANDARD"</span> <span class="p">}</span> <span class="n">resource</span> <span class="s">"google_storage_bucket_acl"</span> <span class="s">"melliott-vmw-static-site-acl"</span> <span class="p">{</span> <span class="n">bucket</span> <span class="o">=</span> <span class="n">google_storage_bucket</span><span class="p">.</span><span class="n">melliott</span><span class="o">-</span><span class="n">vmw</span><span class="o">-</span><span class="n">static</span><span class="o">-</span><span class="n">site</span><span class="p">.</span><span class="n">name</span> <span class="n">role_entity</span> <span class="o">=</span> <span class="p">[</span> <span class="s">"OWNER:[email protected]"</span> <span class="p">]</span> <span class="p">}</span> </code></pre></div></div> <p>Next, I found a simple static website example, which I stored in the bucket and modified for my needs. After staging this, I completed the following steps on each webserver to mount the bucket.</p> <ul> <li>Install the Google Cloud SDK (<a href="https://cloud.google.com/sdk/docs/install">https://cloud.google.com/sdk/docs/install</a>)</li> <li>Install gcsfuse (<a href="https://github.com/GoogleCloudPlatform/gcsfuse/blob/master/docs/installing.md">https://github.com/GoogleCloudPlatform/gcsfuse/blob/master/docs/installing.md</a>), which is used to mount Google Storage buckets in linux via <a href="https://en.wikipedia.org/wiki/Filesystem_in_Userspace">FUSE</a></li> <li>Authenticate to Google Cloud with <code class="language-plaintext highlighter-rouge">gcloud auth application-default login</code>. This will provide a URL that will need to be pasted into a browser to complete authentication. The verification code returned will then need to be pasted back into the prompt on the webserver.</li> <li>Remove existing files in <code class="language-plaintext highlighter-rouge">/var/www/html</code></li> <li>Mount the bucket as a read-only filesystem with <code class="language-plaintext highlighter-rouge">gcsfuse -o allow_other -o ro [bucket-name] /var/www/html</code></li> </ul> <div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>root@ubuntu:/var/www# gcsfuse <span class="nt">-o</span> allow_other <span class="nt">-o</span> ro melliott-vmw-static-site /var/www/html 2021/05/04 16:19:10.680365 Using mount point: /var/www/html 2021/05/04 16:19:10.686743 Opening GCS connection... 2021/05/04 16:19:11.037846 Mounting file system <span class="s2">"melliott-vmw-static-site"</span>... 2021/05/04 16:19:11.042605 File system has been successfully mounted. root@ubuntu:/var/www# root@ubuntu:/var/www# root@ubuntu:/var/www# <span class="nb">ls</span> /var/www/html assets error images index.html LICENSE.MD README.MD </code></pre></div></div> <p>After mounting the bucket and running an <code class="language-plaintext highlighter-rouge">ls</code> on <code class="language-plaintext highlighter-rouge">/var/www/html</code>, I can see that my static website is mounted correctly.</p> <p class="center"><a href="/resources/2021/05/73_static_website.png" class="drop-shadow"><img src="/resources/2021/05/73_static_website.png" alt="" /></a></p> <p>Browsing to the public IP fronting my load balancer VIP now displays my static website, hosted in a Google Storage bucket. Pretty snazzy!</p> <h2 id="google-private-access">Google Private Access</h2> <p>My GCVE environment has internet access enabled, so native services are accessed via the internet gateway. If you don’t want to allow internet access for your environment, you can still access native services via <a href="https://cloud.google.com/vpc/docs/configure-private-google-access">Private Google Access</a>. Much of the GCP documentation for this feature focuses on access to Google APIs from locations other than GCVE, but it is not too difficult to apply these practices to GCVE.</p> <p class="center"><a href="/resources/2021/05/71_vpc_private_google_access.png" class="drop-shadow"><img src="/resources/2021/05/71_vpc_private_google_access.png" alt="" /></a></p> <p>Google Private Access is primarily enabled by DNS, but you still need to enable this feature for any configured VPCs. The domain names used for this service are <code class="language-plaintext highlighter-rouge">private.googleapis.com</code> and <code class="language-plaintext highlighter-rouge">restricted.googleapis.com</code>. I was able to resolve both of these from my GCVE VMs, but my VMs are configured to use the resolvers in my GCVE environment. If you cannot resolve these hostnames, make sure you are using the GCVE DNS servers. As a reminder, these server addresses can be found under <code class="language-plaintext highlighter-rouge">Private Cloud DNS Servers</code> in the summary page for your GCVE cluster. You can find more information on Google Private Access <a href="https://cloud.google.com/vpc/docs/configure-private-google-access">here</a>.</p> <h1 id="viewing-routing-information">Viewing Routing Information</h1> <p>Knowing where to find routing tables is incredibly helpful when troubleshooting connectivity issues. There are a handful of places to look in GCP and GCVE to find this information.</p> <h2 id="vpc-routes">VPC Routes</h2> <p>You can view routes for a VPC in the GCP portal by browsing to <code class="language-plaintext highlighter-rouge">VPC networks</code>, clicking on the desired VPC, then clicking on the <code class="language-plaintext highlighter-rouge">Routes</code> tab. If you are using VPC peering, you will notice a message that says, “<em>This VPC network has been configured to import custom routes using VPC Network Peering. Any imported custom dynamic routes are omitted from this list, and some route conflicts might not be resolved. Please refer to the VPC Network Peering section for the complete list of imported custom routes, and the <a href="https://cloud.google.com/vpc/docs/routes?authuser=1#routeselection">routing order</a> for information about how GCP resolves conflicts.</em>” Basically, this message says that you will not see routes for your GCVE environment in this table.</p> <h2 id="vpc-network-peering-routes">VPC Network Peering Routes</h2> <p>To see routes for your GCVE environment, browse to <code class="language-plaintext highlighter-rouge">VPC Network Peering</code> and choose the <code class="language-plaintext highlighter-rouge">servicenetworking-googleapis-com</code> entry for your VPC. You will see routes for your GCVE environment under <code class="language-plaintext highlighter-rouge">Imported Routes</code> and any subnets in your VPC under <code class="language-plaintext highlighter-rouge">Exported Routes</code>. You can also view these routes using the <code class="language-plaintext highlighter-rouge">gcloud</code> tool.</p> <ul> <li>View imported routes: <code class="language-plaintext highlighter-rouge">gcloud compute networks peerings list-routes servicenetworking-googleapis-com --network=[VPC Name] --region=[REGION]] --direction=INCOMING</code></li> <li>View exported routes: <code class="language-plaintext highlighter-rouge">gcloud compute networks peerings list-routes servicenetworking-googleapis-com --network=[VPC Name] --region=[REGION]] --direction=OUTGOING</code></li> </ul> <p>Example results:</p> <div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>melliott@melliott-a01 gcp-bucket % gcloud compute networks peerings list-routes servicenetworking-googleapis-com <span class="nt">--network</span><span class="o">=</span>gcve-usw2 <span class="nt">--region</span><span class="o">=</span>us-west2 <span class="nt">--direction</span><span class="o">=</span>INCOMING DEST_RANGE TYPE NEXT_HOP_REGION PRIORITY STATUS 192.168.80.0/29 DYNAMIC_PEERING_ROUTE us-west2 0 accepted 192.168.80.0/29 DYNAMIC_PEERING_ROUTE us-west2 0 accepted 192.168.80.16/29 DYNAMIC_PEERING_ROUTE us-west2 0 accepted 192.168.80.16/29 DYNAMIC_PEERING_ROUTE us-west2 0 accepted 192.168.80.8/29 DYNAMIC_PEERING_ROUTE us-west2 0 accepted 192.168.80.8/29 DYNAMIC_PEERING_ROUTE us-west2 0 accepted 192.168.80.112/28 DYNAMIC_PEERING_ROUTE us-west2 0 accepted 192.168.80.112/28 DYNAMIC_PEERING_ROUTE us-west2 0 accepted 10.30.28.0/24 DYNAMIC_PEERING_ROUTE us-west2 0 accepted 10.30.28.0/24 DYNAMIC_PEERING_ROUTE us-west2 0 accepted 192.168.81.0/24 DYNAMIC_PEERING_ROUTE us-west2 0 accepted 192.168.81.0/24 DYNAMIC_PEERING_ROUTE us-west2 0 accepted 192.168.83.0/24 DYNAMIC_PEERING_ROUTE us-west2 0 accepted 192.168.83.0/24 DYNAMIC_PEERING_ROUTE us-west2 0 accepted </code></pre></div></div> <h2 id="nsx-t">NSX-T</h2> <p>Routing and forwarding tables can be downloaded from the NSX-T manager web interface or via API. It’s also reasonably easy to grab the routing table with PowerCLI. The following example displays the routing table from the T0 router in my GCVE environment.</p> <div class="language-powershell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">Import-Module</span><span class="w"> </span><span class="nx">VMware.PowerCLI</span><span class="w"> </span><span class="n">Connect-NsxtServer</span><span class="w"> </span><span class="nt">-Server</span><span class="w"> </span><span class="nx">my-nsxt-manager.gve.goog</span><span class="w"> </span><span class="nv">$t0s</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Get-NsxtPolicyService</span><span class="w"> </span><span class="nt">-Name</span><span class="w"> </span><span class="nx">com.vmware.nsx_policy.infra.tier0s</span><span class="w"> </span><span class="nv">$t0_name</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nv">$t0s</span><span class="o">.</span><span class="nf">list</span><span class="p">()</span><span class="o">.</span><span class="nf">results</span><span class="o">.</span><span class="nf">display_name</span><span class="w"> </span><span class="nv">$t0</span><span class="o">.</span><span class="nf">list</span><span class="p">(</span><span class="nv">$t0_name</span><span class="p">)</span><span class="o">.</span><span class="nf">results</span><span class="o">.</span><span class="nf">route_entries</span><span class="w"> </span><span class="o">|</span><span class="w"> </span><span class="n">Select-Object</span><span class="w"> </span><span class="nx">network</span><span class="p">,</span><span class="nx">next_hop</span><span class="p">,</span><span class="nx">route_type</span><span class="w"> </span><span class="o">|</span><span class="w"> </span><span class="n">Sort-Object</span><span class="w"> </span><span class="nt">-Property</span><span class="w"> </span><span class="nx">network</span><span class="w"> </span><span class="n">network</span><span class="w"> </span><span class="nx">next_hop</span><span class="w"> </span><span class="nx">route_type</span><span class="w"> </span><span class="o">-------</span><span class="w"> </span><span class="o">--------</span><span class="w"> </span><span class="o">----------</span><span class="w"> </span><span class="mf">0.0</span><span class="o">.</span><span class="nf">0</span><span class="o">.</span><span class="nf">0</span><span class="n">/0</span><span class="w"> </span><span class="nx">192.168.81.225</span><span class="w"> </span><span class="nx">t0s</span><span class="w"> </span><span class="mf">0.0</span><span class="o">.</span><span class="nf">0</span><span class="o">.</span><span class="nf">0</span><span class="n">/0</span><span class="w"> </span><span class="nx">192.168.81.241</span><span class="w"> </span><span class="nx">t0s</span><span class="w"> </span><span class="mf">10.30</span><span class="o">.</span><span class="nf">28</span><span class="o">.</span><span class="nf">0</span><span class="n">/24</span><span class="w"> </span><span class="nx">169.254.160.3</span><span class="w"> </span><span class="nx">t1c</span><span class="w"> </span><span class="mf">10.30</span><span class="o">.</span><span class="nf">28</span><span class="o">.</span><span class="nf">0</span><span class="n">/24</span><span class="w"> </span><span class="nx">169.254.160.3</span><span class="w"> </span><span class="nx">t1c</span><span class="w"> </span><span class="mf">169.254</span><span class="o">.</span><span class="nf">0</span><span class="o">.</span><span class="nf">0</span><span class="n">/24</span><span class="w"> </span><span class="nx">t0c</span><span class="w"> </span><span class="mf">169.254</span><span class="o">.</span><span class="nf">160</span><span class="o">.</span><span class="nf">0</span><span class="n">/31</span><span class="w"> </span><span class="nx">t0c</span><span class="w"> </span><span class="mf">169.254</span><span class="o">.</span><span class="nf">160</span><span class="o">.</span><span class="nf">0</span><span class="n">/31</span><span class="w"> </span><span class="nx">t0c</span><span class="w"> </span><span class="mf">169.254</span><span class="o">.</span><span class="nf">160</span><span class="o">.</span><span class="nf">2</span><span class="n">/31</span><span class="w"> </span><span class="nx">t0c</span><span class="w"> </span><span class="mf">169.254</span><span class="o">.</span><span class="nf">160</span><span class="o">.</span><span class="nf">2</span><span class="n">/31</span><span class="w"> </span><span class="nx">t0c</span><span class="w"> </span><span class="mf">192.168</span><span class="o">.</span><span class="nf">81</span><span class="o">.</span><span class="nf">224</span><span class="n">/28</span><span class="w"> </span><span class="nx">t0c</span><span class="w"> </span><span class="mf">192.168</span><span class="o">.</span><span class="nf">81</span><span class="o">.</span><span class="nf">240</span><span class="n">/28</span><span class="w"> </span><span class="nx">t0c</span><span class="w"> </span><span class="mf">192.168</span><span class="o">.</span><span class="nf">83</span><span class="o">.</span><span class="nf">0</span><span class="n">/24</span><span class="w"> </span><span class="nx">169.254.160.1</span><span class="w"> </span><span class="nx">t1c</span><span class="w"> </span><span class="mf">192.168</span><span class="o">.</span><span class="nf">83</span><span class="o">.</span><span class="nf">0</span><span class="n">/24</span><span class="w"> </span><span class="nx">169.254.160.1</span><span class="w"> </span><span class="nx">t1c</span><span class="w"> </span></code></pre></div></div> <h1 id="vpn-connectivity">VPN Connectivity</h1> <p>I haven’t talked much about VPNs in this blog series, but it is an important component that deserves more attention. Provisioning a VPN to GCP is an easy way to connect to your GCVE environment if you are waiting on a Cloud Interconnect to be installed. It can also be used as backup connectivity if your primary connection fails. NSX-T has can terminate an IPSec VPN, but I would recommend using <a href="https://cloud.google.com/network-connectivity/docs/vpn/concepts/overview">Cloud VPN</a> instead. This will ensure you have connectivity to any GCP-based resources along with GCVE.</p> <p>I’ve put together some example Terraform code to provision the necessary VPN-related resources in GCP. The example code is available at https://github.com/shamsway/gcp-terraform-examples in the <code class="language-plaintext highlighter-rouge">gcve-ha-vpn</code> subdirectory. Using this example will create the minimum configuration needed to stand up a VPN to GCP/GCVE. It is assumed that you have already created a VPC and <a href="https://networkbrouhaha.com/2021/02/gcp-vpc-to-gcve/">configured peering with your GCVE cluster</a>. This example does not create a redundant VPN solution, but it can be easily extended to do so by creating a secondary Cloud Router, interface, and BGP peer. You can find more information on HA VPN topologies in the <a href="https://cloud.google.com/network-connectivity/docs/vpn/concepts/topologies">GCP documentation</a>. After using the example code, you will still need to configure the VPN settings at your site. Google provides configuration examples for several different vendors at <a href="https://cloud.google.com/network-connectivity/docs/vpn/how-to/interop-guides">Using third-party VPNs with Cloud VPN</a>. I’ve written previously about VPNs for cloud connectivity, as well as other connection methods, in <a href="/2020/11/cloud-connectivity-101/">Cloud Connectivity 101</a></p> <h1 id="dns-notes">DNS Notes</h1> <p>I’ve saved the most important topic for last. DNS is a crucial component when operating in the cloud, so here are a few tips and recommendations to make sure you’re successful. <a href="https://cloud.google.com/dns">Cloud DNS</a> has a 100% uptime SLA, which is not something you see very often. This service is so crucial to GCP that Google has essentially guaranteed that it always be available. That is the type of guarantee that provides peace of mind, especially when you will have so many other services and applications relying on it.</p> <p>In terms of GCVE, you must be able to properly resolve the hostnames for vCenter, NSX, HCX, and other applications deployed in your environment. These topics are covered in detail at these links:</p> <ul> <li><a href="https://cloud.google.com/vmware-engine/docs/networking/howto-dns-on-premises">Configuring DNS for management appliance access</a></li> <li><a href="https://cloud.google.com/vmware-engine/docs/networking/howto-dns-profiles">Creating and applying DNS profiles</a></li> <li><a href="https://cloud.google.com/vmware-engine/docs/vmware-platform/howto-identity-sources">Configuring authentication using Active Directory</a></li> </ul> <p>The basic gist is this: the DNS servers running in your GCVE environment will be able to resolve A records for the management applications running in GCVE (vCenter, NSX, HCX, etc.). If you have <a href="(/2021/02/gcp-vpc-to-gcve/)">configured VPC peering with GCVE</a>, Cloud DNS will be automatically configured forward requests to the GCVE DNS servers for any <code class="language-plaintext highlighter-rouge">gve.goog</code> hostname. This will allow you to resolve GCVE-related A records from your VPC or bastion host. The last step is to make sure that you can properly resolve GCVE-related hostnames in your local environment. If you are using Windows Server for DNS, you need to configure a conditional forwarder for <code class="language-plaintext highlighter-rouge">gve.goog</code>, using the DNS servers running in GCVE. Other scenarios, like configuring BIND, are covered in the documentation links above.</p> <h1 id="wrap-up">Wrap Up</h1> <p>This is a doozy of a post, so I won’t waste too many words here. I genuinely hope you enjoyed this blog series. There will definitely be more GCVE-related blogs in the future, and you can hit me up any time <a href="https://twitter.com/NetworkBrouhaha">@NetworkBrouhaha</a> and let me know what topics you’d like to see covered. Thanks for reading!</p> <h1 id="helpful-resources">Helpful Resources</h1> <ul> <li><a href="https://cloud.google.com/vmware-engine/docs">Google Cloud VMware Engine documentation</a></li> <li><a href="https://cloud.google.com/architecture/private-cloud-networking-for-vmware-engine">Private cloud networking for Google Cloud VMware Engine</a> Whitepaper</li> <li><a href="https://cloud.google.com/vmware-engine/docs/workloads/howto-migrate-vms-using-hcx">Migrating VMware VMs using VMware HCX</a></li> <li><a href="https://cloud.vmware.com/community/2021/02/25/introducing-google-cloud-vmware-engine-logical-design-poster-workload-mobility/">Google Cloud VMware Engine Logical Design Poster for Workload Mobility</a></li> <li><a href="https://cloud.google.com/dns">Cloud DNS</a></li> <li><a href="https://cloud.google.com/storage/docs/gcs-fuse">Cloud Storage FUSE</a></li> <li><a href="https://github.com/GoogleCloudPlatform/gcsfuse">gcsfuse</a></li> <li><a href="https://cloud.google.com/sdk/docs/install">Installing Google Cloud SDK</a></li> <li><a href="https://cloud.google.com/network-connectivity/docs/vpn">Cloud VPN documentation</a></li> <li><a href="https://cloud.google.com/community/tutorials/deploy-ha-vpn-with-terraform">Tutorial: Deploy HA VPN with Terraform</a></li> <li><a href="https://cloud.google.com/network-connectivity/docs/vpn/concepts/topologies">Cloud VPN Technologies</a></li> <li><a href="https://cloud.google.com/network-connectivity/docs/vpn/how-to/interop-guides">Using third-party VPNs with Cloud VPN</a></li> <li><a href="https://cloud.google.com/blog/products/compute/how-to-use-multi-vpcs-with-google-cloud-vmware-engine">How to use multi-VPC networking in Google Cloud VMware Engine</a></li> <li><a href="https://registry.terraform.io/providers/hashicorp/google/latest/docs">Google Cloud Platform Provider</a> for Terraform</li> <li>My <a href="https://github.com/shamsway/gcp-terraform-examples">GCP Terraform Examples</a></li> </ul> <p>You can find a hands-on lab for GCVE at <a href="https://labs.hol.vmware.com/">https://labs.hol.vmware.com/</a> and searching for <code class="language-plaintext highlighter-rouge">HOL-2179-01-ISM</code></p> Tue, 04 May 2021 00:00:00 +0000 http://www.networkbrouhaha.com/2021/05/gcve-networking-scenarios/ http://www.networkbrouhaha.com/2021/05/gcve-networking-scenarios/ Intro to Google Cloud VMware Engine – HCX Configuration <p>Now that we have an SDDC running in Google Cloud VMware Engine, it is time to migrate some workloads into the cloud! <a href="https://cloud.vmware.com/vmware-hcx">VMware HCX</a> will be the tool I use to migrate Virtual Machines to GCVE. If you recall from the first post in this series, HCX was included in our SDDC deployment, so there is no further configuration needed in GCVE for HCX. The GCVE docs <a href="https://cloud.google.com/vmware-engine/docs/workloads/howto-migrate-vms-using-hcx#prepare-for-hcx-manager-installation-on-premises">cover installing and configuring the on-prem components for HCX</a>, so I’m not going to cover those steps in this post. As with previous posts, I will be taking an “automation first” approach to configuring HCX with Terraform. All of the code referenced in this post is available at <a href="https://github.com/shamsway/gcp-terraform-examples">https://github.com/shamsway/gcp-terraform-examples</a> in the <code class="language-plaintext highlighter-rouge">gcve-hcx</code> sub-directory.</p> <p><strong>Other posts in this series:</strong></p> <ul> <li><a href="/2021/02/gcve-sddc-with-hcx/">Deploying a GCVE SDDC with HCX</a></li> <li><a href="/2021/02/gcp-vpc-to-gcve/">Connecting a VPC to GCVE</a></li> <li><a href="/2021/03/gcve-bastion/">Bastion Host Access with IAP</a></li> <li><a href="/2021/03/gcve-network-overview/">Network and Connectivity Overview</a></li> <li><a href="/2021/05/gcve-networking-scenarios/">Common Networking Scenarios</a></li> </ul> <p>Before we look at configuring HCX with Terraform, there are a few items to consider. The provider I’m using to configure HCX, <a href="https://registry.terraform.io/providers/adeleporte/hcx/">adeleporte/hcx</a>, is a community provider. It is not supported by VMware. It is also under active development, so you may run across a bug or some outdated documentation. In my testing of the provider, I have found that it works well for an environment with a single service mesh but needs some improvements to support environments with multiple service meshes.</p> <p>Part of the beauty of open-source software is that anyone can contribute code. If you would like to submit an issue to track a bug, update documentation, or add new functionality, cruise over to the <a href="https://github.com/adeleporte/terraform-provider-hcx">GitHub repo</a> to get started.</p> <h1 id="hcx-configuration-with-terraform">HCX Configuration with Terraform</h1> <p>Configuring HCX involves configuring network profiles and a compute profile, which are then referenced in a service mesh configuration. The service mesh facilitates the migration of VMs to and from the cloud. The <a href="https://docs.vmware.com/en/VMware-HCX/4.0/hcx-user-guide/GUID-5D2F1312-EB62-4B25-AF88-9ADE129EDB57.html">HCX documentation</a> describes these components in detail, and I recommend reading through the user guide if you plan on performing a migration of any scale.</p> <p>The example Terraform code linked at the beginning of the post will do the following:</p> <ul> <li>Create a <a href="https://docs.vmware.com/en/VMware-HCX/4.0/hcx-user-guide/GUID-4BA6FBD4-ED66-4BE0-A216-6F6FFE1E8A20.html">site pairing</a> between your on-premises data center and your GCVE SDDC</li> <li>Add two <a href="https://docs.vmware.com/en/VMware-HCX/4.0/hcx-user-guide/GUID-184FCA54-D0CB-4931-B0E8-A81CD6120C52.html">network profiles</a>, one for management traffic and another for vMotion traffic. Network profiles for uplink and replication traffic can also be created, but in this example, I will use the management network for those functions.</li> <li>Create a <a href="https://docs.vmware.com/en/VMware-HCX/4.0/hcx-user-guide/GUID-BBAC979E-8899-45AD-9E01-98A132CE146E.html">compute profile</a> consisting of the network profiles created, and other parameters specific to your environment, like the datastore in use.</li> <li>Create a <a href="https://docs.vmware.com/en/VMware-HCX/4.0/hcx-user-guide/GUID-46AED982-8ED2-4CB1-807E-FEFD18FAC0DD.html">service mesh</a> between your on-prem data center and GCVE SDDC. This links the two compute profiles at each site for migration and sets other parameters, like the HCX features to enable.</li> <li><a href="https://docs.vmware.com/en/VMware-HCX/4.0/hcx-user-guide/GUID-DD9C3316-D01C-4088-B3EA-84ADB9FED573.html">Extend a network</a> from your on-prem data center into your GCVE SDDC.</li> </ul> <p>After Terraform completes the configuration, you will be able to migrate VMs from your on-prem data center into your GCVE SDDC. To get started, clone the example repo with <code class="language-plaintext highlighter-rouge">git clone https://github.com/shamsway/gcp-terraform-examples.git</code>, then change to the <code class="language-plaintext highlighter-rouge">gcve-hcx</code> sub-directory. You will find these files:</p> <ul> <li><code class="language-plaintext highlighter-rouge">main.tf</code> – Contains the primary Terraform code to complete the steps mentioned above</li> <li><code class="language-plaintext highlighter-rouge">variables.tf</code> – Defines the input variables that will be used in <code class="language-plaintext highlighter-rouge">main.tf</code></li> </ul> <p>Let’s take a look at the code that makes up this example.</p> <h2 id="maintf-contents">main.tf Contents</h2> <div class="language-tf highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">terraform</span> <span class="p">{</span> <span class="nx">required_providers</span> <span class="p">{</span> <span class="nx">hcx</span> <span class="p">=</span> <span class="p">{</span> <span class="nx">source</span> <span class="p">=</span> <span class="s2">"adeleporte/hcx"</span> <span class="p">}</span> <span class="p">}</span> <span class="p">}</span> </code></pre></div></div> <p>Unlike previous examples, this one does not start with a <code class="language-plaintext highlighter-rouge">provider</code> block. Instead, this <code class="language-plaintext highlighter-rouge">terraform</code> block will download and install the <code class="language-plaintext highlighter-rouge">adeleporte/hcx</code> provider from <code class="language-plaintext highlighter-rouge">registry.terraform.io</code>, which is a handy shortcut for installing community providers.</p> <div class="language-tf highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">provider</span> <span class="s2">"hcx"</span> <span class="p">{</span> <span class="nx">hcx</span> <span class="p">=</span> <span class="s2">"https://your.hcx.url"</span> <span class="nx">admin_username</span> <span class="p">=</span> <span class="kd">var</span><span class="p">.</span><span class="nx">hcx_admin_username</span> <span class="nx">admin_password</span> <span class="p">=</span> <span class="kd">var</span><span class="p">.</span><span class="nx">hcx_admin_password</span> <span class="nx">username</span> <span class="p">=</span> <span class="kd">var</span><span class="p">.</span><span class="nx">hcx_username</span> <span class="nx">password</span> <span class="p">=</span> <span class="kd">var</span><span class="p">.</span><span class="nx">hcx_password</span> <span class="p">}</span> </code></pre></div></div> <p>The <code class="language-plaintext highlighter-rouge">provider</code> block specifies the URL for your HCX appliance, along with admin credentials (those used to access the appliance management UI over port 9443) and user credentials for the standard HCX UI. During my testing, I had to use an IP address instead of an FQDN for my HCX appliance. Note that this example has the URL specified directly in the code instead of using a variable. You will need to edit <code class="language-plaintext highlighter-rouge">main.tf</code> to set this value, along with a few other values that you will see below.</p> <div class="language-tf highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">resource</span> <span class="s2">"hcx_site_pairing"</span> <span class="s2">"gcve"</span> <span class="p">{</span> <span class="nx">url</span> <span class="p">=</span> <span class="s2">"https://gcve.hcx.url"</span> <span class="nx">username</span> <span class="p">=</span> <span class="kd">var</span><span class="p">.</span><span class="nx">gcve_hcx_username</span> <span class="nx">password</span> <span class="p">=</span> <span class="kd">var</span><span class="p">.</span><span class="nx">gcve_hcx_password</span> <span class="p">}</span> </code></pre></div></div> <p>The <code class="language-plaintext highlighter-rouge">hcx_site_pairing</code> resource creates a site pairing between your on-prem and GCVE-based HCX appliances. This allows both HCX appliances to exchange information about their local environments and is a prerequisite to creating the service mesh. I used the FQDN of the HCX server running in GCVE for the <code class="language-plaintext highlighter-rouge">url</code> parameter, but I had previously configured DNS resolution between my lab and my GCVE environment. You can find the IP and FQDN of your HCX server in GCVE by browsing to <code class="language-plaintext highlighter-rouge">Resources &gt; [Your SDDC] &gt; vSphere Management Network</code>.</p> <div class="language-tf highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">resource</span> <span class="s2">"hcx_network_profile"</span> <span class="s2">"net_management_gcve"</span> <span class="p">{</span> <span class="nx">site_pairing</span> <span class="p">=</span> <span class="nx">hcx_site_pairing</span><span class="p">.</span><span class="nx">gcve</span> <span class="nx">network_name</span> <span class="p">=</span> <span class="s2">"Management network name"</span> <span class="nx">name</span> <span class="p">=</span> <span class="s2">"Management network profile name"</span> <span class="nx">mtu</span> <span class="p">=</span> <span class="mi">1500</span> <span class="nx">ip_range</span> <span class="p">{</span> <span class="nx">start_address</span> <span class="p">=</span> <span class="s2">"172.17.10.10"</span> <span class="nx">end_address</span> <span class="p">=</span> <span class="s2">"172.17.10.13"</span> <span class="p">}</span> <span class="nx">gateway</span> <span class="p">=</span> <span class="s2">"172.17.10.1"</span> <span class="nx">prefix_length</span> <span class="p">=</span> <span class="mi">24</span> <span class="nx">primary_dns</span> <span class="p">=</span> <span class="s2">"172.17.10.2"</span> <span class="nx">secondary_dns</span> <span class="p">=</span> <span class="s2">"172.17.10.3"</span> <span class="nx">dns_suffix</span> <span class="p">=</span> <span class="s2">"yourcompany.biz"</span> <span class="p">}</span> </code></pre></div></div> <p>This block and the block immediately following it add new network profiles to your local HCX server. Network profiles specify a local network to use for specific traffic (management, uplink, vMotion, or replication) as well as an IP range reserved for use by HCX appliances. For smaller deployments, it is OK to use one network profile for multiple traffic types. This example creates a management network profile, which will also be used for uplink and replication traffic, and another profile dedicated for vMotion.</p> <div class="language-tf highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">resource</span> <span class="s2">"hcx_compute_profile"</span> <span class="s2">"compute_profile_1"</span> <span class="p">{</span> <span class="nx">name</span> <span class="p">=</span> <span class="s2">"SJC-CP"</span> <span class="nx">datacenter</span> <span class="p">=</span> <span class="s2">"San Jose"</span> <span class="nx">cluster</span> <span class="p">=</span> <span class="s2">"Compute Cluster"</span> <span class="nx">datastore</span> <span class="p">=</span> <span class="s2">"comp-vsanDatastore"</span> <span class="nx">depends_on</span> <span class="p">=</span> <span class="p">[</span> <span class="nx">hcx_network_profile</span><span class="p">.</span><span class="nx">net_management_gcve</span><span class="p">,</span> <span class="nx">hcx_network_profile</span><span class="p">.</span><span class="nx">net_vmotion_gcve</span> <span class="p">]</span> <span class="nx">management_network</span> <span class="p">=</span> <span class="nx">hcx_network_profile</span><span class="p">.</span><span class="nx">net_management_gcve</span><span class="p">.</span><span class="nx">id</span> <span class="nx">replication_network</span> <span class="p">=</span> <span class="nx">hcx_network_profile</span><span class="p">.</span><span class="nx">net_management_gcve</span><span class="p">.</span><span class="nx">id</span> <span class="nx">uplink_network</span> <span class="p">=</span> <span class="nx">hcx_network_profile</span><span class="p">.</span><span class="nx">net_management_gcve</span><span class="p">.</span><span class="nx">id</span> <span class="nx">vmotion_network</span> <span class="p">=</span> <span class="nx">hcx_network_profile</span><span class="p">.</span><span class="nx">net_vmotion_gcve</span><span class="p">.</span><span class="nx">id</span> <span class="nx">dvs</span> <span class="p">=</span> <span class="s2">"nsx-overlay-transportzone"</span> <span class="nx">service</span> <span class="p">{</span> <span class="nx">name</span> <span class="p">=</span> <span class="s2">"INTERCONNECT"</span> <span class="p">}</span> <span class="nx">service</span> <span class="p">{</span> <span class="nx">name</span> <span class="p">=</span> <span class="s2">"WANOPT"</span> <span class="p">}</span> <span class="nx">service</span> <span class="p">{</span> <span class="nx">name</span> <span class="p">=</span> <span class="s2">"VMOTION"</span> <span class="p">}</span> <span class="nx">service</span> <span class="p">{</span> <span class="nx">name</span> <span class="p">=</span> <span class="s2">"BULK_MIGRATION"</span> <span class="p">}</span> <span class="nx">service</span> <span class="p">{</span> <span class="nx">name</span> <span class="p">=</span> <span class="s2">"NETWORK_EXTENSION"</span> <span class="p">}</span> <span class="p">}</span> </code></pre></div></div> <p>The <code class="language-plaintext highlighter-rouge">hcx_compute_profile</code> resource defines the compute, storage, and networking components at the local site that will participate in a service mesh. Compute and storage settings are defined at the beginning of the block. The management profile previously created is also specified for the replication and uplink networks. Finally, the <code class="language-plaintext highlighter-rouge">service</code> statements define which HCX features are enabled for the compute profile. If you attempt to enable a feature that you are not licensed for, Terraform will return an error.</p> <p>There are two things to note with this resource. First, the <code class="language-plaintext highlighter-rouge">dvs</code> parameter is not accurately named. It would be more accurate to name this parameter <code class="language-plaintext highlighter-rouge">network_container</code> or something similar. In this example, I am referencing an NSX transport zone instead of a DVS. This is a valid setup as long as you have NSX registered with your HCX server, so some work is needed to update this provider to reflect that capability. Second, I’ve added a <code class="language-plaintext highlighter-rouge">depends_on</code> statement. I noticed during my testing that this provider would occasionally attempt to remove resources out of order, which ultimately would cause <code class="language-plaintext highlighter-rouge">terraform destroy</code> to fail. Using the <code class="language-plaintext highlighter-rouge">depends_on</code> statement fixes this issue, but some additional logic will need to be added to the provider better understand resource dependencies. I’ve also added <code class="language-plaintext highlighter-rouge">depends_on</code> statements to the following blocks for the same reason.</p> <div class="language-tf highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">resource</span> <span class="s2">"hcx_service_mesh"</span> <span class="s2">"service_mesh_1"</span> <span class="p">{</span> <span class="nx">name</span> <span class="p">=</span> <span class="s2">"Service Mesh Name"</span> <span class="nx">site_pairing</span> <span class="p">=</span> <span class="nx">hcx_site_pairing</span><span class="p">.</span><span class="nx">gcve</span> <span class="nx">local_compute_profile</span> <span class="p">=</span> <span class="nx">hcx_compute_profile</span><span class="p">.</span><span class="nx">compute_profile_1</span><span class="p">.</span><span class="nx">name</span> <span class="nx">remote_compute_profile</span> <span class="p">=</span> <span class="s2">"GCVE Compute Profile"</span> <span class="nx">depends_on</span> <span class="p">=</span> <span class="p">[</span> <span class="nx">hcx_compute_profile</span><span class="p">.</span><span class="nx">compute_profile_1</span> <span class="p">]</span> <span class="nx">app_path_resiliency_enabled</span> <span class="p">=</span> <span class="kc">false</span> <span class="nx">tcp_flow_conditioning_enabled</span> <span class="p">=</span> <span class="kc">false</span> <span class="nx">uplink_max_bandwidth</span> <span class="p">=</span> <span class="mi">10000</span> <span class="nx">service</span> <span class="p">{</span> <span class="nx">name</span> <span class="p">=</span> <span class="s2">"INTERCONNECT"</span> <span class="p">}</span> <span class="nx">service</span> <span class="p">{</span> <span class="nx">name</span> <span class="p">=</span> <span class="s2">"WANOPT"</span> <span class="p">}</span> <span class="nx">service</span> <span class="p">{</span> <span class="nx">name</span> <span class="p">=</span> <span class="s2">"VMOTION"</span> <span class="p">}</span> <span class="nx">service</span> <span class="p">{</span> <span class="nx">name</span> <span class="p">=</span> <span class="s2">"BULK_MIGRATION"</span> <span class="p">}</span> <span class="nx">service</span> <span class="p">{</span> <span class="nx">name</span> <span class="p">=</span> <span class="s2">"NETWORK_EXTENSION"</span> <span class="p">}</span> <span class="p">}</span> </code></pre></div></div> <p>The <code class="language-plaintext highlighter-rouge">hcx_service_mesh</code> resource is where the magic happens. This block creates the service mesh between your on-prem data center and your GCVE SDDC by deploying multiple appliances at both sites and building encrypted tunnels between them. Once this process is complete, you will be able to migrate VMs into GCVE. Notice that the configuration is relatively basic, referencing the site pairing and local compute profile configured by Terraform. You will need to know the name of the compute profile in GCVE, but if you are using the default configuration, it should be <code class="language-plaintext highlighter-rouge">GCVE Compute Profile</code>. Similar to the compute profile, the <code class="language-plaintext highlighter-rouge">service</code> parameters define which features are enabled on the service mesh. Typically, the services enabled in your compute profile should match the services enabled in your service mesh.</p> <div class="language-tf highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">resource</span> <span class="s2">"hcx_l2_extension"</span> <span class="s2">"l2_extension_1"</span> <span class="p">{</span> <span class="nx">site_pairing</span> <span class="p">=</span> <span class="nx">hcx_site_pairing</span><span class="p">.</span><span class="nx">gcve</span> <span class="nx">service_mesh_id</span> <span class="p">=</span> <span class="nx">hcx_service_mesh</span><span class="p">.</span><span class="nx">service_mesh_1</span><span class="p">.</span><span class="nx">id</span> <span class="nx">source_network</span> <span class="p">=</span> <span class="s2">"Name of local network to extend"</span> <span class="nx">network_type</span> <span class="p">=</span> <span class="s2">"NsxtSegment"</span> <span class="nx">depends_on</span> <span class="p">=</span> <span class="p">[</span> <span class="nx">hcx_service_mesh</span><span class="p">.</span><span class="nx">service_mesh_1</span> <span class="p">]</span> <span class="nx">destination_t1</span> <span class="p">=</span> <span class="s2">"Tier1"</span> <span class="nx">gateway</span> <span class="p">=</span> <span class="s2">"192.168.10.1"</span> <span class="nx">netmask</span> <span class="p">=</span> <span class="s2">"255.255.255.0"</span> <span class="p">}</span> </code></pre></div></div> <p>This final block is optional but helpful in testing a migration. This block extends a network from your data center into GCVE using HCX Network Extension. This example extends an NSX segment, but the <a href="https://registry.terraform.io/providers/adeleporte/hcx/latest/docs/resources/l2_extension">hcx_l2_extension resource documentation</a> provides the parameters needed to extend a DVS-based network. You will need to know the name of the tier 1 router in GCVE you wish to connect this network to.</p> <h2 id="variables-used">Variables Used</h2> <p>The following input variables are required for this example:</p> <ul> <li><code class="language-plaintext highlighter-rouge">hcx_admin_username</code>: Username for on-prem HCX appliance management. Default value is <code class="language-plaintext highlighter-rouge">admin</code>.</li> <li><code class="language-plaintext highlighter-rouge">hcx_admin_password</code>: Password for on-prem HCX appliance management</li> <li><code class="language-plaintext highlighter-rouge">hcx_username</code>: Username for on-prem HCX instance</li> <li><code class="language-plaintext highlighter-rouge">hcx_password</code>: Password for on-prem HCX instance</li> <li><code class="language-plaintext highlighter-rouge">gcve_hcx_username</code>: Username for GCVE HCX instance. Default value is <code class="language-plaintext highlighter-rouge">[email protected]</code></li> <li><code class="language-plaintext highlighter-rouge">gcve_hcx_password</code>: Password for GCVE HCX instance</li> </ul> <h3 id="using-environment-variables">Using Environment Variables</h3> <p>You can use the following commands on macOS or Linux to provide these variable values via environment variables. This is a good practice when passing credentials to Terraform.</p> <div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nb">export </span><span class="nv">TF_VAR_hcx_admin_username</span><span class="o">=</span><span class="s1">'admin'</span> <span class="nb">export </span><span class="nv">TF_VAR_hcx_admin_password</span><span class="o">=</span><span class="s1">'password'</span> <span class="nb">export </span><span class="nv">TF_VAR_hcx_username</span><span class="o">=</span><span class="s1">'[email protected]'</span> <span class="nb">export </span><span class="nv">TF_VAR_hcx_password</span><span class="o">=</span><span class="s1">'password'</span> <span class="nb">export </span><span class="nv">TF_VAR_gcve_hcx_username</span><span class="o">=</span><span class="s1">'[email protected]'</span> <span class="nb">export </span><span class="nv">TF_VAR_gcve_hcx_password</span><span class="o">=</span><span class="s1">'password'</span> </code></pre></div></div> <p>You can use the <code class="language-plaintext highlighter-rouge">unset</code> commmand to remove set environment variables, if necessary.</p> <h2 id="initializing-and-running-terraform">Initializing and Running Terraform</h2> <p>See the <a href="https://github.com/shamsway/gcp-terraform-examples/blob/main/gcve-hcx/README.md">README</a> included in the example repo for the steps required to initialize and run Terraform. This is the same process as previous examples.</p> <h1 id="final-thoughts">Final Thoughts</h1> <p>It feels good to finally be able to migrate some workloads into our GCVE environment! Admittedly, this example is a bit of a stretch and may not be useful for all HCX users. My team works heavily with HCX, and we are frequently standing up or removing an HCX service mesh for various environments. This provider will be a huge time saver for us and will be especially valuable once there are a few fixes and improvements. Configuring HCX via the UI is an excellent option for new users, but once you are standing up your tenth service mesh, it becomes apparent that using Terraform is much quicker than clicking through several dialogs. I also believe that seeing the HCX configuration represented in Terraform code provides an excellent overview of all of the configuration needed, and how the configuration of different components stack together like Legos to form a completed service mesh.</p> <p>What about automating the actual migration of VMs? This example prepares our environment for migration, but automating VM migration is best suited for a different tool than Terraform. Luckily, there are plenty of HCX-specific cmdlets in <a href="https://developer.vmware.com/powercli">PowerCLI</a>. Check out these <a href="https://blogs.vmware.com/PowerCLI/2019/02/getting-started-hcx-module.html">existing</a> <a href="https://code.vmware.com/samples?categories=Sample&amp;tags=HCX">resources</a> for some examples of using PowerCLI with HCX.</p> <p>This blog series is approaching its conclusion, but in my next post I’ll dive into configuring some common network use cases, like exposing a VM to the internet and configuring a load balancer in GCVE.</p> <h1 id="helpful-links">Helpful Links</h1> <ul> <li><a href="https://cloud.google.com/vmware-engine/docs/workloads/howto-migrate-vms-using-hcx">Migrating VMware VMs using VMware HCX</a></li> <li><a href="https://labs.hol.vmware.com/HOL/catalogs/lab/8843">Google Cloud VMware Engine Overview</a> Hands-on Lab, which includes HCX configuration.</li> <li><a href="https://registry.terraform.io/providers/adeleporte/hcx/">adeleporte/hcx</a> community Terraform provider for HCX</li> <li><a href="https://registry.terraform.io/providers/adeleporte/hcx/latest/docs/guides/lab">HCX Lab - Full HCX Connector configuration</a> Terraform example</li> <li><a href="https://registry.terraform.io/providers/adeleporte/hcx/latest/docs/resources/site_pairing">hcx_site_pairing Resource</a></li> <li><a href="https://registry.terraform.io/providers/adeleporte/hcx/latest/docs/resources/network_profile">hcx_network_profile Resource</a></li> <li><a href="https://registry.terraform.io/providers/adeleporte/hcx/latest/docs/resources/compute_profile">hcx_compute_profile Resource</a></li> <li><a href="https://registry.terraform.io/providers/adeleporte/hcx/latest/docs/resources/service_mesh">hcx_service_mesh Resource</a></li> <li><a href="https://registry.terraform.io/providers/adeleporte/hcx/latest/docs/resources/l2_extension">hcx_l2_extension Resource</a></li> <li><a href="https://blogs.vmware.com/PowerCLI/2019/02/getting-started-hcx-module.html">Getting Started with the PoweCLI HCX Module</a></li> <li><a href="https://code.vmware.com/samples?categories=Sample&amp;tags=HCX">PowerCLI Example Scripts for HCX</a></li> </ul> <h1 id="screenshots">Screenshots</h1> <p>Below are screenshots from HCX showing the results of running this Terraform example in my lab, for reference. I have modified the example code to match the configuration of my lab environment.</p> <p class="center"><a href="/resources/2021/04/39_hcx_np.png" class="drop-shadow"><img src="/resources/2021/04/39_hcx_np.png" alt="" /></a> HCX Network Profiles</p> <p class="center"><a href="/resources/2021/04/40_hcx_cp.png" class="drop-shadow"><img src="/resources/2021/04/40_hcx_cp.png" alt="" /></a> HCX Compute Profile</p> <p class="center"><a href="/resources/2021/04/41_hcx_sm_edited.png" class="drop-shadow"><img src="/resources/2021/04/41_hcx_sm_edited.png" alt="" /></a> HCX Service Mesh</p> <p class="center"><a href="/resources/2021/04/42_hcx_sm_appliance_details.png" class="drop-shadow"><img src="/resources/2021/04/42_hcx_sm_appliance_details.png" alt="" /></a> HCX Service Mesh Appliance Details</p> <p class="center"><a href="/resources/2021/04/43_hcx_ne_edited.png" class="drop-shadow"><img src="/resources/2021/04/43_hcx_ne_edited.png" alt="" /></a> HCX Network Extension</p> <p class="center"><a href="/resources/2021/04/45_hcx_vmotion_edited.png" class="drop-shadow"><img src="/resources/2021/04/45_hcx_vmotion_edited.png" alt="" /></a> HCX vMotion Test</p> Mon, 12 Apr 2021 00:00:00 +0000 http://www.networkbrouhaha.com/2021/04/gcve-hcx-config/ http://www.networkbrouhaha.com/2021/04/gcve-hcx-config/ Intro to Google Cloud VMware Engine – Network and Connectivity Overview <p>In previous posts, I’ve shown you how to deploy an SDDC in Google Cloud VMware Engine, connect the SDDC to a VPC, and deploy a bastion host for managing your environment. In this post, we’ll take a pause on deploying anything new to take a closer look at our SDDC. This post will provide an overview of the networking configuration and capabilities, and how to connect to it from an external site.</p> <p><strong>Other posts in this series:</strong></p> <ul> <li><a href="/2021/02/gcve-sddc-with-hcx/">Deploying a GCVE SDDC with HCX</a></li> <li><a href="/2021/02/gcp-vpc-to-gcve/">Connecting a VPC to GCVE</a></li> <li><a href="/2021/03/gcve-bastion/">Bastion Host Access with IAP</a></li> <li><a href="/2021/04/gcve-hcx-config/">HCX Configuration</a></li> <li><a href="/2021/05/gcve-networking-scenarios/">Common Networking Scenarios</a></li> </ul> <h1 id="sddc-networking-overview">SDDC Networking Overview</h1> <p class="center"><a href="/resources/2021/03/gcve_arch.png" class="drop-shadow"><img src="/resources/2021/03/gcve_arch.png" alt="" /></a> Google Cloud VMware Engine Overview by Google, licensed under <a href="https://creativecommons.org/licenses/by/3.0/">CC BY 3.0</a></p> <p>An SDDC running in GCVE consists of VMware vSphere, vCenter, vSAN, NSX-T, and optionally HCX, all running on top of Google Cloud infrastructure. Let’s take a peek at an SDDC deployment.</p> <h3 id="vds-and-n-vds-configuration">VDS and N-VDS Configuration</h3> <p class="center"><a href="/resources/2021/03/25_gcve_dvs_edited.png" class="drop-shadow"><img src="/resources/2021/03/25_gcve_dvs_edited.png" alt="" /></a></p> <p>Configuration of the single VDS in the SDDC is basic, and used to provide connectivity for HCX. The VLANs listed are locally significant to Google’s infrastructure and not something we need to worry about.</p> <p class="center"><a href="/resources/2021/03/26_gcve_virtual_switches_edited.png" class="drop-shadow"><img src="/resources/2021/03/26_gcve_virtual_switches_edited.png" alt="" /></a></p> <p>The virtual switch settings for one of the ESXi hosts provides a better picture of the networking landscape. Here we can see both the vanilla VDS deployed, along with the N-VDS managed by NSX-T. Almost all of the networking configuration we will perform will be in NSX-T, but I wanted to show the underlying configuration for curious individuals.</p> <p class="center"><a href="/resources/2021/03/36_nsxt_nvds_visual_edited.png" class="drop-shadow"><img src="/resources/2021/03/36_nsxt_nvds_visual_edited.png" alt="" /></a></p> <p>We’ll look at NSX-T further below, but this screenshot from NSX-T is a simple visualization of the N-VDS deployed.</p> <h3 id="vmkernel-and-vmnic-configuration">VMkernel and vmnic Configuration</h3> <p class="center"><a href="/resources/2021/03/28_gcve_vmk_edited.png" class="drop-shadow"><img src="/resources/2021/03/28_gcve_vmk_edited.png" alt="" /></a></p> <p>VMkernel configuration is straightforward, with dedicated adapters for management, vSAN, and vMotion. The IP addresses correspond with the management, vSAN, and vMotion subnets that were automatically created when the SDDC was deployed.</p> <p class="center"><a href="/resources/2021/03/27_gcve_phys_adapters_edited.png" class="drop-shadow"><img src="/resources/2021/03/27_gcve_phys_adapters_edited.png" alt="" /></a></p> <p>There are four 25 Gbps vmnics (physical adapters) in each host, providing an aggregate of 100 Gbps per host. Two vmnics are dedicated to the VDS, and two are dedicated to the N-VDS.</p> <h3 id="nsx-t-configuration">NSX-T Configuration</h3> <p class="center"><a href="/resources/2021/03/30_gcve_t0_bgp.png" class="drop-shadow"><img src="/resources/2021/03/30_gcve_t0_bgp.png" alt="" /></a></p> <p>The out-of-the-box NSX-T configuration for GCVE should look very familiar to you if you have ever deployed <a href="https://www.vmware.com/products/cloud-foundation.html">VMware Cloud Foundation</a>. The T0 router has redundant BGP connections to Google’s infrastructure.</p> <p class="center"><a href="/resources/2021/03/31_gcve_nsx_firewall.png" class="drop-shadow"><img src="/resources/2021/03/31_gcve_nsx_firewall.png" alt="" /></a></p> <p>There are no NAT rules configured, and the firewall has a default <code class="language-plaintext highlighter-rouge">allow any any</code> rule. This may not be what you were expecting, but by the end of this post, it should be more clear. We will look at traffic flows in the <strong>SDDC Networking Capabilities</strong> section below.</p> <p class="center"><a href="/resources/2021/03/32_gcve_tzs.png" class="drop-shadow"><img src="/resources/2021/03/32_gcve_tzs.png" alt="" /></a></p> <p>The configured transport zones consist of three VLAN TZs, and a single overlay TZ. The VLAN TZs facilitate the plumbing between the T0 router and Google infrastructure for BGP peering. The <code class="language-plaintext highlighter-rouge">TZ-OVERLAY</code> zone is where workload segments will be placed.</p> <p class="center"><a href="/resources/2021/03/35_gcve_edge_nodes_edited.png" class="drop-shadow"><img src="/resources/2021/03/35_gcve_edge_nodes_edited.png" alt="" /></a></p> <p>Finally, there is one edge cluster consisting of two edge nodes to host the NSX-T logical routers.</p> <h1 id="sddc-networking-capabilities">SDDC Networking Capabilities</h1> <p>Now that we’ve peeked behind the curtain, let’s talk about what you can actually <em>do</em> with your SDDC. This is by no means an exhaustive list, but here are some common use cases:</p> <ul> <li>Create workload segments in NSX-T</li> <li>Expose VMs or services to the internet via public IP</li> <li>Leverage NSX-T load balancing capabilities</li> <li>Create north-south firewall policies with the NSX-T gateway firewall</li> <li>Create east-west firewall policies (i.e., micro-segmentation) with the NSX-T distributed firewall</li> <li>Access and consume Google Cloud native services</li> <li>Migrate VMs from your on-prem data center to your GCVE SDDC with VMware HCX</li> </ul> <p>I will be covering many of these topics in future posts, including automation examples. Next, let’s look at the options for ingress and egress traffic.</p> <h3 id="egress-traffic">Egress Traffic</h3> <p class="center"><a href="/resources/2021/03/gcve_egress.png" class="drop-shadow"><img src="/resources/2021/03/gcve_egress.png" alt="" /></a> Google Cloud VMware Engine Egress Traffic Flows by Google, licensed under <a href="https://creativecommons.org/licenses/by/3.0/">CC BY 3.0</a></p> <p>One of the strengths of GCVE is that it provides you with options. As you can see on this diagram, you have three options for egress traffic:</p> <ol> <li>Egress through the GCVE internet gateway</li> <li>Egress through an attached VPC</li> <li>Egress through your on-prem data center via Cloud Interconnect or Cloud VPN</li> </ol> <p>In <a href="/2021/02/gcve-sddc-with-hcx/">Deploying a GCVE SDDC with HCX</a>, I walked through the steps to enable <code class="language-plaintext highlighter-rouge">Internet Access</code> and <code class="language-plaintext highlighter-rouge">Public IP Service</code> for your SDDC. This is all that is needed to provide egress internet access through the internet gateway. Internet-bound traffic will be routed from the T0 router to the internet gateway, which NATs all traffic behind a public IP.</p> <p>Egress through an attached VPC or on-prem datacenter requires additional steps that are beyond the scope of this post, but I will provide documentation links at the end of this post for these scenarios.</p> <h3 id="ingress-traffic">Ingress Traffic</h3> <p class="center"><a href="/resources/2021/03/gcve_ingress.png" class="drop-shadow"><img src="/resources/2021/03/gcve_ingress.png" alt="" /></a> Google Cloud VMware Engine Ingress Traffic Flows by Google, licensed under <a href="https://creativecommons.org/licenses/by/3.0/">CC BY 3.0</a></p> <p>Ingress traffic to GCVE follows similar paths as egress traffic. You can ingress via the public IP service, connected VPC, or through your on-prem data center. Using the public IP service is the least complicated option and requires that you’ve enabled <code class="language-plaintext highlighter-rouge">Public IP Service</code> for your SDDC.</p> <p class="center"><a href="/resources/2021/03/37_allocate_public_ip.png" class="drop-shadow"><img src="/resources/2021/03/37_allocate_public_ip.png" alt="" /></a></p> <p>Public IPs are not assigned directly to VM. Instead, a public IP is allocated and NATed to a private IP in your SDDC. is You can allocate a public IP in the GCVE portal by supplying a name for the IP allocation, region, and the private address.</p> <h1 id="connecting-to-your-sddc">Connecting to your SDDC</h1> <p>My previous post, <a href="/2021/02/gcve-sddc-with-hcx/">Deploying a GCVE SDDC with HCX</a>, outlines the steps to set up client VPN access to your SDDC, and <a href="/2021/03/gcve-bastion/">Bastion Host Access with IAP</a> provides an example bastion host setup for managing your SDDC. These are “day 1” options for connectivity, so you will likely need some other method to connect to your on-prem data center to your GCVE SDDC. I covered cloud connectivity options in <a href="/2020/11/cloud-connectivity-101/">Cloud Connectivity 101</a>, and many of the methods outlined that post are available for connecting to GCVE. Today, your options are to use <a href="https://cloud.google.com/network-connectivity/docs/interconnect">Cloud Interconnect</a> or an IPSec tunnel via <a href="https://cloud.google.com/network-connectivity/docs/vpn/concepts/overview">Cloud VPN</a> or <a href="https://docs.vmware.com/en/VMware-NSX-T-Data-Center/3.1/administration/GUID-A8B113EC-3D53-41A5-919E-78F1A3705F58.html">NSX-T IPSec VPN</a>.</p> <p>In our lab, we are lucky to have a connection to <a href="https://www.megaport.com/">Megaport</a>, so I am using Partner Interconnect for my testing with GCVE. This is a very easy solution for connecting to the cloud, and their documentation provides simple step-by-step instructions to get up and running. Once complete, BGP peering will be established between the Megaport Cloud Router and a Google Cloud Router.</p> <h3 id="advertising-routes-to-gcve">Advertising Routes to GCVE</h3> <p class="center"><a href="/resources/2021/03/38_cloud_router_custom_ip_range_edited.png" class="drop-shadow"><img src="/resources/2021/03/38_cloud_router_custom_ip_range_edited.png" alt="" /></a></p> <p>VPC peering in Google Cloud does not support transitive routing. This means that I had to add a custom advertised IP range for my GCVE subnets to the Google Cloud Router. After adding this configuration, I was able to ping IPs in my SDDC. You will need to <a href="https://cloud.google.com/vmware-engine/docs/networking/howto-dns-on-premises">configure your DNS server to resolve queries for <code class="language-plaintext highlighter-rouge">gve.goog</code></a> to be able to access vCenter, NSX and HCX by their hostnames.</p> <h3 id="icmp-in-gcve">ICMP in GCVE</h3> <p>One nuance in GCVE that threw me off is that ICMP is not supported by the internal load balancer, which is in the path for egress traffic if you are using the internet gateway. Trying to ping 8.8.8.8 will fail, even if your SDDC is correctly connected to the internet. To test internet connectivity from a VM in your SDDC, use another tool like <code class="language-plaintext highlighter-rouge">curl</code> or follow the instructions <a href="https://www.xmodulo.com/how-to-install-tcpping-on-linux.html">here</a> to install <code class="language-plaintext highlighter-rouge">tcpping</code> for testing.</p> <h1 id="next-steps">Next Steps</h1> <p>Next, we will stage our SDDC networking segments and connect HCX to begin migrating workloads to GCVE. I highly recommend you read the <a href="https://cloud.google.com/solutions/private-cloud-networking-for-vmware-engine">Private cloud networking for Google Cloud VMware Engine</a> whitepaper, which goes into many of the subjects I’ve touched on in this blog in greater detail.</p> Thu, 18 Mar 2021 00:00:00 +0000 http://www.networkbrouhaha.com/2021/03/gcve-network-overview/ http://www.networkbrouhaha.com/2021/03/gcve-network-overview/ Intro to Google Cloud VMware Engine – Bastion Host Access with IAP <p>Welcome back! This post will build on the previous posts in this series by deploying a Windows Server 2019 bastion host to manage our Google Cloud VMware Engine (GCVE) SDDC. Access to the bastion host will be provided with <a href="https://cloud.google.com/iap">Identity-Aware Proxy</a> (IAP). Everything will be deployed and configured with Terraform, with all of the code referenced in this post is available at <a href="https://github.com/shamsway/gcp-terraform-examples">https://github.com/shamsway/gcp-terraform-examples</a> in the <code class="language-plaintext highlighter-rouge">gcve-bastion-iap</code> sub-directory.</p> <p><strong>Other posts in this series:</strong></p> <ul> <li><a href="/2021/02/gcve-sddc-with-hcx/">Deploying a GCVE SDDC with HCX</a></li> <li><a href="/2021/02/gcp-vpc-to-gcve/">Connecting a VPC to GCVE</a></li> <li><a href="/2021/03/gcve-network-overview/">Network and Connectivity Overview</a></li> <li><a href="/2021/04/gcve-hcx-config/">HCX Configuration</a></li> <li><a href="/2021/05/gcve-networking-scenarios/">Common Networking Scenarios</a> <h1 id="identity-aware-proxy-overview">Identity Aware Proxy Overview</h1> </li> </ul> <p>Standing up initial cloud connectivity is challenging. I walked through the steps to deploy a client VPN in <a href="/2021/02/gcve-sddc-with-hcx/">Deploying a GCVE SDDC with HCX</a>, but this post will show how to use IAP as a method for accessing a new bastion host. Using IAP means that the bastion host will be accessible without having to configure a VPN or expose it to the internet. I am a massive fan of this approach, and while there are some tradeoffs to discuss, it is a simpler and more secure approach than traditional access methods.</p> <p>IAP can be used to access various resources, including App Engine and GKE. Accessing the bastion host over RDP (TCP port 3389) will be accomplished using <a href="https://cloud.google.com/iap/docs/using-tcp-forwarding">IAP for TCP forwarding</a>. Once configured, IAP will allow us to establish a connection to our bastion host over an encrypted tunnel on demand. Configuring this feature will require some specific IAM roles, as well as some firewall rules in your VPC. If you have <code class="language-plaintext highlighter-rouge">Owner</code> permissions in your GCP project, then you’re good to go. Otherwise, you will need the following roles assigned to complete the tasks outlined in the rest of this post:</p> <ul> <li>Compute Admin (<code class="language-plaintext highlighter-rouge">roles/compute.admin</code>)</li> <li>Service Account Admin (<code class="language-plaintext highlighter-rouge">roles/iam.serviceAccountAdmin</code>)</li> <li>Service Account User (<code class="language-plaintext highlighter-rouge">roles/iam.serviceAccountUser</code>)</li> <li>IAP Policy Admin (<code class="language-plaintext highlighter-rouge">roles/iap.admin</code>)</li> <li>IAP settings Admin (<code class="language-plaintext highlighter-rouge">roles/iap.settingsAdmin</code>)</li> <li>IAP-secured Tunnel User (<code class="language-plaintext highlighter-rouge">roles/iap.tunnelResourceAccessor</code>)</li> <li>Service Networking Admin (<code class="language-plaintext highlighter-rouge">roles/servicenetworking.networksAdmin</code>)</li> <li>Project IAM Admin (<code class="language-plaintext highlighter-rouge">roles/resourcemanager.projectIamAdmin</code>)</li> </ul> <p>The VPC firewall will need to allow traffic sourced from <code class="language-plaintext highlighter-rouge">35.235.240.0/20</code>, which is the range that IAP uses for TCP forwarding. This rule can be further limited to specific TCP ports, like 3389 for RDP or 22 for SSH.</p> <h1 id="bastion-host-deployment-with-terraform">Bastion Host Deployment with Terraform</h1> <p>The example Terraform code linked at the beginning of the post will do the following:</p> <ul> <li>Create a <a href="https://cloud.google.com/compute/docs/access/create-enable-service-accounts-for-instances">service account</a>, which will be associated with the bastion host</li> <li>Create Windows 2019 Server instance, which will be used as a bastion host</li> <li>Create <a href="https://cloud.google.com/iap/docs/using-tcp-forwarding#create-firewall-rule">firewall rules</a> for accessing the bastion host via IAP, and accessing resources from the bastion host</li> <li>Assign <a href="https://cloud.google.com/iap/docs/using-tcp-forwarding#grant-permission">IAM roles needed for IAP</a></li> <li>Set a password on the bastion host using the <code class="language-plaintext highlighter-rouge">gcloud</code> tool</li> </ul> <p>After Terraform completes configuration, you will be able to use the <code class="language-plaintext highlighter-rouge">gcloud</code> tool to enable TCP forwarding for RDP. Once connected to the bastion host, you will be able to log into your GCVE-based vSphere portal. To get started, clone the example repo with <code class="language-plaintext highlighter-rouge">git clone https://github.com/shamsway/gcp-terraform-examples.git</code>, then change to the <code class="language-plaintext highlighter-rouge">gcve-bastion-iap</code> sub-directory. You will find these files:</p> <ul> <li><code class="language-plaintext highlighter-rouge">main.tf</code> – Contains the primary Terraform code to complete the steps mentioned above</li> <li><code class="language-plaintext highlighter-rouge">variables.tf</code> – Defines the input variables that will be used in <code class="language-plaintext highlighter-rouge">main.tf</code></li> <li><code class="language-plaintext highlighter-rouge">terraform.tfvars</code> – Supplies values for the input variables defined in <code class="language-plaintext highlighter-rouge">variables.tf</code></li> <li><code class="language-plaintext highlighter-rouge">outputs.tf</code> – Defines the output variables to be returned from <code class="language-plaintext highlighter-rouge">main.tf</code></li> </ul> <p>Let’s take a closer look at what is happening in each of these files.</p> <h2 id="maintf-contents">main.tf Contents</h2> <div class="language-tf highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">provider</span> <span class="s2">"google"</span> <span class="p">{</span> <span class="nx">project</span> <span class="p">=</span> <span class="kd">var</span><span class="p">.</span><span class="nx">project</span> <span class="nx">region</span> <span class="p">=</span> <span class="kd">var</span><span class="p">.</span><span class="nx">region</span> <span class="nx">zone</span> <span class="p">=</span> <span class="kd">var</span><span class="p">.</span><span class="nx">zone</span> <span class="p">}</span> <span class="k">data</span> <span class="s2">"google_compute_network"</span> <span class="s2">"network"</span> <span class="p">{</span> <span class="nx">name</span> <span class="p">=</span> <span class="kd">var</span><span class="p">.</span><span class="nx">network_name</span> <span class="p">}</span> <span class="k">data</span> <span class="s2">"google_compute_subnetwork"</span> <span class="s2">"subnet"</span> <span class="p">{</span> <span class="nx">name</span> <span class="p">=</span> <span class="kd">var</span><span class="p">.</span><span class="nx">subnet_name</span> <span class="nx">region</span> <span class="p">=</span> <span class="kd">var</span><span class="p">.</span><span class="nx">region</span> <span class="p">}</span> </code></pre></div></div> <p>Just like the example from my <a href="/2021/02/gcp-vpc-to-gcve/">last post</a>, <code class="language-plaintext highlighter-rouge">main.tf</code> begins with a <code class="language-plaintext highlighter-rouge">provider</code> block to define the Google Cloud project, region, and zone in which Terraform will create resources. The following data blocks, <code class="language-plaintext highlighter-rouge">google_compute_network.network</code> and <code class="language-plaintext highlighter-rouge">google_compute_network.subnet</code>, reference an existing VPC network and subnetwork. These data blocks will provide parameters necessary for creating a bastion host and firewall rules.</p> <div class="language-tf highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">resource</span> <span class="s2">"google_service_account"</span> <span class="s2">"bastion_host"</span> <span class="p">{</span> <span class="nx">project</span> <span class="p">=</span> <span class="kd">var</span><span class="p">.</span><span class="nx">project</span> <span class="nx">account_id</span> <span class="p">=</span> <span class="kd">var</span><span class="p">.</span><span class="nx">service_account_name</span> <span class="nx">display_name</span> <span class="p">=</span> <span class="s2">"Service Account for Bastion"</span> <span class="p">}</span> </code></pre></div></div> <p>The first resource block creates a new <a href="https://cloud.google.com/compute/docs/access/service-accounts">service account</a> that will be associated with our bastion host instance.</p> <div class="language-tf highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">resource</span> <span class="s2">"google_compute_instance"</span> <span class="s2">"bastion_host"</span> <span class="p">{</span> <span class="nx">name</span> <span class="p">=</span> <span class="kd">var</span><span class="p">.</span><span class="nx">name</span> <span class="nx">machine_type</span> <span class="p">=</span> <span class="kd">var</span><span class="p">.</span><span class="nx">machine_type</span> <span class="nx">boot_disk</span> <span class="p">{</span> <span class="nx">initialize_params</span> <span class="p">{</span> <span class="nx">image</span> <span class="p">=</span> <span class="kd">var</span><span class="p">.</span><span class="nx">image</span> <span class="p">}</span> <span class="p">}</span> <span class="nx">network_interface</span> <span class="p">{</span> <span class="nx">subnetwork</span> <span class="p">=</span> <span class="k">data</span><span class="p">.</span><span class="nx">google_compute_subnetwork</span><span class="p">.</span><span class="nx">subnet</span><span class="p">.</span><span class="nx">self_link</span> <span class="nx">access_config</span> <span class="p">{}</span> <span class="p">}</span> <span class="nx">service_account</span> <span class="p">{</span> <span class="nx">email</span> <span class="p">=</span> <span class="nx">google_service_account</span><span class="p">.</span><span class="nx">bastion_host</span><span class="p">.</span><span class="nx">email</span> <span class="nx">scopes</span> <span class="p">=</span> <span class="kd">var</span><span class="p">.</span><span class="nx">scopes</span> <span class="p">}</span> <span class="nx">tags</span> <span class="p">=</span> <span class="p">[</span><span class="kd">var</span><span class="p">.</span><span class="nx">tag</span><span class="p">]</span> <span class="nx">labels</span> <span class="p">=</span> <span class="kd">var</span><span class="p">.</span><span class="nx">labels</span> <span class="nx">metadata</span> <span class="p">=</span> <span class="kd">var</span><span class="p">.</span><span class="nx">metadata</span> <span class="p">}</span> </code></pre></div></div> <p>The <code class="language-plaintext highlighter-rouge">google_compute_instance.bastion_host</code> block creates the bastion host. There are a few things to take note of in this block. <code class="language-plaintext highlighter-rouge">subnetwork</code> is set based on one of the data blocks at the beginning of <code class="language-plaintext highlighter-rouge">main.tf</code>, <code class="language-plaintext highlighter-rouge">data.google_compute_subnetwork.subnet.self_link</code>. The <code class="language-plaintext highlighter-rouge">self_link</code> property provides a unique reference to the subnet that Terraform will use when submitting the API call to create the bastion host. Similarly, the service account created by <code class="language-plaintext highlighter-rouge">google_service_account.bastion_host</code> is assigned to the bastion host.</p> <p><code class="language-plaintext highlighter-rouge">tags</code>, <code class="language-plaintext highlighter-rouge">labels</code>, and <code class="language-plaintext highlighter-rouge">metadata</code> all serve similar, but distinct, purposes. <code class="language-plaintext highlighter-rouge">tags</code> are network tags, which will be used in firewall rules. <code class="language-plaintext highlighter-rouge">labels</code> are informational data that can be used for organizational or billing purposes. <code class="language-plaintext highlighter-rouge">metadata</code> has numerous uses, the most common of which is supplying a boot script that the instance will run on first boot.</p> <div class="language-tf highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">resource</span> <span class="s2">"google_compute_firewall"</span> <span class="s2">"allow_from_iap_to_bastion"</span> <span class="p">{</span> <span class="nx">project</span> <span class="p">=</span> <span class="kd">var</span><span class="p">.</span><span class="nx">project</span> <span class="nx">name</span> <span class="p">=</span> <span class="kd">var</span><span class="p">.</span><span class="nx">fw_name_allow_iap_to_bastion</span> <span class="nx">network</span> <span class="p">=</span> <span class="k">data</span><span class="p">.</span><span class="nx">google_compute_network</span><span class="p">.</span><span class="nx">network</span><span class="p">.</span><span class="nx">self_link</span> <span class="nx">allow</span> <span class="p">{</span> <span class="nx">protocol</span> <span class="p">=</span> <span class="s2">"tcp"</span> <span class="nx">ports</span> <span class="p">=</span> <span class="p">[</span><span class="s2">"3389"</span><span class="p">]</span> <span class="p">}</span> <span class="c1"># https://cloud.google.com/iap/docs/using-tcp-forwarding#before_you_begin</span> <span class="c1"># This range is needed to allow IAP to access the bastion host</span> <span class="nx">source_ranges</span> <span class="p">=</span> <span class="p">[</span><span class="s2">"35.235.240.0/20"</span><span class="p">]</span> <span class="nx">target_tags</span> <span class="p">=</span> <span class="p">[</span><span class="kd">var</span><span class="p">.</span><span class="nx">tag</span><span class="p">]</span> <span class="p">}</span> <span class="k">resource</span> <span class="s2">"google_compute_firewall"</span> <span class="s2">"allow_access_from_bastion"</span> <span class="p">{</span> <span class="nx">project</span> <span class="p">=</span> <span class="kd">var</span><span class="p">.</span><span class="nx">project</span> <span class="nx">name</span> <span class="p">=</span> <span class="kd">var</span><span class="p">.</span><span class="nx">fw_name_allow_mgmt_from_bastion</span> <span class="nx">network</span> <span class="p">=</span> <span class="k">data</span><span class="p">.</span><span class="nx">google_compute_network</span><span class="p">.</span><span class="nx">network</span><span class="p">.</span><span class="nx">self_link</span> <span class="nx">allow</span> <span class="p">{</span> <span class="nx">protocol</span> <span class="p">=</span> <span class="s2">"icmp"</span> <span class="p">}</span> <span class="nx">allow</span> <span class="p">{</span> <span class="nx">protocol</span> <span class="p">=</span> <span class="s2">"tcp"</span> <span class="nx">ports</span> <span class="p">=</span> <span class="p">[</span><span class="s2">"22"</span><span class="p">,</span> <span class="s2">"80"</span><span class="p">,</span> <span class="s2">"443"</span><span class="p">,</span> <span class="s2">"3389"</span><span class="p">]</span> <span class="p">}</span> <span class="c1"># Allow management traffic from bastion</span> <span class="nx">source_tags</span> <span class="p">=</span> <span class="p">[</span><span class="kd">var</span><span class="p">.</span><span class="nx">tag</span><span class="p">]</span> <span class="p">}</span> </code></pre></div></div> <p>The next two blocks create firewall rules: one for accessing the bastion host via IAP, and the other for accessing resources from the bastion host. <code class="language-plaintext highlighter-rouge">google_compute_firewall.allow_from_iap_to_bastion</code> allows traffic from <code class="language-plaintext highlighter-rouge">35.235.240.0/24</code> on <code class="language-plaintext highlighter-rouge">tcp/3389</code> to instances that have the same network tag as the one that was assigned to the bastion host. <code class="language-plaintext highlighter-rouge">google_compute_firewall.allow_access_from_bastion</code> allows traffic from the bastion host by referencing the same network tag to anything else in our project, using common management ports/protocols.</p> <div class="language-tf highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">resource</span> <span class="s2">"google_iap_tunnel_instance_iam_binding"</span> <span class="s2">"enable_iap"</span> <span class="p">{</span> <span class="nx">project</span> <span class="p">=</span> <span class="kd">var</span><span class="p">.</span><span class="nx">project</span> <span class="nx">zone</span> <span class="p">=</span> <span class="kd">var</span><span class="p">.</span><span class="nx">zone</span> <span class="nx">instance</span> <span class="p">=</span> <span class="kd">var</span><span class="p">.</span><span class="nx">name</span> <span class="nx">role</span> <span class="p">=</span> <span class="s2">"roles/iap.tunnelResourceAccessor"</span> <span class="nx">members</span> <span class="p">=</span> <span class="kd">var</span><span class="p">.</span><span class="nx">members</span> <span class="nx">depends_on</span> <span class="p">=</span> <span class="p">[</span><span class="nx">google_compute_instance</span><span class="p">.</span><span class="nx">bastion_host</span><span class="p">]</span> <span class="p">}</span> </code></pre></div></div> <p>The <code class="language-plaintext highlighter-rouge">google_iap_tunnel_instance_iam_binding.enable_iap</code> block assigns the <code class="language-plaintext highlighter-rouge">roles/iap.tunnelResourceAccessor</code> IAM role to the accounts defined in the <code class="language-plaintext highlighter-rouge">members</code> variable. This value could be any valid IAM resource like a specific account or a group. This role is required to be able to access the bastion host via IAP.</p> <div class="language-tf highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">resource</span> <span class="s2">"google_service_account_iam_binding"</span> <span class="s2">"bastion_sa_user"</span> <span class="p">{</span> <span class="nx">service_account_id</span> <span class="p">=</span> <span class="nx">google_service_account</span><span class="p">.</span><span class="nx">bastion_host</span><span class="p">.</span><span class="nx">id</span> <span class="nx">role</span> <span class="p">=</span> <span class="s2">"roles/iam.serviceAccountUser"</span> <span class="nx">members</span> <span class="p">=</span> <span class="kd">var</span><span class="p">.</span><span class="nx">members</span> <span class="p">}</span> </code></pre></div></div> <p>The <code class="language-plaintext highlighter-rouge">google_project_iam_member.bastion_sa_user</code> block allows accounts specified in the <code class="language-plaintext highlighter-rouge">members</code> variable to use the newly created service account via the <code class="language-plaintext highlighter-rouge">Service Account User</code> role (<code class="language-plaintext highlighter-rouge">roles/iam.serviceAccountUser</code>). This allows the users or groups defined in the <code class="language-plaintext highlighter-rouge">members</code> variable to access all of the resources that the service account has rights to access. More information on this can be found <a href="https://cloud.google.com/iam/docs/service-accounts#user-role">here</a>.</p> <div class="language-tf highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">resource</span> <span class="s2">"google_project_iam_member"</span> <span class="s2">"bastion_sa_bindings"</span> <span class="p">{</span> <span class="nx">for_each</span> <span class="p">=</span> <span class="nx">toset</span><span class="p">(</span><span class="kd">var</span><span class="p">.</span><span class="nx">service_account_roles</span><span class="p">)</span> <span class="nx">project</span> <span class="p">=</span> <span class="kd">var</span><span class="p">.</span><span class="nx">project</span> <span class="nx">role</span> <span class="p">=</span> <span class="nx">each</span><span class="p">.</span><span class="nx">key</span> <span class="nx">member</span> <span class="p">=</span> <span class="s2">"serviceAccount:</span><span class="k">${</span><span class="nx">google_service_account</span><span class="p">.</span><span class="nx">bastion_host</span><span class="p">.</span><span class="nx">email</span><span class="k">}</span><span class="s2">"</span> <span class="p">}</span> </code></pre></div></div> <p><code class="language-plaintext highlighter-rouge">google_project_iam_member.bastion_sa_bindings</code> completes the IAM-related configuration by granting roles defined in the <code class="language-plaintext highlighter-rouge">service_account_roles</code> variable to the service account. This service account is assigned to the bastion host, which defines what the bastion host can do. The default roles assigned are listed below, but they can be modified in <code class="language-plaintext highlighter-rouge">variables.tf</code>.</p> <ul> <li>Log Writer (<code class="language-plaintext highlighter-rouge">roles/logging.logWriter</code>)</li> <li>Monitoring Metric Writer (<code class="language-plaintext highlighter-rouge">roles/monitoring.metricWriter</code>)</li> <li>Monitoring Viewer (<code class="language-plaintext highlighter-rouge">roles/monitoring.viewer</code>)</li> <li>Compute OS Login (<code class="language-plaintext highlighter-rouge">roles/compute.osLogin</code>)</li> </ul> <div class="language-tf highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">resource</span> <span class="s2">"time_sleep"</span> <span class="s2">"wait_60_seconds"</span> <span class="p">{</span> <span class="nx">create_duration</span> <span class="p">=</span> <span class="s2">"60s"</span> <span class="nx">depends_on</span> <span class="p">=</span> <span class="p">[</span><span class="nx">google_compute_instance</span><span class="p">.</span><span class="nx">bastion_host</span><span class="p">]</span> <span class="p">}</span> <span class="k">data</span> <span class="s2">"external"</span> <span class="s2">"gcloud_set_bastion_password"</span> <span class="p">{</span> <span class="nx">program</span> <span class="p">=</span> <span class="p">[</span><span class="s2">"bash"</span><span class="p">,</span> <span class="s2">"-c"</span><span class="p">,</span> <span class="s2">"gcloud compute reset-windows-password </span><span class="k">${</span><span class="kd">var</span><span class="p">.</span><span class="nx">name</span><span class="k">}</span><span class="s2"> --user=</span><span class="k">${</span><span class="kd">var</span><span class="p">.</span><span class="nx">username</span><span class="k">}</span><span class="s2"> --format=json --quiet"</span><span class="p">]</span> <span class="nx">depends_on</span> <span class="p">=</span> <span class="p">[</span><span class="nx">time_sleep</span><span class="p">.</span><span class="nx">wait_60_seconds</span><span class="p">]</span> <span class="p">}</span> </code></pre></div></div> <p>These final two blocks are what I refer to as “cool Terraform tricks.” The point of these blocks is to set the password on the bastion host. There are a few ways to do this, but unfortunately, there is no way to set a Windows instance password with a native Terraform resource. Instead, an <code class="language-plaintext highlighter-rouge">external</code> data source is used to run the appropriate <code class="language-plaintext highlighter-rouge">gcloud</code> command, with JSON formatted results returned (this is a requirement of the <code class="language-plaintext highlighter-rouge">external</code> data source.) The password cannot be set until the bastion host is fully booted, so <code class="language-plaintext highlighter-rouge">external.gcloud_set_bastion_pasword</code> depends on <code class="language-plaintext highlighter-rouge">time_sleep.wait_60_seconds</code>, which is a simple 60-second timer that gives the bastion host time to boot up before the <code class="language-plaintext highlighter-rouge">gcloud</code> command is run.</p> <p>There is a chance that 60 seconds may not be long enough for the bastion host to boot. If you receive an error stating that the instance is not ready for use, you have two options:</p> <ol> <li>Run <code class="language-plaintext highlighter-rouge">terraform destroy</code> to remove the bastion host. Edit <code class="language-plaintext highlighter-rouge">main.tf</code> and increase the <code class="language-plaintext highlighter-rouge">create_duration</code> to a higher value, then run <code class="language-plaintext highlighter-rouge">terraform apply</code> again.</li> <li>Run the <code class="language-plaintext highlighter-rouge">gcloud compute reset-windows-password</code> command manually</li> </ol> <p>Ideally, the password reset functionality would be built into the Google Cloud Terraform provider, and I wouldn’t be surprised to see it added in the future. If you’re reading this post in 2022 or beyond, it’s probably worth a quick investigation to see if this has happened.</p> <h2 id="outputtf-contents">output.tf Contents</h2> <div class="language-tf highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">output</span> <span class="s2">"bastion_username"</span> <span class="p">{</span> <span class="nx">value</span> <span class="p">=</span> <span class="k">data</span><span class="p">.</span><span class="nx">external</span><span class="p">.</span><span class="nx">gcloud_set_bastion_password</span><span class="p">.</span><span class="nx">result</span><span class="p">.</span><span class="nx">username</span> <span class="p">}</span> <span class="k">output</span> <span class="s2">"bastion_password"</span> <span class="p">{</span> <span class="nx">value</span> <span class="p">=</span> <span class="k">data</span><span class="p">.</span><span class="nx">external</span><span class="p">.</span><span class="nx">gcloud_set_bastion_password</span><span class="p">.</span><span class="nx">result</span><span class="p">.</span><span class="nx">password</span> <span class="p">}</span> </code></pre></div></div> <p>These two outputs are the results of running the gcloud command. Once Terraform has completed running, it will display the username and password set on the bastion host. A password is sensitive data, so if you want to prevent it from being displayed, add <code class="language-plaintext highlighter-rouge">sensitive = true</code> to the <code class="language-plaintext highlighter-rouge">bastion_password</code> output block. Output values are stored in the Terraform state file, so you should take precautions to protect the state file from unauthorized access. Additional information on Terraform outputs is available <a href="https://www.terraform.io/docs/language/values/outputs.html">here</a>.</p> <h2 id="terraformtfvars-contents">terraform.tfvars Contents</h2> <p><code class="language-plaintext highlighter-rouge">terraform.tfvars</code> is the file that defines all the variables that are referenced in <code class="language-plaintext highlighter-rouge">main.tf</code>. All you need to do is supply the desired values for your environment, and you are good to go. Note that the variables below are all examples, so simply copying and pasting may not lead to the desired result.</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">members</span> <span class="o">=</span> <span class="p">[</span><span class="s">"user:[email protected]"</span><span class="p">]</span> <span class="n">project</span> <span class="o">=</span> <span class="s">"your-gcp-project"</span> <span class="n">region</span> <span class="o">=</span> <span class="s">"us-west2"</span> <span class="n">zone</span> <span class="o">=</span> <span class="s">"us-west2-a"</span> <span class="n">service_account_name</span> <span class="o">=</span> <span class="s">"bastion-sa"</span> <span class="n">name</span> <span class="o">=</span> <span class="s">"bastion-vm"</span> <span class="n">username</span> <span class="o">=</span> <span class="s">"bastionuser"</span> <span class="n">labels</span> <span class="o">=</span> <span class="p">{</span> <span class="n">owner</span> <span class="o">=</span> <span class="s">"GCVE Team"</span><span class="p">,</span> <span class="n">created_with</span> <span class="o">=</span> <span class="s">"terraform"</span> <span class="p">}</span> <span class="n">image</span> <span class="o">=</span> <span class="s">"gce-uefi-images/windows-2019"</span> <span class="n">machine_type</span> <span class="o">=</span> <span class="s">"n1-standard-1"</span> <span class="n">network_name</span> <span class="o">=</span> <span class="s">"gcve-usw2"</span> <span class="n">subnet_name</span> <span class="o">=</span> <span class="s">"gcve-usw2-mgmt"</span> <span class="n">tag</span> <span class="o">=</span> <span class="s">"bastion"</span> </code></pre></div></div> <p>Additional information on the variables used is available in <a href="https://github.com/shamsway/gcp-terraform-examples/blob/main/gcve-bastion-iap/README.md">README.md</a>. You can also find information on these variables, including their default values should one exist, in <code class="language-plaintext highlighter-rouge">variables.tf</code>.</p> <h2 id="initializing-and-running-terraform">Initializing and Running Terraform</h2> <p>Terraform will use <a href="https://cloud.google.com/sdk/gcloud/reference/auth/application-default">Application Default Credentials</a> to authenticate to Google Cloud. Assuming you have the <code class="language-plaintext highlighter-rouge">gcloud</code> cli tool installed, you can set these by running <code class="language-plaintext highlighter-rouge">gcloud auth application-default</code>. Additional information on authentication can be found in the <a href="https://registry.terraform.io/providers/hashicorp/google/latest/docs/guides/getting_starte">Getting Started with the Google Provider</a> Terraform documentation. To run the Terraform code, follow the steps below.</p> <p><strong>Following these steps will create resources in your Google Cloud project, and you will be billed for them.</strong></p> <ol> <li>Run <code class="language-plaintext highlighter-rouge">terraform init</code> and ensure no errors are displayed</li> <li>Run <code class="language-plaintext highlighter-rouge">terraform plan</code> and review the changes that Terraform will perform</li> <li>Run <code class="language-plaintext highlighter-rouge">terraform apply</code> to apply the proposed configuration changes</li> </ol> <p>Should you wish to remove everything created by Terraform, run <code class="language-plaintext highlighter-rouge">terraform destroy</code> and answer <code class="language-plaintext highlighter-rouge">yes</code> when prompted. This will only remove the VPC network and related configuration created by Terraform. Your GCVE environment will have to be deleted using <a href="https://cloud.google.com/vmware-engine/docs/private-clouds/howto-delete-private-cloud">these instructions</a>, if desired.</p> <h1 id="accessing-the-bastion-host-with-iap">Accessing the Bastion Host with IAP</h1> <p>Now, you should have a fresh Windows 2019 Server running in Google Cloud to serve as a bastion host. Use this command to create a tunnel to the bastion host:</p> <div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>gcloud compute start-iap-tunnel <span class="o">[</span>bastion-host-name] 3389 <span class="nt">--zone</span> <span class="o">[</span>zone] </code></pre></div></div> <p class="center"><img src="https://networkbrouhaha.com/resources/2021/03/20_gcloud_iap_tunnel.png" alt="" class="drop-shadow" /></p> <p>You will see a message that says <code class="language-plaintext highlighter-rouge">Listening on port [random number]</code>. This random high port is proxied to your bastion host port 3389. Fire up your favorite RDP client and connect to <code class="language-plaintext highlighter-rouge">localhost:[random number]</code>. Login with the credentials that were output from running Terraform. Once you’re able to connect to the bastion host, install the vSphere-compatible browser of your choice, along with any other management tools you may need.</p> <p>If you’re a Windows user, there is an IAP-enabled RDP client available <a href="https://github.com/GoogleCloudPlatform/iap-desktop">here</a>.</p> <h1 id="accessing-gcve-resources-from-the-bastion-host">Accessing GCVE Resources from the Bastion Host</h1> <p>Open the GCVE portal, browse to <code class="language-plaintext highlighter-rouge">Resources</code>, and click on your SDDC, then <code class="language-plaintext highlighter-rouge">vSphere Management Network</code>. This will display the hostnames for your vCenter, NSX and HCX instances. Copy the hostname for vCenter and paste it into a browser in your bastion host to verify you can access your SDDC.</p> <p class="center"><img src="https://networkbrouhaha.com/resources/2021/03/21_cloud_dns_forwarding_edited.png" alt="" class="drop-shadow" /> <em>Cloud DNS forwarding config to enable resolution of GCVE resources</em></p> <p>Access to GCVE from your VPC is made possible by private service access and a DNS forwarding configuration in Cloud DNS. The DNS forwarding configuration enables name resolution from your VPC for resources in GCVE. It is automatically created in Cloud DNS when private service access is configured between your VPC and GCVE. This is a relatively new feature and a nice improvement. Previously, name resolution for GCVE required manually changing resolvers on your bastion host or configuring a standalone DNS server.</p> <h1 id="wrap-up">Wrap Up</h1> <p>A quick recap of everything we’ve accomplished if you’ve been following this blog series from the beginning:</p> <ul> <li>Deployed an SDDC in GCVE</li> <li>Created a new VPC and configured private service access to your SDDC</li> <li>Deployed a bastion host in your VPC, accessible via IAP</li> </ul> <p>Clearly, we are just getting started! My next post will look at configuring Cloud Interconnect and standing up an HCX service mesh. With that in place, we can begin migrating some workloads into our SDDC.</p> <h1 id="terraform-documentation-links">Terraform Documentation Links</h1> <ul> <li><a href="https://registry.terraform.io/providers/hashicorp/google/latest/docs/guides/provider_reference">Google Provider Configuration Reference</a></li> <li><a href="https://registry.terraform.io/providers/hashicorp/google/latest/docs/data-sources/compute_network">google_compute_network Data Source</a></li> <li><a href="https://registry.terraform.io/providers/hashicorp/google/latest/docs/data-sources/compute_subnetwork">google_compute_subnetwork Data Source</a></li> <li><a href="https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/google_service_account">google_service_account Resource</a></li> <li><a href="https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_instance">google_compute_instance Resource</a></li> <li><a href="https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_firewall">google_compute_firewall Resource</a></li> <li><a href="https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/iap_tunnel_instance_iam">google_iap_tunnel_instance_iam_binding Resource</a></li> <li><a href="https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/google_service_account_iam">google_service_account_iam_binding Resource</a></li> <li><a href="https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/google_project_iam">google_project_iam_member Resource</a></li> <li><a href="https://registry.terraform.io/providers/hashicorp/time/latest/docs/resources/sleep">time_sleep Resource</a></li> <li><a href="https://registry.terraform.io/providers/hashicorp/external/latest/docs/data-sources/data_source">external Data Source</a></li> </ul> Wed, 03 Mar 2021 00:00:00 +0000 http://www.networkbrouhaha.com/2021/03/gcve-bastion/ http://www.networkbrouhaha.com/2021/03/gcve-bastion/ Intro to Google Cloud VMware Engine – Connecting a VPC to GCVE <p>My <a href="/2021/02/gcve-sddc-with-hcx/">previous post</a> walked through deploying an SDDC in Google Cloud VMware Engine (GCVE). This post will show the process of connecting a VPC to your GCVE environment, and we will use Terraform to do the vast majority of the work. The diagram below shows the basic concept of what I will be covering in this post. Once connected, you will be able to communicate from your VPC to your SDDC and vice versa. If you would like to complete this process using the cloud console instead of Terraform, see <a href="https://cloud.google.com/vmware-engine/docs/networking/howto-setup-private-service-access">Setting up private service access</a> in the VMware Engine documentation.</p> <p><strong>Other posts in this series:</strong></p> <ul> <li><a href="/2021/02/gcve-sddc-with-hcx/">Deploying a GCVE SDDC with HCX</a></li> <li><a href="/2021/03/gcve-bastion/">Bastion Host Access with IAP</a></li> <li><a href="/2021/03/gcve-network-overview/">Network and Connectivity Overview</a></li> <li><a href="/2021/04/gcve-hcx-config/">HCX Configuration</a></li> <li><a href="/2021/05/gcve-networking-scenarios/">Common Networking Scenarios</a></li> </ul> <p class="center"><a href="/resources/2021/02/gcve-vpc-peeing.png" class="drop-shadow"><img src="/resources/2021/02/gcve-vpc-peeing.png" alt="" /></a></p> <p>I’m assuming you have a working SDDC deployed in VMware Engine and some basic knowledge of how Terraform works so you can use the provided Terraform examples. If you have not yet deployed an SDDC, please do so before continuing. If you need to get up to speed with Terraform, browse over to <a href="https://learn.hashicorp.com/terraform">https://learn.hashicorp.com/terraform</a>. All of the code referenced in this post will be available at <a href="https://github.com/shamsway/gcp-terraform-examples">https://github.com/shamsway/gcp-terraform-examples</a> in the <code class="language-plaintext highlighter-rouge">gcve-network</code> sub-directory. You will need to have git installed to clone the repo, and I highly recommend using <a href="https://github.com/microsoft/vscode">Visual Studio Code</a> with the Terraform add-on installed to view the files.</p> <h1 id="private-service-access-overview">Private Service Access Overview</h1> <p>GCVE SDDCs can establish connectivity to native GCP services with <a href="https://cloud.google.com/vpc/docs/private-services-access">private services access</a>. This feature can be used to establish connectivity from a VPC to a third-party “service producer,” but in this case, it will simply plumb connectivity between our VPC and SDDC. Configuring private services access requires allocating one or more reserved ranges that cannot be used in your local VPC network. In this case, we will supply the ranges that we have allocated for our VMware Engine SDDC networks. Doing this prevents issues with overlapping IP ranges.</p> <h1 id="leveraging-terraform-for-configuration">Leveraging Terraform for Configuration</h1> <p>I have provided Terraform code that will do the following:</p> <ul> <li>Create a VPC network</li> <li>Create a subnet in the new VPC network that will be used to communicate with GCVE</li> <li>Create two Global Address pools that will be used to reserve addresses used in GCVE</li> <li>Create a private connection in the new VPC, using the two Global Address pools as reserved ranges</li> <li>Enable import and export of custom routes for the VPC</li> </ul> <p>After Terraform completes configuration, you will be able to establish peering with the new VPC in GCVE. To get started, clone the example repo with <code class="language-plaintext highlighter-rouge">git clone https://github.com/shamsway/gcp-terraform-examples.git</code>, then change to the <code class="language-plaintext highlighter-rouge">gcve-network</code> sub-directory. You will find these files:</p> <ul> <li><code class="language-plaintext highlighter-rouge">main.tf</code> – Contains the primary Terraform code to complete the steps mentioned above</li> <li><code class="language-plaintext highlighter-rouge">variables.tf</code> – Defines the input variables that will be used in <code class="language-plaintext highlighter-rouge">main.tf</code></li> <li><code class="language-plaintext highlighter-rouge">terraform.tfvars</code> – Supplies values for the input variables defined in <code class="language-plaintext highlighter-rouge">variables.tf</code></li> </ul> <p>Let’s take a look at what is happening in <code class="language-plaintext highlighter-rouge">main.tf</code>, then we will supply the necessary variables in <code class="language-plaintext highlighter-rouge">terraform.tfvars</code> and run Terraform. You will see <code class="language-plaintext highlighter-rouge">var.[name]</code> appear over and over in the code, as this is how Terraform references variables. You may think it would be easier to place the desired values directly into <code class="language-plaintext highlighter-rouge">main.tf</code> instead of defining and supplying variables, but it is worth the time to get used to using variables with Terraform. Hardcoding values in your code is rarely a good idea, and most Terraform code that I have consumed from other authors use variables heavily.</p> <h2 id="maintf-contents">main.tf Contents</h2> <div class="language-tf highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">provider</span> <span class="s2">"google"</span> <span class="p">{</span> <span class="nx">project</span> <span class="p">=</span> <span class="kd">var</span><span class="p">.</span><span class="nx">project</span> <span class="nx">region</span> <span class="p">=</span> <span class="kd">var</span><span class="p">.</span><span class="nx">region</span> <span class="nx">zone</span> <span class="p">=</span> <span class="kd">var</span><span class="p">.</span><span class="nx">zone</span> <span class="p">}</span> </code></pre></div></div> <p>The file begins with a provider block, which is common in Terraform. This block defines the Google Cloud project, region, and zone in which Terraform will create resources. The values used are specified in <code class="language-plaintext highlighter-rouge">terraform.tfvars</code>, which is the same method we will use throughout this example.</p> <div class="language-tf highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">resource</span> <span class="s2">"google_compute_network"</span> <span class="s2">"vpc_network"</span> <span class="p">{</span> <span class="nx">name</span> <span class="p">=</span> <span class="kd">var</span><span class="p">.</span><span class="nx">network_name</span> <span class="nx">description</span> <span class="p">=</span> <span class="kd">var</span><span class="p">.</span><span class="nx">network_descr</span> <span class="nx">auto_create_subnetworks</span> <span class="p">=</span> <span class="kc">false</span> <span class="p">}</span> </code></pre></div></div> <p>The first resource block creates a new VPC in the region and zone specified in the provider block. Setting <code class="language-plaintext highlighter-rouge">auto_create_subnetworks</code> to <code class="language-plaintext highlighter-rouge">false</code> specifies that we want a custom VPC instead of auto-creating subnets for each region.</p> <div class="language-tf highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">resource</span> <span class="s2">"google_compute_subnetwork"</span> <span class="s2">"vpc_subnet"</span> <span class="p">{</span> <span class="nx">name</span> <span class="p">=</span> <span class="kd">var</span><span class="p">.</span><span class="nx">subnet_name</span> <span class="nx">ip_cidr_range</span> <span class="p">=</span> <span class="kd">var</span><span class="p">.</span><span class="nx">subnet_cidr</span> <span class="nx">region</span> <span class="p">=</span> <span class="kd">var</span><span class="p">.</span><span class="nx">region</span> <span class="nx">network</span> <span class="p">=</span> <span class="nx">google_compute_network</span><span class="p">.</span><span class="nx">vpc_network</span><span class="p">.</span><span class="nx">id</span> <span class="p">}</span> </code></pre></div></div> <p>The next block creates a subnet in the newly created VPC. Notice that the last line references <code class="language-plaintext highlighter-rouge">google_compute_network.vpc_network.id</code> for the network value, meaning that it uses the ID value of the VPC created by Terraform.</p> <div class="language-tf highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">resource</span> <span class="s2">"google_compute_global_address"</span> <span class="s2">"private_ip_alloc_1"</span> <span class="p">{</span> <span class="nx">name</span> <span class="p">=</span> <span class="kd">var</span><span class="p">.</span><span class="nx">reserved1_name</span> <span class="nx">address</span> <span class="p">=</span> <span class="kd">var</span><span class="p">.</span><span class="nx">reserved1_address</span> <span class="nx">purpose</span> <span class="p">=</span> <span class="kd">var</span><span class="p">.</span><span class="nx">address_purpose</span> <span class="nx">address_type</span> <span class="p">=</span> <span class="kd">var</span><span class="p">.</span><span class="nx">address_type</span> <span class="nx">prefix_length</span> <span class="p">=</span> <span class="kd">var</span><span class="p">.</span><span class="nx">reserved1_address_prefix_length</span> <span class="nx">network</span> <span class="p">=</span> <span class="nx">google_compute_network</span><span class="p">.</span><span class="nx">vpc_network</span><span class="p">.</span><span class="nx">id</span> <span class="p">}</span> </code></pre></div></div> <p>This block and the following block (<code class="language-plaintext highlighter-rouge">google_compute_global_address.private_ip_alloc_2</code>) create a private IP allocation used for the private services configuration.</p> <div class="language-tf highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">resource</span> <span class="s2">"google_service_networking_connection"</span> <span class="s2">"gcve-psa"</span> <span class="p">{</span> <span class="nx">network</span> <span class="p">=</span> <span class="nx">google_compute_network</span><span class="p">.</span><span class="nx">vpc_network</span><span class="p">.</span><span class="nx">id</span> <span class="nx">service</span> <span class="p">=</span> <span class="kd">var</span><span class="p">.</span><span class="nx">service</span> <span class="nx">reserved_peering_ranges</span> <span class="p">=</span> <span class="p">[</span><span class="nx">google_compute_global_address</span><span class="p">.</span><span class="nx">private_ip_alloc_1</span><span class="p">.</span><span class="nx">name</span><span class="p">,</span> <span class="nx">google_compute_global_address</span><span class="p">.</span><span class="nx">private_ip_alloc_2</span><span class="p">.</span><span class="nx">name</span><span class="p">]</span> <span class="nx">depends_on</span> <span class="p">=</span> <span class="p">[</span><span class="nx">google_compute_network</span><span class="p">.</span><span class="nx">vpc_network</span><span class="p">]</span> <span class="p">}</span> </code></pre></div></div> <p>These last two blocks are where things get interesting. The block above configures the private services connection using the VPC network and private IP allocation created by Terraform. <code class="language-plaintext highlighter-rouge">Service</code> is a specific string, <code class="language-plaintext highlighter-rouge">servicenetworking.googleapis.com</code>, since Google is the service provider in this scenario. This value is set in <code class="language-plaintext highlighter-rouge">terraform.tfvars</code>, as we will see in a moment. If you find this confusing, check the available documentation for this resource, and it should help you to understand it.</p> <div class="language-tf highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">resource</span> <span class="s2">"google_compute_network_peering_routes_config"</span> <span class="s2">"peering_routes"</span> <span class="p">{</span> <span class="nx">peering</span> <span class="p">=</span> <span class="kd">var</span><span class="p">.</span><span class="nx">peering</span> <span class="nx">network</span> <span class="p">=</span> <span class="nx">google_compute_network</span><span class="p">.</span><span class="nx">vpc_network</span><span class="p">.</span><span class="nx">name</span> <span class="nx">import_custom_routes</span> <span class="p">=</span> <span class="kc">true</span> <span class="nx">export_custom_routes</span> <span class="p">=</span> <span class="kc">true</span> <span class="nx">depends_on</span> <span class="p">=</span> <span class="p">[</span><span class="nx">google_service_networking_connection</span><span class="p">.</span><span class="nx">gcve</span><span class="err">-</span><span class="nx">psa</span><span class="p">]</span> <span class="p">}</span> </code></pre></div></div> <p>The final block enables the import and export of custom routes for our VPC peering configuration.</p> <p>Note that the final two blocks contain an argument that none of the others do: <code class="language-plaintext highlighter-rouge">depends_on</code>. The Terraform documentation describes <code class="language-plaintext highlighter-rouge">depends_on</code> in-depth <a href="https://www.terraform.io/docs/language/meta-arguments/depends_on.html">here</a>, but basically, this is a hint for Terraform to describe resources that rely on each other. Typically, Terraform can determine this automatically, but there are occasional cases where this statement needs to be used. Running <code class="language-plaintext highlighter-rouge">terraform destroy</code> without this argument in place may lead to errors, as Terraform could delete the VPC before removing the private services connection or route peering configuration.</p> <h2 id="terraformtfvars-contents">terraform.tfvars Contents</h2> <p><code class="language-plaintext highlighter-rouge">terraform.tfvars</code> is the file that defines all the variables that are referenced in <code class="language-plaintext highlighter-rouge">main.tf</code>. All you need to do is supply the desired values for your environment, and you are good to go. Note that the variables below are all examples, so simply copying and pasting may not lead to the desired result.</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">project</span> <span class="o">=</span> <span class="s">"your-gcp-project"</span> <span class="n">region</span> <span class="o">=</span> <span class="s">"us-west2"</span> <span class="n">zone</span> <span class="o">=</span> <span class="s">"us-west2-a"</span> <span class="n">network_name</span> <span class="o">=</span> <span class="s">"gcve-usw2"</span> <span class="n">network_descr</span> <span class="o">=</span> <span class="s">"Network for testing of GCVE in USW2"</span> <span class="n">subnet_name</span> <span class="o">=</span> <span class="s">"gcve-usw2-mgmt"</span> <span class="n">subnet_cidr</span> <span class="o">=</span> <span class="s">"192.168.82.0/24"</span> <span class="n">reserved1_name</span> <span class="o">=</span> <span class="s">"gcve-managemnt-ip-alloc"</span> <span class="n">reserved1_address</span> <span class="o">=</span> <span class="s">"192.168.80.0"</span> <span class="n">reserved1_address_prefix_length</span> <span class="o">=</span> <span class="mi">23</span> <span class="n">reserved2_name</span> <span class="o">=</span> <span class="s">"gcve-workload-ip-alloc"</span> <span class="n">reserved2_address</span> <span class="o">=</span> <span class="s">"192.168.84.0"</span> <span class="n">reserved2_address_prefix_length</span> <span class="o">=</span> <span class="mi">23</span> <span class="n">address_purpose</span> <span class="o">=</span> <span class="s">"VPC_PEERING"</span> <span class="n">address_type</span> <span class="o">=</span> <span class="s">"INTERNAL"</span> <span class="n">service</span> <span class="o">=</span> <span class="s">"servicenetworking.googleapis.com"</span> <span class="n">peering</span> <span class="o">=</span> <span class="s">"servicenetworking-googleapis-com"</span> </code></pre></div></div> <p>Additional information on the variables used is available in <a href="https://github.com/shamsway/gcp-terraform-examples/blob/main/gcve-vpc-peering/README.md">README.md</a>. You can also find information on these variables, including their default values should one exist, in <code class="language-plaintext highlighter-rouge">variables.tf</code>.</p> <h2 id="initializing-and-running-terraform">Initializing and Running Terraform</h2> <p>Terraform will use <a href="https://cloud.google.com/sdk/gcloud/reference/auth/application-default">Application Default Credentials</a> to authenticate to Google Cloud. Assuming you have the <code class="language-plaintext highlighter-rouge">gcloud</code> cli tool installed, you can set these by running <code class="language-plaintext highlighter-rouge">gcloud auth application-default</code>. Additional information on authentication can be found in the <a href="https://registry.terraform.io/providers/hashicorp/google/latest/docs/guides/getting_starte">Getting Started with the Google Provider</a> Terraform documentation. To run the Terraform code, follow the steps below.</p> <p><strong>Following these steps will create resources in your Google Cloud project, and you will be billed for them.</strong></p> <ol> <li>Run <code class="language-plaintext highlighter-rouge">terraform init</code> and ensure no errors are displayed</li> <li>Run <code class="language-plaintext highlighter-rouge">terraform plan</code> and review the changes that Terraform will perform</li> <li>Run <code class="language-plaintext highlighter-rouge">terraform apply</code> to apply the proposed configuration changes</li> </ol> <p>Should you wish to remove everything created by Terraform, run <code class="language-plaintext highlighter-rouge">terraform destroy</code> and answer <code class="language-plaintext highlighter-rouge">yes</code> when prompted. This will only remove the VPC network and related configuration created by Terraform. Your GCVE environment will have to be deleted using <a href="https://cloud.google.com/vmware-engine/docs/private-clouds/howto-delete-private-cloud">these instructions</a>, if desired.</p> <h1 id="review-vpc-configuration">Review VPC Configuration</h1> <p>Once <code class="language-plaintext highlighter-rouge">terraform apply</code> completes, you can see the results in the <a href="https://console.cloud.google.com/">Google Cloud Console</a>.</p> <p class="center"><a href="/resources/2021/02/network_allocated_ips_edited.png" class="drop-shadow"><img src="/resources/2021/02/network_allocated_ips_edited.png" alt="" /></a></p> <p>IP ranges allocated for use in GCVE are reserved.</p> <p class="center"><a href="/resources/2021/02/network_service_connection_edited.png" class="drop-shadow"><img src="/resources/2021/02/network_service_connection_edited.png" alt="" /></a></p> <p>Private service access is configured.</p> <p class="center"><a href="/resources/2021/02/network_peering_edited.png" class="drop-shadow"><img src="/resources/2021/02/network_peering_edited.png" alt="" /></a></p> <p>Import and export of custom routes on the <code class="language-plaintext highlighter-rouge">servicenetworking-googleapis-com</code> private connection is enabled.</p> <h1 id="complete-peering-in-gcve">Complete Peering in GCVE</h1> <p>The final step is to create the private connection in the VMware Engine portal. You will need the following information to configure the private connection.</p> <ul> <li>Project ID (found under <code class="language-plaintext highlighter-rouge">Project info</code> on the console dashboard.) <code class="language-plaintext highlighter-rouge">Project ID</code> may be different than <code class="language-plaintext highlighter-rouge">Project Name</code>, so verify you are gathering the correct information.</li> <li>Project Number (also found under <code class="language-plaintext highlighter-rouge">Project info</code> on the console dashboard.)</li> <li>Name of the VPC (<code class="language-plaintext highlighter-rouge">network_name</code> in your <code class="language-plaintext highlighter-rouge">variables.tf</code> file.)</li> <li>Peered project ID from VPC Network Peering screen</li> </ul> <p>Save all of these values somewhere handy, and follow these steps to complete peering</p> <p class="center"><a href="/resources/2021/02/15b_add_private_connection_edited.png" class="drop-shadow"><img src="/resources/2021/02/15b_add_private_connection_edited.png" alt="" /></a></p> <ol> <li>Open the VMware Engine portal, and browse to <code class="language-plaintext highlighter-rouge">Network &gt; Private connection</code>.</li> <li>Click <code class="language-plaintext highlighter-rouge">Add network connection</code> and paste the required values. Supply the peered project ID in the <code class="language-plaintext highlighter-rouge">Tenant project ID</code> field, VPC name in the <code class="language-plaintext highlighter-rouge">Peer VPC ID</code> field, and complete the remaining fields.</li> <li>Choose the region your VMware Engine private cloud is deployed in, and click <code class="language-plaintext highlighter-rouge">submit</code>.</li> </ol> <p class="center"><a href="/resources/2021/02/16_add_private_connection_edited.png" class="drop-shadow"><img src="/resources/2021/02/16_add_private_connection_edited.png" alt="" /></a></p> <p>After a few moments, <code class="language-plaintext highlighter-rouge">Region Status</code> should show a status of <code class="language-plaintext highlighter-rouge">Connected</code>. Your VMware Engine private cloud is now peered with your Google Cloud VPC. You can verify peering is working by checking the routing table of your VPC.</p> <h1 id="verify-vpc-routing-table">Verify VPC Routing Table</h1> <p>Once peering is completed, you should see routes for networks in your GCVE SDDC in your VPC routing table. You can view these routes in the cloud console or with:</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code> gcloud couple networks peerings list-routes service-networking-googleapis-com –network=[VPC Name] –region=[Region name] –direction=incoming </code></pre></div></div> <p class="center"><a href="/resources/2021/02/19_gcloud_routes_output.png" class="drop-shadow"><img src="/resources/2021/02/19_gcloud_routes_output.png" alt="" /></a> Verifying routes with the gcloud cli</p> <p class="center"><a href="/resources/2021/02/17_peering_imported_routes_edited.png" class="drop-shadow"><img src="/resources/2021/02/17_peering_imported_routes_edited.png" alt="" /></a> Viewing routes in the console</p> <h1 id="wrap-up">Wrap Up</h1> <p>Well, that was fun! You should now have established connectivity between your VMware Engine SDDC and your Google Cloud VPC, but we are only getting started. My next post will cover creating a bastion host in GCP to manage your GCVE environment, and I may take a look at Cloud DNS as well.</p> <p>This post comes at a good time, as Google has just announced <a href="https://cloud.google.com/blog/products/compute/whats-new-in-google-cloud-vmware-engine-in-february-2021">several enhancements to GCVE</a>, including multiple VPC peering. I’m planning on exploring these enhancements in future posts.</p> <h1 id="terraform-documentation-links">Terraform Documentation Links</h1> <ul> <li><a href="https://registry.terraform.io/providers/hashicorp/google/latest/docs/guides/provider_reference">Google Provider Configuration Reference</a></li> <li><a href="https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_network">google_compute_network Resource</a></li> <li><a href="https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_subnetwork">google_compute_subnetwork Resource</a></li> <li><a href="https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_global_address">google_compute_global_address Resource</a></li> <li><a href="https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/service_networking_connection">google_service_networking_connection Resource</a></li> <li><a href="https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_network_peering_routes_config">google_compute_network_peering_routes_config Resource</a></li> </ul> Fri, 19 Feb 2021 00:00:00 +0000 http://www.networkbrouhaha.com/2021/02/gcp-vpc-to-gcve/ http://www.networkbrouhaha.com/2021/02/gcp-vpc-to-gcve/ Intro to Google Cloud VMware Engine - Deploying a GCVE SDDC with HCX <p>Welcome to the first post in a new series focusing on <a href="https://cloud.google.com/vmware-engine">Google Cloud VMware Engine</a> (GCVE)! This first post will walkthrough prerequisites, deploying an SDDC with VMware HCX, and accessing vCenter via VPN Gateway (i.e., OpenVPN).</p> <p>Before we dive into deploying an SDDC, I want to set expectations for this blog series. My goal when working in the cloud is to create, modify and destroy resources programmatically. My tool of choice is <a href="https://www.terraform.io/">Terraform</a>, but I will also use CLI-based tools like <a href="https://cloud.google.com/sdk/gcloud">gcloud</a>. Occasionally I will inspect API calls directly and perform API calls with Python or <a href="https://github.com/curl/curl">cURL</a>. I have found that learning a product’s API is an excellent way to master it. Cloud consoles (GUIs) are adequate when getting started, but interfacing with the API, whether through Terraform or an SDK, is how these platforms are designed to work.</p> <p>This first post will be different from the others because the GCVE API documentation is not yet public, nor is there any Terraform functionality available to create or destroy GCVE resources. API documentation and Terraform for GCVE is coming, so when it is available, I will certainly blog about it! For now, I will walk through the GCVE GUI to detail SDDC and VPN gateway creation. Have no fear – there will be plenty of Terraform in future posts.</p> <p><strong>Other posts in this series:</strong></p> <ul> <li><a href="/2021/02/gcp-vpc-to-gcve/">Connecting a VPC to GCVE</a></li> <li><a href="/2021/03/gcve-bastion/">Bastion Host Access with IAP</a></li> <li><a href="/2021/03/gcve-network-overview/">Network and Connectivity Overview</a></li> <li><a href="/2021/04/gcve-hcx-config/">HCX Configuration</a></li> <li><a href="/2021/05/gcve-networking-scenarios/">Common Networking Scenarios</a></li> </ul> <h1 id="prerequisites-for-creating-a-gcve-sddc">Prerequisites for Creating a GCVE SDDC</h1> <p>If you’ve read any of my previous blog posts on cloud networking, you will already know that the most important thing to do before deploying anything into the cloud is rigorous planning. Deploying an SDDC in GCVE is no different. You will need to designate several <em>unique</em> IP ranges to be used for SDDC infrastructure and workloads, ensure the proper firewall ports are allowed to manage your SDDC, and prepare your GCP environment before deploying an SDDC. All of these prerequisites are detailed in the <a href="https://cloud.google.com/vmware-engine/docs/quickstart-prerequisites">GCVE prerequisites documentation</a>, which I highly recommend reading. Google’s documentation is thorough, and there is nothing better than reading through all of the docs if you want to understand how this solution works. Here is an overview of the required steps:</p> <ul> <li>Plan the IP ranges you will use with GCVE. These are all <a href="https://en.wikipedia.org/wiki/Private_network">RFC 1918 private addresses</a>. You will need ranges for each of the following: <ul> <li><strong>vSphere and vSAN</strong> (/21 - /24 accepted). Depending on the size of the range you choose, it will be divided into additional subnets for management, vMotion, vSAN, and NSX. Details on the layout for these subnets are available <a href="https://cloud.google.com/vmware-engine/docs/concepts-vlans-subnets#management_network_cidr_range_breakdown">here</a>.</li> <li><strong>HCX</strong> (/27 or higher)</li> <li><strong>Edge Services</strong>, required for client VPN and internet access (/26)</li> <li><strong>Client subnet</strong>, assigned to clients connecting via VPN Gateway (/24)</li> <li><strong>Workload subnets</strong>, which will be configured in NSX-T after your SDDC is deployed. These are entirely up to you to determine, but my advice is to reserve plenty of IPs to use.</li> </ul> </li> <li>Ensure your local firewall is configured for communication with vCenter and workload VMs. Ports used for communication are documented in the <a href="https://cloud.google.com/vmware-engine/docs/quickstart-prerequisites#firewall-port-requirements">prerequisites</a>.</li> <li>Enable the VMware Engine API in your Google Cloud Project</li> <li>Enable the VMware Engine <a href="https://cloud.google.com/vmware-engine/quotas">node quota</a></li> </ul> <p>Once these are completed, you are ready to create your SDDC!</p> <h1 id="creating-a-gcve-sddc">Creating a GCVE SDDC</h1> <p>To create a GCVE SDDC, browse to <code class="language-plaintext highlighter-rouge">Compute &gt; VMware Engine</code> in the GCP Console. This will bring you to the GCVE homepage.</p> <p class="center"><img src="https://networkbrouhaha.com/resources/2021/02/01_create_private_cloud_edited.png" alt="" class="drop-shadow" /></p> <p>Click <code class="language-plaintext highlighter-rouge">Create a Private Cloud</code> to get started.</p> <p class="center"><img src="https://networkbrouhaha.com/resources/2021/02/02_create_private_cloud.png" alt="" class="drop-shadow" /></p> <p>Specify your cloud name, location, node count, and predetermined network ranges. If you cannot choose your desired region, ensure you have requested VMware Engine nodes quota for that region. Your quota will also determine how many nodes you can request. The minimum node count is three nodes. After clicking <code class="language-plaintext highlighter-rouge">Review and Create</code>, you will be shown a confirmation page. Review your choices and click <code class="language-plaintext highlighter-rouge">Create</code>.</p> <p class="center"><img src="https://networkbrouhaha.com/resources/2021/02/04_create_private_cloud_edited.png" alt="" class="drop-shadow" /></p> <p>You will be taken to a summary page for your new cluster once provisioning begins. Note that the state is <code class="language-plaintext highlighter-rouge">Provisioning</code> in the screenshot above, and it will take between 30 minutes to 2 hours to complete. My experience has been that it takes just over 30 minutes to provision an SDDC, which is pretty impressive. You can click on the <code class="language-plaintext highlighter-rouge">Activity</code> to tab view recent events, tasks, and alerts. Drilling into those will provide specifics on any activity in your SDDC, including the provisioning process.</p> <p class="center"><img src="https://networkbrouhaha.com/resources/2021/02/05_gcve_cluster_info_edited.png" alt="" class="drop-shadow" /></p> <h1 id="setting-up-the-gcve-vpn-gateway">Setting Up the GCVE VPN Gateway</h1> <p>There are several ways to access your GCVE environment, including Cloud Interconnect and Cloud VPN. I will explore these topics in future posts. To establish initial connectivity to GCVE, a <a href="https://cloud.google.com/vmware-engine/docs/networking/howto-vpn-configure">VPN gateway</a> can be used. This is an OpenVPN-based client VPN that will allow you to connect to your SDDC’s vCenter and perform any initial configuration that you desire.</p> <p>Before the VPN gateway can be deployed, you will need to configure the “Edge Services” range for the region where your SDDC is deployed. To do this, browse to <code class="language-plaintext highlighter-rouge">Network &gt; Regional</code> settings in the GCVE portal, and click <code class="language-plaintext highlighter-rouge">Add Region</code>.</p> <p class="center"><img src="https://networkbrouhaha.com/resources/2021/02/06_region_edge_services.png" alt="" class="drop-shadow" /></p> <p>Choose the region where your SDDC is deployed and enable <code class="language-plaintext highlighter-rouge">Internet Access</code> and <code class="language-plaintext highlighter-rouge">Public IP Service</code>. Supply the Edge Services range you earmarked during planning and click <code class="language-plaintext highlighter-rouge">Submit</code>. Enabling these services will take 10-15 minutes. Once complete, they will show as <code class="language-plaintext highlighter-rouge">Enabled</code> on the Regional Settings page. Enabling these settings will allow Public IPs to be allocated to your SDDC, which is a requirement for deploying a VPN Gateway. To begin the deployment, browse to <code class="language-plaintext highlighter-rouge">Network &gt; VPN Gateways</code> and click <code class="language-plaintext highlighter-rouge">Create New VPN Gateway</code>.</p> <p class="center"><img src="https://networkbrouhaha.com/resources/2021/02/08_create_vpn_gw.png" alt="" class="drop-shadow" /></p> <p>Supply the name for the VPN gateway and the client subnet reserved during planning and click <code class="language-plaintext highlighter-rouge">Next</code>.</p> <p class="center"><img src="https://networkbrouhaha.com/resources/2021/02/09_create_vpn_gw_edited.png" alt="" class="drop-shadow" /></p> <p>Choose specific users to grant VPN access, or enable <code class="language-plaintext highlighter-rouge">Automatically add all users</code>, and click <code class="language-plaintext highlighter-rouge">Next</code>.</p> <p class="center"><img src="https://networkbrouhaha.com/resources/2021/02/10_create_vpn_gw.png" alt="" class="drop-shadow" /></p> <p>Next, specify which networks to make accessible over VPN. I opted to add all subnets automatically. Click <code class="language-plaintext highlighter-rouge">Next</code>, and a summary screen will be displayed. Verify your choice and click <code class="language-plaintext highlighter-rouge">Submit</code> to create the VPN Gateway.</p> <p class="center"><img src="https://networkbrouhaha.com/resources/2021/02/12_create_vpn_gw_edited.png" alt="" class="drop-shadow" /></p> <p>You will be returned to the VPN Gateways page, and the new VPN gateway will have a status of <code class="language-plaintext highlighter-rouge">Creating</code>. Once the status shows as <code class="language-plaintext highlighter-rouge">Operational</code>, click on the new VPN gateway.</p> <p class="center"><img src="https://networkbrouhaha.com/resources/2021/02/13_create_vpn_gw_edited.png" alt="" class="drop-shadow" /></p> <p>Click <code class="language-plaintext highlighter-rouge">Download my VPN configuration</code> to download a ZIP file containing pre-configured OpenVPN profiles for the VPN gateway. Profiles for connecting via UDP/1194 and TCP/443 are available. Choose whichever is your preference and import it into Open VPN, then connect. In the GCVE portal, browse to <code class="language-plaintext highlighter-rouge">Resources</code> and click on your SDDC.</p> <p class="center"><img src="https://networkbrouhaha.com/resources/2021/02/13_launch_vsphere_edited.png" alt="" class="drop-shadow" /></p> <p>Finally, you can click <code class="language-plaintext highlighter-rouge">Launch vSphere Client</code>. Login with username <code class="language-plaintext highlighter-rouge">[email protected]</code> and password <code class="language-plaintext highlighter-rouge">VMwareEngine123!</code>. Huzzah! You are now free to explore your newly created SDDC in GCVE. Your first task should be updating the password for the <code class="language-plaintext highlighter-rouge">[email protected]</code> account.</p> <p class="center"><img src="https://networkbrouhaha.com/resources/2021/02/14_launch_vsphere_edited.png" alt="" class="drop-shadow" /></p> <h1 id="wrap-up">Wrap Up</h1> <p>As you can see, deploying in SDDC in GCVE is easier than setting up client VPN access. Now, a standalone SDDC is cool, but in the next post we will look at connecting it to a VPC. This will be almost entirely automated with Terraform, apart from a tiny bit of work that needs to be done in the GCVE portal. Later posts will cover creating a bastion host, connecting with Cloud VPN and Cloud Interconnect, configuring HCX for workload migration, and all sorts of other use cases. Are you using GCVE? If so, please reach out to me on Twitter (<a href="https://www.twitter.com/networkbrouhaha">@NetworkBrouhaha</a>) and let me know what topics you’d like to see covered.</p> Thu, 04 Feb 2021 00:00:00 +0000 http://www.networkbrouhaha.com/2021/02/gcve-sddc-with-hcx/ http://www.networkbrouhaha.com/2021/02/gcve-sddc-with-hcx/ Cloud Connectivity 202 - Extending Layer 2 Into the Cloud <p class="center"><img src="https://networkbrouhaha.com/resources/2021/01/dragons.jpg" alt="" /></p> <p>In this post, I will talk about extending layer 2 into the cloud, including scenarios for when it is a good idea, and the numerous dangers involved. I let out a good, long sigh after writing that sentence. Beware, there are dragons ahead.</p> <p>If you’ve been around networks long enough, you’ve probably seen a network taken to its knees by a loop. The <a href="https://en.wikipedia.org/wiki/Spanning_Tree_Protocol">Spanning Tree Protocol</a> (STP) exists to prevent this issue, but a simple misconfiguration can prevent STP from doing its job. I’ve personally seen a hospital network taken down for days from a network loop, and I’ve heard many similar stories from other network engineers. A lot of time has been spent coming up with alternatives to STP-based networks, and for good reason. A major part of the CCIE Data Center curriculum I studied was alternatives to STP like TRILL and VXLAN EVPN with MP-BGP. Here’s the point: extending layer 2 can be dangerous, especially when done without precautions in place.</p> <h1 id="why-do-we-need-layer-2">Why Do We Need Layer 2?</h1> <p>The very first commercially available Ethernet standard, <a href="https://en.wikipedia.org/wiki/10BASE5">10Base5</a> used a coaxial cable as a shared medium. Multiple devices could be attached to the same cable, and each was identified by its MAC address. Later Ethernet standards kept backward compatibility with this model. Although significant improvements have been made over the years, much of the complexity of layer 2 forwarding semantics stem from this initial design.</p> <p>Devices can be connected to a hub or a switch without any sort of central coordination. Neighbors are discovered by broadcasting ARP requests, which makes building an ad-hoc network very easy. Things get messier when a broadcast packet enters a looped network. There is no <a href="https://en.wikipedia.org/wiki/Time_to_live">time to live</a> (TTL) value associated with a layer 2 frame, so it will be forwarded and reforwarded ad infinitum, resulting in a <a href="https://en.wikipedia.org/wiki/Broadcast_storm">broadcast storm</a>. Before we even discuss extending layer 2, it’s important to know that a network can be completely crushed if STP Is improperly configured and a loop is introduced.</p> <h1 id="risks-of-extending-layer-2">Risks Of Extending Layer 2</h1> <p>Before cloud providers were available, extending layer 2 was typically accomplished over a Data Center Interconnect (DCI) that could serve as a trunk, so one or more VLANs could be extended across the link. The risk here is from fate-sharing. Before extending layer two, any issue would be confined to one site. Once layer two is extended, a problem at one site will extend to the other. There are many good reasons why cloud providers use availability zones, and <a href="https://en.wikipedia.org/wiki/Fate-sharing">fate-sharing</a> is one of them. The other risk, of course, is that the larger the layer 2 domain, the more opportunities there are to introduce a loop.</p> <p>Beyond the risks of creating a network outage, extending layer 2 also has some basic disadvantages. A default gateway can only exist at one site, so routed traffic between hosts at the same site may be <a href="https://en.wikipedia.org/wiki/Anti-tromboning">tromboned</a> across the DCI. If using an overlay to facilitate layer 2 extension, you need to be mindful of the implications that has on MTU across the link. “Silent hosts”, or hosts that don’t properly respond to ARP requests, can also be problematic in this scenario.</p> <h1 id="reasons-for-extending-layer-2">Reasons For Extending Layer 2</h1> <p>Hopefully, I’ve made a good case for why you should be very careful when extending layer 2. There are some good reasons to do so, but it is never something I would recommend doing for a long period of time. The main use cases I see for extending layer 2 are data center evacuation and migrating to the cloud while preserving assigned IP addresses. These are good use cases for layer 2 extension since the extension will be finite. Once the evacuation or migration is complete, the extension can be removed. I have seen layer 2 extensions used for disaster recovery purposes, and I would caution users who want to do this to exercise extreme caution. Indefinite layer 2 extension is a recipe for trouble.</p> <h1 id="methods-for-extending-layer-2-to-the-cloud">Methods for Extending Layer 2 to the Cloud</h1> <p>In my post on <a href="/2020/11/cloud-connectivity-101/">cloud connectivity</a>, I listed the typical methods for connecting to a cloud provider. You will notice that all of the connections are based on layer 3, apart from a layer 2 VPN, which rides on top of a layer 3 connection. The prior approach of extending VLANs over a circuit simply isn’t an option in the cloud. In the same post, I pointed out that most clouds don’t use traditional layer 2 forwarding in their networks. This certainly presents a problem when trying to extend layer 2 as well! The best solution available is to use an overlay, like <a href="https://en.wikipedia.org/wiki/Virtual_Extensible_LAN">VXLAN</a> or <a href="https://en.wikipedia.org/wiki/Generic_Networking_Virtualization_Encapsulation">GENEVE</a>, in your cloud of choice. This is where VMware-powered cloud offerings shine since they leverage NSX-T in the cloud-based SDDC. GENEVE, used in NSX-T, will emulate the needed layer 2 functions, as well as provide a layer 2 VPN (L2VPN) appliance to perform the extension. <a href="https://cloud.vmware.com/vmware-hcx">VMware HCX</a> is also compatible with these solutions and provides layer 2 extension for migrated workloads.</p> <p>NSX-T L2VPN and HCX provide guidelines in their documentation that should be carefully studied before deployment. It is also important to know that you cannot use NSX-T L2VPN and HCX at the same time. The HCX documentation states “Virtual machine networks should only be extended with a single solution. For example, HCX Network Extension or NSX L2 VPN can be used to provide connectivity, but both should not be used simultaneously. Using multiple bridging solutions simultaneously can result in a network outage.” Since HCX supports extension to multiple sites, you must deploy the service meshes in a way that does not introduce a loop. See this diagram:</p> <p class="center"><img src="https://networkbrouhaha.com/resources/2021/01/hcx-loop2.png" alt="" /></p> <p>Other solutions exist beyond VMware products, like <a href="https://en.wikipedia.org/wiki/Locator/Identifier_Separation_Protocol">LISP</a>, but I have not personally seen them used. Ultimately you are restricted to solutions supported by your cloud of choice, which are typically delivered in the form of supported third-party network appliances. If you’re aware of another viable method for extending layer 2 into the cloud, please leave a comment or send me a message on Twitter. I’d love to see what others are using to accomplish this.</p> <h1 id="alternatives-to-extending-layer-2">Alternatives To Extending Layer 2</h1> <p>Imagine a world where there was never a hard-coded IP address anywhere. Addresses are assigned automatically, and DNS is instantly updated whenever an address is assigned or changed. Changing addresses is a non-issue since everything relies on DNS name resolution instead of an IP address. Sound too good to be true? It’s not! This is possible today, and it has been for years. Cloud-native networking works exactly in this manner, but it is possible to accomplish on any network, with some effort.</p> <p>For a variety of reasons, enterprise networks have not operated this way, and it is the primary driver behind the desire to extend layer 2. IPs are frequently hard-coded instead of relying on DHCP for address assignment and DNS for name resolution and service discovery. Running a network is difficult work, and forward-thinking practices like the ones I’m describing aren’t always prioritized. It’s easy to decry this as laziness, or poor planning, but I spent years working in networks similar to what I’ve described. I understand the amount of effort it takes to change the way a network fundamentally works, especially when you’re on a small team managing a huge network. If you never need to migrate a workload to another location, it may not be worth the effort.</p> <p>The primary alternative to extending layer 2 is to use layer 3 routed connectivity. This is exactly how the cloud was designed to work; why there isn’t any real concept of layer 2 in the cloud, and why cloud providers don’t allow you to extend your VLANs onto their network. Unfortunately, this concept is difficult to swallow if you have a network like the one I describe above - heavily reliant on hard-coded addresses in many places. If this is the case, you will need to work with management to acquire the resources needed to convert to a “cloud friendly” network on premises before moving workloads to the cloud. This may be a small effort or a massive one, depending on the size of your network, but it will position you to be better prepared for the future. The cloud is here to stay, and it’s a wonderful environment to work in once you get the hang of it.</p> <h1 id="wrap-up">Wrap Up</h1> <p>Whether or not it’s a good idea to extend layer 2 is a debate that has been going on for years, and I doubt it will stop any time soon. The guidance I will leave you with is to use layer 3 everywhere you can, and extend layer 2 only if you must. Do everything you can to configure your applications to use DNS instead of hard-coded IPs. Study cloud-native networks and start embracing those concepts in your own network. By doing this, you will be much better prepared for migrating your workloads to the cloud or moving to a hybrid cloud environment.</p> Mon, 11 Jan 2021 00:00:00 +0000 http://www.networkbrouhaha.com/2021/01/cloud-connectivity-202/ http://www.networkbrouhaha.com/2021/01/cloud-connectivity-202/ Cloud Connectivity 201 - Reliable Connectivity <p class="center"><img src="https://networkbrouhaha.com/resources/2020/12/clouds-trees.png" alt="" /></p> <p>My <a href="/2020/11/cloud-connectivity-101/">last post</a> laid out several options for connecting to the cloud. In this post, I’ll dive into reliable connectivity to the cloud. In the real world, circuits drop, transceivers fail, and software bugs cause lockups or unexpected reboots. This is one of the fundamental tenants of working with technology. If you expect <a href="https://en.wikipedia.org/wiki/High_availability#Percentage_calculation">five nines</a> of uptime, you must plan for component failures and unexpected outages.</p> <p>When it comes to networking, reliability is typically achieved with <a href="https://en.wikipedia.org/wiki/Redundancy_(engineering)">redundancy</a> and <a href="https://en.wikipedia.org/wiki/Resilience_(network)">resiliency</a>. Redundancy means two or more components (e.g. switches, routers, circuits) are in place to tolerate the failure of any single device. Resiliency means that the overall system continues to operate when a component failure occurs. This usually requires some configuration on the part of the operator to ensure an automated switchover occurs, and hopefully testing of every failure scenario to ensure that switchover works as expected. Having two of everything doesn’t provide any benefit if the redundant component doesn’t take over when the primary fails!</p> <p class="center"><img src="https://networkbrouhaha.com/resources/2020/12/first-rule-government-spending.png" alt="" /></p> <p>The rest of this post will focus on the specifics of reliable cloud connectivity, but it is important to consider that reliability is constructed in stages. I’m assuming that your existing network infrastructure is already reliable. If that is not the case, it won’t matter how awesome your cloud connectivity is. You must build upon a stable foundation.</p> <h1 id="software-defined-wan-sd-wan">Software Defined WAN (SD-WAN)</h1> <p>In my last post, I said that SD-WAN “may be the most exciting advancement in the world of networking in the past decade.” I’m referring to SD-WAN in general terms since there is a lot of vendor-specific secret sauce baked into the various offerings. With that in mind, here are the things that SD-WAN gets right when it comes to reliability:</p> <ul> <li>Redundancy is <em>assumed</em>. Reference architectures for SD-WAN assume there are at least two paths available for connectivity. This may be as simple as a primary internet circuit and an LTE connection for backup.</li> <li>Failover is <em>automatic</em>. SD-WAN devices constantly verify connectivity between the participating edge devices, as well as the health of that connection. When a connection goes down, traffic is automatically moved to another available connection. Even more impressively, if a connection is experiencing packet loss or high latency, traffic can be migrated to prevent a performance hit. With a traditional solution, this is a difficult problem to detect unless you are on top of your monitoring game.</li> </ul> <p>SD-WAN is still gaining traction, but the technology is exciting. There are several other benefits in terms of security, manageability, monitoring, and automation. If you’re building a new solution, or looking to replace aging edge hardware, SD-WAN deserves a hard look. You can build a fast, fault tolerant connection to the cloud, and even replace legacy WAN technologies like MPLS. The most important thing to consider with cloud connectivity via SD-WAN is that your chosen vendor has supported software appliances in your cloud(s) of choice. Read through their cloud reference architecture to ensure they meet your requirements. Cloud-based SD-WAN appliances are software-based, so they will have bandwidth limitations that should be taken into account as well.</p> <h1 id="dynamic-routing-and-automated-failover">Dynamic Routing and Automated Failover</h1> <p>If you’re not in a position to roll out SD-WAN, you will need to use the tried-and-true networking technologies that have been around for decades. Dynamic routing is the primary tool used to provide reliable network connectivity and facilitate traffic failover during an outage. There are several dynamic routing protocols that can achieve this, but when it comes to cloud connectivity you will be using <a href="https://en.wikipedia.org/wiki/Border_Gateway_Protocol">BGP</a>. Why not <a href="https://en.wikipedia.org/wiki/Open_Shortest_Path_First">OSPF</a> or <a href="https://en.wikipedia.org/wiki/Enhanced_Interior_Gateway_Routing_Protocol">EIGRP</a>? These protocols rely on IP broadcast or multicast to find neighbor routers and form peering adjacencies, and are generally intended for use within a LAN. BGP peering is established using <a href="https://en.wikipedia.org/wiki/Transmission_Control_Protocol">TCP</a>, and it was designed to operate over a WAN or the Internet. It is, in fact, the duct tape and baling wire that holds the internet together!</p> <p>BGP requires a few bits of information to get working. First, you will need to enable the BGP process on your router and assign an Autonomous System Number (ASN), which will be explained in the next section. Next, you will define which networks will be advertised by BGP. The final step is to configure a neighbor router to peer with, along with the ASN of that router. Once the peering relationship is formed, your router receives all of the known BGP routes from the neighbor.</p> <p>In some cases, BGP relies on another routing protocol to be able to operate. When a router tries to establish a peering relationship, the TCP packets it sends need to be able to make it to the intended neighbor router. When using BGP to connect to the cloud, the routers involved are typically assigned addresses from within a /30 (or /31) subnet. This means no routing is required for the two devices to communicate – they are both connected to the same subnet and can communicate directly. Whether this communication happens over an IPsec tunnel or a point-to-point circuit doesn’t really matter, as long as the two devices can talk to each other.</p> <p>The magic happens when BGP peering is established over multiple paths to the same destination, which is what provides both redundancy and resiliency. We’ll get into the details in a moment, but this the way we can automatically failover if one link goes down. BGP will see the destination subnets advertised from multiple neighbors, and it will pick which path is the “best” based on an algorithm. If a circuit goes down and the neighbor is no longer reachable, BGP will choose the next best path, and traffic will be forwarded accordingly. These paths can be the same type of connection, or different. Here are a few examples:</p> <ul> <li>Redundant, route-based IPSec VPNs. Route-based VPNs allow BGP peering across them, providing a failover method if one tunnel fails.</li> <li>A combination of a direct connection and a route-based VPN as a backup</li> <li>Multiple direct connections</li> </ul> <p>If you’re using a solution like Megaport for connectivity, their website has several <a href="https://docs.megaport.com/connections/common-scenarios/">example network diagrams</a> for redundant connectivity.</p> <h2 id="autonomous-system-numbers-asns">Autonomous System Numbers (ASNs)</h2> <p>BGP uses the term “Autonomous System” to represent an entity or location, and each autonomous system is assigned a number (ASN). There are a few different flavors of ASNs, but for our purpose, we only need to worry about public and private ASNs. Much like IP addresses, public ASNs must be assigned by a <a href="https://en.wikipedia.org/wiki/Regional_Internet_registry">regional internet registry (RIR)</a>, while private ASNs (64512-65535) can be used freely within a private network. Each ASN in the routing topology must be unique, so you will need to plan your ASN usage to prevent overlaps, just as you would with your IP ranges. In some cases, a cloud provider will allow you to specify a private ASN for your cloud resources, and in other cases, it is set by the provider and cannot be changed. Check your cloud provider’s documentation to see what ASN they use so you can plan accordingly. If you have been assigned a public ASN then your life is a bit easier, since it is unique to your organization. Some commonly used ASNs are listed below.</p> <table> <thead> <tr> <th><strong>Cloud Provider</strong></th> <th><strong>ASN</strong></th> <th><strong>Notes</strong></th> </tr> </thead> <tbody> <tr> <td>AWS</td> <td>7224<br />64512</td> <td>ASN 7442 is used for Direct Connect peering. ASN 64512 is the default ASN for VGW or Direct ConnectGateway, but you can specify a different private ASN when those resources are created.</td> </tr> <tr> <td>Azure</td> <td>8074<br />8075<br />12076<br />65515-65520</td> <td>ASN 12075 is used for ExpresssRoute peering. ASNs 65515-65520 are reserved by Azure and should not be used on the customer side.</td> </tr> <tr> <td>GCP</td> <td>16550</td> <td>ASN 16650 is used for Interconnect peering.</td> </tr> <tr> <td>Oracle</td> <td>31898<br />64555</td> <td>ASN 31898 is used for Private and Public peering. ASN 64555 is reserved by Oracle and should not be used on the customer side.</td> </tr> <tr> <td> </td> <td>23456 429496729 64496-64511 65535-65551</td> <td>These ASNs are reserved by <a href="https://en.wikipedia.org/wiki/Internet_Assigned_Numbers_Authority">IANA</a> and should not be used on the customer side.</td> </tr> </tbody> </table> <p>Each route that BGP learns also includes the AS Path needed to reach that destination. The path is the list of ASNs that are traversed to reach the destination. Consider this diagram:</p> <p class="center"><img src="https://networkbrouhaha.com/resources/2020/12/as_path_ex1.png" alt="" /></p> <p>The AS path from router A to router D is <code class="language-plaintext highlighter-rouge">20 30</code>. Likewise, the AS path from router D to router A is <code class="language-plaintext highlighter-rouge">20 10</code>. The AS path is an important part of the algorithm BGP uses to choose the best path to a destination, which we will learn more about in a moment. It is also used to prevent loops. If a BGP router receives a route with its own AS in the path, the route is discarded as a loop-prevention mechanism.</p> <h2 id="internal-bgp-ibgp-vs-external-bgp-ebgp">Internal BGP (iBGP) vs External BGP (eBGP)</h2> <p>I’m not going to get too in-depth on the inner workings of BGP, but it is worth knowing the difference between internal BGP (iBGP) and external BGP (eBGP). iBGP is the term for BGP peering between two routers within the same autonomous system. eBGP is the term for peering between two different autonomous systems. In the example diagram above, routers B and C are connected via iBGP, and the remaining connections are eBGP. Connections to a cloud provider will always use eBGP. If you are running iBGP within your network, then there is a good chance you already know more about BGP than I’m going to cover in this post!</p> <h2 id="bgp-best-path-algorithm">BGP Best Path Algorithm</h2> <p>If you’ve ever studied for a CCNA, you may get cold chills reading the phrase “BGP Best Path Algorithm”. Immediately your brain recalls the “N WLLA OMNI” acronym used to remember the steps in the algorithm, although you may struggle to remember what the letters stand for. For our purposes, the primary metric we need to worry about is the “A” – AS path length. As long as the other metrics are equal, which they typically are, BGP simply counts the number of autonomous systems it has to traverse to reach a destination to determine the AS path length. The path with the least number of autonomous systems traversed is considered the best path. Whether or not this path is the best performing or least latent is another question altogether, but these are not metrics that BGP considers as part of its algorithm. Here is another example topology to consider:</p> <p class="center"><img src="https://networkbrouhaha.com/resources/2020/12/as_path_ex2.png" alt="" /></p> <p>There are now two paths from router A to router D, <code class="language-plaintext highlighter-rouge">20 30</code> and <code class="language-plaintext highlighter-rouge">40 50 30</code>. The latter may be higher bandwidth or more reliable, but BGP will pick the path through AS 20 since that will result in the shortest AS Path (<code class="language-plaintext highlighter-rouge">20 30</code>).</p> <p>One tool that can be used to influence BGP path selection is AS prepending. AS prepending means you artificially add additional AS numbers into the path. These prepending rules are applied to a neighbor relationship between routers and should use the ASN of the local system. Prepending an ASN other than your own may have unintended consequences. AS prepending may be performed outbound on routes being advertised to neighbors or inbound on routes being received from neighbors.</p> <p>Here is an example to illustrate this concept. If router D (AS 30) prefers traffic from router A (AS 10) to arrive over the link with router F (AS 50), it can prepend <code class="language-plaintext highlighter-rouge">30 30</code> to the AS path it advertises to router C (AS 20). This new path will be advertised to router A, and the two paths router A will see to router D are <code class="language-plaintext highlighter-rouge">20 30 30 30</code> and <code class="language-plaintext highlighter-rouge">40 50 30</code>. Now, the best path is through AS 40 and 50, since it is a shorter AS path. Using AS prepending is a way to influence the routing topology to behave differently from the default BGP behavior. You will frequently see this practice referred to as “traffic engineering”, although AS prepending is just one of many tools available to manipulate BGP.</p> <h2 id="equal-cost-multipath-ecmp">Equal Cost Multipath (ECMP)</h2> <p>By default, BGP will calculate a single path to each destination. As I mentioned earlier, that path may change if a failure happens somewhere in the network, which is exactly what we want to see. But what about the scenario where we have two high bandwidth links to a destination? It may feel like a waste of money to have an expensive circuit provisioned to merely serve as a backup. This is where Equal Cost Multipath (ECMP) comes in. ECMP allows for two or more “equal cost” routes to be installed in the routing table. ECMP is used frequently in networking to leverage multiple links at the same time, but different routing protocols handle it differently. There may even be differences in behavior between hardware vendors. Generally with BGP, ECMP has to be enabled, so refer to the vendor documentation to figure out the right knob to turn.</p> <p>When properly configured, ECMP will allow you to take full advantage of the available links. A common scenario is two direct connections to a cloud provider (ideally purchased from two different carriers for diversity). When both links are working, traffic is transferred across either link based on a “hash”. Usually this is done by reading the source and destination addresses and port numbers of a packet and feeding them into a predefined calculation to produce a hash value. Even values would be assigned to one link, and odd to the other, providing a rudimentary way of balancing traffic across the links. If one circuit fails, that route is removed from the routing table, and all traffic would traverse the remaining link. Redundant <em>and</em> cost-effective!</p> <h2 id="additional-considerations">Additional Considerations</h2> <p>By now you’ve probably read all you care to about BGP, but there are a few more components I’d like to mention.</p> <ul> <li><a href="https://en.wikipedia.org/wiki/Bidirectional_Forwarding_Detection">Bi-directional Forwarding Detection (BFD)</a> is a network protocol that is complimentary to BGP. By default, BGP can take several seconds or minutes to detect a failure. BFD can be used to speed up failure detection. If you cannot tolerate an outage of more than a few seconds, configure BFD along with BGP.</li> <li>Prefix lists define a list of routes, and they can be used to filter the routes that are either advertised or received by BGP. Applying prefix lists to your BGP neighbors is a best practice. By default, your equipment will receive all routes advertised by its BGP neighbors, and you certainly want to prevent unintended routes from being advertised into your network. Refer to your network vendor documentation for prefix-list syntax and application.</li> <li>BGP peering over a public network can be a security risk since attackers could use the advertised routing information to perform reconnaissance on your network. To mitigate this, BGP peering can be encrypted with a pre-shared authentication key. This is also considered a best practice, and you will need to read the docs to determine how this is configured on your equipment. To prevent unnecessary troubleshooting, I typically try to stand up a BGP peering connection without authentication. I’ll go back and enable authentication once I’ve confirmed it’s working as expected.</li> </ul> <h2 id="wrap-up">Wrap Up</h2> <p>Multiple redundant links with automated failover is your goal if you want reliable connectivity to the cloud. If it’s up to me, I’m using SD-WAN everywhere that I can. All of the complicated bits of BGP are abstracted away, leaving only the benefits. There is always a trade-off, which in this case is vendor lock-in, and perhaps cost depending on how large your network is. But BGP has been around for a long time, and it’s not going anywhere. There are plenty of network engineers who have spent hundreds of hours designing and supporting BGP, so you will seldom have to worry about finding someone that can provide support for your environment. Chose the solution that works best for you, plan, design, deploy, test, and test again. If everything goes as planned, you will have rock-solid connectivity to the cloud of your choice.</p> Tue, 08 Dec 2020 00:00:00 +0000 http://www.networkbrouhaha.com/2020/12/cloud-connectivity-201/ http://www.networkbrouhaha.com/2020/12/cloud-connectivity-201/ Cloud Connectivity 101 <p class="center"><img src="https://networkbrouhaha.com/resources/2020/11/lightning.png" alt="" /></p> <p>With my <a href="/2020/11/network-princples-cloud/">prior post</a> in mind, we can look at the various methods available for connecting to the cloud. This isn’t intended to be an exhaustive list, but it should cover the vast majority of your options.</p> <h2 id="public-ip">Public IP</h2> <p><strong>Pros</strong>: Ubiquitous, inexpensive<br /> <strong>Cons</strong>: Potentially insecure, no performance guarantees<br /> <strong>What you’ll need</strong>: A business-class internet circuit<br /></p> <p class="center"><img src="https://networkbrouhaha.com/resources/2020/11/cloud-public-ip.png" alt="" /></p> <p>Regardless of the other methods listed, you’ll connect to a cloud provider over Public IP the first time you connect to their console to set up your account. Depending on what you host in the cloud, this may be the only connectivity you need. It’s the easiest way to get to the cloud, but also the least secure. If you’re transferring sensitive information over the public internet, make sure you leverage modern application-based encryption, whether over TLS/HTTPS, SSH or some other protocol.</p> <p>Assuming we’re talking about <a href="https://en.wikipedia.org/wiki/Infrastructure_as_a_service">Infrastructure as a Service (IaaS)</a> and running virtual instances in the cloud, there are variations between providers around how public IPs are assigned. It’s important to understand how your provider assigns addresses, and how they recommend exposing them to the internet. Frequently a load balancer is used for this, and it’s a good idea to use one even if you’re starting small. It’s easier to start with a load balancer than move to one later.</p> <p>Public connectivity is not well suited for the typical backend administrative access you’d see in a traditional data center. No security team would recommend exposing all your servers to the internet. Addressing this will likely require adding in one of the connection methods listed below, like a VPN.</p> <p>Cloud providers allow tight integration between resources you deploy and their hosted DNS solution, but it will still be up to you to configure it properly. Take time to learn the various network services provided, as they will be similar to the network appliances you deploy on prem, but the capabilities and requirements may be different from what you’re used to. As I mentioned previously, there are many options for applying security policy in the cloud, so developing a solid security strategy should be one of the first things you do.</p> <p>The remaining connection methods, apart from Direct Connection and Network as a Service, build on top of Public IP connectivity.</p> <h2 id="traditional-vpn">Traditional VPN</h2> <p><strong>Pros</strong>: Secures sensitive traffic<br /> <strong>Cons</strong>: Requires additional hardware or software, potential performance bottleneck, does not scale well<br /> <strong>What you’ll need</strong>: Hardware or software capable of terminating an IPSec or SSL VPN tunnel<br /></p> <p class="center"><img src="https://networkbrouhaha.com/resources/2020/11/cloud-trad-ipsec.png" alt="" /></p> <p>Layering a <a href="https://en.wikipedia.org/wiki/Virtual_private_network">VPN</a> on top of Public IP connectivity will provide a secure connection into your cloud resources. All cloud providers support <a href="https://en.wikipedia.org/wiki/IPsec">IPSec</a> tunnels, and most support establishing <a href="https://en.wikipedia.org/wiki/Border_Gateway_Protocol">BGP</a> peering over the tunnel to exchange routes. If you need client-based SSL VPN you will need to deploy a network appliance in your cloud environment, but there are <a href="https://aws.amazon.com/marketplace/b/Network-Infrastructure/2649366011">numerous</a> <a href="https://azuremarketplace.microsoft.com/en-us/marketplace/apps/category/networking?subcategories=firewalls&amp;page=1">options</a> <a href="https://console.cloud.google.com/marketplace/browse?filter=category:networking">available</a>.</p> <p>Using a VPN is the easiest way to build a <a href="https://en.wikipedia.org/wiki/Cloud_computing#Hybrid_cloud">hybrid cloud</a> solution, and it will give you a secure way to access any instances you’ve deployed via their private IP. If you were paying attention to what I wrote earlier in my previous post, you know that you should have a good IP addressing scheme that includes non-overlapping IP ranges for everything you’re deploying on prem and in the cloud. This becomes even more important if you’re deploying resources into multiple regions in a single cloud provider, or multiple cloud providers.</p> <p>VPNs, like most technologies, evolve over time, so if you have an aging VPN solution in your environment you may want to consider replacing it with a modern one. Policy-based IPSec VPNs have been around for decades, but cloud providers are encouraging route-based VPNs, which is what you’ll have to use if you want dynamic routing over VPN. As your environment scales, dynamic routing becomes increasingly important, so if you aren’t comfortable with BGP now is the time to learn it.</p> <h2 id="layer-2-vpn">Layer 2 VPN</h2> <p><strong>Pros</strong>: IP Mobility<br /> <strong>Cons</strong>: Requires additional hardware or software, potential performance bottleneck, not recommended for long term deployments<br /> <strong>What you’ll need</strong>: Specialized hardware or software capable of building an L2 tunnel<br /></p> <p class="center"><img src="https://networkbrouhaha.com/resources/2020/11/cloud-l2vpn.png" alt="" /></p> <p>A Layer 2 VPN is used to span a Layer 2 segment, typically a VLAN, across a WAN link. I don’t have scientific data to back this up, but I’d bet a milkshake that if you asked a hundred network engineers if spanning Layer 2 across sites is a good idea, ninety-nine of them would say no. Spanning layer 2 across sites, or over a VPN, introduces complexity and does not scale well. There are some good reasons to do it, but I would never recommend it as a long-term solution.</p> <p>Normally, a layer 2 VPN is used to migrate existing VMs to another site or cloud provider, while preserving assigned IP addresses. Common scenarios are disaster recovery and data center evacuation. I won’t go on another rant about the importance of DNS, but you can see why I climbed up on that soapbox. Putting that aside, if everyone involved knows the risks introduced by stretched L2, and it’s temporary, it can be a handy tool.</p> <p>There are a handful of options for L2 VPN, including <a href="https://cloud.vmware.com/vmware-hcx">VMware HCX</a> and <a href="https://docs.vmware.com/en/VMware-NSX-T-Data-Center/2.5/administration/GUID-86C8D6BB-F185-46DC-828C-1E1876B854E8.html">NSX L2VPN</a>. These have been verified to work in supported cloud providers, but if you choose another solution, be careful to make sure that it is supported by your cloud provider. There is no traditional layer 2 forwarding in most native cloud provider networks, so an overlay like <a href="https://en.wikipedia.org/wiki/Virtual_Extensible_LAN">VXLAN</a> or <a href="https://en.wikipedia.org/wiki/Generic_Networking_Virtualization_Encapsulation">GENEVE</a> is used to emulate layer 2 semantics. These overlays encapsulate packets in UDP, so large packets will be fragmented when transmitted over a WAN link. Unless there is a solution for local traffic egress, there will be tromboning of traffic across the L2VPN for any remote endpoints to reach their default gateway.</p> <p>My advice is to stick to routed layer 3 traffic if possible, even if it takes some work to get there. L2VPN is a tool that can be deployed if absolutely necessary.</p> <h2 id="sd-wan">SD-WAN</h2> <p><strong>Pros</strong>: Flexibility, scalability, potential cost savings<br /> <strong>Cons</strong>: Requires additional hardware or software, potential for vendor lock-in<br /> <strong>What you’ll need</strong>: Hardware or software capable of building an SD-WAN mesh, like <a href="https://www.vmware.com/products/sd-wan-by-velocloud.html">VeloCloud</a><br /></p> <p class="center"><img src="https://networkbrouhaha.com/resources/2020/11/cloud-sdwan.png" alt="" /></p> <p><a href="https://en.wikipedia.org/wiki/SD-WAN">Software Defined WAN (SD-WAN)</a> may be the most exciting advancement in the world of networking in the past decade. Building and maintaining VPN connections is a tough job, especially at scale. SD-WAN makes this process much simpler, since all the heavy lifting of creating tunnels, monitoring connectivity, and intelligently routing traffic between locations is handled by a controller. There are potential cost savings as well. Many businesses have replaced their expensive MPLS networks with SD-WAN meshes running over redundant internet connections.</p> <p>Currently SD-WAN is not standardized, and each vendor offering has its own unique feature set. You will need to do some homework on your own to find the solution that works best for your environment and cloud providers. If you’re considering hybrid-cloud or multi-cloud deployments, you should certainly look at SD-WAN for connecting your environments.</p> <h2 id="direct-connection">Direct Connection</h2> <p><strong>Pros</strong>: High-bandwidth, low-latency cloud connectivity<br /> <strong>Cons</strong>: Cost<br /> <strong>What you’ll need</strong>: A point-to-point circuit from a local telco that provides connectivity to the cloud provider of your choice, and a router or firewall capable of terminating the circuit<br /></p> <p class="center"><img src="https://networkbrouhaha.com/resources/2020/11/cloud-directconnect.png" alt="" /></p> <p>One of the challenges of working with the cloud is nomenclature. Cloud providers offer so many capabilities, and each offering has a name that has been carefully crafted by their marketing department. You may see a direct connection to a cloud provider referred to as Direct Connect, Express Route, Cloud Interconnect, Fast Connect, or Direct Express Cloud Bonanza. Okay, the last one is fake, but you get the point.</p> <p>While there is some variation between how the various cloud providers handle direct connections, this is a straightforward path to the cloud. If you are in a standalone data center, you will likely be working with your local telco to provision a circuit from your data center to the cloud provider of your choice. If you have an existing MPLS network, you may be able to have a “leg” connected to a cloud provider as an alternative. Many colocation facilities are offering direct circuits or cross-connects to the closest geographic cloud regions, so check your colocation offerings if that is where your equipment resides.</p> <p>Review your cloud provider’s documentation for the technical requirements and ordering process for a direct connection. Once the physical circuit is installed, there will be a setup process to complete in the cloud provider portal. Depending on whether you want connection to private resources (e.g. virtual instances deployed with private addresses) or public resources (e.g. provider offerings like object storage), you will need to follow the provider documentation to set up routing across your connection. Most likely this will involve bringing up a BGP peering with the provider between your network and theirs. You may be required to have your own <a href="https://en.wikipedia.org/wiki/Autonomous_system_(Internet)">Autonomous System Number (ASN)</a> and dedicated public IP address range to access public resources over a direct connection.</p> <p>I will explore the specifics of the various cloud provider direct connection options in a future post.</p> <h2 id="network-as-a-service-naas">Network as a Service (NaaS)</h2> <p><strong>Pros</strong>: High-bandwidth, low-latency cloud connectivity, and the ability to dynamically provision cloud connections<br /> <strong>Cons</strong>: Cost, only available in limited locations<br /> <strong>What you’ll need</strong>: A cross connect to the NaaS provider, and compatible hardware to terminate the connection<br /></p> <p class="center"><img src="https://networkbrouhaha.com/resources/2020/11/cloud-naas.png" alt="" /></p> <p>The last connection method I’ll mention is what some refer to as <a href="https://en.wikipedia.org/wiki/Network_as_a_service">Network as a Service (NaaS)</a>. This is similar to a direct connection, but with much more flexibility. <a href="https://www.megaport.com/">Megaport</a> and <a href="https://www.equinix.com/interconnection-services/cloud-exchange-fabric/">Equinix Cloud Exchange Fabric</a> are two examples of this type of service. Typically, you will need to be in a colocation facility to connect to a NaaS provider. If you’re in a standalone data center, you could provision a circuit to your closest NaaS provider and connect via that method.</p> <p>Once physically connected to your network, NaaS providers allow you to dynamically provision virtual circuits to cloud providers, managed service providers (MSPs), other data centers, or directly to an ISP. The strength of this solution is in its flexibility. Many NaaS providers provide an API to provision virtual circuits, meaning you could dynamically create and destroy connections to various cloud providers as needed.</p> <p>If you are looking for high-speed, low-latency connectivity to the cloud, and NaaS is available to you, it’s a great choice.</p> <h2 id="wrap-up">Wrap Up</h2> <p>I’ll be exploring additional topics pertaining to cloud connectivity in future posts, but I hope this is a helpful rundown of the options. To recap, you should have a fully developed plan before you provision any sort of cloud connectivity. The actual connectivity, whether it be over the internet, VPN, or direct connection, will depend on several factors. Budget is likely the biggest hurdle for most, and you will pay for better performance.</p> <p>When making your decision, consider the words of <a href="https://twitter.com/SharpNetwork/status/1326917912347734025">Eyvonne Sharp</a>, “A myopic focus on cost, instead of business value, is the bane of IT. It is also a harbinger of irrelevance.”</p> Tue, 17 Nov 2020 00:00:00 +0000 http://www.networkbrouhaha.com/2020/11/cloud-connectivity-101/ http://www.networkbrouhaha.com/2020/11/cloud-connectivity-101/ Network Principles for Cloud Connectivity <p class="center"><img src="https://networkbrouhaha.com/resources/2020/11/clouds.png" alt="" /></p> <p>Over the past few years, I’ve encountered all sorts of confusion about how cloud resources should be architected, accessed, and consumed. In this post, which is the first in a series, I will walk through some networking basics relevant to cloud connectivity. The <a href="/2020/11/cloud-connectivity-101/">next post</a> covers methods for connecting to cloud providers, and subsequent posts will dive deeper into specific topics.</p> <h1 id="network-basics">Network Basics</h1> <p>The internet runs on – you guessed it – the <a href="https://en.wikipedia.org/wiki/Internet_Protocol">Internet Protocol (IP)</a>, which is what you’ll use to connect to the various cloud providers. Overall, we’re running out of <a href="https://en.wikipedia.org/wiki/IPv4_address_exhaustion">unallocated public IPv4 addresses</a>, although that shortage doesn’t seem to <a href="https://docs.aws.amazon.com/general/latest/gr/aws-ip-ranges.html">apply to cloud providers just yet</a>. IPv4 is well understood, and it is not going anywhere for many years, but now is the time to consider your IPv6 strategy. There is no reason to be afraid of IPv6, and it solves many of the workarounds put in place for IPv4 over the past decades, like <a href="https://en.wikipedia.org/wiki/Network_address_translation">Network Address Translation (NAT)</a>.</p> <p>Overwhelmingly, the problem that comes up again and again when it comes to cloud connectivity, is basic routing and overlapping IP address spaces. With IPv4, this is a planning exercise that many people fail to do. With IPv6, this problem disappears, due to the massive number of unique addresses available. The crux of this issue is that every router that moves packets across a network can only have one entry in its routing table for each destination subnet. When all subnets are unique, this is easily accomplished, but almost every corporate network is using <a href="https://en.wikipedia.org/wiki/Private_network">RFC 1918 private addresses</a> somewhere. To put it simply, if you’re using 10.0.1.0/24 on your corporate network, you can’t use that specific range anywhere else. If you do, anything deployed in that duplicate network will not be able to communicate with the original range.</p> <p>One trick commonly deployed to address overlapping network ranges is to use <a href="https://en.wikipedia.org/wiki/Longest_prefix_match">longest prefix match</a>. Basically, this means that the route to the most specific network is the one that will be used. For example, if a router has valid routes for 10.0.0.0/8, 10.0.1.0/24, and 10.0.1.0/30, the last route is the most specific. This is easy to spot, as the subnet mask, /30, is a larger number than the other routes listed. Some newer network fabrics even create “host routes” - /32 in IPv4 or /128 in IPv6 – for every endpoint. Since these are the most specific routes possible, they always “win”, and they allow for great flexibility in terms of where endpoints are connected. In the world of networking, every decision like this involves a tradeoff. In this case, the tradeoff is flexibility versus a potentially large routing table. Luckily, we are in an era where network hardware is not as resource constrained as it was in decades past, so having hundreds of thousands of known routes is less of a risk. I’m placing a bet that we’ll see this approach used more and more. Regardless, this is a tool you can use when designing your network to make efficient use of your IP address ranges. If I’m able to get across one point through this post, it’s the important of planning ahead. Plan your IP addressing scheme, for both IPv4 and IPv6, since it’s inevitable that you will need to use both eventually.</p> <p class="center"><img src="https://networkbrouhaha.com/resources/2020/11/lpm-example.png" alt="" /></p> <p>One last point to mention is the importance of DNS. In a perfect world, no one would ever memorize an IP address (unless it’s for a DNS server, and 8.8.8.8 is mercifully easy to remember). Unfortunately, this is not the case. Hardcoding IP addresses is common practice, and I still run across people who don’t “trust” DNS or claim that some obscure application doesn’t support it. Name resolution and service discovery are incredibly important functions in the cloud, to the point that DNS has a 100% uptime SLA with some providers. In many cases, DNS is the only service that is guaranteed to be available all the time. If everything on your network is using DNS hostnames, and DNS is accurate and easy to update, changing the IP address of an endpoint becomes a much easier task. DNS has been around almost as long as the internet itself, so it’s high time that we do the heavy lifting to use DNS hostnames everywhere instead of IP addresses.</p> <h1 id="security-and-statefulness">Security and Statefulness</h1> <p>I would be remiss to write about cloud connectivity without mentioning security. Figuring out how to allow legitimate traffic through firewalls and Access Control Lists (ACLs) is a joy that every network engineer gets to experience, and the sheer number of malicious actors means we all must keep security at top of mind when planning and operating our networks. Securing networks means adding complexity, and like an IP addressing scheme, needs careful planning.</p> <p>While properly securing networks is critical, it’s important to understand exactly how it’s done, and what effect is has on our network. IP was designed with the <a href="https://en.wikipedia.org/wiki/End-to-end_principle">end-to-end principle</a> in mind. The original designers envisioned networks that purely moved packets, with any necessary intelligence implemented at the endpoints. A destination address would be all that is needed to get a packet where it needs to go. While that is not the reality of modern networking, it is still a worthy goal.</p> <p>Network appliances, like firewalls, typically introduce some sort of state tracking. This is why some firewalls are referred to as “stateful firewalls” - they track the connection state of traffic flows as they traverse the firewall, and they use that state to determine whether to forward additional traffic. NAT Gateways and Load Balancers track state for similar reasons. As more state is stored on network appliances, we move further and further away from the ideal of the end-to-end principle. This isn’t necessarily a bad thing, but it is worth understanding, as well as architecting networks that minimize the reliance on state stored in appliances. I have seen massive web applications that are completely reliant on proprietary load balancer features. This is a situation that should be avoided. Just because you can use your network to solve a specific problem doesn’t mean that you should.</p> <p>In terms of security policy, my best advice is to choose specific points for policy enforcement, and make sure they are well understood. In a typical enterprise network, this is usually obvious since policy enforcement happens at the firewall, although this has started to change with the advent of micro-segmentation. Cloud providers provide several options for applying security policy. Using AWS as an example, security policy can be applied at the instance (VM) level with <a href="https://docs.aws.amazon.com/vpc/latest/userguide/VPC_SecurityGroups.html">security groups</a>, or at the subnet level with <a href="https://docs.aws.amazon.com/vpc/latest/userguide/vpc-network-acls.html">network ACLs</a>. Whichever method you choose, stay consistent, and document your security practices. A good rule of thumb is to apply security policy as close to the endpoint as possible.</p> <h1 id="wrap-up">Wrap Up</h1> <p>Certainly, there are many technical requirements to consider when it comes to connecting to the cloud. I hope you will consider the planning and design requirements first, to put yourself in a position for success. Networking basics, like IP schemes and routing, as well as a good understanding of network security, and how state affects your traffic flows are subjects every technical cloud consumer needs to understand.</p> <p>Please leave a comment or <a href="https://www.twitter.com/networkbrouhaha">reach out to me on Twitter</a> and let me know if there is anything I’ve missed! My next post will cover the various methods available for connecting to the cloud.</p> Thu, 12 Nov 2020 00:00:00 +0000 http://www.networkbrouhaha.com/2020/11/network-princples-cloud/ http://www.networkbrouhaha.com/2020/11/network-princples-cloud/ Running Ansible Playbooks with GitHub Actions <p>I recently co-presented a session titled <a href="https://bit.ly/3g51lBR">Codify Your Environment with Terraform and Ansible</a> at the inaugural <a href="https://forward.rubrik.com/">Rubrik Forward Digital Summit</a>. My demo used <a href="https://github.com/features/actions">GitHub Actions</a> to run a number of Ansible playbooks. The code is hosted at <a href="https://github.com/rfitzhugh/Forward-2020-Codify-Your-Environment">https://github.com/rfitzhugh/Forward-2020-Codify-Your-Environment</a>. One of my main takeaways for those attending the session is that <a href="https://en.wikipedia.org/wiki/Continuous_integration">Continuous Integration</a> (CI) tools like GitHub Actions can be used to test and run automation against your infrastructure fairly easily. Let’s take a look.</p> <h1 id="the-github-actions-flow">The GitHub Actions Flow</h1> <p>CI tools were created to make life easier for developers by allowing them to test changes to their code rapidly. Modern CI tools leverage cloud infrastructure and containers to run these tests when code is committed. A typical workflow looks something like this:</p> <p class="center"><img src="https://networkbrouhaha.com/resources/2020/05/ci1-bg.png" alt="" /></p> <p>Normally, someone creates a branch on a Git repository, commits some changes, and opens a Pull Request. This alerts the repo owner(s) that there are proposed code changes. A pipeline is usually triggered when the pull request is opened to run tests against the new code. The test result is returned when completed, and is a major consideration as to whether or not the new code is merged. Another pipeline may be triggered after the pull request is merged, or at other points in the process. The timing for when pipelines run is specified in the CI pipeline configuration, and varies by project and need.</p> <p>Rubrik Chief Technologist Chris Wahl wrote a <a href="https://wahlnetwork.com/2020/05/12/continuous-integration-with-github-actions-and-terraform/">snazzy blog post</a> that demonstrates this workflow with Terraform. His post goes into greater detail on GitHub Actions than I will, so if you’re new to the tool then I recommend taking a moment to read his post before continuing.</p> <p>When it comes to using this process to execute Ansible Playbooks against your on-prem infrastructure, the process is slightly different. It looks like this:</p> <p class="center"><img src="https://networkbrouhaha.com/resources/2020/05/ci2-bg.png" alt="" /></p> <p>The beginning is the same, up to where the CI pipeline is triggered to run. In this case, we only want the pipeline to run <em>after</em> a Pull Request is merged into the master branch. The master branch should be considered the source of truth for your infrastructure, and changes must never be made directly to the master branch. All modifications to the Playbooks in the repo will be proposed via Pull Request, reviewed, commented on and ultimately accepted or denied. Only after the PR is accepted and merged to master will the Ansible Playbook(s) be executed. This is an important concept to grasp as you certainly do not want to be making unintended changes due to a pipeline being run unexpectedly.</p> <p>The other difference from the first example is that a local runner is used to execute the pipeline instead of a Docker container. This runner is installed in a Virtual Machine, in this case in our lab, so it has connectivity to all of the infrastructure that we’d like to automate. Instructions for adding a local runner to a repo and making it available to GitHub actions are located here: <a href="https://help.github.com/en/actions/hosting-your-own-runners/adding-self-hosted-runners">https://help.github.com/en/actions/hosting-your-own-runners/adding-self-hosted-runners</a>. Notice the ominous “<strong>Warning</strong>: We recommend that you do not use self-hosted runners with public repositories.” This is indeed sound advice - some additional security considerations will be covered near the end of this post.</p> <h1 id="configuring-github-actions-for-automation">Configuring GitHub Actions for Automation</h1> <p>A basic GitHub Actions configuration file needs to contain three things:</p> <ul> <li>When to Run</li> <li>Where to Run</li> <li>What to Run</li> </ul> <p>There are, of course, a bazillion different ways this can be configured. The examples below will focus on the Ansible workflow described above. A full configuration can be found here: <a href="https://github.com/rfitzhugh/Forward-2020-Codify-Your-Environment/blob/master/.github/workflows/run-playbooks.yml">https://github.com/rfitzhugh/Forward-2020-Codify-Your-Environment/blob/master/.github/workflows/run-playbooks.yml</a>.</p> <h2 id="when-to-run">When to Run</h2> <p>For many CI pipelines, this will be when a Pull Request is opened, but as mentioned above, that won’t work for our case. We only want the pipeline to run after a Pull Request is <em>merged</em> (i.e. approved). Here’s what that configuration looks like:</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>on: push: branches: - master </code></pre></div></div> <p class="center"><sub><sup><strong>Do this</strong></sup></sub></p> <p>The “push” action is triggered when merging code into the master branch. The first time I tried to figure out how to set this configuration, I had trouble finding the correct keywords. I ended up digging around in issues and forums, and found this solution:</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>on: pull_request: branches: [ master ] types: [closed] </code></pre></div></div> <p class="center"><sub><sup><strong>Don’t do this</strong></sup></sub></p> <p>While this does execute a pipeline after a Pull Request is merged, it will also run the pipeline when the PR is simply <em>closed</em>. Closing the PR means no code was merged in, hence no changes, but it still results in the pipeline running when you wouldn’t expect it to. Hopefully this wouldn’t be a major problem since Ansible is idempotent, and the repo still matches the running state of your infrastructure. Still, stick with the first example. Do <em>not</em> use <code class="language-plaintext highlighter-rouge">on: pull_request</code> for your automation CI pipeline!</p> <h2 id="where-to-run">Where to Run</h2> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>jobs: run-playbooks: runs-on: self-hosted </code></pre></div></div> <p>There is one job (“run-playbooks”) in this example, made up of one or more steps. This example instructs GitHub actions to execute the steps on your self-hosted runner. The main reason for this is the runner has connectivity to the infrastructure you’re attempting to automate. You also have the ability to pre-stage any necessary dependencies on the runner. This may be Ansible collections, Python modules, or other prerequisites.</p> <h2 id="what-to-run">What to Run</h2> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>steps: - uses: actions/checkout@v2 - name: Run Ansible Playbook run: ansible-playbook your_playbook.yaml </code></pre></div></div> <p>Finally, the files from repo are checked out via Git, and the Ansible playbook is fired off. This can be repeated, as necessary, if there are multiple playbooks. If Ansible returns an error, the pipeline aborts and the repo owners will be notified that the pipeline failed. If you are running multiple playbooks, you may have a workflow that is partially complete. In this case, there is probably some clean-up to do. Revert things to their previous state, and troubleshoot the error that was returned before trying again.</p> <h1 id="security-considerations">Security Considerations</h1> <p>First, you are placing sensitive information about your environment in a Git repository. There is no reason I can think of to use a public repo for this. Private repos are now free on GitHub, so use that capability and keep prying eyes away from your carefully designed playbooks.</p> <p>Second, whatever you put in your CI pipeline will be executed <em>in your environment, on your local runner</em>. This is why all Pull Requests should be reviewed by someone other than the original author. Have a second set of eyes review any proposed changes, and if possible, test thoroughly in a sandbox environment before proposing a Pull Request.</p> <p>Third, rely on <a href="https://help.github.com/en/actions/configuring-and-managing-workflows/creating-and-storing-encrypted-secrets">GitHub Secrets</a> to store passwords, tokens, and other sensitive information. The configuration example above demonstrates how to use secrets associated with a repo, and inject them into environment variables on the local runner so Ansible can reference them. Keep in mind that some secrets may be displayed in your CI logs if you have verbose logging enabled, so use caution.</p> <p>I’ve seen recommendations to build an “emergency off switch” that can abort automation workflows from running. This could be a firewall rule that blocks all communication to and from the local runner, or a script that immediately powers it off. An abort mechanism is worth considering as you increase your reliance on automation.</p> <h1 id="wrap-up">Wrap Up</h1> <p>Back in 2018 when I wrote my review of <a href="https://networkbrouhaha.com/2018/03/network-automation-book-review/">Network Programmability and Automation</a>, one of my few complaints about the book was that it was light on how CI pipelines are used within an automation workflow. I hope this post was helpful in shedding some light on this topic. There are many different available tools and approaches to this, so you may end up with a different way of tackling this in your own environment. I’d love to hear how you’re managing automation, or if there is anything I can add to my approach to improve it. Hit me up any time on Twitter <a href="https://www.twitter.com/networkbrouhaha">@NetworkBrouhaha</a>.</p> Tue, 19 May 2020 00:00:00 +0000 http://www.networkbrouhaha.com/2020/05/ansible-with-github-actions/ http://www.networkbrouhaha.com/2020/05/ansible-with-github-actions/ Budget Bliss with AWS Lambda and the You Need A Budget API <p class="center"><img src="https://networkbrouhaha.com/resources/2020/01/logos.png" alt="" /></p> <p>I’ve been a crappy budgeter my whole life. It’s a chore, it’s difficult to manage, and for some reason there is never quite enough money. I’ve tried several products over the years, like Microsoft Money and Mint. I was happy with Mint for a long time, but it never quite fit the bill (forgive the pun). About a year ago I heard about <a href="https://www.youneedabudget.com/">You Need A Budget</a> (YNAB), and I was immediately intrigued. The YNAB “method” made more sense to me than other tools out there, and it offers a robust and well-documented <a href="https://api.youneedabudget.com/">API</a>. A little poking around in the API docs confirmed that it would be fairly easy to set up a system I had wanted for many years: a <a href="https://en.wikipedia.org/wiki/Positive_feedback">positive feedback loop</a> of daily budget updates. My vision was to get a text message once or twice a day with the amount of money left in some specific budget categories.</p> <p>There is a lot of research on feedback loops, both positive and negative, in regards to human behavior. A simple example is a speed trailer placed in a neighborhood. Research shows that the simple act of displaying someone’s speed causes them to slow down. I had the idea that the same principle would apply to budgeting. If my wife and I had a better idea of where things were at budget-wise on a daily basis, we could make smarter spending decisions. I knew this would require some diligence since every transaction has to be assigned a category and approved within YNAB before it is reflected in the budget. This is not too difficult to keep up with since the text message would also serve as a reminder to get into YNAB and categorize any new transactions.</p> <h1 id="enter-aws-lambda">Enter AWS Lambda</h1> <p>If you’re even remotely paying attention to technology you’ve heard of “serverless” or “Functions as a Service” (FaaS). <a href="https://aws.amazon.com/lambda/">AWS Lambda</a> is Amazon’s offering in this space, and it has quickly gained adoption due to its ease of use and comprehensive functionality. It’s also cheap - there is no charge when a Lambda function is not in use, and even if I ran my budget alert twice a day that is a max of 62 function executions a month. (Spoiler alert: I’ve had this running for over a month and my AWS bill has gone up by about $0.15. That includes the fees to send text messages via Amazon SNS.)</p> <p>Using Python to interact with a REST API is easy. <a href="https://www.getpostman.com/">Postman</a> will <a href="https://learning.getpostman.com/docs/postman/sending_api_requests/generate_code_snippets/">generate the code</a> needed to make a call in a matter of clicks. From there, it is just a matter of executing the code, and sending the results in a text message. There are many ways to automate text messaging, but I chose to use <a href="https://aws.amazon.com/sns/">Amazon SNS</a> due to its simple integration with Lambda and low cost.</p> <h1 id="the-setup">The Setup</h1> <p>This example will pull the current amount of money left in one budget category via a Lambda function communicating with the YNAB API, and send it to one or more people via SMS. It’s not too difficult to expand this example to send more budget category balances, if desired. Personally I’ve used AWS for years to host my personal DNS zones via Route53, and I have some home movies stored in Glacier. If you’ve never used AWS before, now is a good time to create an account using <a href="https://aws.amazon.com/premiumsupport/knowledge-center/create-and-activate-aws-account/">these instructions</a>.</p> <h2 id="ynab">YNAB</h2> <p>There are few requirements for YNAB, apart from having done the initial setup along with having a budget created. You will need an API key, and to obtain that you must have an account with a username and password. If you are using your Google account (or some other service) to log in to YNAB, you will need to create a password before obtaining your key.</p> <p>Instructions for obtaining a key, along with API documentation is available at <a href="https://api.youneedabudget.com/">https://api.youneedabudget.com/</a>. To create a key follow these steps:</p> <ol> <li><a href="https://app.youneedabudget.com/settings">Sign in to the YNAB web app</a> and go to the “My Account” page and then to the “Developer Settings” page.</li> <li>Under the “Personal Access Tokens” section, click “New Token”, enter your password and click “Generate” to get an access token.</li> <li>Open a terminal window and run this: <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>curl -H "Authorization: Bearer &lt;ACCESS_TOKEN&gt;" https://api.youneedabudget.com/v1/budgets </code></pre></div> </div> </li> <li>You should receive a result beginning with <code class="language-plaintext highlighter-rouge">HTTP/1.1 200 OK</code> followed by a JSON payload with your budget information.</li> <li>Save your API key in a safe place, like 1Password or LastPass. You will need this key when you are configuring your Lambda function. <strong>This key is essentially the same as your username and password, so don’t share it with anyone</strong>. If you share code on GitHub, remember to remove your key!</li> </ol> <p>Now, use the API explorer to grab an IDs from YNAB to use in your Lambda script. You will need to find the ID for the budget category (or categories) you want to track.</p> <ol> <li>From <a href="https://api.youneedabudget.com/">https://api.youneedabudget.com/</a>, click “API Endpoints” in the top right of the page.</li> <li>Scroll down to “Categories” and click the lock icon beside <code class="language-plaintext highlighter-rouge">GET /budgets/{budget_id}/categories</code></li> </ol> <p class="center"><img src="https://networkbrouhaha.com/resources/2020/01/ynab01.png" alt="" /></p> <ol start="3"> <li>When prompted, paste in your API key, click “Authorize” then close the authorization page. Do not click “Logout”.</li> </ol> <p class="center"><img src="https://networkbrouhaha.com/resources/2020/01/ynab02.png" alt="" height="50%" width="50%" /></p> <ol start="4"> <li>Click the <code class="language-plaintext highlighter-rouge">GET /budgets/{budget_id}/categories</code> line to expand the information for this API endpoint. Click “Try it out”.</li> </ol> <p class="center"><img src="https://networkbrouhaha.com/resources/2020/01/ynab03.png" alt="" /></p> <ol start="5"> <li>For budget_id enter <code class="language-plaintext highlighter-rouge">last-used</code>. If you have multiple budgets this will load the categories for the last budget used. I only have one budget, so this will always default to the correct budget.</li> <li>Click “Execute”</li> </ol> <p class="center"><img src="https://networkbrouhaha.com/resources/2020/01/ynab04.png" alt="" /></p> <ol start="7"> <li>Scroll down and examine the response. You will see your budget categories displayed in JSON format. Find the category you want to track, and save the ID listed. Below is a screenshot of one budget category, with the ID partially obfuscated. You should see this pattern repeated for every budget category you have.</li> </ol> <p class="center"><img src="https://networkbrouhaha.com/resources/2020/01/ynab05.png" alt="" /></p> <h2 id="amazon-sns">Amazon SNS</h2> <p>Log into the <a href="https://console.aws.amazon.com/">AWS Console</a>, type “SNS” into the “Find Services” bar, and click “Simple Notification Service” when it comes up. Within SNS, there are two sections to configure: Subscriptions and Topics.</p> <ol> <li>Click “Topics”, then “Create Topic”</li> </ol> <p class="center"><img src="https://networkbrouhaha.com/resources/2020/01/sns01.png" alt="" /></p> <ol start="2"> <li>Provide a name for your topic as well as a display name, and click “Create topic”. I used “ynab-alerts” for both. Note the “ARN” provided after the topic is created. You will need this value for your Lambda function.</li> </ol> <p class="center"><img src="https://networkbrouhaha.com/resources/2020/01/sns02.png" alt="" /></p> <ol start="3"> <li>Click “Subscriptions”, then “Create Subscription”</li> </ol> <p class="center"><img src="https://networkbrouhaha.com/resources/2020/01/sns03.png" alt="" /></p> <ol start="4"> <li>Choose the ARN for the topic that you just created, set “Protocol” to “SMS”, and plug in your cell phone number for “Endpoint”. You must follow the format given, e.g. +15555551234. Click create subscription. Amazon will send a confirmation text message to confirm the owner of the phone number entered does indeed wish to participate in the subscription. If not, you would have a very inexpensive way to spam friends (or enemies) with all sorts of texts!</li> </ol> <p class="center"><img src="https://networkbrouhaha.com/resources/2020/01/sns04.png" alt="" height="50%" width="50%" /></p> <ol start="5"> <li>Repeat step four for any additional cell phones that you want to send budget updates to.</li> </ol> <h2 id="aws-lambda">AWS Lambda</h2> <p>Click the “Services” dropdown at the top of the page, type “Lambda” and hit enter. This is where we will configure the function used to retrieve information from the YNAB API, and send it to a cell phone via SMS.</p> <ol> <li>Click “Create function” on the Lambda dashboard. Leave “Author from scratch” selected, provide a name for your function, choose “Python 3.7” as the runtime, and click “Create function”. Note that the default execution role is sufficient and does not need to be changed. You will now be at the configuration page for your function.</li> </ol> <p class="center"><img src="https://networkbrouhaha.com/resources/2020/01/lambda01.png" alt="" /></p> <ol start="2"> <li>Click “Add trigger” and choose “CloudWatch Events”. Click the dropdown under “Rule” and choose “Create a new rule”. Provide a name (e.g. “YNAB-schedule”) and a Schedule expression, and uncheck “Enable trigger” for now. This is a <a href="https://docs.aws.amazon.com/AmazonCloudWatch/latest/events/ScheduledEvents.html#CronExpressions">cron-formatted rule</a>, so you can specify the times that work best for you to get alerts. I get alerts at noon and 8:00pm, so my expression looks like <code class="language-plaintext highlighter-rouge">cron(0 16,00 * * ? *)</code>. Once your schedule is set, click “Add”.</li> </ol> <p class="center"><img src="https://networkbrouhaha.com/resources/2020/01/lambda02.png" alt="" /></p> <p class="center"><img src="https://networkbrouhaha.com/resources/2020/01/lambda03.png" alt="" /></p> <ol start="3"> <li>You should be back on the Lambda function configuration page. If “CloudWatch Events” is still highlighted in the designer, click on the Lambda function name so that the code editor is displayed.</li> </ol> <p class="center"><img src="https://networkbrouhaha.com/resources/2020/01/lambda04.png" alt="" /></p> <ol start="4"> <li>Scroll down to view the built-in code editor. In another tab, open the example code from this <a href="https://gist.github.com/shamsway/49e2a7f32a18cc9563c50cd1ba59f2ae">GitHub Gist</a>. Copy that code and paste it into the Lambda code editor. Replace the placeholders for API Key (line 6), Budget category (line 7), and SNS Topic ARN (line 31). The editor will save code automatically, but to be safe click File-&gt;Save.</li> </ol> <p class="center"><img src="https://networkbrouhaha.com/resources/2020/01/lambda05.png" alt="" /></p> <ol start="5"> <li>At the top of the page, click the dropdown beside the “Test” button and choose “Configure test events”. Provide an event name (e.g. “Test”) and click “Create”. Now click the “Test” button, and if everything is working as expected, you will receive a text with the amount of money left in your budget category!</li> </ol> <p class="center"><img src="https://networkbrouhaha.com/resources/2020/01/lambda08.png" alt="" /></p> <p class="center"><img src="https://networkbrouhaha.com/resources/2020/01/lambda06.png" alt="" /></p> <ol start="6"> <li>Feel free to adjust the script to change wording or check additional categories based on the steps outlined. It will require some Python knowhow to adjust the existing script and add the logic and outputs for additional categories, but I’m happy to help you if needed. Once you are satisfied with the alert you’re getting when the function runs, click on “CloudWatch Events” in the “Designer” section, then click the slider to enable your schedule. You will now receive texts matching your defined schedule.</li> </ol> <p class="center"><img src="https://networkbrouhaha.com/resources/2020/01/lambda07.png" alt="" /></p> <h1 id="final-thoughts">Final Thoughts</h1> <p>Here’s a screenshot of the text that my wife and I get twice a day. I’m tracking four different categories, but the code is a bit hacky so I’ve opted not to share it.</p> <p class="center"><img src="https://networkbrouhaha.com/resources/2020/01/ynab-alert.png" alt="" height="50%" width="50%" /></p> <p>Here is my monthly AWS bill with the relevant services highlighted. Clearly this is easy to fit into the budget.</p> <p class="center"><img src="https://networkbrouhaha.com/resources/2020/01/aws-bill.png" alt="" /></p> <p>I hope this is a helpful exercise for other YNAB users out there, or at least an example showing the power of APIs and serverless functions. There are many APIs available for consumption, so with a little imagination you can build all sorts of useful tools by following this example. I chose not to share the script I’m using to monitor multiple categories because it is fairly ugly and could use some enhancements. Ultimately I’d like to trigger this function with an API call, and pass in the categories to track via a JSON payload. This would allow me to schedule the same twice-daily messages, but I could also use the script to check categories from Alexa or some other method.</p> <p>Did I miss anything? Please comment below with questions, or find me on twitter at <a href="https://twitter.com/NetworkBrouhaha">@NetworkBrouhaha</a>.</p> Wed, 22 Jan 2020 00:00:00 +0000 http://www.networkbrouhaha.com/2020/01/budget-bliss/ http://www.networkbrouhaha.com/2020/01/budget-bliss/ I'm joining Rubrik <p>This post will be short and sweet. I’m joining Rubrik as a Technical Marketing Engineer, focusing on Networking and Security.</p> <p>Why:</p> <ul> <li>To challenge myself and learn new things</li> <li>To contribute some of my networking (and other) knowledge</li> <li>To focus my energy into a solid product that I believe in</li> <li>To work with some incredibly bright minds</li> <li>Rubrik has a growing customer base with increasingly complex networks</li> </ul> <p>Leaving SIS was a difficult decision. I made some great friends there and genuinely hope those friendships continue. I don’t have anything to say but kind words for the SIS folks, and this new direction isn’t because of anything negative happening there. Simply put, an opportunity to work at Rubrik is one that is too good to pass up. They are disruptive in their market, and by my appraisal they are doing things the right way. No one has anything but good things to say about their leadership, and they have a compelling story about their product.</p> <p>I can’t wait to see what the next few months will bring. I’m excited to learn everything about Rubrik and get to know my new team members. Unfortunately this means that I’ll have to pause work on my “Hybrid Home Lab” setup, but I will continue that effort as soon as I can.</p> Mon, 22 Oct 2018 00:00:00 +0000 http://www.networkbrouhaha.com/2018/10/im-joining-rubrik/ http://www.networkbrouhaha.com/2018/10/im-joining-rubrik/ VMworld 2018 Recap <p>My journey to VMworld 2018 began in an unexpected way - a <a href="https://twitter.com/hcmccain/status/1023994969810460674">tweet from Chris McCain</a>.</p> <blockquote> <p>u a CCIE? Apply for a full VMworld pass.</p> </blockquote> <p>What did I have to lose? Never mind that VMworld was only a month away, and I had no approval to actually travel to Las Vegas. I filled out a little survey about how much I love NSX, and pressed submit. <em>Fast forward a week, a wild email appears:</em></p> <blockquote> <p>Thanks for filling out the application, I’d love to offer you the NSX Mindset scholarship to VMworld!</p> </blockquote> <p>Uh oh. That is not what I expected. After a good amount of scrambling on my part, and some very gracious actions by my employer, I was approved for travel and lodging in Las Vegas for VMworld 2018. What an exciting an unexpected turn of events! I registered, made arrangements, and started prepping to head to a conference I never expected to attend. With CiscoLive fresh on my mind, I made a concerted effort to keep my schedule realistic. I definitely wanted plenty of time to meet new people, hit some hands on labs, and some down time so I didn’t exhaust myself. I exercised self control while scheduling sessions, but it was not easy. There was a long list of options that piqued my interest.</p> <p>It’s no secret that my roots are in networking. I’ve spent plenty of time in the compute and storage silos, and I attend my local VMUG, but I am an “outsider” when it comes to the #vCommunity. I expected that VMworld was a lot like CiscoLive in terms of form and function. I found this to be mostly true, but there are some differences that I will call out throughout this post.</p> <p>August crept by slowly, but the time finally came to board a plane bound for Las Vegas. After checking into my hotel and grabbing my VMworld badge, it was time for <a href="https://blog.vmunderground.com/opening-acts-2018/">Opening Acts</a>, followed by <a href="https://blog.vmunderground.com/vmunderground-2018/">VMunderground</a>. This was the first of several differences I noticed between VMworld and CiscoLive. If VMworld was a planet, it would have several orbiting moons that represent all the community events happening in conjunction. vDodgeball, vSoccer, vFit runs, vBeards gatherings - there is something for everyone. From what I can tell, these all have roots in VMUG (or vBrownbag). VMware made a smart decision supporting and empowering VMUG leaders. It has spawned a vibrant community, and it sets VMworld apart from other events.</p> <p>Opening Acts was a great way to kick off VMworld. The panel on <a href="https://www.youtube.com/watch?v=D2CMVJQPZio">“Beating IT Burnout”</a> was a highlight, and it was fun watching my friend <a href="https://twitter.com/tbgree00">Thom</a> up on stage. <a href="https://twitter.com/MindfulAlicia">Alicia Preston</a> spoke about practicing mindfulness to combat burnout. This presentation spawned several other hallway conversations throughout the week. If you missed it, take the time to watch. VMunderground was also a great time, and I got the opportunity to meet several folks that I would continue to see at blogger tables and VMTN area. I definitely recommend this event for anyone that is new to VMworld.</p> <h3 id="sessions">Sessions</h3> <p>Overall the session content was very good, and I was surprised at the depth of the networking material. In general, I found the sessions to be a bit more technical at CLUS than VMworld, but not by much. One thing I missed from CLUS was having access to a copy of the slides for each session. There were several times a presenter blew past a slide that I wanted to digest a bit more. It also keeps people from feeling like they have to snap a picture of every slide. Here are my highlights:</p> <ul> <li><a href="https://videos.vmworld.com/searchsite/2018/videoplayer/18995">NSX Mindset: Clouds Collide, Opportunity Strikes (NET1919BU)</a> - This is not a technical talk, but I’d recommed it to anyone working in IT. Chris McCain is a fantastic presenter and could probably work the motivational speaker circuit.</li> <li><a href="https://videos.vmworld.com/searchsite/2018/videoplayer/20207">Kubernetes NSX-T Deep Dive (NET1677BU)</a> - I’ve spent plenty of hours trying to detangle networking in Kubernetes. This presentation lays out k8s topics and constructs in an easy to understand way, and makes a great case for NSX-T as one of the best ways to “do networking” in Kubernetes.</li> <li><a href="https://videos.vmworld.com/searchsite/2018/videoplayer/22674">Next-Generation Reference Design with NSX-T Data Center: Part 1 (NET1561BU)</a></li> <li><a href="https://videos.vmworld.com/searchsite/2018/videoplayer/22675">Next-Generation Reference Design with NSX-T Data Center: Part 2 (NET1562BU)</a></li> <li><a href="https://videos.vmworld.com/searchsite/2018/videoplayer/23018">VMware Cloud on AWS with NSX: Use Cases, Design, and Implementation (NET1327BU)</a> - Good overview of networking in VMWonAWS, plus a preview of things to come with NSX-T support.</li> </ul> <h3 id="keynotes-and-announcements">Keynotes and announcements</h3> <p>Honestly, I don’t really care about keynotes at conferences. The only ones I’m truly interested in are the non-technical ones, a la Michio Kaku &amp; Amy Webb at CLUS, and Malala Yousafzai at VMworld. All of the announcements are already well covered, so I’m not going to generate yet another list. I was absolutely thrilled at the opportunity to hear Malala speak, and I give VMware major credit for bringing her to speak, along with committing to supporting her charity. There were some grumbles about the increased security, but in my opinion it was all worth it. I am so inspired by this young woman and her commitment to fighting for education for girls everywhere. Someone - I’m not saying who - recorded her talk on Periscope, and you can watch <em>here</em>. Thinking it still gives me all the feels.</p> <h3 id="parties">Parties</h3> <p>Maybe it’s because VMworld is in Las Vegas, but it would be an understatement to say that there were lots of parties going on. My MO for conferences is to treat them like work. I’m there to learn, and my employer is paying for me to be there. However, there were a few baller parties that are worth mentioning.</p> <ul> <li>Rubrik had the party of the week in my option. RUN-DMC <em>and</em> The Roots?! It was non-stop awesome and I danced my butt off. I have been a fan of The Roots since 1996, and I had only seen them live once. I made my way to the front of the stage and enjoyed a once-in-a-lifetime show. RUN-DMC was also great and Jam Master Jay’s son is a hell of a turntablist. Kudos to Rubrik for throwing a great party. <a href="https://www.youtube.com/playlist?list=PLmyCQ1p5hbAgWITKwFW6HEYAGk21b7OPQ">Here are a couple videos I took from the party</a></li> <li>VMfest was, in my opinion, a fun time. Several people I talked to skipped the party altogether. I’ve read comments from many people that thought it was terrible. <a href="http://www.royalmachinesmusic.com/home/">Royal Machines</a> was an unpopular choice for a band - I was disappointed when I saw the announcement. If this wasn’t my first VMworld I may have skipped the party as well, but I decided to go into it with an open mind. When I walked in, there were <em>long</em> lines for food trucks scattered around the entrance area. I have no idea why people were waiting as there was food available in several other places. I never had to wait in line for a drink all night. The theme was four different environments: tropical, desert, jungle, aquatic. Maybe this turned people off - I thought it was an original idea and the decorations were well done. Royal Machines were a pleasant surprise. I’m a sucker for a good cover band, and it was a fun show. They completely embraced the ridiculousness of who they are. Dave Navarro is a rock god - it was a pleasure to watch him play. Mark McGrath understands that everyone thinks he’s a joke, and he is still willing to get out on the stage and give it his all. He gets my respect for that. Macy Gray covering Radiohead: Awesome. Sebastian Bach covering Ozzy: Awesome. Robin Zander in general: Awesome. Surprise appearance by DMC: Awesome. <a href="https://www.youtube.com/playlist?list=PLmyCQ1p5hbAgazhLfv2Lvu5iwhEPIrKd3">Videos from the show</a> / <a href="http://www.royalmachinesmusic.com/home/latest/events/vmware-las-vegas/">Setlist</a></li> <li>The NSBU threw an “NSX Mindset” party at the <a href="https://1923lv.com">1923 Bourbon Bar</a>. The place was packed, and rockin’. I truly wish I could have stayed longer, but I did not want to experience FUTURE:NET with a hangover. I did the responsible (i.e. boring) thing and slipped out early.</li> </ul> <h3 id="futurenet">FUTURE:NET</h3> <p>Future:net is a one day “conference within a conference”, described as a “discussion on the future of networking with industry leaders and visionaries”. It is invite-only, and I was lucky to receive an invitation along with my scholarship. I first heard about this event on <a href="https://packetpushers.net">Packet Pushers</a>, and I was immediately intrigued. Of everything I had scheduled at VMworld, I was most excited for this event, and it did not disappoint. The event took place all day Thursday, and there was a welcome reception Wedensday evening. I considered skipping the reception, and I’m glad I walked over to The Four Seasons instead of taking a nap. The first person I met leads networking teams at Google. Not long after that, Pat Gelsinger showed up. I was standing right beside him as he and Greg Ferro made a bet about the SD-WAN industry.</p> <blockquote> <p>I just made a SD-WAN bet with @pgelsinger that NSX Velocloud and Cisco Viptela will NOT have 70% market share by this time next year. Tell me I am wrong ? https://twitter.com/etherealmind/status/1035011002855636992</p> </blockquote> <p class="center"><img src="https://networkbrouhaha.com/resources/2018/09/pgelsinger.jpg" alt="" height="50%" width="50%" /></p> <p>Thursday the conference kicked off with breakfast and a live recording of a Packet Pushers podcast, which was a real treat to watch. I have been a loyal listener for many years, but I had never gotten the chance to meet Greg, Ethan and Drew. After breakfast, the presentations began, and the first presenter was a professor from Cornell discussing blockchain. Of every presenter on the agenda I was least excited for this talk - I feel like we’ve all heard more than enough about blockchain already. I was completely wrong, and it may have been my favorite talk of the day. <a href="https://twitter.com/el33th4xor">Emin Gun Sirer</a> delivered fascinating talk about why blockchain as a technology is much more interesting than cryptocurrencies.</p> <p>I live tweeted the event and this blog is already long enough, so you can see my thoughts and others here: <a href="https://twitter.com/search?f=tweets&amp;vertical=default&amp;q=%23futurenet18&amp;src=typd">#FutureNET18</a>. You can also find coverage in <a href="https://packetpushers.net/podcast/weekly-show-406-updates-and-introspection/">Packet Pushers Weekly Episode 406</a> and <a href="https://packetpushers.net/podcast/network-break-200-vmware-navigates-multicloud-perils-and-opportunities/">Network Break 200</a>. I will try to write some more words about this event later - it really deserves its own blog post. Needless to say I was honored to attend and it was one of the highlights of my week.</p> <p class="center"><img src="https://networkbrouhaha.com/resources/2018/09/packetpushers.png" alt="" height="50%" width="50%" /></p> <h3 id="closing-thoughts">Closing thoughts</h3> <ul> <li>As I mentioned, I’m not going to regurgitate all of the announcements from VMworld. Here’s a few links if you still need to catch up. <ul> <li>https://www.vmware.com/products/whats-new.html?src=so_5a314d05e49f5&amp;cid=70134000001SkJn</li> <li>https://anthonyspiteri.net/vmworld-2018-recap-part-1-major-announcement-breakdown/</li> <li>https://anthonyspiteri.net/vmworld-2018-recap-part-2-community-and-veeam-recap/</li> </ul> </li> <li>VMworld has a little ways to go in terms of organization. Compared to CLUS, registration was a hot mess. CiscoLive is a larger conference, and Cisco clearly throws a <em>lot</em> of resources at it. There are some other small things like putting tables in the breakout room that I missed. Would this stuff prevent me from coming again? Probably not. VMware does a very good job with this conference, but they could take a couple pages out of Cisco’s playbook.</li> <li>There was a question thrown out in the Packet Pushers slack: If you went to VMworld this year, would you go again? My answer is probably. I’m not sure if it’s an event that I would need to hit every year, but I really enjoyed my experience. The only thing that bothered me was the location. I am not a fan of Las Vegas. Everything is too expensive. Everything is over the top. There are times when I’m mildly amused, but they are few and far in-between. I am <em>not</em> the morality police and I am not interested in judging anyone, but being in Vegas pushes me to the edge. It makes me feel icky. I’ve made no decision on if I’ll request to go to San Francisco in 2019, but I’ll seriously consider it.</li> <li>Some genius at DEF CON was handing out “blockchains” - miniature cinder blocks on a dogtag chain. I found this to be incredibly punny, so I gathered the necessary materials and brought some with me to Vegas. I figured it would be a fun way to break the ice and meet new people, and I was not wrong. Everyone loved them, and I met so many people that I would not have met otherwise. I wish I knew who came up with the original idea so I could give him/her credit. Having something fun to share is an awesome way to meet people, especially if you’re a newcomer. If I’m already thinking about ways to expand on this idea if I make it to VMworld in San Francisco.</li> </ul> <p class="center"><img src="https://networkbrouhaha.com/resources/2018/09/blockchain.png" alt="" /></p> <ul> <li>If you’re in Vegas and you don’t get a meal at Hash House a Go Go, you’re losing at life.</li> </ul> <p class="center"><img src="https://networkbrouhaha.com/resources/2018/09/hashhouse.png" alt="" height="40%" width="40%" /></p> Thu, 13 Sep 2018 00:00:00 +0000 http://www.networkbrouhaha.com/2018/09/vmworld-2018-recap/ http://www.networkbrouhaha.com/2018/09/vmworld-2018-recap/ Hybrid Home Lab Pt. 1 <p>Over the last few weeks I’ve been working on standing up my version of a “real” lab. I’ve got enough information together to start putting together some blog posts, so let’s dive right in. Previously, my home lab was just a custom built linux server with plenty of memory and software RAID. This was enough to do some small-scale network labs and run the few applications I needed, but it really doesn’t qualify as a true home lab. There’s no way for me to work with a vSphere or KVM cluster, let alone NSX-v or NSX-T. I’ve laid out a few goals for my “Hybrid Home Lab”:</p> <ul> <li>On Prem Resources <ul> <li>2x UCS C220 M3</li> <li>Re-purpose existing server as a home NAS <ul> <li>Utilize hardware RAID and serve LUNs via iSCSI or NFS</li> </ul> </li> <li><a href="https://www.ubnt.com/edgemax/edgeswitch-16-xg/">Ubiquiti EdgeSwitch 16XG</a></li> <li><a href="https://www.ubnt.com/edgemax/edgerouter-poe/">Ubiquiti EdgeRouter PoE</a></li> <li>Purpose: Compute Virtualization Lab (vSphere or KVM), Network Virtualization Lab (NSX-V, NSX-T, EVE-NG, VIRL, GNS3), Kubernetes backup cluster</li> </ul> </li> <li>Cloud Resources <ul> <li>Hosted in <a href="https://www.vmware.com/products/vcloud-director.html">vCloud Director</a></li> <li><a href="https://rancher.com/blog/2018/2018-05-01-rancher-ga-announcement-sheng-liang/">Rancher 2.0</a> Kubernetes cluster</li> <li><a href="https://opnsense.org">OPNsense</a> firewall <ul> <li>Also provides <a href="https://www.zerotier.com">ZeroTier</a> VPN/SD-WAN and <a href="https://haproxy.org">HAproxy</a> load balancing</li> <li>Replaces NSX edge in vCD</li> </ul> </li> <li><a href="https://www.gluster.org">Gluster</a> for persistent Kubernetes storage</li> <li>Purpose: Learn Kubernetes, deliver applications independent of on prem resources, test OPNsense as a “cloud router” and ZeroTier for hybrid cloud scenarios <ul> <li>Applications I’ll try to run: Gitlab, Netbox, Zabbix, Grafana, MariaDB/Postgres, StackStorm, other automation tools, and custom</li> </ul> </li> </ul> </li> </ul> <p>I will be publishing detailed blog posts on the setup of these components - stay tuned!</p> <h1 id="but-why">But, Why?</h1> <p class="center"><img src="https://networkbrouhaha.com/resources/2018/08/ytho.jpg" alt="" height="35%" width="35%" /></p> <p class="center">(my daughter approves the use of this meme)</p> <p><strong>Why vCD?</strong> I have access to a vCD lab at work. I have to keep a small footprint, but this is much more economical than using another cloud provider. We’ve run vCD at my <a href="https://thinksis.cmo">day job</a> for quite a while, and I’ve become fond of it. It’s come a <em>long</em> way since we initially deployed it, and it continues to improve. <a href="https://twitter.com/search?f=tweets&amp;vertical=default&amp;q=%23LongLiveVCD">#LongLiveVCD</a></p> <p><strong>Why Rancher?</strong> This is another product that we’re using at work, so I have some motivation to learn it. It definitely is “training wheels” for Kubernetes, and I’m already getting the itch to experiment with vanilla Kubernetes or OpenShift. For now it does what I need it to, and it’s not terribly difficult to take all my YAML files and load them in another Kubernetes cluster later.</p> <p><strong>Why are you running stateful applications in Kubernetes?</strong> I understand that Kubernetes is mainly for stateless applications and microservices, but it does support stateful workloads. This is a lab, and sometimes it is fun to push the limits.</p> <p><strong>Why Gluster?</strong> Persistent storage in Kubernetes is a PITA if you’re not using one of the major cloud providers, or leveraging storage that provides a Kubernetes plugin. <a href="https://github.com/heketi/heketi">Heketi</a> provides an API interface for GlusterFS that Kubernetes can leverage. I’ll provide more information in a later blog post, but this was the easiest way to provide redundant persistent storage for my Rancher cluster.</p> <p><strong>Why OPNsense?</strong> Yes, vCD provides an NSX edge. In vCD 9.1, it is full featured and suitable for most workloads. I’m a network nerd so this is one of the areas where I want more flexibility than what NSX can provide. The <a href="https://opnsense.org/about/features/">feature list</a> for OPNsense is impressive, and most importantly for me, it has built in support for ZeroTier.</p> <p><strong>Why ZeroTier?</strong> Please see my previous post on <a href="/2018/03/vcd-terraform-example/">cloud automation</a>. Future posts will go into more detail on this as well.</p> <h1 id="show-me-the-diagram">Show me the diagram</h1> <p>IP addresses have been changed to protect the innocent.</p> <p class="center"><a href="https://networkbrouhaha.com/resources/2018/08/hybrid_lab_diagram.png" height="75%" width="75%"><img src="https://networkbrouhaha.com/resources/2018/08/hybrid_lab_diagram.png" alt="hybrid lab diagram" /></a></p> <p class="center">(Click to embiggen)</p> Tue, 21 Aug 2018 00:00:00 +0000 http://www.networkbrouhaha.com/2018/08/hybrid-home-lab-pt1/ http://www.networkbrouhaha.com/2018/08/hybrid-home-lab-pt1/ CLUS 2018 recap <p>For the first time in seven years, I had the opportunity to travel to Cisco Live 2018 in Orlando, FL. In this belated blog post, I’ve got a few thoughts, a few tips, and a bit of geeking out.</p> <p>There’s a thrill to registering for Cisco Live: scheduling sessions, RSVPing to party invites, planning to meet friends, and booking flights. The most important part, by far, is creating a reasonable schedule. CLUS is a marathon, not a sprint, and you have to be careful to not overburden yourself. I was at packed 8:00am sessions every day but Thursday, and up fairly late most nights. There is simply too much to do. Below is a list of sessions I attended, to get an idea of my week.</p> <ul> <li>[BRKSDN-2262] <a href="https://www.ciscolive.com/global/on-demand-library/?search=BRKSDN-2262#/session/1516099602451001CqMa">Open Source for Networking: The FD.io/VPP example</a></li> <li>[DEVNET-1293] <a href="https://www.ciscolive.com/global/on-demand-library/?search=DEVNET-1293#/session/1509733975288001YGm4">Cisco UCS Automation and orchestration with Ansible</a></li> <li>[BRKDCN-2035] <a href="https://www.ciscolive.com/global/on-demand-library/?search=BRKDCN-2035#/session/1509501687106001POqy">VXLAN BGP EVPN based Multi-Site</a></li> <li>[DEVNET-2644] <a href="https://www.ciscolive.com/global/on-demand-library/?search=DEVNET-2644#/session/15111940816080019wR4">Open Network Automation Platform</a> (ONAP)</li> <li>[BRKDCN-3040] <a href="https://www.ciscolive.com/global/on-demand-library/?search=BRKDCN-3040#/session/1509501655684001PLO4">Troubleshooting VxLAN BGP EVPN</a></li> <li>[DEVNET-1296] <a href="https://www.ciscolive.com/global/on-demand-library/?search=DEVNET-1296#/session/1510584364275001jLdB">Building a NetDevOps CICD Pipeline with OpenSource</a></li> <li>[BRKSDN-2115] <a href="https://www.ciscolive.com/global/on-demand-library/?search=BRKSDN-2115#/session/1512002243477001x6sa">Introduction to Containers and Container Networking</a></li> <li>[BRKDCN-3001] <a href="https://www.ciscolive.com/global/on-demand-library/?search=BRKDCN-3001#/session/1512769713770001R5Fc">Leveraging Micro Segmentation to Build Comprehensive Data Center Security Architecture</a></li> <li>[BRKRST-3310] <a href="https://www.ciscolive.com/global/on-demand-library/?search=BRKRST-3310#/session/1518011397038001CXX2">Troubleshooting OSPF</a></li> <li>[BRKCLD-3440] <a href="https://www.ciscolive.com/global/on-demand-library/?search=BRKCLD-3440#/session/1511296161600001A5Dh">Multicloud Networking – Design &amp; Deployment</a></li> <li>[BRKDCN-2125] <a href="https://www.ciscolive.com/global/on-demand-library/?search=BRKDCN-2125#/session/1509501687216001Pex1">Overlay Management and Visibility with VXLAN</a></li> <li>[DEVNET-1365] <a href="https://www.ciscolive.com/global/on-demand-library/?search=DEVNET-1365#/session/1499457537273001QPDr">DevNet Workshop- Vagrant Up for the Network Engineer (NX-OS, IOS-XE, IOS-XR)</a></li> <li>[DEVNET-2076] <a href="https://www.ciscolive.com/global/on-demand-library/?search=DEVNET-2076#/session/1510880880567001k3i2">Continuous Integration and Testing for Networks with Ansible</a></li> <li>[BRKSEC-2010] <a href="https://www.ciscolive.com/global/on-demand-library/?search=BRKSEC-2010#/session/1509501665659001Pw8M">Talos Insights: The State of Cyber Security</a></li> <li>[KEYGEN-1003] <a href="https://www.ciscolive.com/global/on-demand-library/?search=KEYGEN-1003#/session/1520266383574001fJzL">Closing Keynote: What Science Can Tell Us About Our Future</a></li> </ul> <p>Here is the approach I took to building my schedule:</p> <ul> <li>I went through the course catalog, filtering by technology, and marked every interesting course as a favorite. All favorites are saved, so you can go back and watch recordings for sessions you missed once they’re posted.</li> <li>I noted 5-6 “must attend” sessions, and registered for them as soon as registration opened.</li> <li>Filtering by time slot and favorite sessions, I filled up the rest of my schedule. I didn’t worry about leaving time for lunch at this stage.</li> <li>After some internal deliberation, I dropped between 1/3rd and 1/4th of the courses I’d registered for. This provided time to eat, rest, socialize, and attend some of the “meatspace only” opportunities (DevNet, Walk-in Self Paced Labs, Tweetups etc.)</li> </ul> <p>I knew I’d made good picks when I walked into my first session and sat down behind Terry Slattery and Wendell Odom. My favorite session was <a href="https://www.ciscolive.com/global/on-demand-library/?search=BRKRST-3310#/session/1518011397038001CXX2">Troubleshooting OSPF</a>, by Nick Russo. The room was packed, and Nick put on a master class. If you missed it, do yourself a favor and watch it now. You don’t need to be an OSPF guru to keep up, but I’m willing to bet that even the most seasoned CCIE R&amp;S will gain something from this session. Overall the session content across the board was top notch, with only a couple sessions that I found mildly disappointing at worst.</p> <p>Almost every session recording is <a href="https://www.ciscolive.com/global/on-demand-library/">posted online</a>, so there is no reason to have Cisco Live session FOMO. Most of us go to CLUS to learn the latest and greatest in our chosen technology stacks, but I find far greater value in the human connections I formed. I’m an extrovert, so being surrounded by a throng of people gives me energy. As I walked down the halls I would look around and think to myself, “Yes, these are my people!”</p> <p>I made a concerted effort to connect with as many online friends and personal inspirations as I could. Here’s a incomplete list of folks I was either able to meet or learn from: <a href="https://rule11.tech">Russ White</a>, <a href="https://twitter.com/bcjordo">Jordan Martin</a>, <a href="https://twitter.com/SharpNetwork">Eyvonne Sharp</a>, <a href="https://www.netcraftsmen.com/team/terrance-slattery/">Terry Slattery</a> (plus many other NetCraftsmen I sat in sessions with), <a href="https://twitter.com/Wendellodom">Wendell Odom</a>, <a href="https://twitter.com/ScottMorrisCCIE">Scott Morris</a>, <a href="https://twitter.com/CCIE21921">Lukas Krattiger</a>, <a href="https://twitter.com/hfpreston">Hank Preston</a>, <a href="https://twitter.com/jedelman8">Jason Edelman</a>, <a href="https://twitter.com/nickrusso42518">Nick Russo</a>, <a href="https://twitter.com/danieldibswe">Daniel Dibb</a>, <a href="https://twitter.com/dmfigol">Dmitry Figol</a>, <a href="https://twitter.com/kmcnam1">Katherine McNamara</a>, <a href="https://www.networkingwithfish.com">Denise Fishburne</a>, <a href="https://www.linkedin.com/in/humphreycheung/">Humphrey Cheung</a>, <a href="https://twitter.com/theLANtamer">Quentin Demmon</a> and <a href="https://twitter.com/showipintbri">Tony Efantis</a>, not to mention all the fine folks I met from <a href="https://www.meetup.com/routergods/">RouterGods</a>. This is a prolific group of networkers. If you want to improve yourself, what better way is there than learning from people like this? I’m also a believer in spreading gratitude, so I made sure to personally thank folks that had helped me grow technically and professionally. Every single person I thanked seemed genuinely appreciative to hear it. There’s never any harm in spreading the love!</p> <p>My only regret is that I did not hunt down <a href="https://twitter.com/Drew_CM">Drew Conry-Murray</a>, as I am an avid <a href="https://packetpushers.net">Packet Pushers</a> listener and I love <a href="https://packetpushers.net/series/network-break-podcast-post/">The Network Break</a>. Hopefully I can remedy this next year!</p> <p>I have to give special attention to the <a href="https://developer.cisco.com">DevNet Zone</a>, and the folks that put it all together. This area was filled with some of the best content of the conference. Network Automation, Programming, APIs and the future of Networking in general was on full display. There were hands-on labs and experts willing to whiteboard anything you wanted to discuss. Watching Wendell Odom geek out like the rest of us as Hank Preston presented on NetDevOps was a particularly cool moment.</p> <p class="center"><img src="https://networkbrouhaha.com/resources/2018/07/networkcicd.png" alt="" height="50%" width="50%" /></p> <p>You’ll notice from the list of sessions above that I only attended one keynote. There were DevNet sessions that I wanted to attend instead, and the keynotes are posted online, so it wasn’t a tough decision. The closing keynote, featuring <a href="https://amywebb.io">Amy Webb</a> and <a href="http://mkaku.org">Dr. Michio Kaku</a> is a different story. By Thursday I was running on fumes, so I took the day easy. About an hour before the closing keynote, I made my way towards the entrance and saw a huge line had already formed. I had no interest in standing for an hour, so I found an empty seat nearby and waited for the doors to open. For some reason they didn’t open the doors where folks had queued - they opened the doors <em>directly</em> behind the seat I was sitting in. I was surprised and felt bad for the people that had been waiting in line, but I’m no dummy. I grabbed my stuff, walked in, and got seated in the front row almost directly in front of the stage. Talk about good luck! To top it off, as I was sitting there, one of my tweets was flashed up on the uber-displays. It was an amazing and surreal way to end CLUS. Both Amy and Michio gave great keynotes to wrap up CLUS.</p> <p class="center"><img src="https://networkbrouhaha.com/resources/2018/07/clustweet.png" alt="" height="75%" width="75%" /></p> <h2 id="closing-thoughts-and-tips">Closing thoughts and tips</h2> <p>I had a great time at Cisco Live 2018. It was so fulfilling to meet and hang out with everyone, learn new things, explore the DevNet Zone/World of Solutions, and attend several great parties. I will admit to feeling somewhat overwhelmed the whole time I was there. There is something bright and shiny to grab your attention at every turn. Keeping up with twitter is a job within itself, and the Cisco Social Media team really deserves kudos for the great job they do during CLUS. However, I cannot disagree with anything Tom Hollingsworth wrote in his <a href="https://networkingnerd.net/2018/06/22/finding-value-in-cisco-live-2018/">Cisco Live Recap</a>. CLUS is a great event, but there will always be ways to improve and provide better value. In the end, like most things, you will get out of it what you put into it.</p> <p>Here’s a few random tips to wrap up this post</p> <ul> <li>Take breaks - you will need time to decompress.</li> <li>Stay hydrated.</li> <li>Come prepared to learn a lot, and keep a notebook handy. You may find yourself wanting to take notes when least expected.</li> <li>Put yourself out there. Go out of your way to introduce yourself to peers in sessions, during meals, and at parties. Bring business cards.</li> <li>If you’re social, hit the Tweetups. This is a great place to meet people and network.</li> <li>Go easy at the parties. You’ll do yourself no favors by trying to make it through the next day hungover.</li> <li>HAVE FUN.</li> </ul> <p class="center"><img src="https://networkbrouhaha.com/resources/2018/02/drink_route_tr.png" alt="" height="25%" width="25%" /></p> Mon, 23 Jul 2018 00:00:00 +0000 http://www.networkbrouhaha.com/2018/07/CLUS-recap/ http://www.networkbrouhaha.com/2018/07/CLUS-recap/