<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/"><channel><title>Posts on Journey through Cloud &amp; Code</title><link>https://gurupasupathy.com/post/</link><description>Recent content in Posts on Journey through Cloud &amp; Code</description><generator>Hugo</generator><language>en-us</language><lastBuildDate>Sat, 06 Jun 2026 00:00:00 +0000</lastBuildDate><atom:link href="https://gurupasupathy.com/post/index.xml" rel="self" type="application/rss+xml"/><item><title>HandsOn — Building Hybrid Cloud Environment — Part 5— Connectivity— Site-to-Site VPN establishing…</title><link>https://gurupasupathy.com/post/2026-06-06_building-hce-part-5-connectivity-site-to-site-vpn-establishing/</link><pubDate>Sat, 06 Jun 2026 00:00:00 +0000</pubDate><guid>https://gurupasupathy.com/post/2026-06-06_building-hce-part-5-connectivity-site-to-site-vpn-establishing/</guid><description>&lt;p&gt;In the &lt;a href="https://pasupathy-guru.medium.com/handson-building-hybrid-cloud-environment-part-4-identity-domain-joining-a-linux-vm-and-59a48a2be7f2?source=friends_link&amp;amp;sk=fba0fcbdf7ae87efd0f4b01ed798c21e"&gt;previous part,&lt;/a&gt; we established an on-premises identity foundation. The on-premises setup consists of a virtual network with Windows and Linux VMs joined to an on-premises Active Directory domain hosted on two domain controllers. In this part, we will create a VPN Gateway in Azure and a StrongSwan IPsec gateway on-premises and establish the Site-to-Site VPN tunnel — the foundation of our hybrid lab.&lt;/p&gt;
&lt;p&gt;Implementing a Site-to-Site (S2S) tunnel is simple — so rather than walking through the steps procedurally, I want to focus on what each component is actually doing.&lt;/p&gt;</description><content:encoded><![CDATA[<p>In the <a href="https://pasupathy-guru.medium.com/handson-building-hybrid-cloud-environment-part-4-identity-domain-joining-a-linux-vm-and-59a48a2be7f2?source=friends_link&amp;sk=fba0fcbdf7ae87efd0f4b01ed798c21e">previous part,</a> we established an on-premises identity foundation. The on-premises setup consists of a virtual network with Windows and Linux VMs joined to an on-premises Active Directory domain hosted on two domain controllers. In this part, we will create a VPN Gateway in Azure and a StrongSwan IPsec gateway on-premises and establish the Site-to-Site VPN tunnel — the foundation of our hybrid lab.</p>
<p>Implementing a Site-to-Site (S2S) tunnel is simple — so rather than walking through the steps procedurally, I want to focus on what each component is actually doing.</p>
<p>A Site-to-Site VPN connects two networks over the public internet using an encrypted IPsec tunnel. Each end has a gateway that authenticates the other using a Pre-Shared Key (PSK). Only traffic destined for the remote subnet goes through the tunnel — everything else uses the normal internet route. The tunnel is managed by IKE (Internet Key Exchange) which negotiates the Security Association (SA) — the agreed encryption parameters — before any traffic flows.</p>
<blockquote>
<p>Before walking through the steps, here are the key addresses we’ll reference throughout</p>
</blockquote>
<blockquote>
<p><code>192.168.122.0/24</code>— On-prem network hosting KVM VMs (virbr0)<br>
<code>192.168.1.106</code> — <strong>on-premises VPN gateway</strong> (Strongswan)<br>
<code>192.168.1.0/24</code> — on-premises Wi-Fi LAN<br>
<code>61.69.136.49</code> — **On-premises public IP<br>
**<code>20.219.67.227</code> — **Azure VPN Gateway Public IP<br>
**<code>10.66.0.0/16</code> — Azure VNet<br>
<code>10.66.0.0/24</code> — Azure GatewaySubnet<br>
<code>10.66.5.0/24</code> — Azure WorkloadSubnet</p>
</blockquote>
<blockquote>
<p>Some of these will be created in the steps below; others are already in place from earlier parts.</p>
</blockquote>
<h4 id="azure-configurations">Azure Configurations</h4>
<h4 id="1-create-a-virtual-network-inazure"><strong>1. Create a virtual network in Azure</strong></h4>
<p>While creating a vnet using Azure portal, decide on an address range and create two subnets named <code>GatewaySubnet</code> and <code>WorkloadSubnet</code> as show below. In <code>WorkloadSubnet</code> we will create VMs that want to talk to on-premises.</p>
<p><img loading="lazy" src="https://cdn-images-1.medium.com/max/800/1*rcxSAYWJVa8TZ75PgThPYg.png"></p>
<p>Once the VNet is provisioned, we need three components at Azure end that enable the connectivity with on-premises — a VPN gateway, a Local Network Gateway and a link between these two components</p>
<h4 id="2-azure-vpngateway"><strong>2. Azure VPN Gateway</strong></h4>
<p>This is the component that is responsible for establishing a secure tunnel with the on-premises. VPN Gateway is deployed in the virtual network you have chosen to pair with your on-premises network and has to be deployed in GatewaySubnet only — this is a hard requirement. Azure reserves this subnet name specifically for gateway infrastructure and rejects deployment attempts to any other subnet name. All traffic comes in and goes out via this VPN Gateway if force tunneling is enabled. Incoming traffic lands in the GatewaySubnet and from there it will be routed to the destination within the VNet.</p>
<p>For our purpose, a Basic tier VPN gateway would suffice. The Azure portal no longer shows a VPN type selector — all new gateways are route-based by default which support IKEv2. This is what we will use.</p>
<blockquote>
<p>Remember that on top of fixed monthly charge, there are costs associated with traffic entering and leaving the network via VPN Gateway — <a href="https://azure.microsoft.com/en-us/pricing/details/vpn-gateway/">https://azure.microsoft.com/en-us/pricing/details/vpn-gateway/</a></p>
</blockquote>
<p>Once the VPN Gateway is created, take a note of the public IP assigned to it. In my case it is — <code>20.219.67.227</code></p>
<h4 id="3-local-networkgateway"><strong>3. Local Network Gateway</strong></h4>
<p>Now that we have a gateway in our Azure VNet, we need a way to identify the on-premises network. For this, an Azure service called Local Network Gateway is used. <em>This is the representation of on-premises network</em>. When you create a Local Network Gateway provide the static IP of your on-premises as the IP address and the network ranges that you want to include in the tunnel as address ranges —</p>
<p>IP address — <code>61.69.136.49</code> (public IP of your Wi-Fi router), you can confirm this by running the below command</p>
<p>curl -s ifconfig.me # 61.69.136.49</p>
<blockquote>
<p>Note: This address can change when your ISP reassigns it, which typically happens on router restart or DHCP lease expiry. For home lab, this is fine but if you are having a serious setup you must consider getting static IP for yourself.</p>
</blockquote>
<p>Address Space(s) —</p>
<p><code>**192.168.122.0/24**</code> (the libvirt virtual network) &amp;<code>**192.168.1.0/24**</code> (your Wi-Fi network)</p>
<blockquote>
<p>Note: You can skip the Wi-Fi range if you do not intend to have other devices in your Wi-Fi to participate in the S2S tunnel</p>
</blockquote>
<h4 id="4-connection"><strong>4. Connection</strong></h4>
<p>And the final bit of the Azure end of configuration is a Connection. A connection is the link between the VPN Gateway and the Local Network Gateway.</p>
<p>From VPN Gateway, create a connection of type “Site-to-Site (IPSec)” and choose IKEv2, provide the shared key (PSK), connection mode and leave the rest as it is.</p>
<h4 id="on-premises-configuration">On-premises configuration</h4>
<p>On-premises needs VPN Gateway configurations similar to the Azure site. The on-premises configuration is simpler by comparison. We will use StrongSwan as the VPN Gateway and the following sections walk through the necessary configurations to enable a site-to-site tunnel</p>
<h4 id="5-install--configure-strongswan"><strong>5. Install / Configure</strong> <a href="https://docs.strongswan.org/docs/latest/howtos/introduction.html"><strong>StrongSwan</strong></a></h4>
<p>StrongSwan is an open-source IPsec implementation for Linux. It runs as a daemon on the on-premises host and is responsible for IKE negotiation, SA establishment, and installing the resulting XFRM policies and keys into the Linux kernel.</p>
<p>Install StrongSwan and ensure it is running</p>
<p>sudo apt install strongswan<br>
sudo systemctl enable strongswan-starter<br>
sudo systemctl start strongswan-starter<br>
sudo systemctl status strongswan-starter</p>
<h4 id="6-ipsecconf"><strong>6. ipsec.conf</strong></h4>
<p>This is StrongSwan’s main configuration file — it defines the tunnel connection parameters including peer identities, the subnets to advertise on each side, the encryption proposals, and the connection behaviour on startup and failure.</p>
<p>sudo nano /etc/ipsec.conf</p>
<p>Minimal config:</p>
<p>config setup<br>
charondebug=&ldquo;ike 2, knl 2, cfg 2&rdquo;</p>
<p>conn azure-s2s<br>
keyexchange=ikev2<br>
left=&lt;onprem_vpn_gateway&gt; #Linux box with Strongswan<br>
leftid=&lt;onprem_vpn_gateway&gt; #Linux box with Strongswan<br>
leftsubnet=&lt;onprem_subnet_1_range&gt;,&lt;onprem_subnet_2_range&gt;<br>
right=&lt;azure_vpn_gateway_public_ip&gt;<br>
rightid=&lt;azure_vpn_gateway_public_ip&gt;<br>
rightsubnet=&lt;azure_workload_subnet_address_range&gt;<br>
authby=secret<br>
auto=start<br>
ike=aes256-sha256-modp1024! (acceptable for lab environments)<br>
esp=aes256-sha256!<br>
dpdaction=restart<br>
dpddelay=30s<br>
dpdtimeout=120s</p>
<p>Each attribute controls a specific aspect of how StrongSwan negotiates and maintains the tunnel:</p>
<p><code>**keyexchange=ikev2**</code> — specifies IKEv2 as the key exchange protocol. IKEv2 is more efficient than IKEv1 (fewer round trips to establish the SA) and handles NAT traversal natively, which matters here since the on-premises side is behind a home router.</p>
<p><code>**left**</code> <strong>/</strong> <code>**leftid**</code> — identifies the local end of the tunnel. <code>left</code> is the IP StrongSwan binds to; <code>leftid</code> is how it identifies itself to the remote peer during IKE negotiation. Both are set to the StrongSwan host&rsquo;s LAN IP here.</p>
<p><code>**leftsubnet**</code> — defines what on-premises ranges StrongSwan advertises through the tunnel. These must match the address spaces configured in the Azure Local Network Gateway — Azure uses the LNG configuration to inject routes into the VNet.</p>
<p><code>**right**</code> <strong>/</strong> <code>**rightid**</code> — the mirror of <code>left</code>, identifying the remote peer — in this case the Azure VPN Gateway&rsquo;s public IP.</p>
<p><code>**rightsubnet**</code> — the network ranges behind the Azure VPN Gateway that on-premises should route through the tunnel. Traffic destined for these ranges will be intercepted by XFRM and encrypted.</p>
<p><code>**authby=secret**</code> — use a Pre-Shared Key for authentication, as configured in <code>ipsec.secrets</code>.</p>
<p><code>**auto=start**</code> — bring the tunnel up automatically when StrongSwan starts. Setting this to <code>add</code> instead would make StrongSwan a passive responder only.</p>
<p><code>**ike=aes256-sha256-modp1024!**</code> — the Phase 1 (IKE SA) proposal: AES-256 encryption, SHA-256 integrity, and Diffie-Hellman group 2 (modp1024 — <strong><em>acceptable for lab environments</em></strong>). The trailing <code>!</code> means this is the only proposal offered — StrongSwan will not fall back to weaker algorithms. Azure must match this exactly.</p>
<p><code>**esp=aes256-sha256!**</code> — the Phase 2 (ESP) proposal governing how actual data packets are encrypted inside the tunnel. Same strict-match semantics as the <code>ike</code> line.</p>
<p><code>**dpdaction=restart**</code> — Dead Peer Detection behaviour. If the remote peer goes silent, StrongSwan will attempt to re-establish the tunnel rather than leave a stale SA. <code>dpddelay</code> and <code>dpdtimeout</code> control how long it waits before declaring the peer dead.</p>
<p>Example —</p>
<p>config setup<br>
charondebug=&ldquo;ike 2, knl 2, cfg 2&rdquo;</p>
<p>conn azure-s2s-manual<br>
keyexchange=ikev2<br>
left=192.168.1.106<br>
leftid=192.168.1.106<br>
leftsubnet=192.168.122.0/24,192.168.1.0/24<br>
right=20.219.67.227 # (hce-d01-vpngw-pip)<br>
rightid=20.219.67.227 # (hce-d01-vpngw-pip)<br>
rightsubnet=10.66.5.0/24<br>
authby=secret<br>
auto=start<br>
ike=aes256-sha256-modp1024!<br>
esp=aes256-sha256!<br>
dpdaction=restart<br>
dpddelay=30s<br>
dpdtimeout=120s</p>
<blockquote>
<p><code>**_rightsubnet_**</code> <strong>= where your workloads live = what you want to reach.</strong></p>
</blockquote>
<blockquote>
<p><code>_GatewaySubnet_</code> is infrastructure — it&rsquo;s where Azure&rsquo;s VPN Gateway itself runs. You never deploy VMs there, and you never put it in <code>_rightsubnet_</code>. It&rsquo;s not a destination, it&rsquo;s a transit point.</p>
</blockquote>
<blockquote>
<p>So the mental model:</p>
</blockquote>
<blockquote>
<p>rightsubnet = subnets behind the remote gateway<br>
 = where the actual VMs/services are<br>
 = NOT the gateway’s own subnet</p>
</blockquote>
<blockquote>
<p>Same logic applies symmetrically to <code>_leftsubnet_</code> on your side — it&rsquo;s the subnets behind your StrongSwan (your VM network, your LAN), not StrongSwan&rsquo;s own IP.</p>
</blockquote>
<blockquote>
<p><strong>The gateway subnet on both sides is implied</strong> — both ends know the gateways exist because they’re talking to each other. What they need to tell each other is “what’s <em>behind</em> me that you can reach.”</p>
</blockquote>
<h4 id="7-ipsecsecrets"><strong>7. ipsec.secrets</strong></h4>
<p>The <code>ipsec.secrets</code> file is read by Charon — StrongSwan&rsquo;s IKEv2 keying daemon — at startup and on <code>ipsec reload</code>. It holds the Pre-Shared Key used to authenticate both peers during IKE Phase 1. The format is:</p>
<p>&lt;local-id&gt; &lt;remote-id&gt; : PSK &ldquo;shared-secret&rdquo;</p>
<p>The two IPs identify the tunnel endpoints — they must match the <code>leftid</code> and <code>rightid</code> values in <code>ipsec.conf</code> exactly, because charon looks up the secret by matching the peer identities presented during IKE negotiation. The PSK itself must match what was configured in the Azure Connection resource, character for character.</p>
<p>This file does not change structure over the lifetime of the tunnel. The only reason to update it is if you rotate the PSK — in Azure you set a new shared key on the Connection, then update the value here and run <code>sudo ipsec reload secrets</code> to pick it up without restarting the daemon or dropping the tunnel.</p>
<p>One important operational note: this file contains a plaintext secret and should be owned by root with permissions <code>600</code>. StrongSwan will warn if it is world-readable.</p>
<p>sudo nano /etc/ipsec.secrets</p>
<p>192.168.1.106 20.219.67.227 : PSK &ldquo;your_shared_key&rdquo;</p>
<p>We have setup all necessary infrastructure to bring up the tunnel now. But before that, let us understand a bit about how tunnels are established</p>
<h4 id="under-the-hood-the-mechanics-of-tunnel-initiation">Under the Hood: The Mechanics of Tunnel Initiation</h4>
<p><em>The S2S tunnel was initiated from on-premises —</em> <br>
When StrongSwan sends the initial IKE packet outbound (src: <code>192.168.1.106:500</code> → dst: <code>20.219.67.227:500</code>), the Wi-Fi router performs source NAT — replacing <code>192.168.1.106</code> with the public IP <code>61.69.136.49</code> — and records this translation in its conntrack table. When Azure replies, the router matches the inbound packet against that entry and reverses the translation, forwarding the packet to <code>192.168.1.106</code>.</p>
<p><em>The S2S tunnel is initiated from Azure —</em> <br>
in this scenario, the Azure VPN Gateway is the initiator of the tunnel. This will fail to even establish a tunnel unless a <strong>port-forwarding</strong> is configured on the Wi-Fi router. If you are curious, here are the steps to have Azure initiate the tunnel. <br>
1. <em>Set auto=add</em> in ipsec.conf — tells StrongSwan that it should not initiate tunnel and just be a responder, StrongSwan won’t initiate the tunnel on startup and won’t re-initiate if it drops.</p>
<p>2. <em>Set Connection Mode</em> to <code>_InitiatorOnly_</code> in Azure Local Network Gateway &raquo; Connection &raquo; [Your Connection] &raquo; Configuration blade properties.</p>
<p>3. Enable port forwarding in your Wi-Fi router with these values <br>
<code>_1. InternalIP:192.168.1.106 — InternalPort:500 — Protocol:UDP — ExternalPort:500   2. InternalIP:192.168.1.106 — InternalPort:4500 — Protocol:UDP — ExternalPort:4500_</code><br>
Port 500 is used for the initial IKE handshake. Port 4500 is used for NAT Traversal (NAT-T) — once both sides detect a NAT device on the path, all subsequent IKE and <a href="https://docs.strongswan.org/docs/latest/howtos/ipsecProtocol.html#_encapsulating_security_payload_esp">ESP</a> traffic moves to UDP:4500.</p>
<p>These rules tell the Wi-Fi router to forward any inbound UDP:500 and UDP:4500 traffic arriving on the WAN interface to <code>_192.168.1.106_</code>, regardless of the source. Without these rules, the router has no NAT mapping for unsolicited inbound IKE packets and drops them.</p>
<blockquote>
<p><strong>IMPORTANT — Who initiates the tunnel has no bearing on data packet routing between the sites as long as the tunnel is UP. Once the IPsec SA is established, the tunnel is a symmetric pipe — packets flow freely in both directions regardless of which side initiated IKE. Once the tunnel is up, the success and failure points are identical in both cases.</strong></p>
</blockquote>
<h4 id="bring-the-tunnelup"><strong>Bring the tunnel up</strong></h4>
<p>With both sides configured, bring the tunnel up from the <em>StrongSwan host</em>. Follow the below steps to bring the tunnel up.</p>
<p># Check current state<br>
sysctl net.ipv4.ip_forward</p>
<p># Enable if 0<br>
sudo sysctl -w net.ipv4.ip_forward=1</p>
<blockquote>
<p>ip_forward must be enabled for the StrongSwan host to forward packets between interfaces — we will cover exactly why in Part 6.</p>
</blockquote>
<p>#Start strongswan - if is just installed or not running else run ipsec restart and skip the second step<br>
sudo ipsec start</p>
<p>#Bring connection up<br>
sudo ipsec up azure-s2s-manual</p>
<p>#Verify<br>
sudo ipsec status</p>
<p><code>ipsec status</code> output should show the tunnel as <code>ESTABLISHED</code></p>
<p>Security Associations (1 up, 0 connecting):<br>
azure-s2s-manual[1]: ESTABLISHED 13 minutes ago, 192.168.1.106[192.168.1.106]&hellip;20.219.67.227[20.219.67.227]<br>
azure-s2s-manual{1}:  INSTALLED, TUNNEL, reqid 1, ESP in UDP SPIs: c0a6aaf5_i 90cbbb64_o<br>
azure-s2s-manual{1}:   192.168.1.0/24 192.168.122.0/24 === 10.66.5.0/24</p>
<h4 id="summary">Summary</h4>
<p>Now that the tunnel is established, it’s worth mapping out the traffic flows it enables — and one it doesn’t, yet. There are 3 scenarios of bi-directional traffic flow in our setup between —</p>
<ol>
<li>Azure Virtual Machines and On-premises Virtual Machines</li>
<li>Azure Virtual Machines and On-premises VPN Gateway</li>
<li>On-premises VPN Gateway and the On-premises Virtual Machines</li>
</ol>
<p>Except the packet flow from Azure virtual machine destined for on-premises KVM virtual machines (which sit behind StrongSwan on the libvirt network), all of the above will work without any additional configuration.</p>
<p>In the next part of the series we will discuss why they work and why one of them doesn’t.</p>
]]></content:encoded></item><item><title>API Specification and Policy Updates in Azure APIM Are Zero Downtime</title><link>https://gurupasupathy.com/post/2026-06-05_apim_policy_update_0downtime/</link><pubDate>Fri, 05 Jun 2026 00:00:00 +0000</pubDate><guid>https://gurupasupathy.com/post/2026-06-05_apim_policy_update_0downtime/</guid><description>&lt;p&gt;Does APIM support zero downtime deployment? — To answer this question, multiple factors need to be ascertained, like, What is the SKU? Have you opted for Availability zones? etc. In fact, the question needs to be qualified further. What do you mean by zero downtime deployment?&lt;/p&gt;
&lt;p&gt;In the case of APIM, there are infrastructure changes and then there are gateway configuration changes like API specifications and policies. So, the answer depends on — SKU, AZ, “what” kind of changes&lt;/p&gt;</description><content:encoded><![CDATA[<p>Does APIM support zero downtime deployment? — To answer this question, multiple factors need to be ascertained, like, What is the SKU? Have you opted for Availability zones? etc. In fact, the question needs to be qualified further. What do you mean by zero downtime deployment?</p>
<p>In the case of APIM, there are infrastructure changes and then there are gateway configuration changes like API specifications and policies. So, the answer depends on — SKU, AZ, “what” kind of changes</p>
<p>From the official documentation:</p>
<p>“When you change availability zone configuration, the changes can take 15 to 45 minutes or more to apply. The API Management gateway can continue to handle API requests during this time.”</p>
<p>Gateway configuration, such as APIs and policy definitions, regularly synchronizes between the availability zones that you select for the instance. Propagation of updates between the availability zones normally takes less than 10 seconds.</p>
<p>Active requests: When an availability zone is unavailable, any requests in progress that are connected to an API Management unit in the faulty availability zone are terminated and need to be retried.</p>
<p>Automatic: You can expect instances that use automatic availability zone support to have no downtime during an availability zone outage. Units in the unaffected zone or zones continue to work.</p>
<p>“You can also expect instances that use automatic availability zone support, but have a single unit, to have no downtime.” In this case, API Management distributes the unit’s underlying compute resources to two zones. The resource in the unaffected zone continues to work.</p>
<p>Zone-redundant: You can expect zone-redundant instances to have no downtime during an availability zone outage.</p>
<p>My personal view based on this is —API specifications and Policy updates won’t cause any non-recoverable failures to the consumers; provided retry strategy is in place.</p>
<p>Is it zero downtime? Zero downtime need not mean every request succeeds on the first attempt. If the system remains available and failures are recoverable, it meets the zero-downtime requirement. So — Yes.</p>
<p>Confirmation from Microsoft Question and Answer Forum
To validate my understanding, I reached out to the MS Q&amp;A forum and got a response consistent to the above understanding.</p>
<p>Here is the link to the question in the forum that has the official response.</p>
<p>Bottom line — API specification and Policy updates are zero downtime.</p>
]]></content:encoded></item><item><title>Choosing the Right TokenCredential and How AZURE CLIENT ID Influences Identity Selection — A…</title><link>https://gurupasupathy.com/post/2026-05-20_choosing-the-right-tokencredential-and-how-azure-client-id-influences-identity-selection/</link><pubDate>Wed, 20 May 2026 00:00:00 +0000</pubDate><guid>https://gurupasupathy.com/post/2026-05-20_choosing-the-right-tokencredential-and-how-azure-client-id-influences-identity-selection/</guid><description>&lt;p&gt;Photo by Matt Halls on Unsplash&lt;/p&gt;
&lt;p&gt;&lt;img loading="lazy" src="https://cdn-images-1.medium.com/max/800/1*Z-K3Yu80zshHVrnXpLJ0FA.jpeg"&gt;&lt;/p&gt;
&lt;p&gt;Photo by &lt;a href="https://unsplash.com/@matthalls?utm_source=unsplash&amp;amp;utm_medium=referral&amp;amp;utm_content=creditCopyText"&gt;Matt Halls&lt;/a&gt; on &lt;a href="https://unsplash.com/photos/a-very-tall-building-with-lots-of-windows-KeQiUCKNqOc?utm_source=unsplash&amp;amp;utm_medium=referral&amp;amp;utm_content=creditCopyText"&gt;Unsplash&lt;/a&gt;&lt;/p&gt;
&lt;h4 id="introduction"&gt;Introduction&lt;/h4&gt;
&lt;p&gt;I have been using the DefaultAzureCredential class for a long time without understanding how it works. So, I jotted down my notes and learnings in this write-up for future me — and maybe you will find it useful too.&lt;/p&gt;
&lt;h4 id="tokencredential"&gt;TokenCredential&lt;/h4&gt;
&lt;p&gt;&lt;code&gt;TokenCredential&lt;/code&gt; is the abstract base class representing a source of authentication tokens for Azure services. Many classes derive from TokenCredential but the most interesting ones are DefaultAzureCredential and ChainedTokenCredential.&lt;/p&gt;</description><content:encoded><![CDATA[<p>Photo by Matt Halls on Unsplash</p>
<p><img loading="lazy" src="https://cdn-images-1.medium.com/max/800/1*Z-K3Yu80zshHVrnXpLJ0FA.jpeg"></p>
<p>Photo by <a href="https://unsplash.com/@matthalls?utm_source=unsplash&amp;utm_medium=referral&amp;utm_content=creditCopyText">Matt Halls</a> on <a href="https://unsplash.com/photos/a-very-tall-building-with-lots-of-windows-KeQiUCKNqOc?utm_source=unsplash&amp;utm_medium=referral&amp;utm_content=creditCopyText">Unsplash</a></p>
<h4 id="introduction">Introduction</h4>
<p>I have been using the DefaultAzureCredential class for a long time without understanding how it works. So, I jotted down my notes and learnings in this write-up for future me — and maybe you will find it useful too.</p>
<h4 id="tokencredential">TokenCredential</h4>
<p><code>TokenCredential</code> is the abstract base class representing a source of authentication tokens for Azure services. Many classes derive from TokenCredential but the most interesting ones are DefaultAzureCredential and ChainedTokenCredential.</p>
<blockquote>
<p>I’m using package version :Azure.Identity v1.20.0</p>
</blockquote>
<h4 id="defaultazurecredential">DefaultAzureCredential</h4>
<p>This class is a pre-built chain covering the most common authentication methods. When using DefaultAzureCredential to acquire a token, the class attempts to acquire a token via each of the below credentials, in the following order, stopping when one provides a token:</p>
<ul>
<li><a href="https://learn.microsoft.com/en-us/dotnet/api/azure.identity.environmentcredential?view=azure-dotnet">EnvironmentCredential</a></li>
<li><a href="https://learn.microsoft.com/en-us/dotnet/api/azure.identity.workloadidentitycredential?view=azure-dotnet">WorkloadIdentityCredential</a></li>
<li><a href="https://learn.microsoft.com/en-us/dotnet/api/azure.identity.managedidentitycredential?view=azure-dotnet">ManagedIdentityCredential</a></li>
<li><a href="https://learn.microsoft.com/en-us/dotnet/api/azure.identity.visualstudiocredential?view=azure-dotnet">VisualStudioCredential</a></li>
<li><a href="https://learn.microsoft.com/en-us/dotnet/api/azure.identity.visualstudiocodecredential?view=azure-dotnet">VisualStudioCodeCredential</a> (enabled by default for SSO with VS Code on supported platforms when Azure.Identity.Broker is installed)</li>
<li><a href="https://learn.microsoft.com/en-us/dotnet/api/azure.identity.azureclicredential?view=azure-dotnet">AzureCliCredential</a></li>
<li><a href="https://learn.microsoft.com/en-us/dotnet/api/azure.identity.azurepowershellcredential?view=azure-dotnet">AzurePowerShellCredential</a></li>
<li><a href="https://learn.microsoft.com/en-us/dotnet/api/azure.identity.azuredeveloperclicredential?view=azure-dotnet">AzureDeveloperCliCredential</a></li>
<li><a href="https://learn.microsoft.com/en-us/dotnet/api/azure.identity.interactivebrowsercredential?view=azure-dotnet">InteractiveBrowserCredential</a> (not included by default; can use brokered authentication if Azure.Identity.Broker is installed)</li>
</ul>
<blockquote>
<p>source: <a href="https://learn.microsoft.com/en-us/dotnet/api/azure.identity.defaultazurecredential?view=azure-dotnet">DefaultAzureCredential Class (Azure.Identity) — Azure for .NET Developers | Microsoft Learn</a></p>
</blockquote>
<h4 id="defaultazurecredential-with-exclusion">DefaultAzureCredential with Exclusion</h4>
<p>DefaultAzureCredential also supports options that allow you to exclude credentials from evaluation. This is useful if you don’t want to use certain credentials, for example, when running my function locally, I don’t want to use the VisualStudio or VisualStudioCode credential as I prefer AzureCliCredential.</p>
<p>new DefaultAzureCredential(<br>
new DefaultAzureCredentialOptions<br>
{<br>
ExcludeVisualStudioCodeCredential = true,<br>
ExcludeVisualStudioCredential = true<br>
});</p>
<h4 id="chainedtokencredential">ChainedTokenCredential</h4>
<p>In some cases, you will know exactly which credentials you want to use. ChainedTokenCredential is very useful in such cases. It evaluates only the credentials you explicitly specify, in the order provided. I use this locally. For example, below I choose to use only CLI and VS credentials when my function is running locally.</p>
<p>new ChainedTokenCredential(<br>
new AzureCliCredential(),<br>
new VisualStudioCredential()<br>
));</p>
<h4 id="why-managed-identity-fails-locally-but-works-inazure">Why Managed Identity fails locally but works in Azure</h4>
<p><code>DefaultAzureCredential</code> attempts ManagedIdentityCredential (which is unavailable locally — IMDS timeout) and falls through to developer credentials — VS, CLI, etc (refer the table above). The first credential in the chain that can successfully acquire a token is used.</p>
<blockquote>
<p>Note: DefaultAzureCredential evaluates<br>
EnvironmentCredential, then WorkloadIdentityCredential,<br>
followed by ManagedIdentityCredential.</p>
</blockquote>
<p>There is no native way to emulate or impersonate a managed identity locally. IMDS (<code>169.254.169.254</code>) is a hypervisor-level endpoint that only exists on Azure compute. It is physically not present on your laptop. The common alternative would be to use a service principal with similar privileges as the UAMI to test your function.</p>
<p><strong>In Azure:</strong> The chain gets to <code>ManagedIdentityCredential</code>, IMDS responds, token acquired. Everything below it never runs.</p>
<p><strong>Locally:</strong> IMDS doesn’t exist, so <code>ManagedIdentityCredential</code> times out and falls through.</p>
<p>When DefaultAzureCredential is used, the evaluation would like this (assuming none of the credentials are able to provide a token) —</p>
<p>//When running locally there is no IMDS to supply managed identity token<br>
//Assuming VS and other credentials don&rsquo;t have access to the resource.<br>
//This is how DefaultAzureCredentials evaluates the chain<br>
EnvironmentCredential → skipped (env vars not set)<br>
WorkloadIdentityCredential → skipped (not configured)<br>
ManagedIdentityCredential → unavailable/failure (no IMDS endpoint locally)<br>
VisualStudioCredential → failed <br>
VisualStudioCodeCredential → failed<br>
AzureCliCredential → failed<br>
AzurePowerShellCredential → failed<br>
AzureDeveloperCliCredential → failed<br>
InteractiveBrowserCredential → Not included by default (must be explicitly enabled)</p>
<h4 id="how-to-configure-for-local-debugging">How to configure for local debugging</h4>
<p>One approach is to use a factory that returns different credential implementations depending on the execution environment.</p>
<p>For example, when the environment is local, a factory can return a DefaultAzureCredential where you can exclude Visual Studio and Visual Studio Code credentials if you favour AzureCliCredential. Or, better still, if you want to use only Azure CLI or VS credentials, it can return a ChainedTokenCredential with just those two credentials, as shown below</p>
<p>new ChainedTokenCredential(<br>
new AzureCliCredential(),<br>
new VisualStudioCredential()<br>
)</p>
<p>When the environment is Azure, it can just return a DefaultAzureCredential instance or a ChainedTokenCredential as discussed earlier if you are sure about the credential you want to use. If you want to use a specific credential, it can be used directly without DefaultAzureCredential or ChainedTokenCredential. For example, here I’m using a specific credential class —</p>
<p>new ManagedIdentityCredential(<br>
ManagedIdentityId.FromUserAssignedClientId(&laquo;your-uami-ClientId&raquo;)))</p>
<p>A sample flow will look as below when you use a ChainedTokenCredential as shown previously —</p>
<p>AzureCliCredential → acquire token - SUCCESS<br>
VisualStudioCredential → skipped</p>
<p>Notice that only the two credentials mentioned in the <code>ChainedTokenCredential</code> chain are evaluated.</p>
<h4 id="azure_client_id-influence-in-identity-selection">AZURE_CLIENT_ID influence in identity selection</h4>
<p>Many Azure resources can have both System Assigned Managed Identity (SAMI) and User Assigned Managed Identity (UAMI). It is crucial to understand how the AZURE_CLIENT_ID environment variable influences how Azure SDK authentication selects a managed identity. This is not always obvious, and I could not find it clearly documented anywhere</p>
<p>AZURE_CLIENT_ID set?<br>
│<br>
├── YES<br>
│   │<br>
│   └── Which credential in code?<br>
│       │<br>
│       ├── DefaultAzureCredential()<br>
│       │   └── ✅ UAMI  (via AZURE_CLIENT_ID)<br>
│       │<br>
│       ├── ManagedIdentityCredential(id)<br>
│       │   └── ✅ UAMI  (via explicit id, ignores AZURE_CLIENT_ID)<br>
│       │<br>
│       └── ManagedIdentityCredential()<br>
│           └── ⚠️  SAMI  (ignores AZURE_CLIENT_ID)<br>
│<br>
└── NO<br>
│<br>
└── Which credential in code?<br>
│<br>
├── DefaultAzureCredential()<br>
│   └── ⚠️  SAMI<br>
│<br>
├── ManagedIdentityCredential(id)<br>
│   └── ✅ UAMI  (via explicit id)<br>
│<br>
└── ManagedIdentityCredential()<br>
└── ⚠️  SAMI  (no id provided)</p>
<p>Key rule: ManagedIdentityCredential() does not use<br>
AZURE_CLIENT_ID to select a user-assigned managed identity.<br>
Only DefaultAzureCredential does</p>
<p>Note: the flowchart assumes a System Assigned Managed Identity is present. In cases where SAMI is absent and no UAMI is explicitly provided, token acquisition will fail</p>
<blockquote>
<p>Authentication between a Function App and its AzureWebJobsStorage is an independent flow not covered by the flowchart above. See [<a href="https://pasupathy-guru.medium.com/using-managed-identity-for-function-app-authentication-with-its-storage-account-ad352a609abe?source=friends_link&amp;sk=c97ecbf3bbdb14ff8dc7318a70355f3a">Using Managed Identity for Function App Authentication with its Storage Account</a>] for a detailed walkthrough.</p>
</blockquote>
<h4 id="summary">Summary</h4>
<p><code>DefaultAzureCredential</code> is environment-aware by design — the same code uses managed identity in Azure and falls through to developer credentials locally. This means local failures don&rsquo;t always predict Azure failures, and the identity that succeeds locally may be in a different tenant than your Azure resources. For local testing against tenant-specific resources, ensure <code>az login --tenant &lt;tenant-id&gt;</code> is used explicitly, not just any <code>az login</code>. To test managed identity behaviour you must deploy, or substitute a service principal with matching roles via <code>EnvironmentCredential</code>.</p>
<p>Reference — <a href="https://learn.microsoft.com/en-us/dotnet/azure/sdk/authentication/best-practices?tabs=aspdotnet">Authentication best practices with the Azure Identity library for .NET — .NET | Microsoft Learn</a></p>
]]></content:encoded></item><item><title>Using Managed Identity for Function App Authentication with its Storage account</title><link>https://gurupasupathy.com/post/2026-05-19_using-mi-for-function-app-authentication-with-its-storage-account/</link><pubDate>Tue, 19 May 2026 00:00:00 +0000</pubDate><guid>https://gurupasupathy.com/post/2026-05-19_using-mi-for-function-app-authentication-with-its-storage-account/</guid><description>&lt;p&gt;Recently, while setting up a Function App to use User Assigned Managed Identity (UAMI) to authenticate to its &lt;strong&gt;AzureWebJobsStorage&lt;/strong&gt; I encountered &lt;code&gt;SyncTrigger&lt;/code&gt;failure.&lt;/p&gt;
&lt;p&gt;I checked whether the UAMI had necessary RBAC roles to work on &lt;strong&gt;AzureWebJobsStorage&lt;/strong&gt; — it had. So, I wasn’t sure what the issue was.&lt;/p&gt;
&lt;p&gt;Analyzing further, I realized I had skipped a few mandatory variable settings to enable UAMI based authentication to &lt;strong&gt;AzureWebJobsStorage&lt;/strong&gt; (setting the environment variable &lt;code&gt;AzureWebJobsStorage__accountName&lt;/code&gt; alone does not suffice)&lt;/p&gt;</description><content:encoded><![CDATA[<p>Recently, while setting up a Function App to use User Assigned Managed Identity (UAMI) to authenticate to its <strong>AzureWebJobsStorage</strong> I encountered <code>SyncTrigger</code>failure.</p>
<p>I checked whether the UAMI had necessary RBAC roles to work on <strong>AzureWebJobsStorage</strong> — it had. So, I wasn’t sure what the issue was.</p>
<p>Analyzing further, I realized I had skipped a few mandatory variable settings to enable UAMI based authentication to <strong>AzureWebJobsStorage</strong> (setting the environment variable <code>AzureWebJobsStorage__accountName</code> alone does not suffice)</p>
<h4 id="steps-to-enable-uami-access-to-azurewebjobsstorage">Steps to enable UAMI access to <strong>AzureWebJobsStorage</strong></h4>
<p>Enabling UAMI access to AzureWebJobStorage involves changes in Terraform (when the Function App is created), the App Settings (Environment variables) and finally the Role Based Access.</p>
<p><strong>Terraform</strong></p>
<p>If for some reason you want to use UAMI to authenticate with <strong>AzureWebJobsStorage</strong>, then <strong>Terraform</strong> block <code>**functionAppConfig.deployment.storage.authentication**</code><strong>:</strong> should look like below</p>
<blockquote>
<p>Note: I am using Flex Consumption tier</p>
</blockquote>
<p>authentication = {<br>
type                           = &ldquo;userassignedidentity&rdquo;<br>
userAssignedIdentityResourceId = &ldquo;<full ARM resource ID of UAMI>&rdquo;<br>
}</p>
<p>This tells the platform to use UAMI for the deployment package blob container — the part that isn’t controlled by app settings.</p>
<p><strong>App settings</strong></p>
<p>Once the Function App is deployed with usermanagedidentity as authentication type (terraform), ensure the below variables are set in the Function App’s Environment variables</p>
<p>AzureWebJobsStorage__accountName  = <storage account name><br>
AzureWebJobsStorage__credential   = managedidentity<br>
AzureWebJobsStorage__clientId     = <UAMI client ID GUID></p>
<p>All three settings are mandatory.</p>
<p><strong>RBAC</strong></p>
<p>This is the final bit. We have the Function App deployed, environment variables set, next, the UAMI needs privilege to access the storage account.</p>
<p>Provide <code>Storage Blob Data Owner</code> owner role to the UAMI on the storage account</p>
<p>With these three changes, your Function App will authenticate with its <strong>AzureWebJobsStorage</strong> using UAMI.</p>
<blockquote>
<p><strong><em>Caveat</em></strong>: Although this works, the issue with this approach is all services that are assigned this UAMI will gain access to the function’s storage account. This is not ideal if many services share the same UAMI. The better option will be to use System Assigned Managed Identity (SAMI) for authentication between Function App and its storage account. For the rest of the outbound calls that the functions might make, use UAMI.</p>
</blockquote>
<h4 id="using-system-assigned-managedidentity">Using System Assigned Managed Identity</h4>
<p>To use SAMI just set<code>AzureWebJobsStorage__accountName</code> — SAMI is the default, no additional settings needed. Next, give SAMI <code>Storage Blob Data Owner</code> on the storage account. If you are using Terraform to deploy the authentication block of the Function App will look like this —</p>
<p>authentication = {<br>
type = &ldquo;systemassignedidentity&rdquo;<br>
}</p>
<p>SAMI is my preferred method for authentication with the <strong>AzureWebJobsStorage</strong> for the reasons already discussed in the caveat section.</p>
<h4 id="summary">Summary</h4>
<p>Configuring a Function App to authenticate with its AzureWebJobsStorage using managed identity requires changes at three levels — Terraform, app settings, and RBAC — and all three must be consistent with each other. For UAMI, all three <code>AzureWebJobsStorage__*</code> settings are mandatory; omitting any one of them will cause the runtime to fail. However, personally I feel UAMI for AzureWebJobsStorage is rarely the right choice — since UAMI is a shared identity, every service assigned to it inherits access to the storage account. SAMI, which requires only <code>AzureWebJobsStorage__accountName</code> and a single role assignment, is the simpler and safer default for this use case.</p>
<blockquote>
<p>Reference — <a href="https://techcommunity.microsoft.com/blog/appsonazureblog/use-user-managed-identity-to-replace-connection-string-inazurewebjobsstorage-for/3891026">Use User managed identity to replace connection string in”AzureWebJobsStorage” for function apps | Microsoft Community Hub</a></p>
</blockquote>
]]></content:encoded></item><item><title>HandsOn — Building Hybrid Cloud Environment — Part 4— Identity — Domain-Joining a Linux VM and…</title><link>https://gurupasupathy.com/post/2026-05-02_building-hce-part-4--identity-domain-joining-a-linux-vm/</link><pubDate>Sat, 02 May 2026 00:00:00 +0000</pubDate><guid>https://gurupasupathy.com/post/2026-05-02_building-hce-part-4--identity-domain-joining-a-linux-vm/</guid><description>&lt;p&gt;In the &lt;a href="https://pasupathy-guru.medium.com/handson-building-hybrid-cloud-environment-part-3-identity-second-dc-and-dc-replication-3f9ae9e5c651?source=friends_link&amp;amp;sk=cde84a39160f76f4d64ef3e842b38e8b"&gt;previous&lt;/a&gt; parts, we created a primary and secondary domain controller and tested the domain join from Windows client VM. In this part, we will domain-join a Linux VM to the domain controllers we created. The main purpose is to introduce a non-Windows system into the domain to test Kerberos authentication against Active Directory. We will —&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Provision a new Linux VM&lt;/li&gt;
&lt;li&gt;Assign the DC IP&lt;/li&gt;
&lt;li&gt;Install Linux Kerberos client tool&lt;/li&gt;
&lt;li&gt;Join the domain&lt;/li&gt;
&lt;li&gt;Validation&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;img loading="lazy" src="https://cdn-images-1.medium.com/max/1200/1*0xW3AuDQsHnXJR5dfYIdXw.png"&gt;&lt;/p&gt;</description><content:encoded><![CDATA[<p>In the <a href="https://pasupathy-guru.medium.com/handson-building-hybrid-cloud-environment-part-3-identity-second-dc-and-dc-replication-3f9ae9e5c651?source=friends_link&amp;sk=cde84a39160f76f4d64ef3e842b38e8b">previous</a> parts, we created a primary and secondary domain controller and tested the domain join from Windows client VM. In this part, we will domain-join a Linux VM to the domain controllers we created. The main purpose is to introduce a non-Windows system into the domain to test Kerberos authentication against Active Directory. We will —</p>
<ol>
<li>Provision a new Linux VM</li>
<li>Assign the DC IP</li>
<li>Install Linux Kerberos client tool</li>
<li>Join the domain</li>
<li>Validation</li>
</ol>
<p><img loading="lazy" src="https://cdn-images-1.medium.com/max/1200/1*0xW3AuDQsHnXJR5dfYIdXw.png"></p>
<p>The fundamental domain join mechanics are the same for Windows and Linux. The underlying authentication protocol (Kerberos) is identical for both — the difference is integration depth. Windows has native AD support built in, whereas Linux requires explicit configuration via tools like realmd, SSSD, and the Kerberos client utilities. So, let’s get started with a new VM.</p>
<h3 id="provision-a-new-ubuntuvm">Provision a new Ubuntu VM</h3>
<p>We will create a VM based on Ubuntu 22.04 LTS for our virtual network. You need to download the Ubuntu iso and create a VM using virt-manager following the regular VM creation process. Once the VM is up and running, let’s start with some connectivity checks.</p>
<p><strong>Connectivity Checks:</strong></p>
<p><strong>Ping DCs:</strong></p>
<p>ping 192.168.122.10<br>
ping 192.168.122.11</p>
<p>This will work because the Linux VM will be created in the same network range as the Windows client or the DCs. If this is not the case, ensure you map the VM to the relevant network using the virt-manager (This can happen if you have multiple virtual networks running in your system)</p>
<h3 id="add-the-dc-ip-to-the-resolvconf">Add the DC IP to the resolv.conf</h3>
<p>This step is like what we do for a Windows client, just that we do it a bit differently. The Linux VM should use the Domain Controller’s IP as its DNS server. The fresh Linux VM will have the DNS pointing to itself at <code>127.0.0.53</code></p>
<p>If you remember, we used the GUI to change the preferred DNS for the VM in Windows. In Linux, the DNS server details reside in <code>/etc/systemd/resolved.conf</code></p>
<blockquote>
<p>On modern Ubuntu systems, the file <code>/etc/resolv.conf</code> is <strong>not meant to be edited directly</strong> because it is automatically generated and managed by the <code>systemd-resolved</code> service. Any manual changes you make will be overwritten. Instead, you should configure DNS settings in the <strong>source of truth</strong>, typically <code>/etc/systemd/resolved.conf</code> (or via Netplan/NetworkManager depending on your setup), and then restart the service.</p>
</blockquote>
<blockquote>
<p>Note — Even if <code>/etc/resolv.conf</code> appears stable after manual edits, it is still managed by the system in most modern Ubuntu setups and may be overwritten on reboot or network changes. Always configure DNS through systemd-resolved or Netplan for reliability</p>
</blockquote>
<p>Edit the <code>resolved.conf</code> to set the DC’s IP as the DNS server for the Linux VM</p>
<p>sudo nano /etc/systemd/resolved.conf</p>
<p>[Resolve]<br>
# Some examples of DNS servers which may be used for DNS= and FallbackDNS=:<br>
# Cloudflare: 1.1.1.1#cloudflare-dns.com 1.0.0.1#cloudflare-dns.com 2606:4700:4&gt;<br>
# Google:     8.8.8.8#dns.google 8.8.4.4#dns.google 2001:4860:4860::8888#dns.go&gt;<br>
# Quad9:      9.9.9.9#dns.quad9.net 149.112.112.112#dns.quad9.net 2620:fe::fe#d&gt;<br>
DNS=192.168.122.10 192.168.122.11<br>
#FallbackDNS=<br>
Domains=hybrid.local<br>
#DNSSEC=no<br>
#DNSOverTLS=no<br>
#MulticastDNS=no<br>
#LLMNR=no<br>
#Cache=no-negative<br>
#CacheFromLocalhost=no<br>
#DNSStubListener=yes<br>
#DNSStubListenerExtra=<br>
#ReadEtcHosts=yes<br>
#ResolveUnicastSingleLabel=no<br>
#StaleRetentionSec=0</p>
<p>Restart and check that value has persisted.</p>
<p>sudo systemctl restart systemd-resolved</p>
<p><strong>Verify DNS resolution:</strong></p>
<p>Now that the DNS has been updated try <code>nslookup hybrid.local</code> The expected output is</p>
<p>Server:         127.0.0.53<br>
Address:        127.0.0.53#53</p>
<p>Name:    hybrid.local<br>
Address: 192.168.122.10<br>
Name:    hybrid.local<br>
Address: 192.168.122.11</p>
<p>The server shown as <code>127.0.0.53</code> is the systemd-resolved stub — this is expected, as systemd-resolved intercepts all DNS queries locally before forwarding them upstream. Queries are forwarded to the configured upstream DNS servers (your DCs). The returned addresses confirm that DC DNS is now authoritative for <code>hybrid.local</code></p>
<h3 id="install-kerberos-clienttools">Install Kerberos Client Tools</h3>
<p>From this step onwards, the process of domain join is different from Windows. While the Windows client has necessary Kerberos modules, Linux client should be enabled to communicate with Active Directory using Kerberos. To do that install the Kerberos client tool, <code>[krb5-user](https://web.mit.edu/kerberos/krb5-1.4/krb5-1.4/doc/krb5-user.html)</code></p>
<p>What does krb5-user really do? It installs client end tools that enable communication with a server using Kerberos. In Kerberos, the password is never sent over the wire. Instead, it is converted into a cryptographic key, which is then used to encrypt a timestamp during pre-authentication, which will subsequently be decrypted by the server. The <em>validation</em> is the ability of the server to “decrypt” the client request using its version of the stored key — and this is precisely why clock skew breaks Kerberos. If the timestamp is outside the allowed window, the DC rejects it regardless of whether decryption succeeded.</p>
<blockquote>
<p>Notes:</p>
</blockquote>
<blockquote>
<p>When the DC was promoted, the admin password you provided was hashed using DC’s preferred ‘method(s)’ (etype) and stored in ntds.dit (e.g. AES256 key stored against account)</p>
</blockquote>
<blockquote>
<p>When a Linux VM is created the krb5.conf file defines supported etypes</p>
</blockquote>
<blockquote>
<p>When the Linux VM wants to authenticate with the Windows AD, you initiate the Kerberos flow by running <code>kinit</code></p>
</blockquote>
<blockquote>
<p>Client VM’s <code>kinit</code> sends → AS-REQ to DC saying, “I am administrator, The Client VM supports AES256, AES128, RC4”</p>
</blockquote>
<blockquote>
<p>DC responds saying “I need pre-auth, and for this account I use AES256”</p>
</blockquote>
<blockquote>
<p>Client then derives a key from the password (using the required encryption type, e.g., AES256) → uses this key to encrypt timestamp in Client VM. This is sent as AS-REQ (with pre-auth): username + AES256-encrypted timestamp</p>
</blockquote>
<blockquote>
<p>The DC decrypts this AS-REQ with the stored AES256 key against this user, validates the timestamp is within the allowed clock skew window (default 5 minutes) → issues TGT</p>
</blockquote>
<blockquote>
<p>The default TTL of a Kerberos TGT is 10 hours</p>
</blockquote>
<p>Let’s proceed with the setup.</p>
<p>sudo apt update<br>
sudo apt install krb5-user -y</p>
<p><strong>Prompt:</strong> Enter default realm → <code>HYBRID.LOCAL</code> (uppercase).</p>
<p><img loading="lazy" src="https://cdn-images-1.medium.com/max/800/1*lfDCH32lS5AHDvt2susNsA.png"></p>
<p>In the next prompt provide the FQDN on your DCs separated by space</p>
<p><img loading="lazy" src="https://cdn-images-1.medium.com/max/800/1*3OuE_3pTvlKVEiz_5c_Vuw.png"></p>
<p>And finally when asked for primary DC, provide your primary DC’s FQDN</p>
<p><img loading="lazy" src="https://cdn-images-1.medium.com/max/800/1*ul0qRHCDE7UlM2691DcRrg.png"></p>
<p>Run <code>dpkg -l | grep krb5-user</code> It should list <code>krb5-user</code> as installed.</p>
<h4 id="configure-etckrb5conf">Configure /etc/krb5.conf</h4>
<p>We have installed the necessary client tools to enable Kerberos based exchange from the Linux VM. Next, the Linux VM’s Kerberos client must point to the Key Distribution Centers in the Active Directory. Update <code>sudo nano /etc/krb5.conf</code> as described below</p>
<p>[libdefaults]<br>
default_realm = HYBRID.LOCAL<br>
dns_lookup_realm = false<br>
dns_lookup_kdc = true<br>
[realms]<br>
HYBRID.LOCAL = {<br>
kdc = 192.168.122.10<br>
kdc = 192.168.122.11<br>
admin_server = 192.168.122.10<br>
}<br>
[domain_realm]<br>
.hybrid.local = HYBRID.LOCAL<br>
hybrid.local = HYBRID.LOCAL</p>
<p>Confirm your changes have persisted — <code>cat /etc/krb5.conf</code></p>
<p>we will validate if AD is issuing Kerberos tickets to us by requesting a Ticket Granting Ticket from AD KDC. Run <code>kinit Administrator@HYBRID.LOCAL</code> and enter <code>Administrator</code> password.</p>
<p>Next run <code>klist</code></p>
<p>You should be seeing a ticket as below:</p>
<p>Default principal: <a href="mailto:Administrator@HYBRID.LOCAL">Administrator@HYBRID.LOCAL</a><br>
Valid starting       Expires              Service principal<br>
03/06/26 19:11:11   03/07/26 05:11:11   <a href="mailto:krbtgt/HYBRID.LOCAL@HYBRID.LOCAL">krbtgt/HYBRID.LOCAL@HYBRID.LOCAL</a></p>
<p>If you are curious about the attributes of the ticket, run <code>klist -f # shows flags like forwardable, renewable</code> or <code>klist -e # shows encryption types</code></p>
<h3 id="join-linux-todomain">Join Linux to Domain</h3>
<p>We have the foundation required for a domain join. Now, we will install the required packages for the domain join</p>
<p>sudo apt install realmd sssd sssd-tools adcli samba-common-bin oddjob oddjob-mkhomedir -y</p>
<blockquote>
<p><strong>realmd</strong> — discovers which domains or realms it can use or configure. It can discover and identify Active Directory domains by looking up the appropriate DNS SRV records.</p>
</blockquote>
<blockquote>
<p><strong>sssd</strong> — System Security Services Daemon. After the join, this is what runs continuously to handle authentication requests — it talks to the DC for login, group membership, sudo rules etc. The long-running engine.</p>
</blockquote>
<blockquote>
<p><strong>sssd-tools</strong> — CLI utilities for sssd (<code>sssctl</code>, <code>sss_override</code> etc.) — useful for cache flushing and diagnostics.</p>
</blockquote>
<blockquote>
<p><strong>adcli</strong> — Active Directory CLI. <strong>realmd</strong> uses this under the hood to perform the low-level AD join operations (creating the computer object in AD, setting up the machine account).</p>
</blockquote>
<blockquote>
<p><strong>samba-common-bin</strong> — provides tools like <code>net</code> and <code>wbinfo</code> that realmd/sssd lean on for certain AD operations.</p>
</blockquote>
<blockquote>
<p><strong>oddjob</strong> — a D-Bus service that runs privileged helper tasks on behalf of other services. sssd uses it to do things it can’t do as its own user.</p>
</blockquote>
<blockquote>
<p><strong>oddjob-mkhomedir</strong> — the specific oddjob helper that <strong>automatically creates a home directory</strong> the first time a domain user logs into the Linux machine. Without this, a domain user authenticates successfully but lands with no home directory.</p>
</blockquote>
<p>realmd + adcli        → join-time (one-off operation)<br>
sssd + sssd-tools     → runtime (ongoing authentication)<br>
oddjob + mkhomedir    → login-time helper (home dir creation)<br>
samba-common-bin      → shared plumbing both layers use</p>
<p>Verify the configuration and connectivity to the domain controller</p>
<p>sudo realm discover hybrid.local</p>
<p>hybrid.local<br>
type: kerberos<br>
realm-name: HYBRID.LOCAL<br>
domain-name: hybrid.local<br>
configured: no<br>
server-software: active-directory<br>
client-software: sssd<br>
required-package: sssd-tools<br>
required-package: sssd<br>
required-package: libnss-sss<br>
required-package: libpam-sss<br>
required-package: adcli<br>
required-package: samba-common-bin</p>
<p>The above proves network connectivity to the DC, correct DNS resolution, and that AD is responding to discovery queries.</p>
<p>Join the domain as Administrator</p>
<p>sudo realm join --user=Administrator hybrid.local</p>
<p>Post domain join, verify the configuration and connectivity to the domain controller</p>
<p>sudo realm list</p>
<p>hybrid.local<br>
type: kerberos<br>
realm-name: HYBRID.LOCAL<br>
domain-name: hybrid.local<br>
configured: kerberos-member  <br>
server-software: active-directory<br>
client-software: sssd<br>
required-package: sssd-tools<br>
required-package: sssd<br>
required-package: libnss-sss<br>
required-package: libpam-sss<br>
required-package: adcli<br>
required-package: samba-common-bin<br>
login-formats: %U@hybrid.local<br>
login-policy: allow-realm-logins</p>
<p>Run, <code>id testuser1@HYBRID.LOCAL</code>you will see that the <code>testuser1</code> is looked up from the Domain Controller by the Linux Client.</p>
<p><strong>Enable automatic home directory creation:</strong></p>
<p>sudo pam-auth-update<br>
# Enable &ldquo;Create home directory on login&rdquo;</p>
<p><img loading="lazy" src="https://cdn-images-1.medium.com/max/800/1*wJB4k3Pt0bHAGn7BOqGUJg.png"></p>
<p>Enable “Create home directory on login”</p>
<h3 id="verification">Verification</h3>
<p>As a final test, you should be able to successfully login using one of the test users you had created and used for the Windows client (<code>testuser1@HYBRID.LOCAL</code>). Notice that the home directory for testuser1 is created.</p>
<p><img loading="lazy" src="https://cdn-images-1.medium.com/max/800/1*nhh-1a4NyS0SCfN7UeSggA.png"></p>
<p>Also, login to the DC and see a new Linux VM getting added there under <code>hybrid.local &gt; Computers</code></p>
<p><img loading="lazy" src="https://cdn-images-1.medium.com/max/800/1*upo4Y8Mw6SqJrulaqpjMgQ.png"></p>
<h3 id="summary">Summary</h3>
<p>This concludes the part of the series where we established an on-premises identity foundation. In this series, so far, we have established</p>
<ul>
<li>A functioning Active Directory forest (<code>hybrid.local</code>) with two domain controllers</li>
<li>Multi-master replication verified across both DCs — SYSVOL, NETLOGON, and directory objects</li>
<li>FSMO roles identified and accounted for</li>
<li>A Windows client and a Linux VM both domain-joined and authenticated via Kerberos</li>
<li>DNS working end-to-end: internal resolution via the DC, external resolution via the forwarder</li>
</ul>
<p>Up next, a <strong>S2S VPN tunnel with Azure</strong> which would complete the hybrid connectivity foundation</p>
]]></content:encoded></item><item><title>HandsOn — Building Hybrid Cloud Environment — Part 3— Identity — Additional DC and Replication</title><link>https://gurupasupathy.com/post/2026-04-24_building-hce-part-3--identity-additional-dc-and-replication/</link><pubDate>Fri, 24 Apr 2026 00:00:00 +0000</pubDate><guid>https://gurupasupathy.com/post/2026-04-24_building-hce-part-3--identity-additional-dc-and-replication/</guid><description>&lt;p&gt;&lt;a href="https://pasupathy-guru.medium.com/handson-building-hybrid-cloud-environment-part-2-identity-on-premises-domain-controller-1152903ea89b?source=friends_link&amp;amp;sk=1e2563a1f7433d16441db694f87af581"&gt;Previously&lt;/a&gt;, we created a domain controller (DC), joined a test virtual machine to the newly created domain and verified the authentication of a test user from client VM. In this part, we will build redundancy into our environment by introducing a second domain controller.&lt;/p&gt;
&lt;p&gt;Active Directory (AD) is designed for &lt;strong&gt;multi-master replication&lt;/strong&gt;, meaning multiple domain controllers hold a copy of the directory database.&lt;/p&gt;
&lt;p&gt;Adding a second DC provides:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;High Availability&lt;/strong&gt; — authentication continues if one DC fails&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Load Distribution&lt;/strong&gt; — clients can authenticate against different DCs&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Replication Redundancy&lt;/strong&gt; — AD database changes replicate automatically&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;In this part, we will:&lt;/p&gt;</description><content:encoded><![CDATA[<p><a href="https://pasupathy-guru.medium.com/handson-building-hybrid-cloud-environment-part-2-identity-on-premises-domain-controller-1152903ea89b?source=friends_link&amp;sk=1e2563a1f7433d16441db694f87af581">Previously</a>, we created a domain controller (DC), joined a test virtual machine to the newly created domain and verified the authentication of a test user from client VM. In this part, we will build redundancy into our environment by introducing a second domain controller.</p>
<p>Active Directory (AD) is designed for <strong>multi-master replication</strong>, meaning multiple domain controllers hold a copy of the directory database.</p>
<p>Adding a second DC provides:</p>
<ul>
<li><strong>High Availability</strong> — authentication continues if one DC fails</li>
<li><strong>Load Distribution</strong> — clients can authenticate against different DCs</li>
<li><strong>Replication Redundancy</strong> — AD database changes replicate automatically</li>
</ul>
<p>In this part, we will:</p>
<ul>
<li>Provision a secondary domain controller</li>
<li>Assign a static IP</li>
<li>Join the domain</li>
<li>Install AD DS and promote the server</li>
<li>Verify replication and health</li>
</ul>
<p>Let’s get started and add the second domain controller.</p>
<p><img loading="lazy" src="https://cdn-images-1.medium.com/max/1200/1*l-DaEiCkmW_Ms8p66MlzUA.png"></p>
<h4 id="provision-secondary-domain-controller">Provision Secondary domain controller</h4>
<p>For the second DC, create another Windows Server 2022 VM using the same process outlined in the <a href="https://pasupathy-guru.medium.com/handson-building-hybrid-cloud-environment-part-1-identity-connectivity-foundation-7788dd1eb827?source=friends_link&amp;sk=40547f5f11d2a24cbd3fd0705504bdba">first part</a> of this series.</p>
<blockquote>
<p>Ensure that the newly created VM is attached to the <strong>same libvirt virtual network</strong> so it can reach the primary DC</p>
</blockquote>
<p>Run <code>ipconfig /all</code> to confirm the network range, and gateway IP are the same as the primary DC. If you are following along then the gateway should be <code>192.168.122.1</code></p>
<h4 id="assign-staticip">Assign Static IP</h4>
<p>Running <code>ipconfig /all</code>, you will notice a preferred IPv4 address for this VM. It is <code>192.168.122.145</code>in my case. This IP was handed out by DHCP (dnsmasq) when the VM was created and attached to the virtual network. It can change the next time you restart the virtual network, and the VMs that have joined the domain will not be able to reach the DC. To avoid this, we will assign a static IP to this VM.</p>
<p>Open <strong>Server Manager</strong></p>
<ol>
<li>Click <strong>Local Server</strong></li>
<li>Before setting the static IP, let us rename the computer to <code>HCE-DC02</code>and restart.</li>
<li>After restart, go to <strong>Local Server</strong> and Click the link next to <strong>Ethernet</strong></li>
<li>A pop-up Right-click your Ethernet adapter → <strong>Properties</strong></li>
<li>Double-click <strong>Internet Protocol Version 4 (TCP/IPv4)</strong></li>
<li>Select <strong>Use the following IP address</strong> option</li>
<li>Enter the below values:</li>
</ol>
<p>IP address: 192.168.122.11 - The static IP we have chosen for secondary DC<br>
Subnet mask: 255.255.255.0<br>
Default gateway: 192.168.122.1 - Bridge&rsquo;s IP</p>
<p>8. Select <strong>Use the following DNS server addresses</strong></p>
<blockquote>
<p>Note: Before executing this step, do a small test. Run nslookup and you will notice DNS query is sent to the gateway <code>_192.168.122.1_</code>. dnsmasq, which runs on the gateway, cannot answer query about <code>hybrid.local</code> and forwards it up the chain — to your WiFi router, then to your ISP&rsquo;s DNS. None of them have ever heard of <code>_hybrid.local_</code> because it is not a public domain — it exists only inside the primary DC&rsquo;s DNS. The query times out somewhere in that chain and returns nothing useful</p>
</blockquote>
<blockquote>
<p>Now, the secondary DC needs to rely on primary DC for any name resolution for domains managed by primary DC. In our case, <code>hybrid.local</code> is visible only in the context of primary DC and for secondary DC to reach other virtual machines in hybrid.local domain, it must consult primary DC’s DNS. That’s why you must set the Preferred DNS server as 192.168.122.10</p>
</blockquote>
<p>Set <code>Preferred DNS server: 192.168.122.10</code></p>
<p>Once the above steps are completed, check if the static IP is updated by running <code>ipconfig</code></p>
<blockquote>
<p>Note around VM rename — Rename the server before promotion — renaming a DC after the fact is not recommended.</p>
</blockquote>
<p>Now, we can verify domain controller discovery. Run <code>nslookup</code> in interactive mode</p>
<p>nslookup<br>
set type=SRV<br>
_ldap._tcp.dc._msdcs.hybrid.local</p>
<p>The SRV record should return the hostname of <strong>DC1</strong>.</p>
<p><img loading="lazy" src="https://cdn-images-1.medium.com/max/800/1*0PcVnSWZRVH8b5t12b06uQ.png"></p>
<h4 id="join-thedomain">Join the Domain</h4>
<p>At this point, the VM is able to resolve the primary DC, and it has a static IP assigned. We just need one more step before promoting this VM as secondary DC. It must first be a <strong>domain member server</strong>.</p>
<blockquote>
<p>Unlike the primary DC which created the domain during promotion, the secondary DC is joining a domain that already exists. It needs a computer object, and a secure channel established before the promotion wizard can authenticate against the existing domain and begin replication.</p>
</blockquote>
<p>Run PowerShell as Administrator:</p>
<p>Add-Computer -DomainName hybrid.local -Credential HYBRID\Administrator -Restart</p>
<p>After reboot, log in as <code>hybrid\Administrator</code></p>
<p>To verify domain membership, run <code>whoami</code>The current domain\user should be displayed <code>hybrid\administrator</code></p>
<h4 id="install-active-directory-domain-servicesrole">Install Active Directory Domain Services Role</h4>
<p>Now that the VM (it’s not a DC yet) has joined the domain, the next step is to make it a domain controller.</p>
<p>Open <strong>Server Manager &gt;</strong> Click <strong>Manage</strong> (top right) &gt; Click <strong>Add Roles and Features &gt;</strong> Click <strong>Next</strong> until you reach <strong>Server Roles &gt;</strong> Check: <code>Active Directory Domain Services</code>&gt; When prompted: Click <strong>Add Features &gt;</strong> Click <strong>Next</strong> until <strong>Install &amp;</strong> Click <strong>Install</strong></p>
<h4 id="promote-domain-controller">Promote domain controller</h4>
<p>After installation completes, click the notification flag. Select: <code>Promote this server to a domain controller</code> and follow the wizard. The server will reboot after installation automatically.</p>
<p><strong><em>Note on deployment configuration —</em></strong></p>
<p>While installing pay attention to these attributes</p>
<ol>
<li>when the wizard prompts for a forest, Select: <code>Add a domain controller to an existing domain</code>Provide domain name as <code>hybrid.local</code></li>
<li>Domain controller options — <br>
a. check <code>Domain Name System (DNS) server</code>and <code>Global Catalog</code><br>
b. uncheck <code>Read only domain controller (RODC)</code><br>
c. Set a <strong>Directory Services Restore Mode (DSRM) password</strong></li>
<li>Ignore the DNS delegation warning.</li>
<li>In <code>Additional Options</code> choose <code>Replicate from:</code>to <code>HCE-DC01.hybrid.local</code></li>
</ol>
<h4 id="verify-both-domain-controllers-exist">Verify Both Domain Controllers Exist</h4>
<p>The secondary domain controller is set up now. Run the following validation to confirm the promotion succeeded.</p>
<p>On both the DCs, open <code>Active Directory Users and Computers</code>Navigate to <code>Domain Controller ,</code>You should now see <code>HCE-DC01</code><strong>and</strong> <code>HCE-DC02</code></p>
<p>This confirms both DCs are part of the domain. We have successfully set up High availability for the domain controllers.</p>
<h4 id="verify-active-directory-replication">Verify Active Directory Replication</h4>
<p>Active Directory replicates directory changes between the two controllers automatically. You can check that by running <code>repadmin /replsummary</code></p>
<p>Example output:</p>
<p>Beginning data collection for replication summary, this may take a while:<br>
&hellip;..</p>
<p>Source DSA          largest delta    fails/total %%   error<br>
HCE-DC01                  14m:44s    0 /   5    0<br>
HCE-DC02                  02m:40s    0 /   5    0</p>
<p>Destination DSA     largest delta    fails/total %%   error<br>
HCE-DC01                  02m:40s    0 /   5    0<br>
HCE-DC02                  14m:44s    0 /   5    0</p>
<p>Interpretation:</p>
<ul>
<li><strong>largest delta</strong> → how long since last replication</li>
<li><strong>fails/total</strong> → replication failures</li>
</ul>
<p>A healthy environment shows: <code>0 failures</code> This proves <strong>multi-master replication is working</strong>.</p>
<blockquote>
<p>After promotion and replication stabilizes, update DNS settings so each DC points to itself as primary and the other DC as secondary.</p>
</blockquote>
<h4 id="verify-sysvol-replication">Verify SYSVOL Replication</h4>
<p>Group Policies live in the <a href="https://learn.microsoft.com/en-us/windows-server/identity/ad-ds/plan/site-functions#sysvol-replication"><strong>SYSVOL</strong></a> <strong>folder</strong> and must replicate between DCs. Check the share exists by running <code>net share</code>It is a quick sanity check.</p>
<p>SYSVOL shares are only published by Windows when the DC considers itself healthy and SYSVOL replication is complete. Their presence confirms that <a href="https://learn.microsoft.com/en-us/previous-versions/windows/desktop/dfsr/dfsr-overview">DFS-R</a> has done its job and this DC is ready to serve Group Policy to domain members.</p>
<p>Look for SYSVOL, if they are present, the replication of GPO is working fine.</p>
<h4 id="identify-fsmo-roleholders">Identify FSMO Role Holders</h4>
<p>Even though AD supports multi-master writes, some operations must be handled by a <strong>single role owner</strong> to avoid conflicts. These are the <strong>FSMO roles</strong> (<strong>F</strong>lexible <strong>S</strong>ingle <strong>M</strong>aster <strong>O</strong>perations).</p>
<p>Check which server holds them <code>netdom query fsmo</code></p>
<p>Example output:</p>
<p>Schema master               HCE-DC01<br>
Domain naming master        HCE-DC01<br>
PDC                         HCE-DC01<br>
RID pool manager            HCE-DC01<br>
Infrastructure master       HCE-DC01</p>
<p>In small environments like this, all roles may remain on the <strong>first DC</strong>.</p>
<blockquote>
<p>Note: multi-master covers most directory writes; FSMO roles are for the specific operations where a single authority is required to avoid conflicts.</p>
</blockquote>
<h4 id="why-multiple-domain-controllers-matter">Why Multiple Domain Controllers Matter</h4>
<p>With two DCs:</p>
<ul>
<li>Both hold a <strong>replicated copy of the Active Directory database</strong></li>
<li>Clients discover them through <strong>DNS SRV records</strong></li>
<li>Clients choose a DC based on <strong>site proximity and priority</strong></li>
<li>If one DC is offline, clients automatically fail over</li>
</ul>
<p>Authentication flow now becomes:</p>
<p>Client<br>
↓<br>
DNS SRV query<br>
↓<br>
List of Domain Controllers<br>
↓<br>
Client selects reachable DC<br>
↓<br>
Kerberos authentication</p>
<p>This is why <strong>clients never hardcode a Domain Controller IP</strong>. DNS provides the <strong>dynamic discovery layer</strong>.</p>
<h4 id="summary">Summary</h4>
<p>In this part, we introduced a second domain controller and validated replication, establishing high availability for Active Directory.</p>
<p>Current lab state:</p>
<p>HCE-DC01 → First domain controller<br>
HCE-DC02 → Additional domain controller<br>
Client VM → Domain joined</p>
<p>This completes the <strong>Active Directory redundancy layer</strong> making it resilient.</p>
<p>In the next part, we will integrate a Linux VM and validate Kerberos-based authentication, extending identity beyond Windows systems.</p>
]]></content:encoded></item><item><title>HandsOn — Building Hybrid Cloud Environment — Part 2— Identity — On-Premises Domain Controller</title><link>https://gurupasupathy.com/post/2026-04-18_building-hce-part-2--identity-on-premises-domain-controller/</link><pubDate>Sat, 18 Apr 2026 00:00:00 +0000</pubDate><guid>https://gurupasupathy.com/post/2026-04-18_building-hce-part-2--identity-on-premises-domain-controller/</guid><description>&lt;p&gt;In the &lt;a href="https://pasupathy-guru.medium.com/handson-building-hybrid-cloud-environment-part-1-identity-connectivity-foundation-7788dd1eb827?source=friends_link&amp;amp;sk=40547f5f11d2a24cbd3fd0705504bdba"&gt;first part&lt;/a&gt;, we laid the foundation for the hybrid cloud environment. Now we have a virtual network with VM running Windows Server 2022 Evaluation. In this part, we will focus on adding the Identity plane to the hybrid cloud environment by introducing a domain controller and creating an Active Directory structure. We will create a client VM, domain join it and make sure a domain user is able to login&lt;/p&gt;</description><content:encoded><![CDATA[<p>In the <a href="https://pasupathy-guru.medium.com/handson-building-hybrid-cloud-environment-part-1-identity-connectivity-foundation-7788dd1eb827?source=friends_link&amp;sk=40547f5f11d2a24cbd3fd0705504bdba">first part</a>, we laid the foundation for the hybrid cloud environment. Now we have a virtual network with VM running Windows Server 2022 Evaluation. In this part, we will focus on adding the Identity plane to the hybrid cloud environment by introducing a domain controller and creating an Active Directory structure. We will create a client VM, domain join it and make sure a domain user is able to login</p>
<p>We will be following the below sequence.</p>
<ul>
<li>Promote the Virtual Machine <code>hce-dc01</code> , created in <a href="https://pasupathy-guru.medium.com/handson-building-hybrid-cloud-environment-part-1-identity-connectivity-foundation-7788dd1eb827?source=friends_link&amp;sk=40547f5f11d2a24cbd3fd0705504bdba">part 1</a> as primary domain controller and create domain, forest, and OU</li>
<li>Create user accounts</li>
<li>Domain join a Windows client</li>
</ul>
<p><img loading="lazy" src="https://cdn-images-1.medium.com/max/1200/1*0dxAdZ9Lr-lZh7pA1UvbhA.png"></p>
<h4 id="primary-domain-controller-configuration"><strong>Primary Domain Controller Configuration</strong></h4>
<p>The <a href="https://learn.microsoft.com/en-us/previous-versions/windows/it-pro/windows-server-2003/cc786438%28v=ws.10%29">official Microsoft documentation</a> defines a domain controller as — <br>
 “<em>A domain controller is a server that is running a version of the Windows Server® operating system and has Active Directory® Domain Services installed.</em>”</p>
<p>For a simple hybrid cloud environment, a domain controller is not mandatory. So, why do we need this? Some hybrid scenarios depend heavily on the on-premises having an identity plane. Example, AD Connect. To introduce the identity plane in our virtual network, we need a server to manage the domain, forest, OU, users, policies, user authentication, and policy enforcement. This will be our domain controller, and the VM we created in <a href="https://pasupathy-guru.medium.com/handson-building-hybrid-cloud-environment-part-1-identity-connectivity-foundation-7788dd1eb827?source=friends_link&amp;sk=40547f5f11d2a24cbd3fd0705504bdba">part 1</a> will be used for this purpose.</p>
<p>Before starting this configuration we will run <code>ip addr</code> and take note down the IP range of the virtual bridge and Wi-Fi router. The output of the above command will be a list of all the interfaces running in your box with their IP ranges. Notice the virtual bridge, <code>virbr0</code> (<em>created when we set up the virtual network, refer to</em> <a href="https://pasupathy-guru.medium.com/handson-building-hybrid-cloud-environment-part-1-identity-connectivity-foundation-7788dd1eb827?source=friends_link&amp;sk=40547f5f11d2a24cbd3fd0705504bdba"><em>part 1</em></a>) has a network range of <code>192.168.122.1/24</code>, in my case</p>
<p>virbr0: &lt;BROADCAST,MULTICAST,UP,LOWER_UP&gt; mtu 1500 qdisc noqueue state UP group default qlen 1000<br>
link/ether 52:54:00:d9:a6:2b brd ff:ff:ff:ff:ff:ff<br>
inet 192.168.122.1/24 brd 192.168.122.255 scope global virbr0<br>
valid_lft forever preferred_lft forever</p>
<p>Look for your Wi-Fi router range in the output. <code>wlp58s0</code> is my Wi-Fi network and it has the range of <code>192.168.1.1/24</code></p>
<p>wlp58s0: &lt;BROADCAST,MULTICAST,UP,LOWER_UP&gt; mtu 1500 qdisc noqueue state UP group default qlen 1000<br>
link/ether 04:ed:33:e3:9d:f1 brd ff:ff:ff:ff:ff:ff<br>
inet 192.168.1.106/24 brd 192.168.1.255 scope global dynamic noprefixroute wlp58s0<br>
valid_lft 84447sec preferred_lft 84447sec<br>
inet6 fe80::d291:8b3e:4e0:cc87/64 scope link noprefixroute <br>
valid_lft forever preferred_lft forever</p>
<p>virbr0 gateway is 192.168.122.1<br>
wifi router gateway is 192.168.1.1</p>
<p>I will pick an IP from the <code>virbr0</code> range and assign it to the newly created VM, which is going to be our primary domain controller.</p>
<h4 id="static-ip-for-domain-controller"><strong>Static IP for Domain Controller</strong></h4>
<p>The reason we need a static IP for the domain controller is to ensure that the domain controller is reachable even if the virtual network restarts. dnsmasq — we briefly touched on this in <a href="https://pasupathy-guru.medium.com/handson-building-hybrid-cloud-environment-part-1-identity-connectivity-foundation-7788dd1eb827?source=friends_link&amp;sk=40547f5f11d2a24cbd3fd0705504bdba">part 1</a>, is responsible for assigning IPs to the virtual machines. When the virtual network restarts, it will act as the DHCP and start handing out IPs to all the VMs connected to <code>virbr0</code>. If the domain controller gets a different IP when the virtual network restarts, it will break the domain join for all the VMs that were part of the domain controller. To avoid this, we will assign static IP for the domain controllers.</p>
<p>Now that we know the IP range of virtual network, I will pick an IP, say <code>192.168.122.10</code> as my primary domain controller’s IP.</p>
<p>Login to the Windows Server we created in <a href="https://pasupathy-guru.medium.com/handson-building-hybrid-cloud-environment-part-1-identity-connectivity-foundation-7788dd1eb827?source=friends_link&amp;sk=40547f5f11d2a24cbd3fd0705504bdba">part 1</a> and follow below steps to set the static IP —</p>
<blockquote>
<p>Before setting the static IP, rename the computer to <code>HCE-DC01</code> if you haven’t done it already and restart.</p>
</blockquote>
<p>Open <strong>Server Manager</strong></p>
<ol>
<li>Click <code>**Local Server**</code> <strong>-&gt;</strong> Click the link next to <code>**Ethernet**</code></li>
<li>A pop-up Right-click your Ethernet adapter → <code>**Properties**</code></li>
<li>Double-click <strong>Internet Protocol Version 4 (TCP/IPv4)</strong></li>
<li>Select <code>**Use the following IP address**</code> option</li>
<li>Enter the below values:</li>
</ol>
<p>IP address: 192.168.122.10 - The static IP we chose<br>
Subnet mask: 255.255.255.0<br>
Default gateway: 192.168.122.1 - Bridge&rsquo;s IP</p>
<p>8. Select <code>**Use the following DNS server addresses**</code></p>
<p>9. Set Preferred DNS server: <code>192.168.122.10</code></p>
<p>10. Click OK → Close all windows</p>
<p>Check if the static IP is updated by running ipconfig</p>
<blockquote>
<p>When you set up the first Domain Controller, it must use itself as a DNS. Once the secondary domain controller is up, they should ideally cross reference each other</p>
</blockquote>
<blockquote>
<p>Note of VM rename — Rename the server before promotion — renaming a domain controller after the fact is painful.</p>
</blockquote>
<h4 id="install-active-directory-domain-servicesrole"><strong>Install Active Directory Domain Services Role</strong></h4>
<p>For a server to perform the role of a domain controller, it needs certain capabilities. These capabilities include a storage to persist the objects (forest, users, computers, groups, and policies), authentication layer, and policy enforcement mechanisms. This is not a exhaustive list of capabilities. I have listed only those relevant to our hybrid environment right now.</p>
<p>Active Directory Domain Services is a feature that you install on your Windows Server to make it a domain controller. Here is the official definition of AD DS — “<em>A directory is a hierarchical structure that stores information about objects on a network. A directory service, such as</em> <a href="https://learn.microsoft.com/en-us/windows-server/identity/ad-ds/get-started/virtual-dc/active-directory-domain-services-overview"><strong><em>Active Directory Domain Services (AD DS)</em></strong></a><em>, provides methods for storing directory data and making this data available to network users and administrators. For example, AD DS stores information about user accounts, such as names, passwords, phone numbers, and so on. AD DS also provides a way for authorized users on the same network to access this information.</em>”</p>
<p><strong><em>Rough steps to install AD DS</em></strong></p>
<p>Open <strong>Server Manager &gt;</strong> Click <strong>Manage</strong> (top right) &gt; Click <strong>Add Roles and Features &gt;</strong> Click <strong>Next</strong> until you reach <strong>Server Roles &gt;</strong> Check: Active Directory Domain Services&gt; When prompted: Click <strong>Add Features &gt;</strong> Click <strong>Next</strong> until <strong>Install &amp;</strong> Click <strong>Install</strong></p>
<p>I’m not going into the details of the installation as many resources document the process in detail.</p>
<p>With this, the virtual machine has a necessary feature to perform the role of a domain controller.</p>
<h4 id="promote-this-server-to-a-domain-controller"><strong>Promote This Server to a Domain Controller</strong></h4>
<p>Installing the AD DS role and promoting the server are two distinct steps, and it’s easy to miss why. Here is the distinction that matters. What makes a server a domain controller is not what’s installed — it’s whether a valid, initialised <a href="https://trustedsec.com/blog/exploring-ntds-dit-part-1-cracking-the-surface-with-dit-explorer">NTDS.dit</a> exists, the NTDS service is running against it, and the network knows where to find it via SRV records. Promotion is the act of going from <em>capable</em> to <em>instantiated</em>.</p>
<p>Open <strong>Server Manager &gt;</strong> You should see a yellow triangle notification at top right. Click it. &gt; Click: Promote this server to a domain controller</p>
<p><strong><em>Note on deployment configuration</em></strong></p>
<p>While installing, pay attention to these attributes</p>
<ol>
<li>when the wizard prompts for a forest, Select: <code>Add a new forest</code> Root domain name can be given as <code>hybrid.local</code></li>
<li>Domain controller options — <br>
a. Forest functional level: <em>leave default</em> <br>
b. Domain functional level: <em>leave default</em> <br>
c. DNS Server should already be checked <br>
d. Global Catalog should be checked <br>
e. <em>Do NOT</em> check Read-Only DC <br>
f. Set a <strong>Directory Services Restore Mode (DSRM) password</strong> (Write this down somewhere safe.)</li>
<li>Ignore the DNS delegation warning.</li>
<li>When creating a new Active Directory forest, the setup process also asks for a <strong>NetBIOS name</strong> for the domain. NetBIOS is a legacy naming system that predates modern DNS-based Active Directory environments and is widely used in older Windows networks for computer and resource identification. It will auto-fill: HYBRID, Leave it.</li>
</ol>
<p>Again, I’m not providing detailed steps on every screen of the wizard; this process is well documented. Apart from the attributes I have mentioned above, rest can be left with the default value. If you see any warning when checking the pre-requisites, you can ignore them. They will not have any effect on the environment we are building. We will revisit in future if needed.</p>
<blockquote>
<p>⚠ After the installation, the server will reboot automatically.</p>
</blockquote>
<p>After the reboot you will be able to login as Administrator to the new domain</p>
<p>Now, we the domain controller up and ready. As a first test, try <code>nslookup google.com</code>. You will see that the domain controller failed to resolve this query. Let’s fix this next.</p>
<h4 id="adding-a-dns-forwarder"><strong>Adding a DNS Forwarder</strong></h4>
<p>After installing Active Directory Domain Services, the Domain Controller also becomes the <strong>authoritative DNS server</strong> for the new domain (hybrid.local). At this point the DNS server knows how to resolve <strong>internal Active Directory records</strong> such as domain controllers, LDAP services, and domain-joined machines. However, it has no knowledge of <strong>external internet domains</strong> like <code>google.com</code> or <code>microsoft.com</code>. When a domain-joined machine sends a DNS query for an external address, the request reaches the domain controller but cannot be resolved. Configuring a <strong>DNS forwarder</strong> solves this by instructing the DNS server to pass any unknown queries to an upstream resolver (for example, the home router or a public DNS server such as <code>8.8.8.8</code>). The domain controller therefore resolves internal names itself and forwards everything else, allowing domain clients to access the internet while still using the domain controller as their primary DNS server.</p>
<p>To set up the forwarder, Open <strong>Server Manager &gt;</strong> Tools → <strong>DNS &gt; Double click on your server name &gt;</strong> Right-click → <strong>Properties &gt;</strong> Go to <strong>Forwarders</strong> tab &gt; Click <strong>Edit &gt;</strong> Add the virbr0gateway IP <code>192.168.122.1</code></p>
<p>If try the lookup again <code>nslookup google.com</code>, it will resolve.</p>
<p>Without a forwarder, the DNS server attempts recursive resolution using root hints, which can introduce delays or timeouts in lab environments behind NAT. Configuring a forwarder provides a faster and more predictable path for resolving external names.</p>
<h4 id="creating-useraccounts"><strong>Creating User Accounts</strong></h4>
<p>Next, create normal domain users. In the domain controller, launch Active Directory Users and Computers, go to Users folder and create two test users. testuser1 and testuser2. Set password and enable the account.</p>
<h4 id="create-a-client-vm-to-join-thedomain"><strong>Create a client VM to join the domain</strong></h4>
<p>Now that the domain controller is ready, we can test if client VMs are able to join the new domain we created. To test the domain join, spin up a new Windows VM , our client VM. I created another instance of Windows 2022 Server as I did not want to download another iso, just for the testing.</p>
<h4 id="join-client-vm-todomain"><strong><em>Join client VM to Domain</em></strong></h4>
<p>When you spin up a new client VM, its preferred DNS will be the gateway <code>192.168.122.1</code> (in line with the IP range of the virtual network).</p>
<p><strong><em>Configure the Client VM’s DNS</em></strong></p>
<p>Before joining the domain, its DNS server must point to the domain controller’s IP. The reason being the domain <code>hybrid.local</code> is not publicly resolvable like <code>google.com</code> — it exists only in the domain controller&rsquo;s DNS. If the client VM uses any other DNS server, it will fail to locate the domain controller and the domain join will not proceed.</p>
<p>Set-DnsClientServerAddress -InterfaceAlias &ldquo;Ethernet&rdquo; -ServerAddresses 192.168.122.10</p>
<p>if the name “<code>Ethernet</code>” is not resolvable use the below command to look up the actual interface alias</p>
<p>Get-NetAdapter | select Name, InterfaceAlias, Status</p>
<p><strong><em>Testing the Client</em></strong></p>
<p><strong><em>Step 1 — Check the DNS server (from Client VM)</em></strong></p>
<p>On the <strong>client VM</strong>, open PowerShell as Administrator and run:</p>
<p>ipconfig /all</p>
<p><strong>DNS Server</strong> should be <code>192.168.122.10</code></p>
<p><strong><em>Step 2 — Verify if nslookup resolves (from Client VM)</em></strong></p>
<p>On the <strong>client VM</strong>, test domain controller discovery via DNS.</p>
<p>nslookup _ldap._tcp.dc._msdcs.hybrid.local</p>
<p>It should return —</p>
<p><img loading="lazy" src="https://cdn-images-1.medium.com/max/800/1*PXQqlrv5UOx6Pg2h1i0vJg.png"></p>
<p>Above snip shows that the test VM now uses the domain controller as it’s DNS and DNS server is responding. We are good to join the domain.</p>
<p><strong><em>Step 3 — Check if SRV record is correct (from Client VM)</em></strong></p>
<p>Run <code>nslookup</code> in interactive mode: Inside the prompt, query the SRV record</p>
<p>set type=SRV<br>
_ldap._tcp.dc._msdcs.hybrid.local</p>
<p>If you see your domain controller hostname listed under <strong>svr hostname</strong> DNS + SRV discovery is working.</p>
<p><strong><em>Step 4 — Test LDAP connectivity (from Client VM)</em></strong></p>
<p>From the client, test LDAP connectivity. <code>Test-NetConnection 192.168.122.10 — Port 389</code>.The response should say <code>TcpTestSucceeded : True</code></p>
<p><strong><em>Join the domain (from Client VM)</em></strong></p>
<p>We are all set to join this VM to the domain. Run the below PowerShell command to join the domain</p>
<p>Add-Computer -DomainName hybrid.local -Credential HYBRID\Administrator -Restart</p>
<p>What happens internally:</p>
<ul>
<li>Client queries DNS for domain controller (SRV record)</li>
<li>Client contacts domain controller via LDAP</li>
<li>Admin credentials authenticated (Kerberos/NTLM) to authorize the join</li>
<li>Domain controller creates a Computer Object in AD</li>
<li>Machine account password established — this becomes the secure channel secret</li>
<li>Client reboots as domain member</li>
</ul>
<h4 id="verification"><strong>Verification</strong></h4>
<p><strong><em>Domain membership verification —</em></strong> Log in as <code>HYBRID\testuser1</code> - that you created earlier. Run <code>whoami</code>; should return: <code>hybrid\testuser1</code>.<code>echo %logonserver%</code> should show your domain controller hostname <code>\\HCE-DC01</code>.</p>
<p>This proves that Kerberos + secure channel + domain controller communication is working.</p>
<p><strong><em>Validate AD Object Creation —</em></strong> On the domain controller, if you open <code>**Active Directory Users and Computers**</code> and go to <code>**Computers**</code> you should see your client machine listed. This proves that AD object lifecycle works.</p>
<h4 id="summary"><strong>Summary</strong></h4>
<p>In this part we set up a domain controller, domain joined a VM and tested the connectivity. Although single Domain Controller set up works, it is a <strong>single point of failure</strong>. If the only domain controller fails:</p>
<ol>
<li>Authentication stops</li>
<li>Kerberos tickets cannot be issued</li>
<li>Group Policy stops applying</li>
<li>New logons fail</li>
</ol>
<p>In the next part, we will build redundancy for the domain controller by adding a secondary domain controller.</p>
]]></content:encoded></item><item><title>HandsOn — Building Hybrid Cloud Environment — Part 1 — Identity &amp; Connectivity Foundation</title><link>https://gurupasupathy.com/post/2026-04-12_building-hce-part-1-identity-connectivity-foundation/</link><pubDate>Sun, 12 Apr 2026 00:00:00 +0000</pubDate><guid>https://gurupasupathy.com/post/2026-04-12_building-hce-part-1-identity-connectivity-foundation/</guid><description>&lt;h4 id="introduction"&gt;Introduction&lt;/h4&gt;
&lt;p&gt;In this series, I will take you through building an on-premises / Azure hybrid environment, with the on-premises network running entirely on a single machine. We will set up an on-premises Active Directory forest, create OUs and users, deploy domain controllers, join Windows and Linux VMs to the domain, and establish hybrid connectivity to Azure using an S2S VPN tunnel.&lt;/p&gt;
&lt;p&gt;I want to clarify right at the outset that on-premises identity is not a mandatory starting point for a hybrid cloud environment. But I have chosen to build it from the ground up starting with the identity plane (on-premises Active Directory) .&lt;/p&gt;</description><content:encoded><![CDATA[<h4 id="introduction">Introduction</h4>
<p>In this series, I will take you through building an on-premises / Azure hybrid environment, with the on-premises network running entirely on a single machine. We will set up an on-premises Active Directory forest, create OUs and users, deploy domain controllers, join Windows and Linux VMs to the domain, and establish hybrid connectivity to Azure using an S2S VPN tunnel.</p>
<p>I want to clarify right at the outset that on-premises identity is not a mandatory starting point for a hybrid cloud environment. But I have chosen to build it from the ground up starting with the identity plane (on-premises Active Directory) .</p>
<p>To follow along you do not require deep networking or Linux expertise, but comfort with basic bash and networking concepts will help — and we will build the required knowledge as we go. Where appropriate, I will reference official documentation rather than re-explaining well-documented concepts.</p>
<p>At the end of this series, I will share my Github repo with the automation scripts for the infrastructure.</p>
<p>When I started exploring hybrid environments, I assumed it would require multiple machines, dedicated networking, and possibly additional hardware. That made it feel like something I couldn’t easily experiment with on my own setup. As I explored further, I realized those assumptions weren’t entirely true. I was able to build a working hybrid environment on my laptop, keeping everything contained and manageable. This series documents how I put it together. Here’s the setup I’m using: <code>a laptop running Linux Mint 22.1 (Xia) with 16 GB RAM, a regular home Wi-Fi router, and an Azure subscription (PAYG).</code>Let’s get started.</p>
<h4 id="building-an-on-premises-virtualnetwork">Building an On-premises virtual network</h4>
<p>In this first part of the series, I will set up the virtual network on my Linux laptop. This is the foundation for the hybrid environment we are building. By the end of this article, we will have laid the foundation which will include a virtual network — our on-premises representation and a Virtual Machine within the virtual network.</p>
<p><img loading="lazy" src="https://cdn-images-1.medium.com/max/800/1*QPEIpLQrFfvq4hfgw04yag.png"></p>
<p>The above diagram shows what we will have in place by the end of this article.</p>
<h4 id="software-prerequisites-and-why-you-needthem">Software prerequisites and why you need them</h4>
<p>The on-premises network is virtual. We will need libraries and applications that enable the creation and management of virtual networks and virtual machines. Follow the below steps to prepare the environment.</p>
<blockquote>
<p>Installation steps for KVM, QEMU, libvirt and virt-manager vary by Linux distribution and version. Refer to your distribution’s documentation for the correct package names and commands</p>
</blockquote>
<ul>
<li><strong>Verify</strong> virtualization is enabled — Run <code>egrep -c '(vmx|svm)' /proc/cpuinfo</code> A result greater than 0 means your CPU supports hardware virtualization and you&rsquo;re good to go.</li>
<li>Install <a href="https://ubuntu.com/blog/kvm-hyphervisor"><strong>KVM hypervisor</strong></a> <strong>and</strong> <a href="https://www.qemu.org/docs/master/"><strong>QEMU</strong></a> — These two work as a pair. KVM is the Linux kernel module that provides hardware virtualization, allowing the Guest OS to execute instructions directly on the host CPU at near-native speed. QEMU handles the emulated hardware that the Guest OS interacts with — the virtual disk drives, the network card, and the VGA BIOS that Windows thinks it’s seeing.</li>
<li>Install <a href="https://www.libvirt.org/apps.html"><strong>libvirt</strong></a> — this is the virtualization manager I will be using to manage my virtual network. It is the control layer. It translates GUI actions into XML definitions and complex command-line instructions for the hypervisor. It manages storage pools, virtual networks, and VM lifecycle. For example, you add a new virtual hardware, say, a CDROM, libvirt will generate the XML configuration to support CDROM virtualization which will then be read and processed by QEMU.</li>
<li><a href="https://virt-manager.org/"><strong>Virtual Machine Manager (virt-manager)</strong></a> — a handy GUI for libvirt. You launch it with virt-manager, and it lets you create and manage VMs but does not run them.</li>
<li>Download <a href="https://www.microsoft.com/en-us/evalcenter/evaluate-windows-server-2022">Windows Server 2022 Evaluation</a> version ISO file. The OS must be a server OS that supports Active Directory Domain Services and hence I’ve chosen Windows Server 2022</li>
<li>Download virtIO from <a href="https://fedorapeople.org/groups/virt/virtio-win/direct-downloads/stable-virtio/virtio-win.iso">https://fedorapeople.org/groups/virt/virtio-win/direct-downloads/stable-virtio/virtio-win.iso</a></li>
</ul>
<p>Once the above steps are completed, run <code>virsh net-list --all</code> This should show the default network that libvirt just created.</p>
<p><strong>At this stage, the virtual network is up and all the tools needed to create and manage VMs are in place.</strong></p>
<h4 id="role-of-libvirt-virt-manager-kvm-andqemu">Role of libvirt, virt-manager, KVM and QEMU</h4>
<p>It helps to build a simple mental model of how these components fit together — specifically, how virt-manager interacts with the underlying hypervisor.</p>
<p><strong>The Management Flow (Control Plane)</strong> — when you create or configure a VM:</p>
<p><code>virt-manager → libvirt → QEMU/KVM</code></p>
<ul>
<li>virt-manager provides the GUI</li>
<li>libvirt acts as the control layer, translating actions into configurations</li>
<li>QEMU/KVM executes those configurations</li>
</ul>
<p><strong>The Execution Flow (Data Plane)</strong> — when something runs inside the VM:</p>
<p><code>User → Guest OS (Windows) → QEMU → KVM → Hardware</code></p>
<ul>
<li>QEMU handles device emulation (disk, NIC, etc.)</li>
<li>KVM provides direct access to CPU virtualization features</li>
</ul>
<p>This separation helps explain why VM configuration and VM execution are two different layers.</p>
<h4 id="virtual-networking">Virtual Networking</h4>
<p>Before we create the virtual machine, a few networking concepts around virtual networking and libvirt are worth clarifying.</p>
<p><strong>Virtual network</strong> — the private address space where your VMs live. They exist only inside the host and are not visible to the WiFi network that is connecting your laptop with other laptops, phones and other devices in your WiFi network.</p>
<p><strong>Bridge (virbr0)</strong>— <strong>Layer 2</strong> construct — the virtual switch that connects your VMs to each other and to the host, like a physical switch in a rack. Imagine VM1 wants to send a packet to VM2, it forwards traffic based on MAC address learning (similar to a Layer 2 switch) and will forward the packet to VM2. It acts like a Layer 2 switch inside your Linux host. Every VM connected to the default network, plugs into this switch.</p>
<p><strong>Gateway (192.168.122.1) — Layer 3</strong> construct — the door out of the virtual network; packets destined for anywhere outside the virtual network go here first and then get routed based on the defined route table.</p>
<blockquote>
<p><strong>Gateway vs bridge</strong> — At first, they looked similar to me. It took me some time to understand the difference. The bridge connects devices at Layer 2 by MAC address. If two VMs in the same virtual network want to talk, the bridge connects them; they bypass the gateway. The gateway is the IP address assigned to the bridge interface, acting as the Layer 3 entry/exit point for the virtual network. It is the address VMs use when sending traffic outside the virtual network</p>
</blockquote>
<blockquote>
<p>VMs talk to <strong>each other through the bridge</strong>, they talk to the <strong>outside world through the gateway</strong>, and the virtual network is the address space that gives them all a place to live.</p>
</blockquote>
<p>When libvirt is first installed, it automatically creates a default virtual network with a virtual switch called <code>virbr0</code> — visible via <code>ip a</code>. Behind this bridge, libvirt configures <a href="https://dnsmasq.org/doc.html"><strong>dnsmasq</strong></a> for DHCP and DNS, and uses <strong>iptables/nftables</strong> on the host to provide NAT routing. Any VM you create is connected to this network unless you specify otherwise.</p>
<h4 id="libvirt-modes">libvirt modes</h4>
<p>libvirt supports three modes, namely, NAT, Bridge and Internal. For our hybrid environment, the virtual network uses <strong>NAT mode</strong> described in <a href="https://wiki.libvirt.org/VirtualNetworking.html">https://wiki.libvirt.org/VirtualNetworking.html</a> meaning that the WiFi router sees all packets from the virtual network as originating from the Linux host. It has no notion of the virtual network. This approach requires no changes to the home network and keeps the virtual environment isolated, while still allowing outbound internet access.</p>
<p>Use bridged networking if you want your virtual machines to obtain an IP address from your LAN. Or use <strong>Internal Network</strong> if you want a fully isolated lab.</p>
<p>Traffic flow in default (NAT) mode looks like this:</p>
<p>VM →<br>
virtual NIC →<br>
virbr0 (virtual switch) →<br>
NAT (iptables/nftables on host) →<br>
Linux host →<br>
physical network →<br>
internet</p>
<p>If you use <strong>bridged mode</strong>, the VM connects directly to your physical network through a Linux bridge (e.g., <code>br0</code>). In that case:</p>
<p>VM →<br>
virtual NIC →<br>
Linux bridge →<br>
physical NIC →<br>
real LAN</p>
<p>No NAT. The VM behaves like a real machine on your network.</p>
<h4 id="creating-virtualmachines">Creating Virtual Machines</h4>
<p>Now, we can start creating a Virtual Machine. This virtual machine will be the primary domain controller (will be covered in Part 2), so, let’s name it appropriately. I will call it <code>hce-dc01</code>. Give at least: 4 GB RAM, 2 vCPU, 60 GB disk for the virtual machine. This sizing is sufficient for a lightweight domain controller while keeping resource usage manageable on a single host machine.</p>
<p>Launch <code>virt-manager</code> from the terminal. It opens up the GUI of virtual manager. Select option to create a new VM, follow the wizard by providing configurations as outlined above. Use the downloaded ISO image and install Windows Server OS. The installation is quite straightforward.</p>
<blockquote>
<p>Important — Make sure you install Windows Server 2022 Desktop Experience.</p>
</blockquote>
<p>Once the Windows Server 2022 OS is installed, the VM will reboot allowing you to set the Administrator password.</p>
<p>Once the VM is ready, go to the details view and verify the below settings.</p>
<p>For Virtual NIC choose — e1000e. This is an emulated Intel NIC that Windows recognizes out of the box — it gets you network access during installation before VirtIO (discussed in the next section) drivers are in place.</p>
<p><img loading="lazy" src="https://cdn-images-1.medium.com/max/800/1*s9C2l7kujQs9G8zESedAxA.png"></p>
<p>For video select QXL — this is a paravirtualized display adapter that gives you a responsive desktop with better resolution support compared to the default VGA</p>
<p><img loading="lazy" src="https://cdn-images-1.medium.com/max/800/1*cF9_eWtaDxxocolKIsCHyw.png"></p>
<p><strong>Add VirtIO ISO as CD-ROM in virt-manager</strong></p>
<p>What and Why? — <code>[VirtIO](https://wiki.libvirt.org/Virtio.html)</code> is a paravirtualized device interface used between Windows and QEMU. Instead of emulating physical hardware like SATA or Intel NICs, VirtIO provides purpose-built virtual drivers that both the guest and hypervisor understand directly — reducing CPU overhead and improving throughput.</p>
<p>Follow the below steps to install VirtIO</p>
<ol>
<li>Open <strong>virt-manager</strong> → select your VM → <strong>Open → Show virtual hardware details</strong></li>
<li>Click <strong>Add Hardware → Storage → CD-ROM</strong></li>
<li>Choose <strong>Select or create custom storage</strong> → point to <code>~/ISOs/virtio-win.iso</code></li>
<li>Boot (or reboot) the VM</li>
</ol>
<p>Now inside the VM you will see <strong>two CD-ROMs</strong>:</p>
<ul>
<li>SATA CDROM1 → Windows Server ISO</li>
<li>VirtIO CD-ROM → drivers</li>
</ul>
<p><strong>Installing VirtIO<br>
<strong>Now that we have the VirtIO ISO mounted to VM as a CDROM drive, navigate to the CDROM drive and run the <strong>exe installer (</strong><code>**virtio-win-gt-x64.exe**</code></strong>)</strong> It installs all the necessary VirtIO drivers for disk, network, and optional devices automatically. Once the drivers are installed, shut down the VM, change the NIC type to ‘virtio’ in virt-manager, and then start it back up for better throughput.</p>
<p><strong>This step completes the configuration of the Windows Virtual Machine.</strong> When <code>hce-dc01</code> was created, virt-manager automatically connected it to the default virtual network created by libvirt — Let’s verify that now</p>
<h4 id="verification">Verification</h4>
<p>To confirm the virtual network is configured correctly, run <code>virsh net-dumpxml default</code></p>
<p>&lt;network connections=&lsquo;1&rsquo;&gt;<br>
&lt;name&gt;default&lt;/name&gt;<br>
&lt;uuid&gt;73e80935-c747-4ad7-88a1-5417707abc02&lt;/uuid&gt;<br>
&lt;forward mode=&lsquo;nat&rsquo;&gt;<br>
&lt;nat&gt;<br>
&lt;port start=&lsquo;1024&rsquo; end=&lsquo;65535&rsquo;/&gt;<br>
&lt;/nat&gt;<br>
&lt;/forward&gt;<br>
&lt;bridge name=&lsquo;virbr0&rsquo; stp=&lsquo;on&rsquo; delay=&lsquo;0&rsquo;/&gt;<br>
&lt;mac address=&lsquo;52:54:00:d9:a6:2b&rsquo;/&gt;<br>
&lt;ip address=&lsquo;192.168.122.1&rsquo; netmask=&lsquo;255.255.255.0&rsquo;&gt;<br>
&lt;dhcp&gt;<br>
&lt;range start=&lsquo;192.168.122.2&rsquo; end=&lsquo;192.168.122.254&rsquo;/&gt;<br>
&lt;/dhcp&gt;<br>
&lt;/ip&gt;<br>
&lt;/network&gt;</p>
<p>In the above snippet you will notice that bridge <code>virbr0</code> is configured with a DHCP range from <code>192.168.122.2</code> to <code>192.168.122.254</code>.</p>
<p>Login to the new VM and check its IP (<code>ipconfig</code>). You will see an IP within this range allocated by dnsmasq.</p>
<p>Also, remember this range must not overlap with the range your Wi-Fi provides (it usually doesn’t, but worth noting)</p>
<h4 id="summary">Summary</h4>
<p>With these steps, we have a working virtual network and a Windows Server 2022 VM ready to be configured. In the next part, we will turn this VM into a fully functional Active Directory domain controller — laying the groundwork for identity in our hybrid environment.</p>
]]></content:encoded></item><item><title>Adding application roles to Managed Identity</title><link>https://gurupasupathy.com/post/2026-02-27_adding-application-roles-to-managed-identity/</link><pubDate>Fri, 27 Feb 2026 00:00:00 +0000</pubDate><guid>https://gurupasupathy.com/post/2026-02-27_adding-application-roles-to-managed-identity/</guid><description>&lt;p&gt;&lt;img loading="lazy" src="https://gurupasupathy.com/img/1__932kIAgBM7f5RkLN3QcYxw.png"&gt;&lt;/p&gt;
&lt;p&gt;This guide outlines the process for assigning application roles to a &lt;strong&gt;Managed Identity (MI)&lt;/strong&gt; in Entra ID. It covers observed behaviors, inherent limitations, and the necessary steps required when an MI must authenticate with another application (such as an API in APIM) using role-based access control (RBAC).&lt;/p&gt;
&lt;h3 id="scenario"&gt;Scenario&lt;/h3&gt;
&lt;p&gt;In a typical architecture, a &lt;strong&gt;Logic App&lt;/strong&gt; utilizes a &lt;strong&gt;Managed Identity&lt;/strong&gt; (either System-Assigned or User-Assigned) to communicate with downstream resources. When that Logic App needs to call an &lt;strong&gt;API exposed via APIM&lt;/strong&gt;, the following requirements usually apply:&lt;/p&gt;</description><content:encoded><![CDATA[<p><img loading="lazy" src="/img/1__932kIAgBM7f5RkLN3QcYxw.png"></p>
<p>This guide outlines the process for assigning application roles to a <strong>Managed Identity (MI)</strong> in Entra ID. It covers observed behaviors, inherent limitations, and the necessary steps required when an MI must authenticate with another application (such as an API in APIM) using role-based access control (RBAC).</p>
<h3 id="scenario">Scenario</h3>
<p>In a typical architecture, a <strong>Logic App</strong> utilizes a <strong>Managed Identity</strong> (either System-Assigned or User-Assigned) to communicate with downstream resources. When that Logic App needs to call an <strong>API exposed via APIM</strong>, the following requirements usually apply:</p>
<ul>
<li>The API is protected by its own <strong>App Registration</strong> in Entra ID.</li>
<li>The API expects the caller to possess specific <strong>app roles</strong> (e.g., <code>API.Read</code> or <code>API.ReadWrite</code>).</li>
<li>The Logic App must obtain an OAuth token containing these roles to successfully authorize against the API.</li>
</ul>
<h3 id="the-challenge">The Challenge</h3>
<p>Managed Identities are automatically created Service Principals. A common point of confusion is that they <strong>do not appear in the App Registration section</strong> of the Azure portal; they are found exclusively under <strong>Enterprise Applications</strong>.</p>
<p>Because the Azure portal does not currently provide a UI for assigning app roles to Enterprise Applications directly, it is not possible to assign roles like <code>API.Read</code> through the standard &ldquo;API Permissions&rdquo; blade used for traditional App Registrations.</p>
<h3 id="the-workaroundassigning-app-roles-via-powershell--microsoft-graph">The Workaround — Assigning App Roles via PowerShell / Microsoft Graph</h3>
<p>You can use the below Powershell to assign roles to your Managed Identity</p>
<p># Install-Module Microsoft.Graph -Scope CurrentUser (If not done already)</p>
<p># Your tenant ID (in the Azure portal, under Azure Active Directory &gt; Overview).<br>
$tenantID = &lsquo;{tenantId}&rsquo;</p>
<p># The name of the server app that exposes the app roles.<br>
$serverApplicationName = &lsquo;{serverApplicationName}&rsquo;</p>
<p># The name of the app role that the managed identity should be assigned to.<br>
$appRoleName = &lsquo;{appRoleName}&rsquo; # For example, Api.Read</p>
<p># Look up the Logic App / Function (Client application) managed identity&rsquo;s object ID.<br>
$managedIdentityObjectId = &lsquo;{managedIdentityObjectId}&rsquo;</p>
<p># Connect-MgGraph -TenantId $tenantId -Scopes &lsquo;Application.ReadWrite.All&rsquo;,&lsquo;Directory.Read.All&rsquo;<br>
# or a more restricted set of permissions (recommended):<br>
Connect-MgGraph -TenantId $tenantId -Scopes &lsquo;Application.Read.All&rsquo;,&lsquo;AppRoleAssignment.ReadWrite.All&rsquo;</p>
<p># Look up the details about the server app&rsquo;s service principal and app role.<br>
$serverServicePrincipal = (Get-MgServicePrincipal -Filter &ldquo;DisplayName eq &lsquo;$serverApplicationName&rsquo;&rdquo;)<br>
$serverServicePrincipalObjectId = $serverServicePrincipal.Id<br>
$appRoleId = ($serverServicePrincipal.AppRoles | Where-Object {$_.Value -eq $appRoleName }).Id</p>
<p>Write-Host &lsquo;$serverServicePrincipal &rsquo; $serverServicePrincipal<br>
Write-Host &lsquo;$managedIdentityObjectId &rsquo; $managedIdentityObjectId<br>
Write-Host &lsquo;$serverServicePrincipalObjectId &rsquo; $serverServicePrincipalObjectId<br>
Write-Host &lsquo;AppRoleId &gt;&rsquo; $appRoleId</p>
<p># Assign the managed identity access to the app role.<br>
New-MgServicePrincipalAppRoleAssignment -ServicePrincipalId $serverServicePrincipalObjectId -PrincipalId $managedIdentityObjectId -ResourceId $serverServicePrincipalObjectId -AppRoleId $appRoleId</p>
<ul>
<li><code>PrincipalId</code> → Managed Identity object ID (Logic App)</li>
<li><code>ResourceId</code> → API service principal object ID</li>
<li><code>AppRoleId</code> → GUID of the role defined in the API registration</li>
</ul>
<p>After this assignment, tokens requested by the Managed Identity will include the required <code>roles</code> claim, allowing successful authorization against the API.</p>
<h3 id="note">Note</h3>
<ul>
<li>For clientId to be able to be used as an audience it must “own” App Roles. And the consumer-client-id should have been provided this roles in AAD. I think you can further check these claims in the Authentication section</li>
</ul>
<h3 id="key-takeaways">Key Takeaways</h3>
<ul>
<li>Managed Identities always appear as <strong>Enterprise Apps</strong> in Azure AD.</li>
<li>App roles cannot be assigned via the portal for Enterprise Apps; <strong>Graph / PowerShell is required</strong>.</li>
<li>Token validation depends on correct <strong>Issuer, Audience, and presence of role claims</strong>.</li>
<li>Explicit role assignment ensures tokens carry the required roles for API authorization.</li>
</ul>
]]></content:encoded></item><item><title>Troubleshooting notes — Azure Table Storage 403 Authentication</title><link>https://gurupasupathy.com/post/2026-02-22_troubleshooting-notes-azure-table-storage-403-authentication/</link><pubDate>Sun, 22 Feb 2026 00:00:00 +0000</pubDate><guid>https://gurupasupathy.com/post/2026-02-22_troubleshooting-notes-azure-table-storage-403-authentication/</guid><description>&lt;p&gt;Symptom&lt;/p&gt;
&lt;p&gt;&lt;img loading="lazy" src="https://gurupasupathy.com/img/1__Uod5Yt3bOWoSV99G6AG0Ug.png"&gt;&lt;/p&gt;
&lt;h3 id="symptom"&gt;Symptom&lt;/h3&gt;
&lt;p&gt;Calling Azure Table Storage REST API returns:&lt;/p&gt;
&lt;p&gt;403 Server failed to authenticate the request.&lt;br&gt;
Make sure the value of Authorization header is formed correctly including the signature.&lt;/p&gt;
&lt;p&gt;Even though Authorization header looks valid&lt;/p&gt;
&lt;h3 id="root-cause"&gt;Root Cause&lt;/h3&gt;
&lt;p&gt;The request is missing &lt;strong&gt;x-ms-version&lt;/strong&gt; header&lt;/p&gt;
&lt;p&gt;Azure Storage requires this header to determine the API version used for request validation. Without it, the service may reject the request with a misleading authentication error.&lt;/p&gt;</description><content:encoded><![CDATA[<p>Symptom</p>
<p><img loading="lazy" src="/img/1__Uod5Yt3bOWoSV99G6AG0Ug.png"></p>
<h3 id="symptom">Symptom</h3>
<p>Calling Azure Table Storage REST API returns:</p>
<p>403 Server failed to authenticate the request.<br>
Make sure the value of Authorization header is formed correctly including the signature.</p>
<p>Even though Authorization header looks valid</p>
<h3 id="root-cause">Root Cause</h3>
<p>The request is missing <strong>x-ms-version</strong> header</p>
<p>Azure Storage requires this header to determine the API version used for request validation. Without it, the service may reject the request with a misleading authentication error.</p>
<h3 id="fix">Fix</h3>
<p>Add header</p>
<p>x-ms-version: 2020–08–04</p>
<p>Example minimal headers:</p>
<p>x-ms-version: 2020–08–04<br>
Accept: application/json;odata=nometadata<br>
Content-Type: application/json</p>
<h3 id="lesson-learned">Lesson learned</h3>
<p>If Azure Storage returns a 403 authentication error for a manually signed REST request, check for missing x-ms-version before debugging the signature.</p>
]]></content:encoded></item><item><title>Access AppConfiguration from Function App using Managed Identity</title><link>https://gurupasupathy.com/post/2026-02-21_access-appconfiguration-from-function-app-using-managed-identity/</link><pubDate>Sat, 21 Feb 2026 00:00:00 +0000</pubDate><guid>https://gurupasupathy.com/post/2026-02-21_access-appconfiguration-from-function-app-using-managed-identity/</guid><description>&lt;p&gt;&lt;img loading="lazy" src="https://gurupasupathy.com/img/1__nTaI7u53YGpj1No72IN6Yw.png"&gt;&lt;/p&gt;
&lt;p&gt;Accessing Azure App Configuration using Managed Identity in Azure Functions is slightly different from accessing other Azure services.&lt;/p&gt;
&lt;p&gt;For most Azure services (Storage, Service Bus, Key Vault), you typically:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Enable Managed Identity on the Function&lt;/li&gt;
&lt;li&gt;Grant RBAC access to the resource&lt;/li&gt;
&lt;li&gt;Create the SDK client using DefaultAzureCredential&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;However, App Configuration is usually loaded as part of the application configuration pipeline at startup, so it must be added via the host builder.&lt;/p&gt;</description><content:encoded><![CDATA[<p><img loading="lazy" src="/img/1__nTaI7u53YGpj1No72IN6Yw.png"></p>
<p>Accessing Azure App Configuration using Managed Identity in Azure Functions is slightly different from accessing other Azure services.</p>
<p>For most Azure services (Storage, Service Bus, Key Vault), you typically:</p>
<ul>
<li>Enable Managed Identity on the Function</li>
<li>Grant RBAC access to the resource</li>
<li>Create the SDK client using DefaultAzureCredential</li>
</ul>
<p>However, App Configuration is usually loaded as part of the application configuration pipeline at startup, so it must be added via the host builder.</p>
<h3 id="prerequisites">Prerequisites</h3>
<ul>
<li>Enable Managed Identity on the Function App</li>
<li>Grant the identity: App Configuration Data Reader on the App Configuration resource</li>
</ul>
<p>Sample code as shown below</p>
<p>var host = new HostBuilder()    <br>
.ConfigureAppConfiguration(builder =&gt;<br>
{<br>
string cs = Environment.GetEnvironmentVariable(&ldquo;ConnectionString&rdquo;);<br>
builder.AddAzureAppConfiguration(options =&gt;<br>
options.Connect(new Uri(@&ldquo;<a href="https://appconfiguri.azconfig.io">https://appconfiguri.azconfig.io</a>&rdquo;), new ManagedIdentityCredential()));<br>
})<br>
.ConfigureFunctionsWebApplication()<br>
.Build();<br>
host.Run();</p>
<p>Note: I’m using ManagedIdentityCredential but the recommend class is DefaultAzureCredential</p>
<h3 id="key-insight">Key Insight</h3>
<ul>
<li>Other Azure services → authenticated when <strong>creating the client</strong></li>
<li>App Configuration → authenticated when <strong>building the configuration provider.</strong> That’s why it must be configured inside <code>ConfigureAppConfiguration()</code>.</li>
</ul>
]]></content:encoded></item><item><title>Coding with Integrity</title><link>https://gurupasupathy.com/post/2026-02-20_coding-with-integrity/</link><pubDate>Fri, 20 Feb 2026 00:00:00 +0000</pubDate><guid>https://gurupasupathy.com/post/2026-02-20_coding-with-integrity/</guid><description>&lt;h1 id="coding-with-integrity"&gt;Coding with Integrity&lt;/h1&gt;
&lt;p&gt;&lt;img loading="lazy" src="https://gurupasupathy.com/img/1____MXJsGbwgJ3C5gieg298uA.png"&gt;&lt;/p&gt;
&lt;p&gt;The real measure of a software engineer is simple — &lt;strong&gt;&lt;em&gt;how you code when no one is watching.&lt;/em&gt;&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;We often associate strong engineering with technical brilliance — mastering languages, designing scalable systems, or solving complex problems.&lt;/p&gt;
&lt;p&gt;But beyond skill, the most valuable attribute a software engineer can bring to the table is &lt;strong&gt;integrity&lt;/strong&gt;.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;“Coding with Integrity, is how you code when you know that no one is going to review your code”&lt;/strong&gt;&lt;/p&gt;</description><content:encoded><![CDATA[<h1 id="coding-with-integrity">Coding with Integrity</h1>
<p><img loading="lazy" src="/img/1____MXJsGbwgJ3C5gieg298uA.png"></p>
<p>The real measure of a software engineer is simple — <strong><em>how you code when no one is watching.</em></strong></p>
<p>We often associate strong engineering with technical brilliance — mastering languages, designing scalable systems, or solving complex problems.</p>
<p>But beyond skill, the most valuable attribute a software engineer can bring to the table is <strong>integrity</strong>.</p>
<blockquote>
<p><strong>“Coding with Integrity, is how you code when you know that no one is going to review your code”</strong></p>
</blockquote>
<p>Coding with integrity is about the choices you make in the quiet moments of development — when there’s no reviewer, no deadline pressure, and no immediate accountability except your own standards.</p>
<p>Many common engineering issues don’t come from lack of knowledge.<br>
They come from small decisions made in those unseen moments.</p>
<h3 id="design-decisions">Design decisions</h3>
<p>When implementing a feature, it’s easy to think only about the immediate ask: Does it work? Does it avoid breaking anything?</p>
<p>Integrity pushes the thinking further: Is this the right approach? Is it maintainable? Should I pause and rethink this before moving forward?</p>
<h3 id="unit-tests">Unit tests</h3>
<p>It’s possible to reach high coverage while knowing the tests don’t really validate behaviour.</p>
<p>Integrity asks: Do these tests genuinely protect the system? Would I trust them if something broke tomorrow?</p>
<h3 id="technical-debt">Technical debt</h3>
<p>Sometimes we clearly see duplication, fragile logic, or missed refactoring opportunities.</p>
<p>Integrity isn’t about always fixing everything immediately. It’s about being honest: acknowledging the debt, documenting it, not pretending the shortcut is a solution and ensure the debt is addressed.</p>
<h3 id="documentation-andclarity">Documentation and clarity</h3>
<p>After spending days or weeks on a module, everything feels obvious.</p>
<p>Integrity means writing code and comments for the next reader — even if that reader is your future self, months later.</p>
<p>Maybe integrity in coding isn’t something we formally learn or measure.</p>
<p>Maybe it’s simply the voice that nudges us toward clarity, correctness, and responsibility. Whether we follow that voice or ignore it is what ultimately shows up in our code.</p>
<p>Cheers!</p>
]]></content:encoded></item><item><title>Managing Azure APIM Operation Policies in Terraform by Importing OpenAPI Specification</title><link>https://gurupasupathy.com/post/2026-02-20_managing-apim-op-policies-in-terraform-by-importing-openapi-spec/</link><pubDate>Fri, 20 Feb 2026 00:00:00 +0000</pubDate><guid>https://gurupasupathy.com/post/2026-02-20_managing-apim-op-policies-in-terraform-by-importing-openapi-spec/</guid><description>&lt;p&gt;&lt;img loading="lazy" src="https://gurupasupathy.com/img/1__mk3hcBMP7jVKBsxWOa5JDA.png"&gt;&lt;/p&gt;
&lt;p&gt;When using Terraform to import an OpenAPI/Swagger definition into Azure API Management (APIM), the API and its operations are created successfully. However, one subtle behavior can cause confusion when trying to manage operation-level policies declaratively.&lt;/p&gt;
&lt;p&gt;This post explains the issue and a simple workaround.&lt;/p&gt;
&lt;h3 id="the-scenario"&gt;The Scenario&lt;/h3&gt;
&lt;p&gt;I was importing my API using Terraform:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Swagger/OpenAPI definition imported into APIM&lt;br&gt;
API created successfully&lt;br&gt;
All operations appeared correctly in Azure&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Later, I wanted to attach operation-level policies using Terraform using &lt;em&gt;azurerm_api_management_api_operation_policy&lt;/em&gt;&lt;/p&gt;</description><content:encoded><![CDATA[<p><img loading="lazy" src="/img/1__mk3hcBMP7jVKBsxWOa5JDA.png"></p>
<p>When using Terraform to import an OpenAPI/Swagger definition into Azure API Management (APIM), the API and its operations are created successfully. However, one subtle behavior can cause confusion when trying to manage operation-level policies declaratively.</p>
<p>This post explains the issue and a simple workaround.</p>
<h3 id="the-scenario">The Scenario</h3>
<p>I was importing my API using Terraform:</p>
<blockquote>
<p>Swagger/OpenAPI definition imported into APIM<br>
API created successfully<br>
All operations appeared correctly in Azure</p>
</blockquote>
<p>Later, I wanted to attach operation-level policies using Terraform using <em>azurerm_api_management_api_operation_policy</em></p>
<p>At this point I ran into a problem: <strong>Terraform had no record of the operations in its state file.</strong></p>
<h3 id="why-thishappens">Why This Happens</h3>
<p>This behavior is expected once you understand how Terraform works. Terraform only tracks resources explicitly declared in configuration, or<br>
resources manually imported into state</p>
<p>When Swagger is imported via <em>azurerm_api_management_api</em> the operations are created inside Azure, but they are not separate Terraform-managed resources unless you explicitly declare using <em>azurerm_api_management_api_operation</em></p>
<p>Effectively — API is created in Azure and tracked in Terraform while<br>
API Operations (via Swagger import) are created in Azure but NOT tracked in Terraform</p>
<p>This makes it unclear how to attach policies to those operations without creating the operations explicitly — a nightmare if you have hundreds of operations</p>
<h3 id="the-simple-workaround">The Simple Workaround</h3>
<p>You do not need a Terraform resource reference to the operation for you to create an operation policy and attach it. Instead, you can attach the policy directly using <em>azurerm_api_management_api_operation_policy</em> resource and referencing the Swagger operationId.</p>
<p>Example:</p>
<p>resource &ldquo;azurerm_api_management_api_operation_policy&rdquo; &ldquo;my_op_policy&rdquo; {<br>
provider = &laquo;provider&raquo;<br>
api_name = &ldquo;<your api name>&rdquo;<br>
api_management_name = data.azurerm_api_management.apim.name<br>
resource_group_name = data.azurerm_api_management.apim.resource_group_name<br>
operation_id = &ldquo;<operationId from swagger>&rdquo;<br>
xml_content = templatefile(&quot;<policy path>&quot;, {<br>
backend_name = &ldquo;<backend name>&rdquo;<br>
method = &ldquo;<operation method>&rdquo;<br>
})<br>
}</p>
<p>As long as the API exists in APIM and the operation exists and operation_id exactly matches the Swagger operationId — Terraform can apply and update the policy successfully. No explicit Terraform operation resource is required.</p>
<h3 id="notes">Notes</h3>
<p>1. Use the Swagger operationId, not the display name. Terraform identifies the operation strictly by operationId.</p>
<p>2. Treat operationId as a stable contract. If you later rename the operationId or remove an endpoint or restructure the Swagger Terraform may fail because the referenced operation no longer exists.</p>
<p>3. Importing operations individually is possible but rarely worth it. You can define <em>azurerm_api_management_api_operation</em> and import each operation manually into Terraform state. However, it requires one resource per operation. Also, manual imports are tedious and scales poorly for large APIs thus defeating the benefit of Swagger-driven API definition</p>
<p>For most setups, referencing operationId directly in the policy resource is simpler.</p>
<h3 id="takeaway">Takeaway</h3>
<p>When importing Swagger into APIM using Terraform:</p>
<blockquote>
<p>Operations are created in Azure<br>
Terraform does not automatically track them<br>
Operation policies can still be managed declaratively by simply referencing the Swagger/OpenAPI Spec operationId</p>
</blockquote>
<p>Understanding this distinction can save significant time when automating API Management deployments.</p>
]]></content:encoded></item><item><title>Demo workflow for Minikube</title><link>https://gurupasupathy.com/post/2026-01-31_demo-workflow-for-minikube/</link><pubDate>Sat, 31 Jan 2026 00:00:00 +0000</pubDate><guid>https://gurupasupathy.com/post/2026-01-31_demo-workflow-for-minikube/</guid><description>&lt;p&gt;Photo by Shubham Dhage on Unsplash&lt;/p&gt;
&lt;p&gt;&lt;img loading="lazy" src="img/1__L75eWRx0XZp7bUauqQgzog.jpeg"&gt;&lt;/p&gt;
&lt;p&gt;Photo by &lt;a href="https://unsplash.com/@theshubhamdhage?utm_source=unsplash&amp;amp;utm_medium=referral&amp;amp;utm_content=creditCopyText"&gt;Shubham Dhage&lt;/a&gt; on &lt;a href="https://unsplash.com/photos/a-black-and-white-photo-of-a-bunch-of-cubes-gC_aoAjQl2Q?utm_source=unsplash&amp;amp;utm_medium=referral&amp;amp;utm_content=creditCopyText"&gt;Unsplash&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Here’s a ready-to-run “one-shot” demo workflow for Minikube that sets up a webserver deployment, exposes it, configures HPA, and generates load so you can see autoscaling in action immediately.&lt;/p&gt;
&lt;p&gt;You can copy-paste these commands &lt;strong&gt;one after the other&lt;/strong&gt; in your terminal.&lt;/p&gt;
&lt;h3 id="step-0-optional-clean-up-old-resources"&gt;Step 0: (Optional) Clean up old resources&lt;/h3&gt;
&lt;p&gt;kubectl delete deployment webserver --ignore-not-found&lt;br&gt;
kubectl delete svc webserver --ignore-not-found&lt;br&gt;
kubectl delete hpa webserver --ignore-not-found&lt;br&gt;
kubectl delete pod load-generator --ignore-not-found&lt;/p&gt;</description><content:encoded><![CDATA[<p>Photo by Shubham Dhage on Unsplash</p>
<p><img loading="lazy" src="img/1__L75eWRx0XZp7bUauqQgzog.jpeg"></p>
<p>Photo by <a href="https://unsplash.com/@theshubhamdhage?utm_source=unsplash&amp;utm_medium=referral&amp;utm_content=creditCopyText">Shubham Dhage</a> on <a href="https://unsplash.com/photos/a-black-and-white-photo-of-a-bunch-of-cubes-gC_aoAjQl2Q?utm_source=unsplash&amp;utm_medium=referral&amp;utm_content=creditCopyText">Unsplash</a></p>
<p>Here’s a ready-to-run “one-shot” demo workflow for Minikube that sets up a webserver deployment, exposes it, configures HPA, and generates load so you can see autoscaling in action immediately.</p>
<p>You can copy-paste these commands <strong>one after the other</strong> in your terminal.</p>
<h3 id="step-0-optional-clean-up-old-resources">Step 0: (Optional) Clean up old resources</h3>
<p>kubectl delete deployment webserver --ignore-not-found<br>
kubectl delete svc webserver --ignore-not-found<br>
kubectl delete hpa webserver --ignore-not-found<br>
kubectl delete pod load-generator --ignore-not-found</p>
<h3 id="step-1-create-the-webserver-deployment">Step 1: Create the webserver deployment</h3>
<p>kubectl create deployment webserver --image=gcr.io/google_containers/echoserver:1.10</p>
<h3 id="step-2-expose-the-deployment-as-a-nodeportservice">Step 2: Expose the deployment as a NodePort service</h3>
<p>kubectl expose deployment webserver &ndash;type=NodePort &ndash;port=8080</p>
<p>Check service:</p>
<p>kubectl get svc webserver</p>
<h3 id="step-3-enable-metrics-server-if-notalready">Step 3: Enable metrics-server if not already</h3>
<p>minikube addons enable metrics-server</p>
<h3 id="step-4-create-horizontal-pod-autoscaler">Step 4: Create Horizontal Pod Autoscaler</h3>
<p>kubectl autoscale deployment webserver &ndash;cpu=20% &ndash;min=1 &ndash;max=5</p>
<p>Check HPA:</p>
<p>kubectl get hpa</p>
<h3 id="step-5-launch-load-generator-pod">Step 5: Launch load-generator pod</h3>
<p>kubectl run -i &ndash;tty load-generator &ndash;image=busybox &ndash; /bin/sh</p>
<p>Inside the pod, generate <strong>heavy load</strong>:</p>
<p>while true; do wget -q -O- http://webserver:8080 &amp; done</p>
<ul>
<li>The <code>&amp;</code> ensures requests run in parallel for higher CPU usage.</li>
<li>This will <strong>trigger the HPA</strong> to scale the webserver pods.</li>
</ul>
<h3 id="step-6-watch-autoscaling-in-anotherterminal">Step 6: Watch autoscaling in another terminal</h3>
<p>kubectl get hpa -w<br>
kubectl get pods -w</p>
<ul>
<li>You will see <strong>replicas increase</strong> as CPU usage rises.</li>
<li>When you stop the load (<code>Ctrl+C</code> in the BusyBox pod), HPA will scale pods back down.</li>
</ul>
<h3 id="step-7-optionaltest-webserver-fromhost">Step 7: Optional — test webserver from host</h3>
<p>minikube service webserver</p>
<ul>
<li>Opens your webserver in the browser.</li>
</ul>
]]></content:encoded></item><item><title>Using Model Overlays using .modelfile</title><link>https://gurupasupathy.com/post/2026-01-31_using-model-overlays-using--modelfile/</link><pubDate>Sat, 31 Jan 2026 00:00:00 +0000</pubDate><guid>https://gurupasupathy.com/post/2026-01-31_using-model-overlays-using--modelfile/</guid><description>&lt;p&gt;&lt;img loading="lazy" src="https://gurupasupathy.com/img/1__yS9nBSlhRLM7xC__WPyek__w.jpeg"&gt;&lt;/p&gt;
&lt;p&gt;Photo by &lt;a href="https://unsplash.com/@steve_j?utm_source=unsplash&amp;amp;utm_medium=referral&amp;amp;utm_content=creditCopyText"&gt;Steve Johnson&lt;/a&gt; on &lt;a href="https://unsplash.com/photos/a-ceiling-with-many-windows-7INz588_4Kw?utm_source=unsplash&amp;amp;utm_medium=referral&amp;amp;utm_content=creditCopyText"&gt;Unsplash&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;When you want to use a model but don’t want to keep initializing it with a specific &lt;strong&gt;persona, temperature,&lt;/strong&gt; and other attributes, you can use the &lt;strong&gt;.modelfile Customization Approach.&lt;/strong&gt;&lt;/p&gt;
&lt;h4 id="step-1-create-amodelfile-as-shown-below-sys_adminmodelfile"&gt;Step 1: Create a .modelfile as shown below (sys_admin.modelfile)&lt;/h4&gt;
&lt;p&gt;# 1. THE BASE (Required)&lt;br&gt;
FROM llama3&lt;/p&gt;
&lt;p&gt;# 2. BRAIN PHYSICS (Parameters)&lt;br&gt;
PARAMETER temperature 0.7 # Creativity (0.0 to 1.0+)&lt;br&gt;
PARAMETER num_ctx 4096 # How many &amp;ldquo;tokens&amp;rdquo; of memory it has&lt;br&gt;
PARAMETER top_k 40 # Limits the &amp;ldquo;vocabulary&amp;rdquo; pool for each word&lt;br&gt;
PARAMETER top_p 0.9 # Probability threshold for word choice&lt;br&gt;
PARAMETER repeat_penalty 1.1 # Prevents the AI from getting stuck in a loop&lt;br&gt;
PARAMETER stop &amp;ldquo;User:&amp;rdquo; # Tells the AI exactly when to stop talking&lt;br&gt;
PARAMETER stop &amp;ldquo;&amp;mdash;&amp;rdquo;&lt;/p&gt;</description><content:encoded><![CDATA[<p><img loading="lazy" src="/img/1__yS9nBSlhRLM7xC__WPyek__w.jpeg"></p>
<p>Photo by <a href="https://unsplash.com/@steve_j?utm_source=unsplash&amp;utm_medium=referral&amp;utm_content=creditCopyText">Steve Johnson</a> on <a href="https://unsplash.com/photos/a-ceiling-with-many-windows-7INz588_4Kw?utm_source=unsplash&amp;utm_medium=referral&amp;utm_content=creditCopyText">Unsplash</a></p>
<p>When you want to use a model but don’t want to keep initializing it with a specific <strong>persona, temperature,</strong> and other attributes, you can use the <strong>.modelfile Customization Approach.</strong></p>
<h4 id="step-1-create-amodelfile-as-shown-below-sys_adminmodelfile">Step 1: Create a .modelfile as shown below (sys_admin.modelfile)</h4>
<p># 1. THE BASE (Required)<br>
FROM llama3</p>
<p># 2. BRAIN PHYSICS (Parameters)<br>
PARAMETER temperature 0.7     # Creativity (0.0 to 1.0+)<br>
PARAMETER num_ctx 4096        # How many &ldquo;tokens&rdquo; of memory it has<br>
PARAMETER top_k 40            # Limits the &ldquo;vocabulary&rdquo; pool for each word<br>
PARAMETER top_p 0.9           # Probability threshold for word choice<br>
PARAMETER repeat_penalty 1.1  # Prevents the AI from getting stuck in a loop<br>
PARAMETER stop &ldquo;User:&rdquo;        # Tells the AI exactly when to stop talking<br>
PARAMETER stop &ldquo;&mdash;&rdquo;</p>
<p># 3. THE TEMPLATE (The &ldquo;Skeleton&rdquo; of a conversation)<br>
# This defines how the model sees the Turn-taking between User and AI.<br>
TEMPLATE &ldquo;&rdquo;&quot;{{ if .System }}&lt;|start_header_id|&gt;system&lt;|end_header_id|&gt;</p>
<p>{{ .System }}&lt;|eot_id|&gt;{{ end }}{{ if .Prompt }}&lt;|start_header_id|&gt;user&lt;|end_header_id|&gt;</p>
<p>{{ .Prompt }}&lt;|eot_id|&gt;{{ end }}&lt;|start_header_id|&gt;assistant&lt;|end_header_id|&gt;</p>
<p>{{ .Response }}&lt;|eot_id|&gt;&quot;&quot;&quot;</p>
<p># 4. (System Instructions)<br>
SYSTEM &quot;&quot;&quot;<br>
You are a specialized Azure Networking Assistant and System Administrator with plenty of experience.<br>
You provide CLI commands for Linux Mint and PowerShell for Windows.<br>
Constraints:<br>
1. If a config is insecure, call it out immediately.<br>
&quot;&quot;&quot;</p>
<p># 5. PRE-LOADING (The &ldquo;Conversation Starter&rdquo;)<br>
# You can bake in a &ldquo;fake&rdquo; memory so the model thinks it&rsquo;s already talking to you.<br>
# [OPTIONAL] ADAPTER ~/models/my-adapter  # (for actual fine-tuned weights)<br>
MESSAGE user &ldquo;Check the S2S status.&rdquo;<br>
MESSAGE assistant &ldquo;checking the IPsec tunnels now. One moment.&rdquo;</p>
<h4 id="step-2-create-an-overlay-on-top-of-existingmodel">Step 2: Create an overlay on top of existing model</h4>
<p>Once the .modelfile is ready, pick one of you exisiting models and create a new overlay like so -</p>
<p>ollama create my-new-overlay-sysadmin -f ./sys_admin.modelfile</p>
<h4 id="step-3-create-an-alias-for-easyuse">Step 3: Create an alias for easy use</h4>
<p>To make it “instant” so you don’t have to type long commands, you add an alias to your <code>.bashrc</code> file. This is the bridge between your OS and the AI.</p>
<ol>
<li>Open your config: <code>nano ~/.bashrc</code></li>
<li>Add this line at the bottom: alias summon-admin=’ollama run my-new-overlay-sysadmin’</li>
<li>Save and refresh: <code>source ~/.bashrc</code></li>
</ol>
<h4 id="how-it-works-inpractice">How it works in practice</h4>
<p>Now, whenever you are looking at a messy config file on your machine, you just pipe the text to your new friend:</p>
<p><code>cat /etc/ssh/sshd_config | summon-admin</code></p>
<p>The model will wake up, read the file, and start grumbling about your security choices.</p>
<h3 id="how-is-this-different-from-prompt-engineering">How is this different from prompt engineering</h3>
<h4 id="1-hardware--environment-parameters">1. Hardware &amp; Environment Parameters</h4>
<p>Prompt engineering cannot change how the computer actually runs the model. A <code>.modelfile</code> can.</p>
<ul>
<li><strong>Parameter Tuning:</strong> You set things like <code>PARAMETER temperature 0.2</code> (for consistency) or <code>PARAMETER num_ctx 4096</code> (how much &ldquo;memory&rdquo; it has for your config files).</li>
<li><strong>Stop Sequences:</strong> You can tell the model exactly when to stop talking (e.g., <code>PARAMETER stop &quot;User:&quot;</code>), preventing it from rambling.</li>
</ul>
<h4 id="2-the-persona-vs-theask">2. The “Persona” vs. The “Ask”</h4>
<ul>
<li><strong>Prompt Engineering:</strong> You have to tell the model <em>every time</em>: “Act like a sys admin and check this file…”</li>
<li><strong>Modelfile (The Base-Overlay):</strong> The persona is “baked in.” when you launch your “SysAdmin” model.</li>
</ul>
<h4 id="3-layered-inheritance-the-fromcommand">3. Layered Inheritance (The “FROM” command)</h4>
<p>This is the part that is impossible with just prompting.</p>
<ul>
<li>In a <code>.modelfile</code>, the first line is usually <code>FROM llama3</code>(or any model that you use). This is <strong>Inheritance</strong>.</li>
</ul>
]]></content:encoded></item><item><title>Extracting Swagger definition for Azure Logic App and importing to Azure APIM</title><link>https://gurupasupathy.com/post/2024-06-10_extracting-logic-app-swagger-def-and-import-to-apim/</link><pubDate>Mon, 10 Jun 2024 00:00:00 +0000</pubDate><guid>https://gurupasupathy.com/post/2024-06-10_extracting-logic-app-swagger-def-and-import-to-apim/</guid><description>&lt;p&gt;&lt;img loading="lazy" src="https://gurupasupathy.com/img/1__Yqhxr__0j4lVw8U9QtA6mKQ.jpeg"&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Use case&lt;/strong&gt; — I want to import a Logic App as an API within my APIM instance.&lt;/p&gt;
&lt;p&gt;There is no direct way to get the swagger file of a logic app using CLI (at least, I could not figure out). So, detailing the steps to extract the swagger definition of a logic app. I use the generated swagger file to import a Logic App as an API within APIM using Azure CLI&lt;/p&gt;</description><content:encoded><![CDATA[<p><img loading="lazy" src="/img/1__Yqhxr__0j4lVw8U9QtA6mKQ.jpeg"></p>
<p><strong>Use case</strong> — I want to import a Logic App as an API within my APIM instance.</p>
<p>There is no direct way to get the swagger file of a logic app using CLI (at least, I could not figure out). So, detailing the steps to extract the swagger definition of a logic app. I use the generated swagger file to import a Logic App as an API within APIM using Azure CLI</p>
<ol>
<li><strong>Provide the service principal contributor role to the logic app</strong></li>
</ol>
<ul>
<li><em>Get the resource id of the logic app —</em></li>
</ul>
<p>$logicAppResourceId = (az logic workflow show &ndash;resource-group &ldquo;{resourcegroup-name}&rdquo; &ndash;name &ldquo;{logicAppName}&rdquo; &ndash;query id &ndash;output tsv)</p>
<ul>
<li><em>Provide contributor role for the service principal —</em></li>
</ul>
<p>az role assignment create --assignee {sp-id} - role Contributor &ndash;scope $logicAppResourceId</p>
<p><strong>2. Get the swagger file from the Logic App</strong></p>
<ul>
<li><em>generate a JW token</em> from <a href="https://login.microsoftonline.com/%7BtenantId%7D/oauth2/token"><em>https://login.microsoftonline.com/{tenantId}/oauth2/token</em></a> for the service principle with resource as “<em><a href="https://management.core.windows.net/">https://management.core.windows.net/</a></em>”</li>
</ul>
<p>$tenantId = &ldquo;11111111-1111-1111-1111-111111111111&rdquo;<br>
$clientId = &ldquo;00000000-0000-0000-0000-000000000000&rdquo;<br>
$clientSecret = &ldquo;your-client-secret&rdquo;<br>
$resource = &ldquo;<a href="https://management.core.azure.com/%22">https://management.core.azure.com/&quot;</a></p>
<p>$body = @{<br>
grant_type    = &ldquo;client_credentials&rdquo;<br>
client_id     = $clientId<br>
client_secret = $clientSecret<br>
resource      = $resource<br>
}</p>
<p>$response = Invoke-RestMethod -Method Post -Uri &ldquo;<a href="https://login.microsoftonline.com/$tenantId/oauth2/token%22">https://login.microsoftonline.com/$tenantId/oauth2/token&quot;</a> -ContentType &ldquo;application/x-www-form-urlencoded&rdquo; -Body $body</p>
<p>$accessToken = $response.access_token<br>
$accessToken</p>
<ul>
<li><em>construct the swagger URL for the logic app —</em></li>
</ul>
<p>$swaggerUrl = &ldquo;<a href="https://management.azure.com">https://management.azure.com</a>&rdquo; + (az logic workflow show &ndash;resource-group &ldquo;{resourcegroup-name}&rdquo; &ndash;name &ldquo;{logicapp-name}&rdquo; &ndash;query id &ndash;output tsv) + &ldquo;/listSwagger?api-version=2016–06–01&rdquo;</p>
<ul>
<li><em>Issue a POST request to $swaggerUrl to get the swagger definition of the LogicApp using Postman (or any other option you prefer)</em></li>
</ul>
<p><strong>3. Import into APIM</strong></p>
<ul>
<li>Run the below command to import the above swagger file to APIM</li>
</ul>
<p>az apim api import &ndash;resource-group &ldquo;{resourcegroup-name}&rdquo; &ndash;service-name &ldquo;{apim-instance-name}&rdquo; &ndash;path &ldquo;/v1&rdquo; &ndash;api-id myapi &ndash;specification-path &ldquo;.\logicapp.backend.swagger.json&rdquo; &ndash;specification-format Swagger</p>
<p><strong>4. Remove the contributor role for the service principal</strong></p>
<p>az role assignment delete &ndash;assignee 00000000–0000–0000–0000–000000000000 &ndash;role &ldquo;Contributor&rdquo; &ndash;scope $logicAppResourceId</p>
]]></content:encoded></item><item><title>Calling a Logic App from APIM</title><link>https://gurupasupathy.com/post/2024-05-22_calling-a-logic-app-from-apim/</link><pubDate>Wed, 22 May 2024 00:00:00 +0000</pubDate><guid>https://gurupasupathy.com/post/2024-05-22_calling-a-logic-app-from-apim/</guid><description>&lt;p&gt;&lt;img loading="lazy" src="https://gurupasupathy.com/img/1__Gepa1jETj8F7cwF8xpVXJg.jpeg"&gt;&lt;/p&gt;
&lt;p&gt;There are couple of ways to integrate an APIM with Logic App. The most common use case as far as I know is exposing the Logic App as an API on the APIM. The other scenario is calling a Logic App from APIM.&lt;/p&gt;
&lt;p&gt;I will provide the APIM policy snippet to call a Logic App. If you are using Managed Identity to authenticate to Logic App (will cover in a separate article), you can skip sending the bearer token.&lt;/p&gt;</description><content:encoded><![CDATA[<p><img loading="lazy" src="/img/1__Gepa1jETj8F7cwF8xpVXJg.jpeg"></p>
<p>There are couple of ways to integrate an APIM with Logic App. The most common use case as far as I know is exposing the Logic App as an API on the APIM. The other scenario is calling a Logic App from APIM.</p>
<p>I will provide the APIM policy snippet to call a Logic App. If you are using Managed Identity to authenticate to Logic App (will cover in a separate article), you can skip sending the bearer token.</p>
<p><strong>Few steps to be done in the Logic App</strong></p>
<ol>
<li>
<p>enable Authentication at the Logic App end</p>
</li>
<li>
<p>the Logic App URL should not contain the SAS token</p>
</li>
<li>
<p>make sure that the Logic App has the below in trigger section. Basically, this is the ensure that the Logic App expects the Bearer token and “IncludeAuthorizationHeadersInOutputs” ensures that the Auth token is available for further processing within the Logic App</p>
<p>&ldquo;triggers&rdquo;: {<br>
&ldquo;manual&rdquo;: {<br>
&ldquo;conditions&rdquo;: [<br>
{<br>
&ldquo;expression&rdquo;: &ldquo;@startsWith(triggerOutputs()?[&lsquo;headers&rsquo;]?[&lsquo;Authorization&rsquo;], &lsquo;Bearer&rsquo;)&rdquo;<br>
}<br>
],<br>
&ldquo;inputs&rdquo;: {<br>
&ldquo;schema&rdquo;: {}<br>
},<br>
&ldquo;kind&rdquo;: &ldquo;Http&rdquo;,<br>
&ldquo;operationOptions&rdquo;: &ldquo;IncludeAuthorizationHeadersInOutputs&rdquo;,<br>
&ldquo;type&rdquo;: &ldquo;Request&rdquo;<br>
}<br>
}</p>
</li>
</ol>
<p><strong>APIM Policy to call the Logic App</strong></p>
<p>We issue a call to the Logic App from with the <send-request>. The response from the Logic App is captured in <em>response-variable-name=”responsela”.</em></p>
<p>&lt;policies&gt;<br>
&lt;inbound&gt;<br>
<base /></p>
<pre><code>    &lt;send-request mode\=&quot;new&quot; response-variable-name\=&quot;responsela&quot; timeout\=&quot;20&quot; ignore-error\=&quot;false&quot;\&gt;  
        &lt;set-url\&gt;https://xxxxxxx.com:443/workflows/xxxxxxxxxxxx/triggers/manual/paths/invoke?api-version=2016-10-01&lt;/set-url\&gt;  
        &lt;set-method\&gt;POST&lt;/set-method\&gt;  
        &lt;set-header name\=&quot;Content-Type&quot; exists-action\=&quot;override&quot;\&gt;  
            &lt;value\&gt;application/json&lt;/value\&gt;  
        &lt;/set-header\&gt;  
        &lt;set-header name\=&quot;Authorization&quot; exists-action\=&quot;override&quot;\&gt;  
            &lt;value\&gt;Bearer \*\*\*\*&lt;/value\&gt;  
        &lt;/set-header\&gt;  
    &lt;/send-request\&gt;  

    &lt;return-response\&gt;  
        &lt;set-status code\=&quot;200&quot; reason\=&quot;OK&quot; /&gt;  
        &lt;set-body\&gt;@(((IResponse)context.Variables\[&quot;responsela&quot;\]).Body.As&lt;JObject\&gt;(preserveContent: true).ToString())&lt;/set-body\&gt;  
    &lt;/return-response\&gt;  

&lt;/inbound\&gt;  
&lt;outbound\&gt;  
    &lt;base /&gt;  
&lt;/outbound\&gt;  
&lt;on-error\&gt;  
    &lt;base /&gt;  
&lt;/on-error\&gt;  
&lt;backend\&gt;  
    &lt;base /&gt;  
&lt;/backend\&gt;  
</code></pre>
<p>&lt;/policies&gt;</p>
<p>All the tags are quite self-explanatory and there a loads of documentation available about them. <return-response> is very useful policy, it suspends further policy pipeline execution and returns to the caller.</p>
<p>Hope this helps.</p>
<p>Cheers!</p>
]]></content:encoded></item><item><title>Reducing Data Transfer Objects using Tuples in C#</title><link>https://gurupasupathy.com/post/2021-11-25_reducing-data-transfer-objects-using-tuples/</link><pubDate>Thu, 25 Nov 2021 00:00:00 +0000</pubDate><guid>https://gurupasupathy.com/post/2021-11-25_reducing-data-transfer-objects-using-tuples/</guid><description>&lt;p&gt;&lt;img loading="lazy" src="https://gurupasupathy.com/img/1__o63lCwtjwCbbGw__KrUl__sw.jpeg"&gt;&lt;/p&gt;
&lt;p&gt;I have come across quite a few ASP.NET Core WebAPI solutions where there is a inordinate number of Data Transfer Object (&lt;a href="https://docs.microsoft.com/en-us/aspnet/core/tutorials/first-web-api?view=aspnetcore-5.0&amp;amp;tabs=visual-studio#prevent-over-posting-1"&gt;DTO&lt;/a&gt;) classes. This results in a kind of class explosion which I think can be avoided. Yes, DTOs do have their utility, no doubt. But, many a times as the application evolves and grows, we often end up with numerous DTOs and these DTOs sometimes differ just by a handful of attributes or in some cases they are a simple composition of multiple entities / DTOs.&lt;/p&gt;</description><content:encoded><![CDATA[<p><img loading="lazy" src="/img/1__o63lCwtjwCbbGw__KrUl__sw.jpeg"></p>
<p>I have come across quite a few ASP.NET Core WebAPI solutions where there is a inordinate number of Data Transfer Object (<a href="https://docs.microsoft.com/en-us/aspnet/core/tutorials/first-web-api?view=aspnetcore-5.0&amp;tabs=visual-studio#prevent-over-posting-1">DTO</a>) classes. This results in a kind of class explosion which I think can be avoided. Yes, DTOs do have their utility, no doubt. But, many a times as the application evolves and grows, we often end up with numerous DTOs and these DTOs sometimes differ just by a handful of attributes or in some cases they are a simple composition of multiple entities / DTOs.</p>
<p>One of the reasons we have so many such DTO classes is the need to pass data to and from repository and service layer ( between different layers of the application for that matter). In order to find a way around creating yet another DTO, I was exploring some options and realized that Tuples can be used to minimize the creation of DTOs</p>
<p><a href="https://docs.microsoft.com/en-us/dotnet/csharp/language-reference/builtin-types/value-tuples">Tuples</a> have been around in C# for quite sometime now. I am not sure if it is a common knowledge but I recently figured out that you could eliminate quite a few DTOs that we use to ferry data between the layers by leveraging Tuples</p>
<p>Let us take the following scenario of instance.</p>
<p>We have an API, say, GetCustomers which, <em>of course</em>, will return me the list of customers . And, we have an entity, Customer as defined below.</p>
<p>class Customer<br>
{<br>
public int customerId {get; set;}<br>
public string firstname {get; set;}<br>
public string lastname {get; set;}<br>
}</p>
<p>The API response for our GetCustomers API is as below</p>
<p>You would have noticed that the attribute <em>count</em> is expected in the response and this is not present in our Customer class. The repository layer would just return a List<Customer> but the service layer needs to pass it along with the <em>count</em> attribute to the controller. This is usually where we tend to create a DTO as below.</p>
<p>class CustomerDTO<br>
{<br>
int count;<br>
List<Customer> customers<br>
}</p>
<p>The only reason for the above class to exist is to ferry the data from repository in a format that the controller is expecting. We can eliminate this class altogether by returning Tuple as below</p>
<p>return new Tuple&lt;int, List<Customer>&gt;(result.Count,result)</p>
<p>Granted, this is a very trivial scenario and you can add the count attribute in the controller and return an anonymous type also.</p>
<p>Now consider the cases when you need a response that is aggregation of multiple custom types. For instance, if we have two API one to get customer and another to get order details we would have created two DTO for Customer and Order. If a new API is required that gives details pertaining to a particular Customer and all related Orders as response, you might have to create a new DTO again, as below.</p>
<p>public class newDTO {<br>
public int orderCount {get; set;}<br>
public int customerId {get; set;}<br>
public List<Order> orders {get; set;}<br>
}</p>
<p>The expected response is</p>
<p>This is exactly what we can avoid by using Tuples like below in the service and repository layer.</p>
<p>RepositoryLayer.cs</p>
<p>var <strong>repoResponse</strong> = new Tuple&lt;int, Customer customer, List<Order>&gt;(count,custResult, orderResult);<br>
return <strong>repoResponse</strong>;</p>
<p><em>custResult holds a particular customer’s data and orderResult will be a List<Order></em></p>
<p>ServiceLayer.cs</p>
<p>.<br>
.<br>
.<br>
<em>//Create a Tuple with three members, count, customer and orders <br>
//repoResponse is the response from your repository(a Tuple)</em><br>
(int count, Customer customer, List<Order> orderList) <strong>result</strong> = (<strong>repoResponse</strong>.Item1,    <strong>repoResponse</strong>.Item2, <strong>repoResponse</strong>.Item3);</p>
<p>return <strong>result</strong>;<br>
}</p>
<p>From the above service response, the controller can create an anonymous type as below without ever creating a DTO and return the <a href="https://gist.github.com/gurupasupathy/4d04d8352e9b2be63cee00bcfea6a3fc">response</a> .</p>
<p>return new { ordercount= <strong>serviceResponse</strong>.count, customer = <strong>serviceResponse</strong>.customer.customerId, serviceResponse = <strong>serviceResponse</strong>.orderList };</p>
<p>It should be noted that, although this approach eliminates the need to create DTO classes, it come at the cost of readability. Your method signatures may not be very elegant and readable. While DTOs will still be the right way to go in some scenarios, for others, Tuples can help.</p>
<p>Hope this helps in reducing a few DTOs at your end.</p>
<p>Cheers!</p>
]]></content:encoded></item><item><title>How to get rid of a rouge instance in Azure App Service Plan</title><link>https://gurupasupathy.com/post/2021-04-22_rid-of-a-rouge-instance-in-azure-app-service-plan/</link><pubDate>Thu, 22 Apr 2021 00:00:00 +0000</pubDate><guid>https://gurupasupathy.com/post/2021-04-22_rid-of-a-rouge-instance-in-azure-app-service-plan/</guid><description>&lt;p&gt;&lt;img loading="lazy" src="https://gurupasupathy.com/img/1__Lm11e6NyfH1lBZhwYgObTw.jpeg"&gt;&lt;/p&gt;
&lt;p&gt;If you have been using Azure App Services for a while to host your API, there is a small chance that you would have encountered the issue with a faulty instance. Your API just doesn’t respond or keeps crashing in a particular instance. And, if your ARR Affinity was enabled, your problems will just be exacerbated. Some users will always be routed to the faulty instance.&lt;/p&gt;
&lt;p&gt;AFAIK, there are no straight forward way to release an instance that is allotted to you by Azure for the given App Service Plan. Adding more instances and removing instances (scale out / in) will not guarantee that the rogue instance will be released. I will share the approach I took to get rid of the rogue instance. Note that, the approach below needs your app service to be out of rotation and should not be serving incoming requests.&lt;/p&gt;</description><content:encoded><![CDATA[<p><img loading="lazy" src="/img/1__Lm11e6NyfH1lBZhwYgObTw.jpeg"></p>
<p>If you have been using Azure App Services for a while to host your API, there is a small chance that you would have encountered the issue with a faulty instance. Your API just doesn’t respond or keeps crashing in a particular instance. And, if your ARR Affinity was enabled, your problems will just be exacerbated. Some users will always be routed to the faulty instance.</p>
<p>AFAIK, there are no straight forward way to release an instance that is allotted to you by Azure for the given App Service Plan. Adding more instances and removing instances (scale out / in) will not guarantee that the rogue instance will be released. I will share the approach I took to get rid of the rogue instance. Note that, the approach below needs your app service to be out of rotation and should not be serving incoming requests.</p>
<p>Assume that you suspect that a given instance in your App Service Plan has issues and is crashing frequently and you wish to remove this instance. As of today, there is no way to select an instance and remove it via the Azure Portal (<em>yes, you can stop an instance from Process Explorer, but it would still not get rid of the instance</em>). One way to achieve this would be to use vertical scaling (up/down). When you scale up/down Azure allocates necessary hardware based on the target pricing tier you have chosen. The infrastructure differs significantly across tiers and moving across tiers will almost always guarantee different infrastructure allocation. We will use this to get rid of the rogue instance.</p>
<p>Start by scaling down to a lesser tier (<em>moving laterally within the same tier may not help</em>) For instance, if you are operating on a Premium tier, move to Standard. This action will make Azure allocate new instances in the lesser tier that you have chosen. Now, after scaling down, scale up again to your target pricing tier. When you do this, you are going to be allocated fresh (at least not the old rogue) instances. This is how I got rid of one of the instances that was bothering me.</p>
<p>Hope this helps.</p>
]]></content:encoded></item><item><title>How I ended up writing cleaner PATCH calls using JSON Patch</title><link>https://gurupasupathy.com/post/2020-07-05_patch-calls-using-json-patch/</link><pubDate>Mon, 06 Jul 2020 00:00:00 +0000</pubDate><guid>https://gurupasupathy.com/post/2020-07-05_patch-calls-using-json-patch/</guid><description>&lt;p&gt;&lt;img loading="lazy" src="https://gurupasupathy.com/img/1__a__P3zmoCIshJqSyuAQQGeA.jpeg"&gt;&lt;/p&gt;
&lt;p&gt;I have written my fair share of RESTful API but am no expert by any measure. I had never given enough thought about the &lt;a href="https://developer.mozilla.org/en-US/docs/Web/HTTP/Methods"&gt;HTTP Verbs&lt;/a&gt; I should be using (like, PUT, POST, PATCH) while writing API. If a resource had to be created, I would automatically go for POST (_never considered the idempoten_cy &lt;em&gt;angle at all&lt;/em&gt;) and if a resource had to be modified, I would go for PATCH.&lt;/p&gt;</description><content:encoded><![CDATA[<p><img loading="lazy" src="/img/1__a__P3zmoCIshJqSyuAQQGeA.jpeg"></p>
<p>I have written my fair share of RESTful API but am no expert by any measure. I had never given enough thought about the <a href="https://developer.mozilla.org/en-US/docs/Web/HTTP/Methods">HTTP Verbs</a> I should be using (like, PUT, POST, PATCH) while writing API. If a resource had to be created, I would automatically go for POST (_never considered the idempoten_cy <em>angle at all</em>) and if a resource had to be modified, I would go for PATCH.</p>
<p>On one of my API assignments, I decided to make a very conscious and deliberated choice of the verbs I will be using; and an API in particular got me thinking.</p>
<p>I had to write an API to modify a resource and this resource happened to have nested resources and numerous attributes. I soon realized that it wasn’t so straight forward to create an elegant PATCH API due to the sheer number of attributes on this resource that can potentially get modified (<em>Why did you design a resource with so many attributes in the first place? you might ask. But that is a topic for another day</em>)</p>
<p>So, coming back to the task in hand, I was aware of only two choices to go about writing a PATCH call. Either, send just the attributes requiring modifications to the API (<em>Approach 1</em>) or end the entire resource to API <em>after</em> making changes to the necessary attributes at consumers’ end (<em>Approach 2</em>). We will examine both the approaches in the context of the below two classes (Employee and Address)</p>
<pre><code>    public class Employee  
    {  
        public int EmployeeId {get; set;}  
        public string EmployeeName {get; set;}  
        public Address EmployeeAddress {get; set;}  
        public string WorkLocation {get; set;}  
        public List&lt;string&gt; PreferredWorkLocations {get; set;}

    }

    public class EmployeeAddress  
    {  
        public string HouseNumber {get; set;}  
        public string AddressLine1 {get; set;}  
        public string AddressLine2 {get; set;}  
    }
</code></pre>
<p><strong><em>Approach 1:</em></strong> The consumer of the PATCH API will send the entire resource, after changing a few select attribute that need to be modified. At the API end, the PATCH payload will be handed over to the repository layer which would then update the entire resource to the database. Other than some basic validations, no additional work is needed at the API end. A sample PATCH call (<em>almost a PUT</em>) would look like below</p>
<pre><code>    **api/ModifyEmployee/{empId}  
    **{  
    &quot;EmployeeName&quot; : &quot;ename&quot;,  
    &quot;EmployeeAddress&quot; : {  
                &quot;HouseNumber&quot; : &quot;F32&quot;,  
                &quot;AddressLine1&quot; : &quot;Addr1&quot;,  
                &quot;AddressLine2&quot; : &quot;Addr2&quot;  
                },  
    &quot;WorkLocation&quot; : &quot;Brazil&quot;,  
    &quot;PreferredWorkLocations&quot; : \[&quot;Brazil&quot;,&quot;France&quot;\]  
    }
</code></pre>
<blockquote>
<p>Downside: the consumer has to build the entire object even if only a single attribute requires modification. Also note that, there is no way of knowing if just the “houseNumber” has changed or any other / all the attributes of Employee has changed. So, all the attributes’ values need to be copied back to a new object object to be persisted in the database.</p>
</blockquote>
<p><strong><em>Approach 2:</em></strong> The consumer of the API will send only the attribute that had to be modified. A sample PATCH for modifying the work location will look like below:</p>
<pre><code>    **api/ModifyEmployeeWorkLocation/{empId}  
    **{  
    &quot;WorkLocation&quot; : &quot;Brazil&quot;  
    }
</code></pre>
<blockquote>
<p>Downside: the onus of constructing an object that can be handed over to the repository layer falls on the API service layer (an object mapper need to be used here)</p>
</blockquote>
<blockquote>
<p>Further, this approach might necessitates that a new API be created of each combination of possible modifications in the resource attributes. Consider if I have to update the Employee Address I will have to have another method like api/ModifyEmployeeAddress/{empId}. If the class has many attributes that could be modified this can lead to explosion of PATCH methods.</p>
</blockquote>
<h3 id="json-patch"><strong>JSON Patch</strong></h3>
<p>Neither of the approaches appealed to me. This is when I stumbled upon JSON PATCH. Honestly, I had never heard of JSON Patch before and wanted to give it a shot as I thought it would address the downsides mentioned above.</p>
<p>What I like the most about JSON Patch was that <strong><em>as a consumer</em></strong> I don’t have to send the entire object as payload for the PATCH call, I can just mention what operation (<em>add / remove / replace / copy</em>) I want to perform on which resource attribute / subset of attributes. Also, <strong><em>at the API end</em></strong>, I there is not need to have multiple methods for each type of modifications and there is no need to manually copy over the incoming values to a new object that the repository will understand and persist</p>
<p>Using JSON Patch, these call can be as simple as below</p>
<pre><code>    **\[    
        {**  
            &quot;**value**&quot;: &quot;address line one&quot;,  
            &quot;**path**&quot;: &quot;/address/addressLine1&quot;,  
            &quot;**op**&quot;: &quot;replace&quot;  
        **}  
    \]**
</code></pre>
<p>The advantage of using JSON Patch is that you don’t have to reconstruct the object at your API end. You can use a middle-ware like NewtonsoftJsonPatch and simply use <strong><em>ApplyTo</em></strong> method to construct the object for persistence / further processing.</p>
<pre><code>    public async Task&lt;IActionResult&gt; UpdateEmployee(\[FromBody\] **JsonPatchDocument&lt;Employee&gt;** patchDoc, int empId)  
    {  
        if (patchDoc != null)  
        {  
            var emp = await &lt;yourCache&gt;.GetAsync&lt;Employee&gt;(&quot;cacheKey&quot;);

            if (emp == null)   
            {  
                emp = await yourService.GetEmployeeData(empId);  
            }

            **patchDoc.ApplyTo(emp, ModelState);**

            //call repository to update.   
            \_ = await yourService.UpdateAsync(emp);

            return new ObjectResult(emp);  
        }  
        else  
        {  
            return BadRequest(ModelState);  
        }  
    }
</code></pre>
<p>The ApplyTo method will take care of copying (or performing any operation based on the value supplied in “op” attribute of the PATCH call) the new values to the existing object. This eliminates the need to do this mapping and copying manually using a mapper.</p>
<p>Another plus is that you don’t have to have multiple PATCH calls for each of the attributes, you can club multiple modification requests in the same PATCH call like below. Please note that you have to use /- notation to add to a list.</p>
<pre><code>    \[  
    {  
            &quot;value&quot;: &quot;new work Location&quot;,  
            &quot;path&quot;: &quot;/preferredWorkLocations/-&quot;,  
            &quot;op&quot;: &quot;add&quot;  
    },  
    {  
            &quot;value&quot;: &quot;address line one&quot;,  
            &quot;path&quot;: &quot;/address/addressLine1&quot;,  
            &quot;op&quot;: &quot;replace&quot;  
    }  
    \]
</code></pre>
<p>You may refer to the below link for detailed information on how to use JSON Patch in ASP.NET Core.</p>
<p><a href="https://docs.microsoft.com/en-us/aspnet/core/web-api/jsonpatch?view=aspnetcore-3.1" title="https://docs.microsoft.com/en-us/aspnet/core/web-api/jsonpatch?view=aspnetcore-3.1"><strong>JsonPatch in ASP.NET Core web API</strong><br>
_By Tom Dykstra and Kirk Larkin This article explains how to handle JSON Patch requests in an ASP.NET Core web API. To…_docs.microsoft.com</a><a href="https://docs.microsoft.com/en-us/aspnet/core/web-api/jsonpatch?view=aspnetcore-3.1"></a></p>
<p>So, that’s how I embraced JSON Patch.</p>
<p>Cheers!</p>
]]></content:encoded></item><item><title>Using Azure function proxies for mocking API</title><link>https://gurupasupathy.com/post/2020-06-14_using-azure-function-proxies-for-mocking/</link><pubDate>Sun, 14 Jun 2020 00:00:00 +0000</pubDate><guid>https://gurupasupathy.com/post/2020-06-14_using-azure-function-proxies-for-mocking/</guid><description>&lt;p&gt;&lt;img loading="lazy" src="../post/img/1__0dxAdZ9Lr__lZh7pA1UvbhA.png"&gt;
&lt;img loading="lazy" src="img/1__K8rbssn1nWY4vAurhAH3QQ.jpeg"&gt;&lt;/p&gt;
&lt;p&gt;There are many options available when it comes to mocking API response, like, &lt;a href="https://www.npmjs.com/package/json-server"&gt;JSON server&lt;/a&gt; or even having a response JSON file added to your solutions, to cite a few. In this article we will see how Azure function proxies can be used to mock API responses.&lt;/p&gt;
&lt;p&gt;Azure function provides an elegant option to mock API response using proxies. Using a Azure function proxy, you can provide a mock endpoint which can be used by your team to continue their work till your actual API is ready for integration.&lt;/p&gt;</description><content:encoded><![CDATA[<p><img loading="lazy" src="../post/img/1__0dxAdZ9Lr__lZh7pA1UvbhA.png">
<img loading="lazy" src="img/1__K8rbssn1nWY4vAurhAH3QQ.jpeg"></p>
<p>There are many options available when it comes to mocking API response, like, <a href="https://www.npmjs.com/package/json-server">JSON server</a> or even having a response JSON file added to your solutions, to cite a few. In this article we will see how Azure function proxies can be used to mock API responses.</p>
<p>Azure function provides an elegant option to mock API response using proxies. Using a Azure function proxy, you can provide a mock endpoint which can be used by your team to continue their work till your actual API is ready for integration.</p>
<p>Let us go ahead, create a simple proxy and see how the mock response is served.</p>
<p>We will be creating a proxy end point which will service a GET call, say, getCustomer. Our getCustomer API method is expected to provide a response in the below format. So, till getCustomer is up and ready for consumption, our proxy can be used to get the below JSON as mock response.</p>
<p><img loading="lazy" src="/img/1__VxL01HJuza__7HKo6y3T8Vw.png"></p>
<p>Below are the steps for create a Function proxy.</p>
<p><strong>Step 1:</strong> We will create an Azure function app which will host the proxy. (If there is already a general purpose / maintenance Function App present we can use that.)</p>
<p><img loading="lazy" src="/img/1__CdC5l5Ay818N906wBgbugw.png">
<img loading="lazy" src="/img/1__R8T9gR2hwvG7u38I8e9ZWA.png"></p>
<p><strong>Step 2:</strong> Now that we have created the function app to host our proxy, let us create our proxy. Choose the “Proxies” item in the Azure function blade as shown below. Click on “Add” to create a new proxy. We will call this MockCustomerAPI</p>
<p><img loading="lazy" src="/img/1__K6PLoLzwHZssslewruuToQ.png"></p>
<p>And we will provide a route <em>/api/getcustomer</em>. In the HTTP Method section, we select “GET”. Please note that we can choose to mock other HTTP methods like POST as well.</p>
<p><strong>Step 3:</strong> This is the step where we will provide the response we want the proxy to send us back. We will override the response as shown below by expanding the “Response override” link and paste our mock response in the space provide in Body section.</p>
<p><img loading="lazy" src="/img/1__BgsQW0N7msbWbSpjkiIdNg.png"></p>
<p>We can provide the status code and status message as per our use case and click on “Create”. Once the proxy gets created successfully we will be provided with a link to access the proxy as shown below</p>
<p><img loading="lazy" src="/img/1__k7__6TGvg__h2H9nN1KHpO7Q.png"></p>
<p><strong>Step 4:</strong> Now that we are done with creating the proxy, let us test. To test our proxy, copy the generated proxy URL and open in the browser. We will see the response as below</p>
<p><img loading="lazy" src="/img/1__omuFybwa4TdrArkz3r7Gxg.png"></p>
<p>Thus, we have created a proxy for the getCustomer API which can be used by the UX team or other API teams to integrate during the early development cycles when our API is not ready yet. Please do note that mocks are not just for GET method, you could do other HTTP methods as well</p>
<p>Some of the advantages of this approach are</p>
<ol>
<li>Mock API responses unblocks the collaborating team like UX team as they can work against the mock endpoint till the actual API is ready</li>
<li>Testing is easier and thorough if your API relies on a partner API. Creating a mock endpoint gives you the flexibility to change the response and test your code for all possible, allowed parameter values from the partner API</li>
<li>If there is a dependency on partner API which is not available in your lower environments, you can resort to creating a proxy in lower environment.</li>
<li>As it is hosted in a common URL, same contract will be used across all crews consistently. Any change done to the contract will be immediately visible to all the consuming developers.</li>
<li>Eliminates the need of having a separate JSON response file or JSON server on local dev box and thus ensure you are developing against the latest contract</li>
</ol>
<p><em>Before we conclude, a note about</em> <a href="https://developer.mozilla.org/en-US/docs/Web/HTTP/CORS"><em>CORS</em></a><em>:</em> If you are hitting the proxy from your front-end web application, please ensure you tweak the <a href="https://developer.mozilla.org/en-US/docs/Web/HTTP/CORS">CORS</a> setting for the mock function app accordingly as show below</p>
<p><img loading="lazy" src="/img/1__vxBs4ndIeg9zEF4r7__BX8w.png"></p>
<p><strong>Conclusion</strong></p>
<p>There are many useful feature of Azure function Proxies like redirection and route template parameters. You can read more about Azure Function proxies in official <a href="https://docs.microsoft.com/en-us/azure/azure-functions/functions-proxies">Microsoft documentation</a>.</p>
]]></content:encoded></item><item><title>Scheduling vertical scaling using Microsoft Azure Automation Accounts</title><link>https://gurupasupathy.com/post/2020-04-25_scheduling-vertical-scaling/</link><pubDate>Sat, 25 Apr 2020 00:00:00 +0000</pubDate><guid>https://gurupasupathy.com/post/2020-04-25_scheduling-vertical-scaling/</guid><description>&lt;p&gt;&lt;img loading="lazy" src="https://gurupasupathy.com/img/1__6HRU03UvgOt__5ih3ssEDXw.jpeg"&gt;&lt;/p&gt;
&lt;p&gt;Scaling cloud resources dynamically is a fascinating topic. Microsoft Azure provide quite a few ways to dynamically scale resources. This article focuses on creating a scheduled &lt;em&gt;vertical scaling&lt;/em&gt; (scaled up/down) of App Services. The approach outlined here can be used for other Azure resource like SQL Databases, Redis Cache or in fact pretty much most of the Azure resources that support scaling. Just to clarify right at the outset, we are talking about vertical scaling (between pricing tiers) and not horizontal scaling (scale in/out) wherein we deal with the number of instances at our disposal.&lt;/p&gt;</description><content:encoded><![CDATA[<p><img loading="lazy" src="/img/1__6HRU03UvgOt__5ih3ssEDXw.jpeg"></p>
<p>Scaling cloud resources dynamically is a fascinating topic. Microsoft Azure provide quite a few ways to dynamically scale resources. This article focuses on creating a scheduled <em>vertical scaling</em> (scaled up/down) of App Services. The approach outlined here can be used for other Azure resource like SQL Databases, Redis Cache or in fact pretty much most of the Azure resources that support scaling. Just to clarify right at the outset, we are talking about vertical scaling (between pricing tiers) and not horizontal scaling (scale in/out) wherein we deal with the number of instances at our disposal.</p>
<p>It is a common knowledge that Azure provides out-of-the-box options to <strong><em>scale out/ scale in</em></strong> based on the scaling rules for App Service plan but there is no way to scale up / scale down as per some schedule.</p>
<p>For instance, there is no direct way to say that between 10 AM and 12 Noon, let my App Service plan run on P2V2 and come back to P1V2 there after or have my SQL Server move up to P6 for a few hours before coming back to S2. In other words, no option to <strong><em>scale up/scale down</em></strong> based on schedule</p>
<blockquote>
<p>Note on Serverless Azure SQL Database :- We have Serverless Azure SQL Databases with two key capabilities which make them attractive in terms of cost. 1. The option to auto scale up / down between the minimum and maximum threshold 2. Auto-pause — wherein the SQL Server is stopped after a predefined period of inactivity till some activity is detected again. You don’t get charged for the period of inactivity. The downside is that it will take some time of the SQL Server to warm up and be available for the next use after the period of inactivity. Serverless is best suited for test / dev environments where you have tolerance to the brief period of connection unavailability during the warm-up. There could also be slight performance degradation for sometime as the cache memory are gradually reclaimed. Serverless is not for your use case if these limitations are not acceptable. Furthermore, there are cases when although your usage will be limited for an interval you cannot afford to shut down the server using auto pause. Without auto pause you will be charged for the minimum number of vCores and minimum memory configured. For more details, refer to Microsoft documentation on <a href="https://docs.microsoft.com/en-us/azure/azure-sql/database/serverless-tier-overview">Azure Serverless SQL Database</a></p>
</blockquote>
<blockquote>
<p>So, if your case is such that you will want to use DTU based provisioning and still want scaling based on a schedule as you have predictable utilisation, you can use the approach outlined in this article. One example that comes to my mind is bumping up your DTU for a few hours when you are doing a performance test or scaling down during a seasonal / weekend low utilisation to save costs.</p>
</blockquote>
<p>Alright, now that we have context set, let us move on to see how we can achieve this scheduled vertical scaling for Azure resource in the following section</p>
<p>To start with, create an automation account. Details on how to create automation account can be found <a href="https://docs.microsoft.com/en-us/azure/automation/automation-create-standalone-account">here</a>. Azure Automation allows us to invoke runbooks as per a schedule. We will leverage this capability for our purpose. Please remember to select the option to create a RunAs account while creating the Automation account as shown below. This is the principle under with the runbooks can execute.</p>
<p><img loading="lazy" src="/img/1__XoHZ7Nof1sdJkKRcwhXCaA.png"></p>
<p>After the automation account is created, create two Runbooks which will be invoked by a scheduler to perform the scaling operation automatically without our intervention. These runbooks will contain PowerShell scripts to perform the scaling operation based on your need. (one for Scale up and another for Scale down) The fact that we use PowerShell to perform the scaling gives us the option to scale pretty much all resource for which you can get hold of PowerShell scripts to scale; and the best source for PowerShell reference is <a href="https://docs.microsoft.com/en-us/powershell/scripting/overview?view=powershell-7">Microsoft’s official documentation</a>.</p>
<p>Now that you have the automation account and the runbooks that you need, create a schedule and link these runbooks as per your need. (<em>I won’t go into details on creating a schedule as it is very well documented and simple. Refer to Microsoft’s documentation on how to create a</em> <a href="https://docs.microsoft.com/en-us/azure/automation/shared-resources/schedules"><em>Schedule</em></a> )</p>
<p>So, for instance, the below schedule will automatically call the “ScaleDown” runbook at 5:10 AM on 7th Feb</p>
<p><img loading="lazy" src="/img/0__qk76rY21a2xOWrxz.jpg"></p>
<p>The below PowerShell script can be used to scale up / down an App Service Plan</p>
<p>Write-Output &ldquo;API Scale&rdquo;</p>
<p>$connectionName = &ldquo;AzureRunAsConnection&rdquo;<br>
try {<br>
$servicePrincipalConnection=Get-AutomationConnection -Name $connectionName</p>
<p>Add-AzureRmAccount -ServicePrincipal -TenantId $servicePrincipalConnection.TenantId -ApplicationId $servicePrincipalConnection.ApplicationId -CertificateThumbprint $servicePrincipalConnection.CertificateThumbprint</p>
<p>Set-AzureRmAppServicePlan -ResourceGroupName &ldquo;&laquo;<strong>yourresourcegroup</strong>&rdquo; -Name &ldquo;&laquo;<strong>yourappserviceplanname</strong>&rdquo;  -Tier PremiumV2 -NumberofWorkers 2 -WorkerSize &ldquo;Medium&rdquo;}</p>
<p>catch {<br>
if (!$servicePrincipalConnection){<br>
$ErrorMessage = &ldquo;Connection $connectionName not found.&rdquo;<br>
throw $ErrorMessage } else{<br>
Write-Error -Message $_.Exception<br>
throw $_.Exception<br>
}}</p>
<blockquote>
<p>In case you receive an error as below, go and update the PowerShell modules in your automation account. That should fix the issue</p>
</blockquote>
<blockquote>
<p><strong>The term ‘Set-AzureRmAppServicePlan’ is not recognized as the name of a cmdlet, function, script file, or operable program.</strong></p>
</blockquote>
<p>Below, I have provided a sample PowerShell to scale a SQL database; it scales to database to P1 tier. This script uses a credential to perform the DB scaling as opposed to the AzureRunAsAccount in the previous PowerShell.</p>
<p>To create a new credential, navigate to your automation account and select the “Credentials” option in the “Shared Resources” section. Refer to the below screen shot showing the credential creation</p>
<p><img loading="lazy" src="/img/1__tvODqy4v1CAZN__sbLLDL7g.png"></p>
<p>param([parameter(Mandatory=$true)] [PSCredential] $Credential )</p>
<p># Name of the Azure SQL Database server<br>
[string] $SqlServerName = &ldquo;<strong>yourserver</strong>.database.windows.net&rdquo;</p>
<p>$Servercredential = New-Object System.Management.Automation.PSCredential($Credential.UserName, (($Credential).GetNetworkCredential().Password | ConvertTo-SecureString -asPlainText -Force))</p>
<p>$CTX = New-AzureSqlDatabaseServerContext -ServerName $SqlServerName -Credential $ServerCredential</p>
<p>[string] $DatabaseName = &ldquo;<strong>yourdb</strong>&rdquo;<br>
[string] $Edition = &ldquo;Premium&rdquo;<br>
[string] $PerfLevel = &ldquo;P1&rdquo;<br>
$Db = Get-AzureSqlDatabase  $CTX –DatabaseName $DatabaseName</p>
<p>Write-Output &ldquo;Database Scale state &quot; - $Db.ServiceObjectiveAssignementStateDescription</p>
<p>if($Db.ServiceObjectiveName -ne $PerfLevel -and $Db.ServiceObjectiveAssignementStateDescription -ne &ldquo;Pending&rdquo;){<br>
$ServiceObjective = Get-AzureSqlDatabaseServiceObjective $CTX -ServiceObjectiveName $PerfLevel</p>
<p># Set the new edition/performance level<br>
#None, Business, Web, Premium, Basic, Standard&rdquo;</p>
<p>Write-Output &ldquo;Trigger the scale operation&rdquo;<br>
Set-AzureSqlDatabase $CTX –Database $Db –ServiceObjective $ServiceObjective –Edition $Edition -Force</p>
<p>Write-Output &ldquo;Completed vertical scale&rdquo;<br>
}else{<br>
Write-Output &ldquo;The DB is already in the target pricing tier Or DB is currenlty being scale up / down&rdquo;}</p>
<p>This same approach can be applied for other Azure resources also. Go on and try it out!</p>
<p>Happy Scaling!</p>
]]></content:encoded></item></channel></rss>