[{"content":"In the previous part, we established an on-premises identity foundation. The on-premises setup consists of a virtual network with Windows and Linux VMs joined to an on-premises Active Directory domain hosted on two domain controllers. In this part, we will create a VPN Gateway in Azure and a StrongSwan IPsec gateway on-premises and establish the Site-to-Site VPN tunnel — the foundation of our hybrid lab.\nImplementing a Site-to-Site (S2S) tunnel is simple — so rather than walking through the steps procedurally, I want to focus on what each component is actually doing.\nA Site-to-Site VPN connects two networks over the public internet using an encrypted IPsec tunnel. Each end has a gateway that authenticates the other using a Pre-Shared Key (PSK). Only traffic destined for the remote subnet goes through the tunnel — everything else uses the normal internet route. The tunnel is managed by IKE (Internet Key Exchange) which negotiates the Security Association (SA) — the agreed encryption parameters — before any traffic flows.\nBefore walking through the steps, here are the key addresses we’ll reference throughout\n192.168.122.0/24— On-prem network hosting KVM VMs (virbr0)\n192.168.1.106 — on-premises VPN gateway (Strongswan)\n192.168.1.0/24 — on-premises Wi-Fi LAN\n61.69.136.49 — **On-premises public IP\n**20.219.67.227 — **Azure VPN Gateway Public IP\n**10.66.0.0/16 — Azure VNet\n10.66.0.0/24 — Azure GatewaySubnet\n10.66.5.0/24 — Azure WorkloadSubnet\nSome of these will be created in the steps below; others are already in place from earlier parts.\nAzure Configurations 1. Create a virtual network in Azure While creating a vnet using Azure portal, decide on an address range and create two subnets named GatewaySubnet and WorkloadSubnet as show below. In WorkloadSubnet we will create VMs that want to talk to on-premises.\nOnce the VNet is provisioned, we need three components at Azure end that enable the connectivity with on-premises — a VPN gateway, a Local Network Gateway and a link between these two components\n2. Azure VPN Gateway This is the component that is responsible for establishing a secure tunnel with the on-premises. VPN Gateway is deployed in the virtual network you have chosen to pair with your on-premises network and has to be deployed in GatewaySubnet only — this is a hard requirement. Azure reserves this subnet name specifically for gateway infrastructure and rejects deployment attempts to any other subnet name. All traffic comes in and goes out via this VPN Gateway if force tunneling is enabled. Incoming traffic lands in the GatewaySubnet and from there it will be routed to the destination within the VNet.\nFor our purpose, a Basic tier VPN gateway would suffice. The Azure portal no longer shows a VPN type selector — all new gateways are route-based by default which support IKEv2. This is what we will use.\nRemember that on top of fixed monthly charge, there are costs associated with traffic entering and leaving the network via VPN Gateway — https://azure.microsoft.com/en-us/pricing/details/vpn-gateway/\nOnce the VPN Gateway is created, take a note of the public IP assigned to it. In my case it is — 20.219.67.227\n3. Local Network Gateway Now that we have a gateway in our Azure VNet, we need a way to identify the on-premises network. For this, an Azure service called Local Network Gateway is used. This is the representation of on-premises network. When you create a Local Network Gateway provide the static IP of your on-premises as the IP address and the network ranges that you want to include in the tunnel as address ranges —\nIP address — 61.69.136.49 (public IP of your Wi-Fi router), you can confirm this by running the below command\ncurl -s ifconfig.me # 61.69.136.49\nNote: This address can change when your ISP reassigns it, which typically happens on router restart or DHCP lease expiry. For home lab, this is fine but if you are having a serious setup you must consider getting static IP for yourself.\nAddress Space(s) —\n**192.168.122.0/24** (the libvirt virtual network) \u0026amp;**192.168.1.0/24** (your Wi-Fi network)\nNote: You can skip the Wi-Fi range if you do not intend to have other devices in your Wi-Fi to participate in the S2S tunnel\n4. Connection And the final bit of the Azure end of configuration is a Connection. A connection is the link between the VPN Gateway and the Local Network Gateway.\nFrom VPN Gateway, create a connection of type “Site-to-Site (IPSec)” and choose IKEv2, provide the shared key (PSK), connection mode and leave the rest as it is.\nOn-premises configuration On-premises needs VPN Gateway configurations similar to the Azure site. The on-premises configuration is simpler by comparison. We will use StrongSwan as the VPN Gateway and the following sections walk through the necessary configurations to enable a site-to-site tunnel\n5. Install / Configure StrongSwan StrongSwan is an open-source IPsec implementation for Linux. It runs as a daemon on the on-premises host and is responsible for IKE negotiation, SA establishment, and installing the resulting XFRM policies and keys into the Linux kernel.\nInstall StrongSwan and ensure it is running\nsudo apt install strongswan\nsudo systemctl enable strongswan-starter\nsudo systemctl start strongswan-starter\nsudo systemctl status strongswan-starter\n6. ipsec.conf This is StrongSwan’s main configuration file — it defines the tunnel connection parameters including peer identities, the subnets to advertise on each side, the encryption proposals, and the connection behaviour on startup and failure.\nsudo nano /etc/ipsec.conf\nMinimal config:\nconfig setup\ncharondebug=\u0026ldquo;ike 2, knl 2, cfg 2\u0026rdquo;\nconn azure-s2s\nkeyexchange=ikev2\nleft=\u0026lt;onprem_vpn_gateway\u0026gt; #Linux box with Strongswan\nleftid=\u0026lt;onprem_vpn_gateway\u0026gt; #Linux box with Strongswan\nleftsubnet=\u0026lt;onprem_subnet_1_range\u0026gt;,\u0026lt;onprem_subnet_2_range\u0026gt;\nright=\u0026lt;azure_vpn_gateway_public_ip\u0026gt;\nrightid=\u0026lt;azure_vpn_gateway_public_ip\u0026gt;\nrightsubnet=\u0026lt;azure_workload_subnet_address_range\u0026gt;\nauthby=secret\nauto=start\nike=aes256-sha256-modp1024! (acceptable for lab environments)\nesp=aes256-sha256!\ndpdaction=restart\ndpddelay=30s\ndpdtimeout=120s\nEach attribute controls a specific aspect of how StrongSwan negotiates and maintains the tunnel:\n**keyexchange=ikev2** — specifies IKEv2 as the key exchange protocol. IKEv2 is more efficient than IKEv1 (fewer round trips to establish the SA) and handles NAT traversal natively, which matters here since the on-premises side is behind a home router.\n**left** / **leftid** — identifies the local end of the tunnel. left is the IP StrongSwan binds to; leftid is how it identifies itself to the remote peer during IKE negotiation. Both are set to the StrongSwan host\u0026rsquo;s LAN IP here.\n**leftsubnet** — defines what on-premises ranges StrongSwan advertises through the tunnel. These must match the address spaces configured in the Azure Local Network Gateway — Azure uses the LNG configuration to inject routes into the VNet.\n**right** / **rightid** — the mirror of left, identifying the remote peer — in this case the Azure VPN Gateway\u0026rsquo;s public IP.\n**rightsubnet** — the network ranges behind the Azure VPN Gateway that on-premises should route through the tunnel. Traffic destined for these ranges will be intercepted by XFRM and encrypted.\n**authby=secret** — use a Pre-Shared Key for authentication, as configured in ipsec.secrets.\n**auto=start** — bring the tunnel up automatically when StrongSwan starts. Setting this to add instead would make StrongSwan a passive responder only.\n**ike=aes256-sha256-modp1024!** — the Phase 1 (IKE SA) proposal: AES-256 encryption, SHA-256 integrity, and Diffie-Hellman group 2 (modp1024 — acceptable for lab environments). The trailing ! means this is the only proposal offered — StrongSwan will not fall back to weaker algorithms. Azure must match this exactly.\n**esp=aes256-sha256!** — the Phase 2 (ESP) proposal governing how actual data packets are encrypted inside the tunnel. Same strict-match semantics as the ike line.\n**dpdaction=restart** — Dead Peer Detection behaviour. If the remote peer goes silent, StrongSwan will attempt to re-establish the tunnel rather than leave a stale SA. dpddelay and dpdtimeout control how long it waits before declaring the peer dead.\nExample —\nconfig setup\ncharondebug=\u0026ldquo;ike 2, knl 2, cfg 2\u0026rdquo;\nconn azure-s2s-manual\nkeyexchange=ikev2\nleft=192.168.1.106\nleftid=192.168.1.106\nleftsubnet=192.168.122.0/24,192.168.1.0/24\nright=20.219.67.227 # (hce-d01-vpngw-pip)\nrightid=20.219.67.227 # (hce-d01-vpngw-pip)\nrightsubnet=10.66.5.0/24\nauthby=secret\nauto=start\nike=aes256-sha256-modp1024!\nesp=aes256-sha256!\ndpdaction=restart\ndpddelay=30s\ndpdtimeout=120s\n**_rightsubnet_** = where your workloads live = what you want to reach.\n_GatewaySubnet_ is infrastructure — it\u0026rsquo;s where Azure\u0026rsquo;s VPN Gateway itself runs. You never deploy VMs there, and you never put it in _rightsubnet_. It\u0026rsquo;s not a destination, it\u0026rsquo;s a transit point.\nSo the mental model:\nrightsubnet = subnets behind the remote gateway\n= where the actual VMs/services are\n= NOT the gateway’s own subnet\nSame logic applies symmetrically to _leftsubnet_ on your side — it\u0026rsquo;s the subnets behind your StrongSwan (your VM network, your LAN), not StrongSwan\u0026rsquo;s own IP.\nThe gateway subnet on both sides is implied — both ends know the gateways exist because they’re talking to each other. What they need to tell each other is “what’s behind me that you can reach.”\n7. ipsec.secrets The ipsec.secrets file is read by Charon — StrongSwan\u0026rsquo;s IKEv2 keying daemon — at startup and on ipsec reload. It holds the Pre-Shared Key used to authenticate both peers during IKE Phase 1. The format is:\n\u0026lt;local-id\u0026gt; \u0026lt;remote-id\u0026gt; : PSK \u0026ldquo;shared-secret\u0026rdquo;\nThe two IPs identify the tunnel endpoints — they must match the leftid and rightid values in ipsec.conf exactly, because charon looks up the secret by matching the peer identities presented during IKE negotiation. The PSK itself must match what was configured in the Azure Connection resource, character for character.\nThis file does not change structure over the lifetime of the tunnel. The only reason to update it is if you rotate the PSK — in Azure you set a new shared key on the Connection, then update the value here and run sudo ipsec reload secrets to pick it up without restarting the daemon or dropping the tunnel.\nOne important operational note: this file contains a plaintext secret and should be owned by root with permissions 600. StrongSwan will warn if it is world-readable.\nsudo nano /etc/ipsec.secrets\n192.168.1.106 20.219.67.227 : PSK \u0026ldquo;your_shared_key\u0026rdquo;\nWe have setup all necessary infrastructure to bring up the tunnel now. But before that, let us understand a bit about how tunnels are established\nUnder the Hood: The Mechanics of Tunnel Initiation The S2S tunnel was initiated from on-premises — When StrongSwan sends the initial IKE packet outbound (src: 192.168.1.106:500 → dst: 20.219.67.227:500), the Wi-Fi router performs source NAT — replacing 192.168.1.106 with the public IP 61.69.136.49 — and records this translation in its conntrack table. When Azure replies, the router matches the inbound packet against that entry and reverses the translation, forwarding the packet to 192.168.1.106.\nThe S2S tunnel is initiated from Azure — in this scenario, the Azure VPN Gateway is the initiator of the tunnel. This will fail to even establish a tunnel unless a port-forwarding is configured on the Wi-Fi router. If you are curious, here are the steps to have Azure initiate the tunnel. 1. Set auto=add in ipsec.conf — tells StrongSwan that it should not initiate tunnel and just be a responder, StrongSwan won’t initiate the tunnel on startup and won’t re-initiate if it drops.\n2. Set Connection Mode to _InitiatorOnly_ in Azure Local Network Gateway \u0026raquo; Connection \u0026raquo; [Your Connection] \u0026raquo; Configuration blade properties.\n3. Enable port forwarding in your Wi-Fi router with these values _1. InternalIP:192.168.1.106 — InternalPort:500 — Protocol:UDP — ExternalPort:500 2. InternalIP:192.168.1.106 — InternalPort:4500 — Protocol:UDP — ExternalPort:4500_\nPort 500 is used for the initial IKE handshake. Port 4500 is used for NAT Traversal (NAT-T) — once both sides detect a NAT device on the path, all subsequent IKE and ESP traffic moves to UDP:4500.\nThese rules tell the Wi-Fi router to forward any inbound UDP:500 and UDP:4500 traffic arriving on the WAN interface to _192.168.1.106_, regardless of the source. Without these rules, the router has no NAT mapping for unsolicited inbound IKE packets and drops them.\nIMPORTANT — Who initiates the tunnel has no bearing on data packet routing between the sites as long as the tunnel is UP. Once the IPsec SA is established, the tunnel is a symmetric pipe — packets flow freely in both directions regardless of which side initiated IKE. Once the tunnel is up, the success and failure points are identical in both cases.\nBring the tunnel up With both sides configured, bring the tunnel up from the StrongSwan host. Follow the below steps to bring the tunnel up.\n# Check current state\nsysctl net.ipv4.ip_forward\n# Enable if 0\nsudo sysctl -w net.ipv4.ip_forward=1\nip_forward must be enabled for the StrongSwan host to forward packets between interfaces — we will cover exactly why in Part 6.\n#Start strongswan - if is just installed or not running else run ipsec restart and skip the second step\nsudo ipsec start\n#Bring connection up\nsudo ipsec up azure-s2s-manual\n#Verify\nsudo ipsec status\nipsec status output should show the tunnel as ESTABLISHED\nSecurity Associations (1 up, 0 connecting):\nazure-s2s-manual[1]: ESTABLISHED 13 minutes ago, 192.168.1.106[192.168.1.106]\u0026hellip;20.219.67.227[20.219.67.227]\nazure-s2s-manual{1}: INSTALLED, TUNNEL, reqid 1, ESP in UDP SPIs: c0a6aaf5_i 90cbbb64_o\nazure-s2s-manual{1}: 192.168.1.0/24 192.168.122.0/24 === 10.66.5.0/24\nSummary Now that the tunnel is established, it’s worth mapping out the traffic flows it enables — and one it doesn’t, yet. There are 3 scenarios of bi-directional traffic flow in our setup between —\nAzure Virtual Machines and On-premises Virtual Machines Azure Virtual Machines and On-premises VPN Gateway On-premises VPN Gateway and the On-premises Virtual Machines Except the packet flow from Azure virtual machine destined for on-premises KVM virtual machines (which sit behind StrongSwan on the libvirt network), all of the above will work without any additional configuration.\nIn the next part of the series we will discuss why they work and why one of them doesn’t.\n","permalink":"https://gurupasupathy.com/post/2026-06-06_building-hce-part-5-connectivity-site-to-site-vpn-establishing/","summary":"\u003cp\u003eIn the \u003ca href=\"https://pasupathy-guru.medium.com/handson-building-hybrid-cloud-environment-part-4-identity-domain-joining-a-linux-vm-and-59a48a2be7f2?source=friends_link\u0026amp;sk=fba0fcbdf7ae87efd0f4b01ed798c21e\"\u003eprevious part,\u003c/a\u003e we established an on-premises identity foundation. The on-premises setup consists of a virtual network with Windows and Linux VMs joined to an on-premises Active Directory domain hosted on two domain controllers. In this part, we will create a VPN Gateway in Azure and a StrongSwan IPsec gateway on-premises and establish the Site-to-Site VPN tunnel — the foundation of our hybrid lab.\u003c/p\u003e\n\u003cp\u003eImplementing a Site-to-Site (S2S) tunnel is simple — so rather than walking through the steps procedurally, I want to focus on what each component is actually doing.\u003c/p\u003e","title":"HandsOn — Building Hybrid Cloud Environment — Part 5— Connectivity— Site-to-Site VPN establishing…"},{"content":"Does APIM support zero downtime deployment? — To answer this question, multiple factors need to be ascertained, like, What is the SKU? Have you opted for Availability zones? etc. In fact, the question needs to be qualified further. What do you mean by zero downtime deployment?\nIn the case of APIM, there are infrastructure changes and then there are gateway configuration changes like API specifications and policies. So, the answer depends on — SKU, AZ, “what” kind of changes\nFrom the official documentation:\n“When you change availability zone configuration, the changes can take 15 to 45 minutes or more to apply. The API Management gateway can continue to handle API requests during this time.”\nGateway configuration, such as APIs and policy definitions, regularly synchronizes between the availability zones that you select for the instance. Propagation of updates between the availability zones normally takes less than 10 seconds.\nActive requests: When an availability zone is unavailable, any requests in progress that are connected to an API Management unit in the faulty availability zone are terminated and need to be retried.\nAutomatic: You can expect instances that use automatic availability zone support to have no downtime during an availability zone outage. Units in the unaffected zone or zones continue to work.\n“You can also expect instances that use automatic availability zone support, but have a single unit, to have no downtime.” In this case, API Management distributes the unit’s underlying compute resources to two zones. The resource in the unaffected zone continues to work.\nZone-redundant: You can expect zone-redundant instances to have no downtime during an availability zone outage.\nMy personal view based on this is —API specifications and Policy updates won’t cause any non-recoverable failures to the consumers; provided retry strategy is in place.\nIs it zero downtime? Zero downtime need not mean every request succeeds on the first attempt. If the system remains available and failures are recoverable, it meets the zero-downtime requirement. So — Yes.\nConfirmation from Microsoft Question and Answer Forum To validate my understanding, I reached out to the MS Q\u0026amp;A forum and got a response consistent to the above understanding.\nHere is the link to the question in the forum that has the official response.\nBottom line — API specification and Policy updates are zero downtime.\n","permalink":"https://gurupasupathy.com/post/2026-06-05_apim_policy_update_0downtime/","summary":"\u003cp\u003eDoes APIM support zero downtime deployment? — To answer this question, multiple factors need to be ascertained, like, What is the SKU? Have you opted for Availability zones? etc. In fact, the question needs to be qualified further. What do you mean by zero downtime deployment?\u003c/p\u003e\n\u003cp\u003eIn the case of APIM, there are infrastructure changes and then there are gateway configuration changes like API specifications and policies. So, the answer depends on — SKU, AZ, “what” kind of changes\u003c/p\u003e","title":"API Specification and Policy Updates in Azure APIM Are Zero Downtime"},{"content":"Photo by Matt Halls on Unsplash\nPhoto by Matt Halls on Unsplash\nIntroduction I have been using the DefaultAzureCredential class for a long time without understanding how it works. So, I jotted down my notes and learnings in this write-up for future me — and maybe you will find it useful too.\nTokenCredential TokenCredential is the abstract base class representing a source of authentication tokens for Azure services. Many classes derive from TokenCredential but the most interesting ones are DefaultAzureCredential and ChainedTokenCredential.\nI’m using package version :Azure.Identity v1.20.0\nDefaultAzureCredential This class is a pre-built chain covering the most common authentication methods. When using DefaultAzureCredential to acquire a token, the class attempts to acquire a token via each of the below credentials, in the following order, stopping when one provides a token:\nEnvironmentCredential WorkloadIdentityCredential ManagedIdentityCredential VisualStudioCredential VisualStudioCodeCredential (enabled by default for SSO with VS Code on supported platforms when Azure.Identity.Broker is installed) AzureCliCredential AzurePowerShellCredential AzureDeveloperCliCredential InteractiveBrowserCredential (not included by default; can use brokered authentication if Azure.Identity.Broker is installed) source: DefaultAzureCredential Class (Azure.Identity) — Azure for .NET Developers | Microsoft Learn\nDefaultAzureCredential with Exclusion DefaultAzureCredential also supports options that allow you to exclude credentials from evaluation. This is useful if you don’t want to use certain credentials, for example, when running my function locally, I don’t want to use the VisualStudio or VisualStudioCode credential as I prefer AzureCliCredential.\nnew DefaultAzureCredential(\nnew DefaultAzureCredentialOptions\n{\nExcludeVisualStudioCodeCredential = true,\nExcludeVisualStudioCredential = true\n});\nChainedTokenCredential In some cases, you will know exactly which credentials you want to use. ChainedTokenCredential is very useful in such cases. It evaluates only the credentials you explicitly specify, in the order provided. I use this locally. For example, below I choose to use only CLI and VS credentials when my function is running locally.\nnew ChainedTokenCredential(\nnew AzureCliCredential(),\nnew VisualStudioCredential()\n));\nWhy Managed Identity fails locally but works in Azure DefaultAzureCredential attempts ManagedIdentityCredential (which is unavailable locally — IMDS timeout) and falls through to developer credentials — VS, CLI, etc (refer the table above). The first credential in the chain that can successfully acquire a token is used.\nNote: DefaultAzureCredential evaluates\nEnvironmentCredential, then WorkloadIdentityCredential,\nfollowed by ManagedIdentityCredential.\nThere is no native way to emulate or impersonate a managed identity locally. IMDS (169.254.169.254) is a hypervisor-level endpoint that only exists on Azure compute. It is physically not present on your laptop. The common alternative would be to use a service principal with similar privileges as the UAMI to test your function.\nIn Azure: The chain gets to ManagedIdentityCredential, IMDS responds, token acquired. Everything below it never runs.\nLocally: IMDS doesn’t exist, so ManagedIdentityCredential times out and falls through.\nWhen DefaultAzureCredential is used, the evaluation would like this (assuming none of the credentials are able to provide a token) —\n//When running locally there is no IMDS to supply managed identity token\n//Assuming VS and other credentials don\u0026rsquo;t have access to the resource.\n//This is how DefaultAzureCredentials evaluates the chain\nEnvironmentCredential → skipped (env vars not set)\nWorkloadIdentityCredential → skipped (not configured)\nManagedIdentityCredential → unavailable/failure (no IMDS endpoint locally)\nVisualStudioCredential → failed VisualStudioCodeCredential → failed\nAzureCliCredential → failed\nAzurePowerShellCredential → failed\nAzureDeveloperCliCredential → failed\nInteractiveBrowserCredential → Not included by default (must be explicitly enabled)\nHow to configure for local debugging One approach is to use a factory that returns different credential implementations depending on the execution environment.\nFor example, when the environment is local, a factory can return a DefaultAzureCredential where you can exclude Visual Studio and Visual Studio Code credentials if you favour AzureCliCredential. Or, better still, if you want to use only Azure CLI or VS credentials, it can return a ChainedTokenCredential with just those two credentials, as shown below\nnew ChainedTokenCredential(\nnew AzureCliCredential(),\nnew VisualStudioCredential()\n)\nWhen the environment is Azure, it can just return a DefaultAzureCredential instance or a ChainedTokenCredential as discussed earlier if you are sure about the credential you want to use. If you want to use a specific credential, it can be used directly without DefaultAzureCredential or ChainedTokenCredential. For example, here I’m using a specific credential class —\nnew ManagedIdentityCredential(\nManagedIdentityId.FromUserAssignedClientId(\u0026laquo;your-uami-ClientId\u0026raquo;)))\nA sample flow will look as below when you use a ChainedTokenCredential as shown previously —\nAzureCliCredential → acquire token - SUCCESS\nVisualStudioCredential → skipped\nNotice that only the two credentials mentioned in the ChainedTokenCredential chain are evaluated.\nAZURE_CLIENT_ID influence in identity selection Many Azure resources can have both System Assigned Managed Identity (SAMI) and User Assigned Managed Identity (UAMI). It is crucial to understand how the AZURE_CLIENT_ID environment variable influences how Azure SDK authentication selects a managed identity. This is not always obvious, and I could not find it clearly documented anywhere\nAZURE_CLIENT_ID set?\n│\n├── YES\n│ │\n│ └── Which credential in code?\n│ │\n│ ├── DefaultAzureCredential()\n│ │ └── ✅ UAMI (via AZURE_CLIENT_ID)\n│ │\n│ ├── ManagedIdentityCredential(id)\n│ │ └── ✅ UAMI (via explicit id, ignores AZURE_CLIENT_ID)\n│ │\n│ └── ManagedIdentityCredential()\n│ └── ⚠️ SAMI (ignores AZURE_CLIENT_ID)\n│\n└── NO\n│\n└── Which credential in code?\n│\n├── DefaultAzureCredential()\n│ └── ⚠️ SAMI\n│\n├── ManagedIdentityCredential(id)\n│ └── ✅ UAMI (via explicit id)\n│\n└── ManagedIdentityCredential()\n└── ⚠️ SAMI (no id provided)\nKey rule: ManagedIdentityCredential() does not use\nAZURE_CLIENT_ID to select a user-assigned managed identity.\nOnly DefaultAzureCredential does\nNote: the flowchart assumes a System Assigned Managed Identity is present. In cases where SAMI is absent and no UAMI is explicitly provided, token acquisition will fail\nAuthentication between a Function App and its AzureWebJobsStorage is an independent flow not covered by the flowchart above. See [Using Managed Identity for Function App Authentication with its Storage Account] for a detailed walkthrough.\nSummary DefaultAzureCredential is environment-aware by design — the same code uses managed identity in Azure and falls through to developer credentials locally. This means local failures don\u0026rsquo;t always predict Azure failures, and the identity that succeeds locally may be in a different tenant than your Azure resources. For local testing against tenant-specific resources, ensure az login --tenant \u0026lt;tenant-id\u0026gt; is used explicitly, not just any az login. To test managed identity behaviour you must deploy, or substitute a service principal with matching roles via EnvironmentCredential.\nReference — Authentication best practices with the Azure Identity library for .NET — .NET | Microsoft Learn\n","permalink":"https://gurupasupathy.com/post/2026-05-20_choosing-the-right-tokencredential-and-how-azure-client-id-influences-identity-selection/","summary":"\u003cp\u003ePhoto by Matt Halls on Unsplash\u003c/p\u003e\n\u003cp\u003e\u003cimg loading=\"lazy\" src=\"https://cdn-images-1.medium.com/max/800/1*Z-K3Yu80zshHVrnXpLJ0FA.jpeg\"\u003e\u003c/p\u003e\n\u003cp\u003ePhoto by \u003ca href=\"https://unsplash.com/@matthalls?utm_source=unsplash\u0026amp;utm_medium=referral\u0026amp;utm_content=creditCopyText\"\u003eMatt Halls\u003c/a\u003e on \u003ca href=\"https://unsplash.com/photos/a-very-tall-building-with-lots-of-windows-KeQiUCKNqOc?utm_source=unsplash\u0026amp;utm_medium=referral\u0026amp;utm_content=creditCopyText\"\u003eUnsplash\u003c/a\u003e\u003c/p\u003e\n\u003ch4 id=\"introduction\"\u003eIntroduction\u003c/h4\u003e\n\u003cp\u003eI have been using the DefaultAzureCredential class for a long time without understanding how it works. So, I jotted down my notes and learnings in this write-up for future me — and maybe you will find it useful too.\u003c/p\u003e\n\u003ch4 id=\"tokencredential\"\u003eTokenCredential\u003c/h4\u003e\n\u003cp\u003e\u003ccode\u003eTokenCredential\u003c/code\u003e is the abstract base class representing a source of authentication tokens for Azure services. Many classes derive from TokenCredential but the most interesting ones are DefaultAzureCredential and ChainedTokenCredential.\u003c/p\u003e","title":"Choosing the Right TokenCredential and How AZURE CLIENT ID Influences Identity Selection — A…"},{"content":"Recently, while setting up a Function App to use User Assigned Managed Identity (UAMI) to authenticate to its AzureWebJobsStorage I encountered SyncTriggerfailure.\nI checked whether the UAMI had necessary RBAC roles to work on AzureWebJobsStorage — it had. So, I wasn’t sure what the issue was.\nAnalyzing further, I realized I had skipped a few mandatory variable settings to enable UAMI based authentication to AzureWebJobsStorage (setting the environment variable AzureWebJobsStorage__accountName alone does not suffice)\nSteps to enable UAMI access to AzureWebJobsStorage Enabling UAMI access to AzureWebJobStorage involves changes in Terraform (when the Function App is created), the App Settings (Environment variables) and finally the Role Based Access.\nTerraform\nIf for some reason you want to use UAMI to authenticate with AzureWebJobsStorage, then Terraform block **functionAppConfig.deployment.storage.authentication**: should look like below\nNote: I am using Flex Consumption tier\nauthentication = {\ntype = \u0026ldquo;userassignedidentity\u0026rdquo;\nuserAssignedIdentityResourceId = \u0026ldquo;\u0026rdquo;\n}\nThis tells the platform to use UAMI for the deployment package blob container — the part that isn’t controlled by app settings.\nApp settings\nOnce the Function App is deployed with usermanagedidentity as authentication type (terraform), ensure the below variables are set in the Function App’s Environment variables\nAzureWebJobsStorage__accountName = AzureWebJobsStorage__credential = managedidentity\nAzureWebJobsStorage__clientId = All three settings are mandatory.\nRBAC\nThis is the final bit. We have the Function App deployed, environment variables set, next, the UAMI needs privilege to access the storage account.\nProvide Storage Blob Data Owner owner role to the UAMI on the storage account\nWith these three changes, your Function App will authenticate with its AzureWebJobsStorage using UAMI.\nCaveat: Although this works, the issue with this approach is all services that are assigned this UAMI will gain access to the function’s storage account. This is not ideal if many services share the same UAMI. The better option will be to use System Assigned Managed Identity (SAMI) for authentication between Function App and its storage account. For the rest of the outbound calls that the functions might make, use UAMI.\nUsing System Assigned Managed Identity To use SAMI just setAzureWebJobsStorage__accountName — SAMI is the default, no additional settings needed. Next, give SAMI Storage Blob Data Owner on the storage account. If you are using Terraform to deploy the authentication block of the Function App will look like this —\nauthentication = {\ntype = \u0026ldquo;systemassignedidentity\u0026rdquo;\n}\nSAMI is my preferred method for authentication with the AzureWebJobsStorage for the reasons already discussed in the caveat section.\nSummary Configuring a Function App to authenticate with its AzureWebJobsStorage using managed identity requires changes at three levels — Terraform, app settings, and RBAC — and all three must be consistent with each other. For UAMI, all three AzureWebJobsStorage__* settings are mandatory; omitting any one of them will cause the runtime to fail. However, personally I feel UAMI for AzureWebJobsStorage is rarely the right choice — since UAMI is a shared identity, every service assigned to it inherits access to the storage account. SAMI, which requires only AzureWebJobsStorage__accountName and a single role assignment, is the simpler and safer default for this use case.\nReference — Use User managed identity to replace connection string in”AzureWebJobsStorage” for function apps | Microsoft Community Hub\n","permalink":"https://gurupasupathy.com/post/2026-05-19_using-mi-for-function-app-authentication-with-its-storage-account/","summary":"\u003cp\u003eRecently, while setting up a Function App to use User Assigned Managed Identity (UAMI) to authenticate to its \u003cstrong\u003eAzureWebJobsStorage\u003c/strong\u003e I encountered \u003ccode\u003eSyncTrigger\u003c/code\u003efailure.\u003c/p\u003e\n\u003cp\u003eI checked whether the UAMI had necessary RBAC roles to work on \u003cstrong\u003eAzureWebJobsStorage\u003c/strong\u003e — it had. So, I wasn’t sure what the issue was.\u003c/p\u003e\n\u003cp\u003eAnalyzing further, I realized I had skipped a few mandatory variable settings to enable UAMI based authentication to \u003cstrong\u003eAzureWebJobsStorage\u003c/strong\u003e (setting the environment variable \u003ccode\u003eAzureWebJobsStorage__accountName\u003c/code\u003e alone does not suffice)\u003c/p\u003e","title":"Using Managed Identity for Function App Authentication with its Storage account"},{"content":"In the previous parts, we created a primary and secondary domain controller and tested the domain join from Windows client VM. In this part, we will domain-join a Linux VM to the domain controllers we created. The main purpose is to introduce a non-Windows system into the domain to test Kerberos authentication against Active Directory. We will —\nProvision a new Linux VM Assign the DC IP Install Linux Kerberos client tool Join the domain Validation The fundamental domain join mechanics are the same for Windows and Linux. The underlying authentication protocol (Kerberos) is identical for both — the difference is integration depth. Windows has native AD support built in, whereas Linux requires explicit configuration via tools like realmd, SSSD, and the Kerberos client utilities. So, let’s get started with a new VM.\nProvision a new Ubuntu VM We will create a VM based on Ubuntu 22.04 LTS for our virtual network. You need to download the Ubuntu iso and create a VM using virt-manager following the regular VM creation process. Once the VM is up and running, let’s start with some connectivity checks.\nConnectivity Checks:\nPing DCs:\nping 192.168.122.10\nping 192.168.122.11\nThis will work because the Linux VM will be created in the same network range as the Windows client or the DCs. If this is not the case, ensure you map the VM to the relevant network using the virt-manager (This can happen if you have multiple virtual networks running in your system)\nAdd the DC IP to the resolv.conf This step is like what we do for a Windows client, just that we do it a bit differently. The Linux VM should use the Domain Controller’s IP as its DNS server. The fresh Linux VM will have the DNS pointing to itself at 127.0.0.53\nIf you remember, we used the GUI to change the preferred DNS for the VM in Windows. In Linux, the DNS server details reside in /etc/systemd/resolved.conf\nOn modern Ubuntu systems, the file /etc/resolv.conf is not meant to be edited directly because it is automatically generated and managed by the systemd-resolved service. Any manual changes you make will be overwritten. Instead, you should configure DNS settings in the source of truth, typically /etc/systemd/resolved.conf (or via Netplan/NetworkManager depending on your setup), and then restart the service.\nNote — Even if /etc/resolv.conf appears stable after manual edits, it is still managed by the system in most modern Ubuntu setups and may be overwritten on reboot or network changes. Always configure DNS through systemd-resolved or Netplan for reliability\nEdit the resolved.conf to set the DC’s IP as the DNS server for the Linux VM\nsudo nano /etc/systemd/resolved.conf\n[Resolve]\n# Some examples of DNS servers which may be used for DNS= and FallbackDNS=:\n# Cloudflare: 1.1.1.1#cloudflare-dns.com 1.0.0.1#cloudflare-dns.com 2606:4700:4\u0026gt;\n# Google: 8.8.8.8#dns.google 8.8.4.4#dns.google 2001:4860:4860::8888#dns.go\u0026gt;\n# Quad9: 9.9.9.9#dns.quad9.net 149.112.112.112#dns.quad9.net 2620:fe::fe#d\u0026gt;\nDNS=192.168.122.10 192.168.122.11\n#FallbackDNS=\nDomains=hybrid.local\n#DNSSEC=no\n#DNSOverTLS=no\n#MulticastDNS=no\n#LLMNR=no\n#Cache=no-negative\n#CacheFromLocalhost=no\n#DNSStubListener=yes\n#DNSStubListenerExtra=\n#ReadEtcHosts=yes\n#ResolveUnicastSingleLabel=no\n#StaleRetentionSec=0\nRestart and check that value has persisted.\nsudo systemctl restart systemd-resolved\nVerify DNS resolution:\nNow that the DNS has been updated try nslookup hybrid.local The expected output is\nServer: 127.0.0.53\nAddress: 127.0.0.53#53\nName: hybrid.local\nAddress: 192.168.122.10\nName: hybrid.local\nAddress: 192.168.122.11\nThe server shown as 127.0.0.53 is the systemd-resolved stub — this is expected, as systemd-resolved intercepts all DNS queries locally before forwarding them upstream. Queries are forwarded to the configured upstream DNS servers (your DCs). The returned addresses confirm that DC DNS is now authoritative for hybrid.local\nInstall Kerberos Client Tools From this step onwards, the process of domain join is different from Windows. While the Windows client has necessary Kerberos modules, Linux client should be enabled to communicate with Active Directory using Kerberos. To do that install the Kerberos client tool, [krb5-user](https://web.mit.edu/kerberos/krb5-1.4/krb5-1.4/doc/krb5-user.html)\nWhat does krb5-user really do? It installs client end tools that enable communication with a server using Kerberos. In Kerberos, the password is never sent over the wire. Instead, it is converted into a cryptographic key, which is then used to encrypt a timestamp during pre-authentication, which will subsequently be decrypted by the server. The validation is the ability of the server to “decrypt” the client request using its version of the stored key — and this is precisely why clock skew breaks Kerberos. If the timestamp is outside the allowed window, the DC rejects it regardless of whether decryption succeeded.\nNotes:\nWhen the DC was promoted, the admin password you provided was hashed using DC’s preferred ‘method(s)’ (etype) and stored in ntds.dit (e.g. AES256 key stored against account)\nWhen a Linux VM is created the krb5.conf file defines supported etypes\nWhen the Linux VM wants to authenticate with the Windows AD, you initiate the Kerberos flow by running kinit\nClient VM’s kinit sends → AS-REQ to DC saying, “I am administrator, The Client VM supports AES256, AES128, RC4”\nDC responds saying “I need pre-auth, and for this account I use AES256”\nClient then derives a key from the password (using the required encryption type, e.g., AES256) → uses this key to encrypt timestamp in Client VM. This is sent as AS-REQ (with pre-auth): username + AES256-encrypted timestamp\nThe DC decrypts this AS-REQ with the stored AES256 key against this user, validates the timestamp is within the allowed clock skew window (default 5 minutes) → issues TGT\nThe default TTL of a Kerberos TGT is 10 hours\nLet’s proceed with the setup.\nsudo apt update\nsudo apt install krb5-user -y\nPrompt: Enter default realm → HYBRID.LOCAL (uppercase).\nIn the next prompt provide the FQDN on your DCs separated by space\nAnd finally when asked for primary DC, provide your primary DC’s FQDN\nRun dpkg -l | grep krb5-user It should list krb5-user as installed.\nConfigure /etc/krb5.conf We have installed the necessary client tools to enable Kerberos based exchange from the Linux VM. Next, the Linux VM’s Kerberos client must point to the Key Distribution Centers in the Active Directory. Update sudo nano /etc/krb5.conf as described below\n[libdefaults]\ndefault_realm = HYBRID.LOCAL\ndns_lookup_realm = false\ndns_lookup_kdc = true\n[realms]\nHYBRID.LOCAL = {\nkdc = 192.168.122.10\nkdc = 192.168.122.11\nadmin_server = 192.168.122.10\n}\n[domain_realm]\n.hybrid.local = HYBRID.LOCAL\nhybrid.local = HYBRID.LOCAL\nConfirm your changes have persisted — cat /etc/krb5.conf\nwe will validate if AD is issuing Kerberos tickets to us by requesting a Ticket Granting Ticket from AD KDC. Run kinit Administrator@HYBRID.LOCAL and enter Administrator password.\nNext run klist\nYou should be seeing a ticket as below:\nDefault principal: Administrator@HYBRID.LOCAL\nValid starting Expires Service principal\n03/06/26 19:11:11 03/07/26 05:11:11 krbtgt/HYBRID.LOCAL@HYBRID.LOCAL\nIf you are curious about the attributes of the ticket, run klist -f # shows flags like forwardable, renewable or klist -e # shows encryption types\nJoin Linux to Domain We have the foundation required for a domain join. Now, we will install the required packages for the domain join\nsudo apt install realmd sssd sssd-tools adcli samba-common-bin oddjob oddjob-mkhomedir -y\nrealmd — discovers which domains or realms it can use or configure. It can discover and identify Active Directory domains by looking up the appropriate DNS SRV records.\nsssd — System Security Services Daemon. After the join, this is what runs continuously to handle authentication requests — it talks to the DC for login, group membership, sudo rules etc. The long-running engine.\nsssd-tools — CLI utilities for sssd (sssctl, sss_override etc.) — useful for cache flushing and diagnostics.\nadcli — Active Directory CLI. realmd uses this under the hood to perform the low-level AD join operations (creating the computer object in AD, setting up the machine account).\nsamba-common-bin — provides tools like net and wbinfo that realmd/sssd lean on for certain AD operations.\noddjob — a D-Bus service that runs privileged helper tasks on behalf of other services. sssd uses it to do things it can’t do as its own user.\noddjob-mkhomedir — the specific oddjob helper that automatically creates a home directory the first time a domain user logs into the Linux machine. Without this, a domain user authenticates successfully but lands with no home directory.\nrealmd + adcli → join-time (one-off operation)\nsssd + sssd-tools → runtime (ongoing authentication)\noddjob + mkhomedir → login-time helper (home dir creation)\nsamba-common-bin → shared plumbing both layers use\nVerify the configuration and connectivity to the domain controller\nsudo realm discover hybrid.local\nhybrid.local\ntype: kerberos\nrealm-name: HYBRID.LOCAL\ndomain-name: hybrid.local\nconfigured: no\nserver-software: active-directory\nclient-software: sssd\nrequired-package: sssd-tools\nrequired-package: sssd\nrequired-package: libnss-sss\nrequired-package: libpam-sss\nrequired-package: adcli\nrequired-package: samba-common-bin\nThe above proves network connectivity to the DC, correct DNS resolution, and that AD is responding to discovery queries.\nJoin the domain as Administrator\nsudo realm join --user=Administrator hybrid.local\nPost domain join, verify the configuration and connectivity to the domain controller\nsudo realm list\nhybrid.local\ntype: kerberos\nrealm-name: HYBRID.LOCAL\ndomain-name: hybrid.local\nconfigured: kerberos-member server-software: active-directory\nclient-software: sssd\nrequired-package: sssd-tools\nrequired-package: sssd\nrequired-package: libnss-sss\nrequired-package: libpam-sss\nrequired-package: adcli\nrequired-package: samba-common-bin\nlogin-formats: %U@hybrid.local\nlogin-policy: allow-realm-logins\nRun, id testuser1@HYBRID.LOCALyou will see that the testuser1 is looked up from the Domain Controller by the Linux Client.\nEnable automatic home directory creation:\nsudo pam-auth-update\n# Enable \u0026ldquo;Create home directory on login\u0026rdquo;\nEnable “Create home directory on login”\nVerification As a final test, you should be able to successfully login using one of the test users you had created and used for the Windows client (testuser1@HYBRID.LOCAL). Notice that the home directory for testuser1 is created.\nAlso, login to the DC and see a new Linux VM getting added there under hybrid.local \u0026gt; Computers\nSummary This concludes the part of the series where we established an on-premises identity foundation. In this series, so far, we have established\nA functioning Active Directory forest (hybrid.local) with two domain controllers Multi-master replication verified across both DCs — SYSVOL, NETLOGON, and directory objects FSMO roles identified and accounted for A Windows client and a Linux VM both domain-joined and authenticated via Kerberos DNS working end-to-end: internal resolution via the DC, external resolution via the forwarder Up next, a S2S VPN tunnel with Azure which would complete the hybrid connectivity foundation\n","permalink":"https://gurupasupathy.com/post/2026-05-02_building-hce-part-4--identity-domain-joining-a-linux-vm/","summary":"\u003cp\u003eIn the \u003ca href=\"https://pasupathy-guru.medium.com/handson-building-hybrid-cloud-environment-part-3-identity-second-dc-and-dc-replication-3f9ae9e5c651?source=friends_link\u0026amp;sk=cde84a39160f76f4d64ef3e842b38e8b\"\u003eprevious\u003c/a\u003e parts, we created a primary and secondary domain controller and tested the domain join from Windows client VM. In this part, we will domain-join a Linux VM to the domain controllers we created. The main purpose is to introduce a non-Windows system into the domain to test Kerberos authentication against Active Directory. We will —\u003c/p\u003e\n\u003col\u003e\n\u003cli\u003eProvision a new Linux VM\u003c/li\u003e\n\u003cli\u003eAssign the DC IP\u003c/li\u003e\n\u003cli\u003eInstall Linux Kerberos client tool\u003c/li\u003e\n\u003cli\u003eJoin the domain\u003c/li\u003e\n\u003cli\u003eValidation\u003c/li\u003e\n\u003c/ol\u003e\n\u003cp\u003e\u003cimg loading=\"lazy\" src=\"https://cdn-images-1.medium.com/max/1200/1*0xW3AuDQsHnXJR5dfYIdXw.png\"\u003e\u003c/p\u003e","title":"HandsOn — Building Hybrid Cloud Environment — Part 4— Identity — Domain-Joining a Linux VM and…"},{"content":"Previously, we created a domain controller (DC), joined a test virtual machine to the newly created domain and verified the authentication of a test user from client VM. In this part, we will build redundancy into our environment by introducing a second domain controller.\nActive Directory (AD) is designed for multi-master replication, meaning multiple domain controllers hold a copy of the directory database.\nAdding a second DC provides:\nHigh Availability — authentication continues if one DC fails Load Distribution — clients can authenticate against different DCs Replication Redundancy — AD database changes replicate automatically In this part, we will:\nProvision a secondary domain controller Assign a static IP Join the domain Install AD DS and promote the server Verify replication and health Let’s get started and add the second domain controller.\nProvision Secondary domain controller For the second DC, create another Windows Server 2022 VM using the same process outlined in the first part of this series.\nEnsure that the newly created VM is attached to the same libvirt virtual network so it can reach the primary DC\nRun ipconfig /all to confirm the network range, and gateway IP are the same as the primary DC. If you are following along then the gateway should be 192.168.122.1\nAssign Static IP Running ipconfig /all, you will notice a preferred IPv4 address for this VM. It is 192.168.122.145in my case. This IP was handed out by DHCP (dnsmasq) when the VM was created and attached to the virtual network. It can change the next time you restart the virtual network, and the VMs that have joined the domain will not be able to reach the DC. To avoid this, we will assign a static IP to this VM.\nOpen Server Manager\nClick Local Server Before setting the static IP, let us rename the computer to HCE-DC02and restart. After restart, go to Local Server and Click the link next to Ethernet A pop-up Right-click your Ethernet adapter → Properties Double-click Internet Protocol Version 4 (TCP/IPv4) Select Use the following IP address option Enter the below values: IP address: 192.168.122.11 - The static IP we have chosen for secondary DC\nSubnet mask: 255.255.255.0\nDefault gateway: 192.168.122.1 - Bridge\u0026rsquo;s IP\n8. Select Use the following DNS server addresses\nNote: Before executing this step, do a small test. Run nslookup and you will notice DNS query is sent to the gateway _192.168.122.1_. dnsmasq, which runs on the gateway, cannot answer query about hybrid.local and forwards it up the chain — to your WiFi router, then to your ISP\u0026rsquo;s DNS. None of them have ever heard of _hybrid.local_ because it is not a public domain — it exists only inside the primary DC\u0026rsquo;s DNS. The query times out somewhere in that chain and returns nothing useful\nNow, the secondary DC needs to rely on primary DC for any name resolution for domains managed by primary DC. In our case, hybrid.local is visible only in the context of primary DC and for secondary DC to reach other virtual machines in hybrid.local domain, it must consult primary DC’s DNS. That’s why you must set the Preferred DNS server as 192.168.122.10\nSet Preferred DNS server: 192.168.122.10\nOnce the above steps are completed, check if the static IP is updated by running ipconfig\nNote around VM rename — Rename the server before promotion — renaming a DC after the fact is not recommended.\nNow, we can verify domain controller discovery. Run nslookup in interactive mode\nnslookup\nset type=SRV\n_ldap._tcp.dc._msdcs.hybrid.local\nThe SRV record should return the hostname of DC1.\nJoin the Domain At this point, the VM is able to resolve the primary DC, and it has a static IP assigned. We just need one more step before promoting this VM as secondary DC. It must first be a domain member server.\nUnlike the primary DC which created the domain during promotion, the secondary DC is joining a domain that already exists. It needs a computer object, and a secure channel established before the promotion wizard can authenticate against the existing domain and begin replication.\nRun PowerShell as Administrator:\nAdd-Computer -DomainName hybrid.local -Credential HYBRID\\Administrator -Restart\nAfter reboot, log in as hybrid\\Administrator\nTo verify domain membership, run whoamiThe current domain\\user should be displayed hybrid\\administrator\nInstall Active Directory Domain Services Role Now that the VM (it’s not a DC yet) has joined the domain, the next step is to make it a domain controller.\nOpen Server Manager \u0026gt; Click Manage (top right) \u0026gt; Click Add Roles and Features \u0026gt; Click Next until you reach Server Roles \u0026gt; Check: Active Directory Domain Services\u0026gt; When prompted: Click Add Features \u0026gt; Click Next until Install \u0026amp; Click Install\nPromote domain controller After installation completes, click the notification flag. Select: Promote this server to a domain controller and follow the wizard. The server will reboot after installation automatically.\nNote on deployment configuration —\nWhile installing pay attention to these attributes\nwhen the wizard prompts for a forest, Select: Add a domain controller to an existing domainProvide domain name as hybrid.local Domain controller options — a. check Domain Name System (DNS) serverand Global Catalog\nb. uncheck Read only domain controller (RODC)\nc. Set a Directory Services Restore Mode (DSRM) password Ignore the DNS delegation warning. In Additional Options choose Replicate from:to HCE-DC01.hybrid.local Verify Both Domain Controllers Exist The secondary domain controller is set up now. Run the following validation to confirm the promotion succeeded.\nOn both the DCs, open Active Directory Users and ComputersNavigate to Domain Controller ,You should now see HCE-DC01and HCE-DC02\nThis confirms both DCs are part of the domain. We have successfully set up High availability for the domain controllers.\nVerify Active Directory Replication Active Directory replicates directory changes between the two controllers automatically. You can check that by running repadmin /replsummary\nExample output:\nBeginning data collection for replication summary, this may take a while:\n\u0026hellip;..\nSource DSA largest delta fails/total %% error\nHCE-DC01 14m:44s 0 / 5 0\nHCE-DC02 02m:40s 0 / 5 0\nDestination DSA largest delta fails/total %% error\nHCE-DC01 02m:40s 0 / 5 0\nHCE-DC02 14m:44s 0 / 5 0\nInterpretation:\nlargest delta → how long since last replication fails/total → replication failures A healthy environment shows: 0 failures This proves multi-master replication is working.\nAfter promotion and replication stabilizes, update DNS settings so each DC points to itself as primary and the other DC as secondary.\nVerify SYSVOL Replication Group Policies live in the SYSVOL folder and must replicate between DCs. Check the share exists by running net shareIt is a quick sanity check.\nSYSVOL shares are only published by Windows when the DC considers itself healthy and SYSVOL replication is complete. Their presence confirms that DFS-R has done its job and this DC is ready to serve Group Policy to domain members.\nLook for SYSVOL, if they are present, the replication of GPO is working fine.\nIdentify FSMO Role Holders Even though AD supports multi-master writes, some operations must be handled by a single role owner to avoid conflicts. These are the FSMO roles (Flexible Single Master Operations).\nCheck which server holds them netdom query fsmo\nExample output:\nSchema master HCE-DC01\nDomain naming master HCE-DC01\nPDC HCE-DC01\nRID pool manager HCE-DC01\nInfrastructure master HCE-DC01\nIn small environments like this, all roles may remain on the first DC.\nNote: multi-master covers most directory writes; FSMO roles are for the specific operations where a single authority is required to avoid conflicts.\nWhy Multiple Domain Controllers Matter With two DCs:\nBoth hold a replicated copy of the Active Directory database Clients discover them through DNS SRV records Clients choose a DC based on site proximity and priority If one DC is offline, clients automatically fail over Authentication flow now becomes:\nClient\n↓\nDNS SRV query\n↓\nList of Domain Controllers\n↓\nClient selects reachable DC\n↓\nKerberos authentication\nThis is why clients never hardcode a Domain Controller IP. DNS provides the dynamic discovery layer.\nSummary In this part, we introduced a second domain controller and validated replication, establishing high availability for Active Directory.\nCurrent lab state:\nHCE-DC01 → First domain controller\nHCE-DC02 → Additional domain controller\nClient VM → Domain joined\nThis completes the Active Directory redundancy layer making it resilient.\nIn the next part, we will integrate a Linux VM and validate Kerberos-based authentication, extending identity beyond Windows systems.\n","permalink":"https://gurupasupathy.com/post/2026-04-24_building-hce-part-3--identity-additional-dc-and-replication/","summary":"\u003cp\u003e\u003ca href=\"https://pasupathy-guru.medium.com/handson-building-hybrid-cloud-environment-part-2-identity-on-premises-domain-controller-1152903ea89b?source=friends_link\u0026amp;sk=1e2563a1f7433d16441db694f87af581\"\u003ePreviously\u003c/a\u003e, we created a domain controller (DC), joined a test virtual machine to the newly created domain and verified the authentication of a test user from client VM. In this part, we will build redundancy into our environment by introducing a second domain controller.\u003c/p\u003e\n\u003cp\u003eActive Directory (AD) is designed for \u003cstrong\u003emulti-master replication\u003c/strong\u003e, meaning multiple domain controllers hold a copy of the directory database.\u003c/p\u003e\n\u003cp\u003eAdding a second DC provides:\u003c/p\u003e\n\u003cul\u003e\n\u003cli\u003e\u003cstrong\u003eHigh Availability\u003c/strong\u003e — authentication continues if one DC fails\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eLoad Distribution\u003c/strong\u003e — clients can authenticate against different DCs\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eReplication Redundancy\u003c/strong\u003e — AD database changes replicate automatically\u003c/li\u003e\n\u003c/ul\u003e\n\u003cp\u003eIn this part, we will:\u003c/p\u003e","title":"HandsOn — Building Hybrid Cloud Environment — Part 3— Identity — Additional DC and Replication"},{"content":"In the first part, we laid the foundation for the hybrid cloud environment. Now we have a virtual network with VM running Windows Server 2022 Evaluation. In this part, we will focus on adding the Identity plane to the hybrid cloud environment by introducing a domain controller and creating an Active Directory structure. We will create a client VM, domain join it and make sure a domain user is able to login\nWe will be following the below sequence.\nPromote the Virtual Machine hce-dc01 , created in part 1 as primary domain controller and create domain, forest, and OU Create user accounts Domain join a Windows client Primary Domain Controller Configuration The official Microsoft documentation defines a domain controller as — “A domain controller is a server that is running a version of the Windows Server® operating system and has Active Directory® Domain Services installed.”\nFor a simple hybrid cloud environment, a domain controller is not mandatory. So, why do we need this? Some hybrid scenarios depend heavily on the on-premises having an identity plane. Example, AD Connect. To introduce the identity plane in our virtual network, we need a server to manage the domain, forest, OU, users, policies, user authentication, and policy enforcement. This will be our domain controller, and the VM we created in part 1 will be used for this purpose.\nBefore starting this configuration we will run ip addr and take note down the IP range of the virtual bridge and Wi-Fi router. The output of the above command will be a list of all the interfaces running in your box with their IP ranges. Notice the virtual bridge, virbr0 (created when we set up the virtual network, refer to part 1) has a network range of 192.168.122.1/24, in my case\nvirbr0: \u0026lt;BROADCAST,MULTICAST,UP,LOWER_UP\u0026gt; mtu 1500 qdisc noqueue state UP group default qlen 1000\nlink/ether 52:54:00:d9:a6:2b brd ff:ff:ff:ff:ff:ff\ninet 192.168.122.1/24 brd 192.168.122.255 scope global virbr0\nvalid_lft forever preferred_lft forever\nLook for your Wi-Fi router range in the output. wlp58s0 is my Wi-Fi network and it has the range of 192.168.1.1/24\nwlp58s0: \u0026lt;BROADCAST,MULTICAST,UP,LOWER_UP\u0026gt; mtu 1500 qdisc noqueue state UP group default qlen 1000\nlink/ether 04:ed:33:e3:9d:f1 brd ff:ff:ff:ff:ff:ff\ninet 192.168.1.106/24 brd 192.168.1.255 scope global dynamic noprefixroute wlp58s0\nvalid_lft 84447sec preferred_lft 84447sec\ninet6 fe80::d291:8b3e:4e0:cc87/64 scope link noprefixroute valid_lft forever preferred_lft forever\nvirbr0 gateway is 192.168.122.1\nwifi router gateway is 192.168.1.1\nI will pick an IP from the virbr0 range and assign it to the newly created VM, which is going to be our primary domain controller.\nStatic IP for Domain Controller The reason we need a static IP for the domain controller is to ensure that the domain controller is reachable even if the virtual network restarts. dnsmasq — we briefly touched on this in part 1, is responsible for assigning IPs to the virtual machines. When the virtual network restarts, it will act as the DHCP and start handing out IPs to all the VMs connected to virbr0. If the domain controller gets a different IP when the virtual network restarts, it will break the domain join for all the VMs that were part of the domain controller. To avoid this, we will assign static IP for the domain controllers.\nNow that we know the IP range of virtual network, I will pick an IP, say 192.168.122.10 as my primary domain controller’s IP.\nLogin to the Windows Server we created in part 1 and follow below steps to set the static IP —\nBefore setting the static IP, rename the computer to HCE-DC01 if you haven’t done it already and restart.\nOpen Server Manager\nClick **Local Server** -\u0026gt; Click the link next to **Ethernet** A pop-up Right-click your Ethernet adapter → **Properties** Double-click Internet Protocol Version 4 (TCP/IPv4) Select **Use the following IP address** option Enter the below values: IP address: 192.168.122.10 - The static IP we chose\nSubnet mask: 255.255.255.0\nDefault gateway: 192.168.122.1 - Bridge\u0026rsquo;s IP\n8. Select **Use the following DNS server addresses**\n9. Set Preferred DNS server: 192.168.122.10\n10. Click OK → Close all windows\nCheck if the static IP is updated by running ipconfig\nWhen you set up the first Domain Controller, it must use itself as a DNS. Once the secondary domain controller is up, they should ideally cross reference each other\nNote of VM rename — Rename the server before promotion — renaming a domain controller after the fact is painful.\nInstall Active Directory Domain Services Role For a server to perform the role of a domain controller, it needs certain capabilities. These capabilities include a storage to persist the objects (forest, users, computers, groups, and policies), authentication layer, and policy enforcement mechanisms. This is not a exhaustive list of capabilities. I have listed only those relevant to our hybrid environment right now.\nActive Directory Domain Services is a feature that you install on your Windows Server to make it a domain controller. Here is the official definition of AD DS — “A directory is a hierarchical structure that stores information about objects on a network. A directory service, such as Active Directory Domain Services (AD DS), provides methods for storing directory data and making this data available to network users and administrators. For example, AD DS stores information about user accounts, such as names, passwords, phone numbers, and so on. AD DS also provides a way for authorized users on the same network to access this information.”\nRough steps to install AD DS\nOpen Server Manager \u0026gt; Click Manage (top right) \u0026gt; Click Add Roles and Features \u0026gt; Click Next until you reach Server Roles \u0026gt; Check: Active Directory Domain Services\u0026gt; When prompted: Click Add Features \u0026gt; Click Next until Install \u0026amp; Click Install\nI’m not going into the details of the installation as many resources document the process in detail.\nWith this, the virtual machine has a necessary feature to perform the role of a domain controller.\nPromote This Server to a Domain Controller Installing the AD DS role and promoting the server are two distinct steps, and it’s easy to miss why. Here is the distinction that matters. What makes a server a domain controller is not what’s installed — it’s whether a valid, initialised NTDS.dit exists, the NTDS service is running against it, and the network knows where to find it via SRV records. Promotion is the act of going from capable to instantiated.\nOpen Server Manager \u0026gt; You should see a yellow triangle notification at top right. Click it. \u0026gt; Click: Promote this server to a domain controller\nNote on deployment configuration\nWhile installing, pay attention to these attributes\nwhen the wizard prompts for a forest, Select: Add a new forest Root domain name can be given as hybrid.local Domain controller options — a. Forest functional level: leave default b. Domain functional level: leave default c. DNS Server should already be checked d. Global Catalog should be checked e. Do NOT check Read-Only DC f. Set a Directory Services Restore Mode (DSRM) password (Write this down somewhere safe.) Ignore the DNS delegation warning. When creating a new Active Directory forest, the setup process also asks for a NetBIOS name for the domain. NetBIOS is a legacy naming system that predates modern DNS-based Active Directory environments and is widely used in older Windows networks for computer and resource identification. It will auto-fill: HYBRID, Leave it. Again, I’m not providing detailed steps on every screen of the wizard; this process is well documented. Apart from the attributes I have mentioned above, rest can be left with the default value. If you see any warning when checking the pre-requisites, you can ignore them. They will not have any effect on the environment we are building. We will revisit in future if needed.\n⚠ After the installation, the server will reboot automatically.\nAfter the reboot you will be able to login as Administrator to the new domain\nNow, we the domain controller up and ready. As a first test, try nslookup google.com. You will see that the domain controller failed to resolve this query. Let’s fix this next.\nAdding a DNS Forwarder After installing Active Directory Domain Services, the Domain Controller also becomes the authoritative DNS server for the new domain (hybrid.local). At this point the DNS server knows how to resolve internal Active Directory records such as domain controllers, LDAP services, and domain-joined machines. However, it has no knowledge of external internet domains like google.com or microsoft.com. When a domain-joined machine sends a DNS query for an external address, the request reaches the domain controller but cannot be resolved. Configuring a DNS forwarder solves this by instructing the DNS server to pass any unknown queries to an upstream resolver (for example, the home router or a public DNS server such as 8.8.8.8). The domain controller therefore resolves internal names itself and forwards everything else, allowing domain clients to access the internet while still using the domain controller as their primary DNS server.\nTo set up the forwarder, Open Server Manager \u0026gt; Tools → DNS \u0026gt; Double click on your server name \u0026gt; Right-click → Properties \u0026gt; Go to Forwarders tab \u0026gt; Click Edit \u0026gt; Add the virbr0gateway IP 192.168.122.1\nIf try the lookup again nslookup google.com, it will resolve.\nWithout a forwarder, the DNS server attempts recursive resolution using root hints, which can introduce delays or timeouts in lab environments behind NAT. Configuring a forwarder provides a faster and more predictable path for resolving external names.\nCreating User Accounts Next, create normal domain users. In the domain controller, launch Active Directory Users and Computers, go to Users folder and create two test users. testuser1 and testuser2. Set password and enable the account.\nCreate a client VM to join the domain Now that the domain controller is ready, we can test if client VMs are able to join the new domain we created. To test the domain join, spin up a new Windows VM , our client VM. I created another instance of Windows 2022 Server as I did not want to download another iso, just for the testing.\nJoin client VM to Domain When you spin up a new client VM, its preferred DNS will be the gateway 192.168.122.1 (in line with the IP range of the virtual network).\nConfigure the Client VM’s DNS\nBefore joining the domain, its DNS server must point to the domain controller’s IP. The reason being the domain hybrid.local is not publicly resolvable like google.com — it exists only in the domain controller\u0026rsquo;s DNS. If the client VM uses any other DNS server, it will fail to locate the domain controller and the domain join will not proceed.\nSet-DnsClientServerAddress -InterfaceAlias \u0026ldquo;Ethernet\u0026rdquo; -ServerAddresses 192.168.122.10\nif the name “Ethernet” is not resolvable use the below command to look up the actual interface alias\nGet-NetAdapter | select Name, InterfaceAlias, Status\nTesting the Client\nStep 1 — Check the DNS server (from Client VM)\nOn the client VM, open PowerShell as Administrator and run:\nipconfig /all\nDNS Server should be 192.168.122.10\nStep 2 — Verify if nslookup resolves (from Client VM)\nOn the client VM, test domain controller discovery via DNS.\nnslookup _ldap._tcp.dc._msdcs.hybrid.local\nIt should return —\nAbove snip shows that the test VM now uses the domain controller as it’s DNS and DNS server is responding. We are good to join the domain.\nStep 3 — Check if SRV record is correct (from Client VM)\nRun nslookup in interactive mode: Inside the prompt, query the SRV record\nset type=SRV\n_ldap._tcp.dc._msdcs.hybrid.local\nIf you see your domain controller hostname listed under svr hostname DNS + SRV discovery is working.\nStep 4 — Test LDAP connectivity (from Client VM)\nFrom the client, test LDAP connectivity. Test-NetConnection 192.168.122.10 — Port 389.The response should say TcpTestSucceeded : True\nJoin the domain (from Client VM)\nWe are all set to join this VM to the domain. Run the below PowerShell command to join the domain\nAdd-Computer -DomainName hybrid.local -Credential HYBRID\\Administrator -Restart\nWhat happens internally:\nClient queries DNS for domain controller (SRV record) Client contacts domain controller via LDAP Admin credentials authenticated (Kerberos/NTLM) to authorize the join Domain controller creates a Computer Object in AD Machine account password established — this becomes the secure channel secret Client reboots as domain member Verification Domain membership verification — Log in as HYBRID\\testuser1 - that you created earlier. Run whoami; should return: hybrid\\testuser1.echo %logonserver% should show your domain controller hostname \\\\HCE-DC01.\nThis proves that Kerberos + secure channel + domain controller communication is working.\nValidate AD Object Creation — On the domain controller, if you open **Active Directory Users and Computers** and go to **Computers** you should see your client machine listed. This proves that AD object lifecycle works.\nSummary In this part we set up a domain controller, domain joined a VM and tested the connectivity. Although single Domain Controller set up works, it is a single point of failure. If the only domain controller fails:\nAuthentication stops Kerberos tickets cannot be issued Group Policy stops applying New logons fail In the next part, we will build redundancy for the domain controller by adding a secondary domain controller.\n","permalink":"https://gurupasupathy.com/post/2026-04-18_building-hce-part-2--identity-on-premises-domain-controller/","summary":"\u003cp\u003eIn the \u003ca href=\"https://pasupathy-guru.medium.com/handson-building-hybrid-cloud-environment-part-1-identity-connectivity-foundation-7788dd1eb827?source=friends_link\u0026amp;sk=40547f5f11d2a24cbd3fd0705504bdba\"\u003efirst part\u003c/a\u003e, we laid the foundation for the hybrid cloud environment. Now we have a virtual network with VM running Windows Server 2022 Evaluation. In this part, we will focus on adding the Identity plane to the hybrid cloud environment by introducing a domain controller and creating an Active Directory structure. We will create a client VM, domain join it and make sure a domain user is able to login\u003c/p\u003e","title":"HandsOn — Building Hybrid Cloud Environment — Part 2— Identity — On-Premises Domain Controller"},{"content":"Introduction In this series, I will take you through building an on-premises / Azure hybrid environment, with the on-premises network running entirely on a single machine. We will set up an on-premises Active Directory forest, create OUs and users, deploy domain controllers, join Windows and Linux VMs to the domain, and establish hybrid connectivity to Azure using an S2S VPN tunnel.\nI want to clarify right at the outset that on-premises identity is not a mandatory starting point for a hybrid cloud environment. But I have chosen to build it from the ground up starting with the identity plane (on-premises Active Directory) .\nTo follow along you do not require deep networking or Linux expertise, but comfort with basic bash and networking concepts will help — and we will build the required knowledge as we go. Where appropriate, I will reference official documentation rather than re-explaining well-documented concepts.\nAt the end of this series, I will share my Github repo with the automation scripts for the infrastructure.\nWhen I started exploring hybrid environments, I assumed it would require multiple machines, dedicated networking, and possibly additional hardware. That made it feel like something I couldn’t easily experiment with on my own setup. As I explored further, I realized those assumptions weren’t entirely true. I was able to build a working hybrid environment on my laptop, keeping everything contained and manageable. This series documents how I put it together. Here’s the setup I’m using: a laptop running Linux Mint 22.1 (Xia) with 16 GB RAM, a regular home Wi-Fi router, and an Azure subscription (PAYG).Let’s get started.\nBuilding an On-premises virtual network In this first part of the series, I will set up the virtual network on my Linux laptop. This is the foundation for the hybrid environment we are building. By the end of this article, we will have laid the foundation which will include a virtual network — our on-premises representation and a Virtual Machine within the virtual network.\nThe above diagram shows what we will have in place by the end of this article.\nSoftware prerequisites and why you need them The on-premises network is virtual. We will need libraries and applications that enable the creation and management of virtual networks and virtual machines. Follow the below steps to prepare the environment.\nInstallation steps for KVM, QEMU, libvirt and virt-manager vary by Linux distribution and version. Refer to your distribution’s documentation for the correct package names and commands\nVerify virtualization is enabled — Run egrep -c '(vmx|svm)' /proc/cpuinfo A result greater than 0 means your CPU supports hardware virtualization and you\u0026rsquo;re good to go. Install KVM hypervisor and QEMU — These two work as a pair. KVM is the Linux kernel module that provides hardware virtualization, allowing the Guest OS to execute instructions directly on the host CPU at near-native speed. QEMU handles the emulated hardware that the Guest OS interacts with — the virtual disk drives, the network card, and the VGA BIOS that Windows thinks it’s seeing. Install libvirt — this is the virtualization manager I will be using to manage my virtual network. It is the control layer. It translates GUI actions into XML definitions and complex command-line instructions for the hypervisor. It manages storage pools, virtual networks, and VM lifecycle. For example, you add a new virtual hardware, say, a CDROM, libvirt will generate the XML configuration to support CDROM virtualization which will then be read and processed by QEMU. Virtual Machine Manager (virt-manager) — a handy GUI for libvirt. You launch it with virt-manager, and it lets you create and manage VMs but does not run them. Download Windows Server 2022 Evaluation version ISO file. The OS must be a server OS that supports Active Directory Domain Services and hence I’ve chosen Windows Server 2022 Download virtIO from https://fedorapeople.org/groups/virt/virtio-win/direct-downloads/stable-virtio/virtio-win.iso Once the above steps are completed, run virsh net-list --all This should show the default network that libvirt just created.\nAt this stage, the virtual network is up and all the tools needed to create and manage VMs are in place.\nRole of libvirt, virt-manager, KVM and QEMU It helps to build a simple mental model of how these components fit together — specifically, how virt-manager interacts with the underlying hypervisor.\nThe Management Flow (Control Plane) — when you create or configure a VM:\nvirt-manager → libvirt → QEMU/KVM\nvirt-manager provides the GUI libvirt acts as the control layer, translating actions into configurations QEMU/KVM executes those configurations The Execution Flow (Data Plane) — when something runs inside the VM:\nUser → Guest OS (Windows) → QEMU → KVM → Hardware\nQEMU handles device emulation (disk, NIC, etc.) KVM provides direct access to CPU virtualization features This separation helps explain why VM configuration and VM execution are two different layers.\nVirtual Networking Before we create the virtual machine, a few networking concepts around virtual networking and libvirt are worth clarifying.\nVirtual network — the private address space where your VMs live. They exist only inside the host and are not visible to the WiFi network that is connecting your laptop with other laptops, phones and other devices in your WiFi network.\nBridge (virbr0)— Layer 2 construct — the virtual switch that connects your VMs to each other and to the host, like a physical switch in a rack. Imagine VM1 wants to send a packet to VM2, it forwards traffic based on MAC address learning (similar to a Layer 2 switch) and will forward the packet to VM2. It acts like a Layer 2 switch inside your Linux host. Every VM connected to the default network, plugs into this switch.\nGateway (192.168.122.1) — Layer 3 construct — the door out of the virtual network; packets destined for anywhere outside the virtual network go here first and then get routed based on the defined route table.\nGateway vs bridge — At first, they looked similar to me. It took me some time to understand the difference. The bridge connects devices at Layer 2 by MAC address. If two VMs in the same virtual network want to talk, the bridge connects them; they bypass the gateway. The gateway is the IP address assigned to the bridge interface, acting as the Layer 3 entry/exit point for the virtual network. It is the address VMs use when sending traffic outside the virtual network\nVMs talk to each other through the bridge, they talk to the outside world through the gateway, and the virtual network is the address space that gives them all a place to live.\nWhen libvirt is first installed, it automatically creates a default virtual network with a virtual switch called virbr0 — visible via ip a. Behind this bridge, libvirt configures dnsmasq for DHCP and DNS, and uses iptables/nftables on the host to provide NAT routing. Any VM you create is connected to this network unless you specify otherwise.\nlibvirt modes libvirt supports three modes, namely, NAT, Bridge and Internal. For our hybrid environment, the virtual network uses NAT mode described in https://wiki.libvirt.org/VirtualNetworking.html meaning that the WiFi router sees all packets from the virtual network as originating from the Linux host. It has no notion of the virtual network. This approach requires no changes to the home network and keeps the virtual environment isolated, while still allowing outbound internet access.\nUse bridged networking if you want your virtual machines to obtain an IP address from your LAN. Or use Internal Network if you want a fully isolated lab.\nTraffic flow in default (NAT) mode looks like this:\nVM →\nvirtual NIC →\nvirbr0 (virtual switch) →\nNAT (iptables/nftables on host) →\nLinux host →\nphysical network →\ninternet\nIf you use bridged mode, the VM connects directly to your physical network through a Linux bridge (e.g., br0). In that case:\nVM →\nvirtual NIC →\nLinux bridge →\nphysical NIC →\nreal LAN\nNo NAT. The VM behaves like a real machine on your network.\nCreating Virtual Machines Now, we can start creating a Virtual Machine. This virtual machine will be the primary domain controller (will be covered in Part 2), so, let’s name it appropriately. I will call it hce-dc01. Give at least: 4 GB RAM, 2 vCPU, 60 GB disk for the virtual machine. This sizing is sufficient for a lightweight domain controller while keeping resource usage manageable on a single host machine.\nLaunch virt-manager from the terminal. It opens up the GUI of virtual manager. Select option to create a new VM, follow the wizard by providing configurations as outlined above. Use the downloaded ISO image and install Windows Server OS. The installation is quite straightforward.\nImportant — Make sure you install Windows Server 2022 Desktop Experience.\nOnce the Windows Server 2022 OS is installed, the VM will reboot allowing you to set the Administrator password.\nOnce the VM is ready, go to the details view and verify the below settings.\nFor Virtual NIC choose — e1000e. This is an emulated Intel NIC that Windows recognizes out of the box — it gets you network access during installation before VirtIO (discussed in the next section) drivers are in place.\nFor video select QXL — this is a paravirtualized display adapter that gives you a responsive desktop with better resolution support compared to the default VGA\nAdd VirtIO ISO as CD-ROM in virt-manager\nWhat and Why? — [VirtIO](https://wiki.libvirt.org/Virtio.html) is a paravirtualized device interface used between Windows and QEMU. Instead of emulating physical hardware like SATA or Intel NICs, VirtIO provides purpose-built virtual drivers that both the guest and hypervisor understand directly — reducing CPU overhead and improving throughput.\nFollow the below steps to install VirtIO\nOpen virt-manager → select your VM → Open → Show virtual hardware details Click Add Hardware → Storage → CD-ROM Choose Select or create custom storage → point to ~/ISOs/virtio-win.iso Boot (or reboot) the VM Now inside the VM you will see two CD-ROMs:\nSATA CDROM1 → Windows Server ISO VirtIO CD-ROM → drivers Installing VirtIO\nNow that we have the VirtIO ISO mounted to VM as a CDROM drive, navigate to the CDROM drive and run the exe installer (**virtio-win-gt-x64.exe**) It installs all the necessary VirtIO drivers for disk, network, and optional devices automatically. Once the drivers are installed, shut down the VM, change the NIC type to ‘virtio’ in virt-manager, and then start it back up for better throughput.\nThis step completes the configuration of the Windows Virtual Machine. When hce-dc01 was created, virt-manager automatically connected it to the default virtual network created by libvirt — Let’s verify that now\nVerification To confirm the virtual network is configured correctly, run virsh net-dumpxml default\n\u0026lt;network connections=\u0026lsquo;1\u0026rsquo;\u0026gt;\n\u0026lt;name\u0026gt;default\u0026lt;/name\u0026gt;\n\u0026lt;uuid\u0026gt;73e80935-c747-4ad7-88a1-5417707abc02\u0026lt;/uuid\u0026gt;\n\u0026lt;forward mode=\u0026lsquo;nat\u0026rsquo;\u0026gt;\n\u0026lt;nat\u0026gt;\n\u0026lt;port start=\u0026lsquo;1024\u0026rsquo; end=\u0026lsquo;65535\u0026rsquo;/\u0026gt;\n\u0026lt;/nat\u0026gt;\n\u0026lt;/forward\u0026gt;\n\u0026lt;bridge name=\u0026lsquo;virbr0\u0026rsquo; stp=\u0026lsquo;on\u0026rsquo; delay=\u0026lsquo;0\u0026rsquo;/\u0026gt;\n\u0026lt;mac address=\u0026lsquo;52:54:00:d9:a6:2b\u0026rsquo;/\u0026gt;\n\u0026lt;ip address=\u0026lsquo;192.168.122.1\u0026rsquo; netmask=\u0026lsquo;255.255.255.0\u0026rsquo;\u0026gt;\n\u0026lt;dhcp\u0026gt;\n\u0026lt;range start=\u0026lsquo;192.168.122.2\u0026rsquo; end=\u0026lsquo;192.168.122.254\u0026rsquo;/\u0026gt;\n\u0026lt;/dhcp\u0026gt;\n\u0026lt;/ip\u0026gt;\n\u0026lt;/network\u0026gt;\nIn the above snippet you will notice that bridge virbr0 is configured with a DHCP range from 192.168.122.2 to 192.168.122.254.\nLogin to the new VM and check its IP (ipconfig). You will see an IP within this range allocated by dnsmasq.\nAlso, remember this range must not overlap with the range your Wi-Fi provides (it usually doesn’t, but worth noting)\nSummary With these steps, we have a working virtual network and a Windows Server 2022 VM ready to be configured. In the next part, we will turn this VM into a fully functional Active Directory domain controller — laying the groundwork for identity in our hybrid environment.\n","permalink":"https://gurupasupathy.com/post/2026-04-12_building-hce-part-1-identity-connectivity-foundation/","summary":"\u003ch4 id=\"introduction\"\u003eIntroduction\u003c/h4\u003e\n\u003cp\u003eIn this series, I will take you through building an on-premises / Azure hybrid environment, with the on-premises network running entirely on a single machine. We will set up an on-premises Active Directory forest, create OUs and users, deploy domain controllers, join Windows and Linux VMs to the domain, and establish hybrid connectivity to Azure using an S2S VPN tunnel.\u003c/p\u003e\n\u003cp\u003eI want to clarify right at the outset that on-premises identity is not a mandatory starting point for a hybrid cloud environment. But I have chosen to build it from the ground up starting with the identity plane (on-premises Active Directory) .\u003c/p\u003e","title":"HandsOn — Building Hybrid Cloud Environment — Part 1 — Identity \u0026 Connectivity Foundation"},{"content":"\nThis guide outlines the process for assigning application roles to a Managed Identity (MI) in Entra ID. It covers observed behaviors, inherent limitations, and the necessary steps required when an MI must authenticate with another application (such as an API in APIM) using role-based access control (RBAC).\nScenario In a typical architecture, a Logic App utilizes a Managed Identity (either System-Assigned or User-Assigned) to communicate with downstream resources. When that Logic App needs to call an API exposed via APIM, the following requirements usually apply:\nThe API is protected by its own App Registration in Entra ID. The API expects the caller to possess specific app roles (e.g., API.Read or API.ReadWrite). The Logic App must obtain an OAuth token containing these roles to successfully authorize against the API. The Challenge Managed Identities are automatically created Service Principals. A common point of confusion is that they do not appear in the App Registration section of the Azure portal; they are found exclusively under Enterprise Applications.\nBecause the Azure portal does not currently provide a UI for assigning app roles to Enterprise Applications directly, it is not possible to assign roles like API.Read through the standard \u0026ldquo;API Permissions\u0026rdquo; blade used for traditional App Registrations.\nThe Workaround — Assigning App Roles via PowerShell / Microsoft Graph You can use the below Powershell to assign roles to your Managed Identity\n# Install-Module Microsoft.Graph -Scope CurrentUser (If not done already)\n# Your tenant ID (in the Azure portal, under Azure Active Directory \u0026gt; Overview).\n$tenantID = \u0026lsquo;{tenantId}\u0026rsquo;\n# The name of the server app that exposes the app roles.\n$serverApplicationName = \u0026lsquo;{serverApplicationName}\u0026rsquo;\n# The name of the app role that the managed identity should be assigned to.\n$appRoleName = \u0026lsquo;{appRoleName}\u0026rsquo; # For example, Api.Read\n# Look up the Logic App / Function (Client application) managed identity\u0026rsquo;s object ID.\n$managedIdentityObjectId = \u0026lsquo;{managedIdentityObjectId}\u0026rsquo;\n# Connect-MgGraph -TenantId $tenantId -Scopes \u0026lsquo;Application.ReadWrite.All\u0026rsquo;,\u0026lsquo;Directory.Read.All\u0026rsquo;\n# or a more restricted set of permissions (recommended):\nConnect-MgGraph -TenantId $tenantId -Scopes \u0026lsquo;Application.Read.All\u0026rsquo;,\u0026lsquo;AppRoleAssignment.ReadWrite.All\u0026rsquo;\n# Look up the details about the server app\u0026rsquo;s service principal and app role.\n$serverServicePrincipal = (Get-MgServicePrincipal -Filter \u0026ldquo;DisplayName eq \u0026lsquo;$serverApplicationName\u0026rsquo;\u0026rdquo;)\n$serverServicePrincipalObjectId = $serverServicePrincipal.Id\n$appRoleId = ($serverServicePrincipal.AppRoles | Where-Object {$_.Value -eq $appRoleName }).Id\nWrite-Host \u0026lsquo;$serverServicePrincipal \u0026rsquo; $serverServicePrincipal\nWrite-Host \u0026lsquo;$managedIdentityObjectId \u0026rsquo; $managedIdentityObjectId\nWrite-Host \u0026lsquo;$serverServicePrincipalObjectId \u0026rsquo; $serverServicePrincipalObjectId\nWrite-Host \u0026lsquo;AppRoleId \u0026gt;\u0026rsquo; $appRoleId\n# Assign the managed identity access to the app role.\nNew-MgServicePrincipalAppRoleAssignment -ServicePrincipalId $serverServicePrincipalObjectId -PrincipalId $managedIdentityObjectId -ResourceId $serverServicePrincipalObjectId -AppRoleId $appRoleId\nPrincipalId → Managed Identity object ID (Logic App) ResourceId → API service principal object ID AppRoleId → GUID of the role defined in the API registration After this assignment, tokens requested by the Managed Identity will include the required roles claim, allowing successful authorization against the API.\nNote For clientId to be able to be used as an audience it must “own” App Roles. And the consumer-client-id should have been provided this roles in AAD. I think you can further check these claims in the Authentication section Key Takeaways Managed Identities always appear as Enterprise Apps in Azure AD. App roles cannot be assigned via the portal for Enterprise Apps; Graph / PowerShell is required. Token validation depends on correct Issuer, Audience, and presence of role claims. Explicit role assignment ensures tokens carry the required roles for API authorization. ","permalink":"https://gurupasupathy.com/post/2026-02-27_adding-application-roles-to-managed-identity/","summary":"\u003cp\u003e\u003cimg loading=\"lazy\" src=\"/img/1__932kIAgBM7f5RkLN3QcYxw.png\"\u003e\u003c/p\u003e\n\u003cp\u003eThis guide outlines the process for assigning application roles to a \u003cstrong\u003eManaged Identity (MI)\u003c/strong\u003e in Entra ID. It covers observed behaviors, inherent limitations, and the necessary steps required when an MI must authenticate with another application (such as an API in APIM) using role-based access control (RBAC).\u003c/p\u003e\n\u003ch3 id=\"scenario\"\u003eScenario\u003c/h3\u003e\n\u003cp\u003eIn a typical architecture, a \u003cstrong\u003eLogic App\u003c/strong\u003e utilizes a \u003cstrong\u003eManaged Identity\u003c/strong\u003e (either System-Assigned or User-Assigned) to communicate with downstream resources. When that Logic App needs to call an \u003cstrong\u003eAPI exposed via APIM\u003c/strong\u003e, the following requirements usually apply:\u003c/p\u003e","title":"Adding application roles to Managed Identity"},{"content":"Symptom\nSymptom Calling Azure Table Storage REST API returns:\n403 Server failed to authenticate the request.\nMake sure the value of Authorization header is formed correctly including the signature.\nEven though Authorization header looks valid\nRoot Cause The request is missing x-ms-version header\nAzure Storage requires this header to determine the API version used for request validation. Without it, the service may reject the request with a misleading authentication error.\nFix Add header\nx-ms-version: 2020–08–04\nExample minimal headers:\nx-ms-version: 2020–08–04\nAccept: application/json;odata=nometadata\nContent-Type: application/json\nLesson learned If Azure Storage returns a 403 authentication error for a manually signed REST request, check for missing x-ms-version before debugging the signature.\n","permalink":"https://gurupasupathy.com/post/2026-02-22_troubleshooting-notes-azure-table-storage-403-authentication/","summary":"\u003cp\u003eSymptom\u003c/p\u003e\n\u003cp\u003e\u003cimg loading=\"lazy\" src=\"/img/1__Uod5Yt3bOWoSV99G6AG0Ug.png\"\u003e\u003c/p\u003e\n\u003ch3 id=\"symptom\"\u003eSymptom\u003c/h3\u003e\n\u003cp\u003eCalling Azure Table Storage REST API returns:\u003c/p\u003e\n\u003cp\u003e403 Server failed to authenticate the request.\u003cbr\u003e\nMake sure the value of Authorization header is formed correctly including the signature.\u003c/p\u003e\n\u003cp\u003eEven though Authorization header looks valid\u003c/p\u003e\n\u003ch3 id=\"root-cause\"\u003eRoot Cause\u003c/h3\u003e\n\u003cp\u003eThe request is missing \u003cstrong\u003ex-ms-version\u003c/strong\u003e header\u003c/p\u003e\n\u003cp\u003eAzure Storage requires this header to determine the API version used for request validation. Without it, the service may reject the request with a misleading authentication error.\u003c/p\u003e","title":"Troubleshooting notes — Azure Table Storage 403 Authentication"},{"content":"\nAccessing Azure App Configuration using Managed Identity in Azure Functions is slightly different from accessing other Azure services.\nFor most Azure services (Storage, Service Bus, Key Vault), you typically:\nEnable Managed Identity on the Function Grant RBAC access to the resource Create the SDK client using DefaultAzureCredential However, App Configuration is usually loaded as part of the application configuration pipeline at startup, so it must be added via the host builder.\nPrerequisites Enable Managed Identity on the Function App Grant the identity: App Configuration Data Reader on the App Configuration resource Sample code as shown below\nvar host = new HostBuilder() .ConfigureAppConfiguration(builder =\u0026gt;\n{\nstring cs = Environment.GetEnvironmentVariable(\u0026ldquo;ConnectionString\u0026rdquo;);\nbuilder.AddAzureAppConfiguration(options =\u0026gt;\noptions.Connect(new Uri(@\u0026ldquo;https://appconfiguri.azconfig.io\u0026rdquo;), new ManagedIdentityCredential()));\n})\n.ConfigureFunctionsWebApplication()\n.Build();\nhost.Run();\nNote: I’m using ManagedIdentityCredential but the recommend class is DefaultAzureCredential\nKey Insight Other Azure services → authenticated when creating the client App Configuration → authenticated when building the configuration provider. That’s why it must be configured inside ConfigureAppConfiguration(). ","permalink":"https://gurupasupathy.com/post/2026-02-21_access-appconfiguration-from-function-app-using-managed-identity/","summary":"\u003cp\u003e\u003cimg loading=\"lazy\" src=\"/img/1__nTaI7u53YGpj1No72IN6Yw.png\"\u003e\u003c/p\u003e\n\u003cp\u003eAccessing Azure App Configuration using Managed Identity in Azure Functions is slightly different from accessing other Azure services.\u003c/p\u003e\n\u003cp\u003eFor most Azure services (Storage, Service Bus, Key Vault), you typically:\u003c/p\u003e\n\u003cul\u003e\n\u003cli\u003eEnable Managed Identity on the Function\u003c/li\u003e\n\u003cli\u003eGrant RBAC access to the resource\u003c/li\u003e\n\u003cli\u003eCreate the SDK client using DefaultAzureCredential\u003c/li\u003e\n\u003c/ul\u003e\n\u003cp\u003eHowever, App Configuration is usually loaded as part of the application configuration pipeline at startup, so it must be added via the host builder.\u003c/p\u003e","title":"Access AppConfiguration from Function App using Managed Identity"},{"content":"Coding with Integrity The real measure of a software engineer is simple — how you code when no one is watching.\nWe often associate strong engineering with technical brilliance — mastering languages, designing scalable systems, or solving complex problems.\nBut beyond skill, the most valuable attribute a software engineer can bring to the table is integrity.\n“Coding with Integrity, is how you code when you know that no one is going to review your code”\nCoding with integrity is about the choices you make in the quiet moments of development — when there’s no reviewer, no deadline pressure, and no immediate accountability except your own standards.\nMany common engineering issues don’t come from lack of knowledge.\nThey come from small decisions made in those unseen moments.\nDesign decisions When implementing a feature, it’s easy to think only about the immediate ask: Does it work? Does it avoid breaking anything?\nIntegrity pushes the thinking further: Is this the right approach? Is it maintainable? Should I pause and rethink this before moving forward?\nUnit tests It’s possible to reach high coverage while knowing the tests don’t really validate behaviour.\nIntegrity asks: Do these tests genuinely protect the system? Would I trust them if something broke tomorrow?\nTechnical debt Sometimes we clearly see duplication, fragile logic, or missed refactoring opportunities.\nIntegrity isn’t about always fixing everything immediately. It’s about being honest: acknowledging the debt, documenting it, not pretending the shortcut is a solution and ensure the debt is addressed.\nDocumentation and clarity After spending days or weeks on a module, everything feels obvious.\nIntegrity means writing code and comments for the next reader — even if that reader is your future self, months later.\nMaybe integrity in coding isn’t something we formally learn or measure.\nMaybe it’s simply the voice that nudges us toward clarity, correctness, and responsibility. Whether we follow that voice or ignore it is what ultimately shows up in our code.\nCheers!\n","permalink":"https://gurupasupathy.com/post/2026-02-20_coding-with-integrity/","summary":"\u003ch1 id=\"coding-with-integrity\"\u003eCoding with Integrity\u003c/h1\u003e\n\u003cp\u003e\u003cimg loading=\"lazy\" src=\"/img/1____MXJsGbwgJ3C5gieg298uA.png\"\u003e\u003c/p\u003e\n\u003cp\u003eThe real measure of a software engineer is simple — \u003cstrong\u003e\u003cem\u003ehow you code when no one is watching.\u003c/em\u003e\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eWe often associate strong engineering with technical brilliance — mastering languages, designing scalable systems, or solving complex problems.\u003c/p\u003e\n\u003cp\u003eBut beyond skill, the most valuable attribute a software engineer can bring to the table is \u003cstrong\u003eintegrity\u003c/strong\u003e.\u003c/p\u003e\n\u003cblockquote\u003e\n\u003cp\u003e\u003cstrong\u003e“Coding with Integrity, is how you code when you know that no one is going to review your code”\u003c/strong\u003e\u003c/p\u003e","title":"Coding with Integrity"},{"content":"\nWhen using Terraform to import an OpenAPI/Swagger definition into Azure API Management (APIM), the API and its operations are created successfully. However, one subtle behavior can cause confusion when trying to manage operation-level policies declaratively.\nThis post explains the issue and a simple workaround.\nThe Scenario I was importing my API using Terraform:\nSwagger/OpenAPI definition imported into APIM\nAPI created successfully\nAll operations appeared correctly in Azure\nLater, I wanted to attach operation-level policies using Terraform using azurerm_api_management_api_operation_policy\nAt this point I ran into a problem: Terraform had no record of the operations in its state file.\nWhy This Happens This behavior is expected once you understand how Terraform works. Terraform only tracks resources explicitly declared in configuration, or\nresources manually imported into state\nWhen Swagger is imported via azurerm_api_management_api the operations are created inside Azure, but they are not separate Terraform-managed resources unless you explicitly declare using azurerm_api_management_api_operation\nEffectively — API is created in Azure and tracked in Terraform while\nAPI Operations (via Swagger import) are created in Azure but NOT tracked in Terraform\nThis makes it unclear how to attach policies to those operations without creating the operations explicitly — a nightmare if you have hundreds of operations\nThe Simple Workaround You do not need a Terraform resource reference to the operation for you to create an operation policy and attach it. Instead, you can attach the policy directly using azurerm_api_management_api_operation_policy resource and referencing the Swagger operationId.\nExample:\nresource \u0026ldquo;azurerm_api_management_api_operation_policy\u0026rdquo; \u0026ldquo;my_op_policy\u0026rdquo; {\nprovider = \u0026laquo;provider\u0026raquo;\napi_name = \u0026ldquo;\u0026rdquo;\napi_management_name = data.azurerm_api_management.apim.name\nresource_group_name = data.azurerm_api_management.apim.resource_group_name\noperation_id = \u0026ldquo;\u0026rdquo;\nxml_content = templatefile(\u0026quot;\u0026quot;, {\nbackend_name = \u0026ldquo;\u0026rdquo;\nmethod = \u0026ldquo;\u0026rdquo;\n})\n}\nAs long as the API exists in APIM and the operation exists and operation_id exactly matches the Swagger operationId — Terraform can apply and update the policy successfully. No explicit Terraform operation resource is required.\nNotes 1. Use the Swagger operationId, not the display name. Terraform identifies the operation strictly by operationId.\n2. Treat operationId as a stable contract. If you later rename the operationId or remove an endpoint or restructure the Swagger Terraform may fail because the referenced operation no longer exists.\n3. Importing operations individually is possible but rarely worth it. You can define azurerm_api_management_api_operation and import each operation manually into Terraform state. However, it requires one resource per operation. Also, manual imports are tedious and scales poorly for large APIs thus defeating the benefit of Swagger-driven API definition\nFor most setups, referencing operationId directly in the policy resource is simpler.\nTakeaway When importing Swagger into APIM using Terraform:\nOperations are created in Azure\nTerraform does not automatically track them\nOperation policies can still be managed declaratively by simply referencing the Swagger/OpenAPI Spec operationId\nUnderstanding this distinction can save significant time when automating API Management deployments.\n","permalink":"https://gurupasupathy.com/post/2026-02-20_managing-apim-op-policies-in-terraform-by-importing-openapi-spec/","summary":"\u003cp\u003e\u003cimg loading=\"lazy\" src=\"/img/1__mk3hcBMP7jVKBsxWOa5JDA.png\"\u003e\u003c/p\u003e\n\u003cp\u003eWhen using Terraform to import an OpenAPI/Swagger definition into Azure API Management (APIM), the API and its operations are created successfully. However, one subtle behavior can cause confusion when trying to manage operation-level policies declaratively.\u003c/p\u003e\n\u003cp\u003eThis post explains the issue and a simple workaround.\u003c/p\u003e\n\u003ch3 id=\"the-scenario\"\u003eThe Scenario\u003c/h3\u003e\n\u003cp\u003eI was importing my API using Terraform:\u003c/p\u003e\n\u003cblockquote\u003e\n\u003cp\u003eSwagger/OpenAPI definition imported into APIM\u003cbr\u003e\nAPI created successfully\u003cbr\u003e\nAll operations appeared correctly in Azure\u003c/p\u003e\n\u003c/blockquote\u003e\n\u003cp\u003eLater, I wanted to attach operation-level policies using Terraform using \u003cem\u003eazurerm_api_management_api_operation_policy\u003c/em\u003e\u003c/p\u003e","title":"Managing Azure APIM Operation Policies in Terraform by Importing OpenAPI Specification"},{"content":"Photo by Shubham Dhage on Unsplash\nPhoto by Shubham Dhage on Unsplash\nHere’s a ready-to-run “one-shot” demo workflow for Minikube that sets up a webserver deployment, exposes it, configures HPA, and generates load so you can see autoscaling in action immediately.\nYou can copy-paste these commands one after the other in your terminal.\nStep 0: (Optional) Clean up old resources kubectl delete deployment webserver --ignore-not-found\nkubectl delete svc webserver --ignore-not-found\nkubectl delete hpa webserver --ignore-not-found\nkubectl delete pod load-generator --ignore-not-found\nStep 1: Create the webserver deployment kubectl create deployment webserver --image=gcr.io/google_containers/echoserver:1.10\nStep 2: Expose the deployment as a NodePort service kubectl expose deployment webserver \u0026ndash;type=NodePort \u0026ndash;port=8080\nCheck service:\nkubectl get svc webserver\nStep 3: Enable metrics-server if not already minikube addons enable metrics-server\nStep 4: Create Horizontal Pod Autoscaler kubectl autoscale deployment webserver \u0026ndash;cpu=20% \u0026ndash;min=1 \u0026ndash;max=5\nCheck HPA:\nkubectl get hpa\nStep 5: Launch load-generator pod kubectl run -i \u0026ndash;tty load-generator \u0026ndash;image=busybox \u0026ndash; /bin/sh\nInside the pod, generate heavy load:\nwhile true; do wget -q -O- http://webserver:8080 \u0026amp; done\nThe \u0026amp; ensures requests run in parallel for higher CPU usage. This will trigger the HPA to scale the webserver pods. Step 6: Watch autoscaling in another terminal kubectl get hpa -w\nkubectl get pods -w\nYou will see replicas increase as CPU usage rises. When you stop the load (Ctrl+C in the BusyBox pod), HPA will scale pods back down. Step 7: Optional — test webserver from host minikube service webserver\nOpens your webserver in the browser. ","permalink":"https://gurupasupathy.com/post/2026-01-31_demo-workflow-for-minikube/","summary":"\u003cp\u003ePhoto by Shubham Dhage on Unsplash\u003c/p\u003e\n\u003cp\u003e\u003cimg loading=\"lazy\" src=\"img/1__L75eWRx0XZp7bUauqQgzog.jpeg\"\u003e\u003c/p\u003e\n\u003cp\u003ePhoto by \u003ca href=\"https://unsplash.com/@theshubhamdhage?utm_source=unsplash\u0026amp;utm_medium=referral\u0026amp;utm_content=creditCopyText\"\u003eShubham Dhage\u003c/a\u003e on \u003ca href=\"https://unsplash.com/photos/a-black-and-white-photo-of-a-bunch-of-cubes-gC_aoAjQl2Q?utm_source=unsplash\u0026amp;utm_medium=referral\u0026amp;utm_content=creditCopyText\"\u003eUnsplash\u003c/a\u003e\u003c/p\u003e\n\u003cp\u003eHere’s a ready-to-run “one-shot” demo workflow for Minikube that sets up a webserver deployment, exposes it, configures HPA, and generates load so you can see autoscaling in action immediately.\u003c/p\u003e\n\u003cp\u003eYou can copy-paste these commands \u003cstrong\u003eone after the other\u003c/strong\u003e in your terminal.\u003c/p\u003e\n\u003ch3 id=\"step-0-optional-clean-up-old-resources\"\u003eStep 0: (Optional) Clean up old resources\u003c/h3\u003e\n\u003cp\u003ekubectl delete deployment webserver --ignore-not-found\u003cbr\u003e\nkubectl delete svc webserver --ignore-not-found\u003cbr\u003e\nkubectl delete hpa webserver --ignore-not-found\u003cbr\u003e\nkubectl delete pod load-generator --ignore-not-found\u003c/p\u003e","title":"Demo workflow for Minikube"},{"content":"\nPhoto by Steve Johnson on Unsplash\nWhen you want to use a model but don’t want to keep initializing it with a specific persona, temperature, and other attributes, you can use the .modelfile Customization Approach.\nStep 1: Create a .modelfile as shown below (sys_admin.modelfile) # 1. THE BASE (Required)\nFROM llama3\n# 2. BRAIN PHYSICS (Parameters)\nPARAMETER temperature 0.7 # Creativity (0.0 to 1.0+)\nPARAMETER num_ctx 4096 # How many \u0026ldquo;tokens\u0026rdquo; of memory it has\nPARAMETER top_k 40 # Limits the \u0026ldquo;vocabulary\u0026rdquo; pool for each word\nPARAMETER top_p 0.9 # Probability threshold for word choice\nPARAMETER repeat_penalty 1.1 # Prevents the AI from getting stuck in a loop\nPARAMETER stop \u0026ldquo;User:\u0026rdquo; # Tells the AI exactly when to stop talking\nPARAMETER stop \u0026ldquo;\u0026mdash;\u0026rdquo;\n# 3. THE TEMPLATE (The \u0026ldquo;Skeleton\u0026rdquo; of a conversation)\n# This defines how the model sees the Turn-taking between User and AI.\nTEMPLATE \u0026ldquo;\u0026rdquo;\u0026quot;{{ if .System }}\u0026lt;|start_header_id|\u0026gt;system\u0026lt;|end_header_id|\u0026gt;\n{{ .System }}\u0026lt;|eot_id|\u0026gt;{{ end }}{{ if .Prompt }}\u0026lt;|start_header_id|\u0026gt;user\u0026lt;|end_header_id|\u0026gt;\n{{ .Prompt }}\u0026lt;|eot_id|\u0026gt;{{ end }}\u0026lt;|start_header_id|\u0026gt;assistant\u0026lt;|end_header_id|\u0026gt;\n{{ .Response }}\u0026lt;|eot_id|\u0026gt;\u0026quot;\u0026quot;\u0026quot;\n# 4. (System Instructions)\nSYSTEM \u0026quot;\u0026quot;\u0026quot;\nYou are a specialized Azure Networking Assistant and System Administrator with plenty of experience.\nYou provide CLI commands for Linux Mint and PowerShell for Windows.\nConstraints:\n1. If a config is insecure, call it out immediately.\n\u0026quot;\u0026quot;\u0026quot;\n# 5. PRE-LOADING (The \u0026ldquo;Conversation Starter\u0026rdquo;)\n# You can bake in a \u0026ldquo;fake\u0026rdquo; memory so the model thinks it\u0026rsquo;s already talking to you.\n# [OPTIONAL] ADAPTER ~/models/my-adapter # (for actual fine-tuned weights)\nMESSAGE user \u0026ldquo;Check the S2S status.\u0026rdquo;\nMESSAGE assistant \u0026ldquo;checking the IPsec tunnels now. One moment.\u0026rdquo;\nStep 2: Create an overlay on top of existing model Once the .modelfile is ready, pick one of you exisiting models and create a new overlay like so -\nollama create my-new-overlay-sysadmin -f ./sys_admin.modelfile\nStep 3: Create an alias for easy use To make it “instant” so you don’t have to type long commands, you add an alias to your .bashrc file. This is the bridge between your OS and the AI.\nOpen your config: nano ~/.bashrc Add this line at the bottom: alias summon-admin=’ollama run my-new-overlay-sysadmin’ Save and refresh: source ~/.bashrc How it works in practice Now, whenever you are looking at a messy config file on your machine, you just pipe the text to your new friend:\ncat /etc/ssh/sshd_config | summon-admin\nThe model will wake up, read the file, and start grumbling about your security choices.\nHow is this different from prompt engineering 1. Hardware \u0026amp; Environment Parameters Prompt engineering cannot change how the computer actually runs the model. A .modelfile can.\nParameter Tuning: You set things like PARAMETER temperature 0.2 (for consistency) or PARAMETER num_ctx 4096 (how much \u0026ldquo;memory\u0026rdquo; it has for your config files). Stop Sequences: You can tell the model exactly when to stop talking (e.g., PARAMETER stop \u0026quot;User:\u0026quot;), preventing it from rambling. 2. The “Persona” vs. The “Ask” Prompt Engineering: You have to tell the model every time: “Act like a sys admin and check this file…” Modelfile (The Base-Overlay): The persona is “baked in.” when you launch your “SysAdmin” model. 3. Layered Inheritance (The “FROM” command) This is the part that is impossible with just prompting.\nIn a .modelfile, the first line is usually FROM llama3(or any model that you use). This is Inheritance. ","permalink":"https://gurupasupathy.com/post/2026-01-31_using-model-overlays-using--modelfile/","summary":"\u003cp\u003e\u003cimg loading=\"lazy\" src=\"/img/1__yS9nBSlhRLM7xC__WPyek__w.jpeg\"\u003e\u003c/p\u003e\n\u003cp\u003ePhoto by \u003ca href=\"https://unsplash.com/@steve_j?utm_source=unsplash\u0026amp;utm_medium=referral\u0026amp;utm_content=creditCopyText\"\u003eSteve Johnson\u003c/a\u003e on \u003ca href=\"https://unsplash.com/photos/a-ceiling-with-many-windows-7INz588_4Kw?utm_source=unsplash\u0026amp;utm_medium=referral\u0026amp;utm_content=creditCopyText\"\u003eUnsplash\u003c/a\u003e\u003c/p\u003e\n\u003cp\u003eWhen you want to use a model but don’t want to keep initializing it with a specific \u003cstrong\u003epersona, temperature,\u003c/strong\u003e and other attributes, you can use the \u003cstrong\u003e.modelfile Customization Approach.\u003c/strong\u003e\u003c/p\u003e\n\u003ch4 id=\"step-1-create-amodelfile-as-shown-below-sys_adminmodelfile\"\u003eStep 1: Create a .modelfile as shown below (sys_admin.modelfile)\u003c/h4\u003e\n\u003cp\u003e# 1. THE BASE (Required)\u003cbr\u003e\nFROM llama3\u003c/p\u003e\n\u003cp\u003e# 2. BRAIN PHYSICS (Parameters)\u003cbr\u003e\nPARAMETER temperature 0.7     # Creativity (0.0 to 1.0+)\u003cbr\u003e\nPARAMETER num_ctx 4096        # How many \u0026ldquo;tokens\u0026rdquo; of memory it has\u003cbr\u003e\nPARAMETER top_k 40            # Limits the \u0026ldquo;vocabulary\u0026rdquo; pool for each word\u003cbr\u003e\nPARAMETER top_p 0.9           # Probability threshold for word choice\u003cbr\u003e\nPARAMETER repeat_penalty 1.1  # Prevents the AI from getting stuck in a loop\u003cbr\u003e\nPARAMETER stop \u0026ldquo;User:\u0026rdquo;        # Tells the AI exactly when to stop talking\u003cbr\u003e\nPARAMETER stop \u0026ldquo;\u0026mdash;\u0026rdquo;\u003c/p\u003e","title":"Using Model Overlays using .modelfile"},{"content":"\nUse case — I want to import a Logic App as an API within my APIM instance.\nThere is no direct way to get the swagger file of a logic app using CLI (at least, I could not figure out). So, detailing the steps to extract the swagger definition of a logic app. I use the generated swagger file to import a Logic App as an API within APIM using Azure CLI\nProvide the service principal contributor role to the logic app Get the resource id of the logic app — $logicAppResourceId = (az logic workflow show \u0026ndash;resource-group \u0026ldquo;{resourcegroup-name}\u0026rdquo; \u0026ndash;name \u0026ldquo;{logicAppName}\u0026rdquo; \u0026ndash;query id \u0026ndash;output tsv)\nProvide contributor role for the service principal — az role assignment create --assignee {sp-id} - role Contributor \u0026ndash;scope $logicAppResourceId\n2. Get the swagger file from the Logic App\ngenerate a JW token from https://login.microsoftonline.com/{tenantId}/oauth2/token for the service principle with resource as “https://management.core.windows.net/” $tenantId = \u0026ldquo;11111111-1111-1111-1111-111111111111\u0026rdquo;\n$clientId = \u0026ldquo;00000000-0000-0000-0000-000000000000\u0026rdquo;\n$clientSecret = \u0026ldquo;your-client-secret\u0026rdquo;\n$resource = \u0026ldquo;https://management.core.azure.com/\u0026quot;\n$body = @{\ngrant_type = \u0026ldquo;client_credentials\u0026rdquo;\nclient_id = $clientId\nclient_secret = $clientSecret\nresource = $resource\n}\n$response = Invoke-RestMethod -Method Post -Uri \u0026ldquo;https://login.microsoftonline.com/$tenantId/oauth2/token\u0026quot; -ContentType \u0026ldquo;application/x-www-form-urlencoded\u0026rdquo; -Body $body\n$accessToken = $response.access_token\n$accessToken\nconstruct the swagger URL for the logic app — $swaggerUrl = \u0026ldquo;https://management.azure.com\u0026rdquo; + (az logic workflow show \u0026ndash;resource-group \u0026ldquo;{resourcegroup-name}\u0026rdquo; \u0026ndash;name \u0026ldquo;{logicapp-name}\u0026rdquo; \u0026ndash;query id \u0026ndash;output tsv) + \u0026ldquo;/listSwagger?api-version=2016–06–01\u0026rdquo;\nIssue a POST request to $swaggerUrl to get the swagger definition of the LogicApp using Postman (or any other option you prefer) 3. Import into APIM\nRun the below command to import the above swagger file to APIM az apim api import \u0026ndash;resource-group \u0026ldquo;{resourcegroup-name}\u0026rdquo; \u0026ndash;service-name \u0026ldquo;{apim-instance-name}\u0026rdquo; \u0026ndash;path \u0026ldquo;/v1\u0026rdquo; \u0026ndash;api-id myapi \u0026ndash;specification-path \u0026ldquo;.\\logicapp.backend.swagger.json\u0026rdquo; \u0026ndash;specification-format Swagger\n4. Remove the contributor role for the service principal\naz role assignment delete \u0026ndash;assignee 00000000–0000–0000–0000–000000000000 \u0026ndash;role \u0026ldquo;Contributor\u0026rdquo; \u0026ndash;scope $logicAppResourceId\n","permalink":"https://gurupasupathy.com/post/2024-06-10_extracting-logic-app-swagger-def-and-import-to-apim/","summary":"\u003cp\u003e\u003cimg loading=\"lazy\" src=\"/img/1__Yqhxr__0j4lVw8U9QtA6mKQ.jpeg\"\u003e\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eUse case\u003c/strong\u003e — I want to import a Logic App as an API within my APIM instance.\u003c/p\u003e\n\u003cp\u003eThere is no direct way to get the swagger file of a logic app using CLI (at least, I could not figure out). So, detailing the steps to extract the swagger definition of a logic app. I use the generated swagger file to import a Logic App as an API within APIM using Azure CLI\u003c/p\u003e","title":"Extracting Swagger definition for Azure Logic App and importing to Azure APIM"},{"content":"\nThere are couple of ways to integrate an APIM with Logic App. The most common use case as far as I know is exposing the Logic App as an API on the APIM. The other scenario is calling a Logic App from APIM.\nI will provide the APIM policy snippet to call a Logic App. If you are using Managed Identity to authenticate to Logic App (will cover in a separate article), you can skip sending the bearer token.\nFew steps to be done in the Logic App\nenable Authentication at the Logic App end\nthe Logic App URL should not contain the SAS token\nmake sure that the Logic App has the below in trigger section. Basically, this is the ensure that the Logic App expects the Bearer token and “IncludeAuthorizationHeadersInOutputs” ensures that the Auth token is available for further processing within the Logic App\n\u0026ldquo;triggers\u0026rdquo;: {\n\u0026ldquo;manual\u0026rdquo;: {\n\u0026ldquo;conditions\u0026rdquo;: [\n{\n\u0026ldquo;expression\u0026rdquo;: \u0026ldquo;@startsWith(triggerOutputs()?[\u0026lsquo;headers\u0026rsquo;]?[\u0026lsquo;Authorization\u0026rsquo;], \u0026lsquo;Bearer\u0026rsquo;)\u0026rdquo;\n}\n],\n\u0026ldquo;inputs\u0026rdquo;: {\n\u0026ldquo;schema\u0026rdquo;: {}\n},\n\u0026ldquo;kind\u0026rdquo;: \u0026ldquo;Http\u0026rdquo;,\n\u0026ldquo;operationOptions\u0026rdquo;: \u0026ldquo;IncludeAuthorizationHeadersInOutputs\u0026rdquo;,\n\u0026ldquo;type\u0026rdquo;: \u0026ldquo;Request\u0026rdquo;\n}\n}\nAPIM Policy to call the Logic App\nWe issue a call to the Logic App from with the . The response from the Logic App is captured in response-variable-name=”responsela”.\n\u0026lt;policies\u0026gt;\n\u0026lt;inbound\u0026gt;\n\u0026lt;send-request mode\\=\u0026quot;new\u0026quot; response-variable-name\\=\u0026quot;responsela\u0026quot; timeout\\=\u0026quot;20\u0026quot; ignore-error\\=\u0026quot;false\u0026quot;\\\u0026gt; \u0026lt;set-url\\\u0026gt;https://xxxxxxx.com:443/workflows/xxxxxxxxxxxx/triggers/manual/paths/invoke?api-version=2016-10-01\u0026lt;/set-url\\\u0026gt; \u0026lt;set-method\\\u0026gt;POST\u0026lt;/set-method\\\u0026gt; \u0026lt;set-header name\\=\u0026quot;Content-Type\u0026quot; exists-action\\=\u0026quot;override\u0026quot;\\\u0026gt; \u0026lt;value\\\u0026gt;application/json\u0026lt;/value\\\u0026gt; \u0026lt;/set-header\\\u0026gt; \u0026lt;set-header name\\=\u0026quot;Authorization\u0026quot; exists-action\\=\u0026quot;override\u0026quot;\\\u0026gt; \u0026lt;value\\\u0026gt;Bearer \\*\\*\\*\\*\u0026lt;/value\\\u0026gt; \u0026lt;/set-header\\\u0026gt; \u0026lt;/send-request\\\u0026gt; \u0026lt;return-response\\\u0026gt; \u0026lt;set-status code\\=\u0026quot;200\u0026quot; reason\\=\u0026quot;OK\u0026quot; /\u0026gt; \u0026lt;set-body\\\u0026gt;@(((IResponse)context.Variables\\[\u0026quot;responsela\u0026quot;\\]).Body.As\u0026lt;JObject\\\u0026gt;(preserveContent: true).ToString())\u0026lt;/set-body\\\u0026gt; \u0026lt;/return-response\\\u0026gt; \u0026lt;/inbound\\\u0026gt; \u0026lt;outbound\\\u0026gt; \u0026lt;base /\u0026gt; \u0026lt;/outbound\\\u0026gt; \u0026lt;on-error\\\u0026gt; \u0026lt;base /\u0026gt; \u0026lt;/on-error\\\u0026gt; \u0026lt;backend\\\u0026gt; \u0026lt;base /\u0026gt; \u0026lt;/backend\\\u0026gt; \u0026lt;/policies\u0026gt;\nAll the tags are quite self-explanatory and there a loads of documentation available about them. is very useful policy, it suspends further policy pipeline execution and returns to the caller.\nHope this helps.\nCheers!\n","permalink":"https://gurupasupathy.com/post/2024-05-22_calling-a-logic-app-from-apim/","summary":"\u003cp\u003e\u003cimg loading=\"lazy\" src=\"/img/1__Gepa1jETj8F7cwF8xpVXJg.jpeg\"\u003e\u003c/p\u003e\n\u003cp\u003eThere are couple of ways to integrate an APIM with Logic App. The most common use case as far as I know is exposing the Logic App as an API on the APIM. The other scenario is calling a Logic App from APIM.\u003c/p\u003e\n\u003cp\u003eI will provide the APIM policy snippet to call a Logic App. If you are using Managed Identity to authenticate to Logic App (will cover in a separate article), you can skip sending the bearer token.\u003c/p\u003e","title":"Calling a Logic App from APIM"},{"content":"\nI have come across quite a few ASP.NET Core WebAPI solutions where there is a inordinate number of Data Transfer Object (DTO) classes. This results in a kind of class explosion which I think can be avoided. Yes, DTOs do have their utility, no doubt. But, many a times as the application evolves and grows, we often end up with numerous DTOs and these DTOs sometimes differ just by a handful of attributes or in some cases they are a simple composition of multiple entities / DTOs.\nOne of the reasons we have so many such DTO classes is the need to pass data to and from repository and service layer ( between different layers of the application for that matter). In order to find a way around creating yet another DTO, I was exploring some options and realized that Tuples can be used to minimize the creation of DTOs\nTuples have been around in C# for quite sometime now. I am not sure if it is a common knowledge but I recently figured out that you could eliminate quite a few DTOs that we use to ferry data between the layers by leveraging Tuples\nLet us take the following scenario of instance.\nWe have an API, say, GetCustomers which, of course, will return me the list of customers . And, we have an entity, Customer as defined below.\nclass Customer\n{\npublic int customerId {get; set;}\npublic string firstname {get; set;}\npublic string lastname {get; set;}\n}\nThe API response for our GetCustomers API is as below\nYou would have noticed that the attribute count is expected in the response and this is not present in our Customer class. The repository layer would just return a List but the service layer needs to pass it along with the count attribute to the controller. This is usually where we tend to create a DTO as below.\nclass CustomerDTO\n{\nint count;\nList customers\n}\nThe only reason for the above class to exist is to ferry the data from repository in a format that the controller is expecting. We can eliminate this class altogether by returning Tuple as below\nreturn new Tuple\u0026lt;int, List\u0026gt;(result.Count,result)\nGranted, this is a very trivial scenario and you can add the count attribute in the controller and return an anonymous type also.\nNow consider the cases when you need a response that is aggregation of multiple custom types. For instance, if we have two API one to get customer and another to get order details we would have created two DTO for Customer and Order. If a new API is required that gives details pertaining to a particular Customer and all related Orders as response, you might have to create a new DTO again, as below.\npublic class newDTO {\npublic int orderCount {get; set;}\npublic int customerId {get; set;}\npublic List orders {get; set;}\n}\nThe expected response is\nThis is exactly what we can avoid by using Tuples like below in the service and repository layer.\nRepositoryLayer.cs\nvar repoResponse = new Tuple\u0026lt;int, Customer customer, List\u0026gt;(count,custResult, orderResult);\nreturn repoResponse;\ncustResult holds a particular customer’s data and orderResult will be a List\nServiceLayer.cs\n.\n.\n.\n//Create a Tuple with three members, count, customer and orders //repoResponse is the response from your repository(a Tuple)\n(int count, Customer customer, List orderList) result = (repoResponse.Item1, repoResponse.Item2, repoResponse.Item3);\nreturn result;\n}\nFrom the above service response, the controller can create an anonymous type as below without ever creating a DTO and return the response .\nreturn new { ordercount= serviceResponse.count, customer = serviceResponse.customer.customerId, serviceResponse = serviceResponse.orderList };\nIt should be noted that, although this approach eliminates the need to create DTO classes, it come at the cost of readability. Your method signatures may not be very elegant and readable. While DTOs will still be the right way to go in some scenarios, for others, Tuples can help.\nHope this helps in reducing a few DTOs at your end.\nCheers!\n","permalink":"https://gurupasupathy.com/post/2021-11-25_reducing-data-transfer-objects-using-tuples/","summary":"\u003cp\u003e\u003cimg loading=\"lazy\" src=\"/img/1__o63lCwtjwCbbGw__KrUl__sw.jpeg\"\u003e\u003c/p\u003e\n\u003cp\u003eI have come across quite a few ASP.NET Core WebAPI solutions where there is a inordinate number of Data Transfer Object (\u003ca href=\"https://docs.microsoft.com/en-us/aspnet/core/tutorials/first-web-api?view=aspnetcore-5.0\u0026amp;tabs=visual-studio#prevent-over-posting-1\"\u003eDTO\u003c/a\u003e) classes. This results in a kind of class explosion which I think can be avoided. Yes, DTOs do have their utility, no doubt. But, many a times as the application evolves and grows, we often end up with numerous DTOs and these DTOs sometimes differ just by a handful of attributes or in some cases they are a simple composition of multiple entities / DTOs.\u003c/p\u003e","title":"Reducing Data Transfer Objects using Tuples in C#"},{"content":"\nIf you have been using Azure App Services for a while to host your API, there is a small chance that you would have encountered the issue with a faulty instance. Your API just doesn’t respond or keeps crashing in a particular instance. And, if your ARR Affinity was enabled, your problems will just be exacerbated. Some users will always be routed to the faulty instance.\nAFAIK, there are no straight forward way to release an instance that is allotted to you by Azure for the given App Service Plan. Adding more instances and removing instances (scale out / in) will not guarantee that the rogue instance will be released. I will share the approach I took to get rid of the rogue instance. Note that, the approach below needs your app service to be out of rotation and should not be serving incoming requests.\nAssume that you suspect that a given instance in your App Service Plan has issues and is crashing frequently and you wish to remove this instance. As of today, there is no way to select an instance and remove it via the Azure Portal (yes, you can stop an instance from Process Explorer, but it would still not get rid of the instance). One way to achieve this would be to use vertical scaling (up/down). When you scale up/down Azure allocates necessary hardware based on the target pricing tier you have chosen. The infrastructure differs significantly across tiers and moving across tiers will almost always guarantee different infrastructure allocation. We will use this to get rid of the rogue instance.\nStart by scaling down to a lesser tier (moving laterally within the same tier may not help) For instance, if you are operating on a Premium tier, move to Standard. This action will make Azure allocate new instances in the lesser tier that you have chosen. Now, after scaling down, scale up again to your target pricing tier. When you do this, you are going to be allocated fresh (at least not the old rogue) instances. This is how I got rid of one of the instances that was bothering me.\nHope this helps.\n","permalink":"https://gurupasupathy.com/post/2021-04-22_rid-of-a-rouge-instance-in-azure-app-service-plan/","summary":"\u003cp\u003e\u003cimg loading=\"lazy\" src=\"/img/1__Lm11e6NyfH1lBZhwYgObTw.jpeg\"\u003e\u003c/p\u003e\n\u003cp\u003eIf you have been using Azure App Services for a while to host your API, there is a small chance that you would have encountered the issue with a faulty instance. Your API just doesn’t respond or keeps crashing in a particular instance. And, if your ARR Affinity was enabled, your problems will just be exacerbated. Some users will always be routed to the faulty instance.\u003c/p\u003e\n\u003cp\u003eAFAIK, there are no straight forward way to release an instance that is allotted to you by Azure for the given App Service Plan. Adding more instances and removing instances (scale out / in) will not guarantee that the rogue instance will be released. I will share the approach I took to get rid of the rogue instance. Note that, the approach below needs your app service to be out of rotation and should not be serving incoming requests.\u003c/p\u003e","title":"How to get rid of a rouge instance in Azure App Service Plan"},{"content":"\nI have written my fair share of RESTful API but am no expert by any measure. I had never given enough thought about the HTTP Verbs I should be using (like, PUT, POST, PATCH) while writing API. If a resource had to be created, I would automatically go for POST (_never considered the idempoten_cy angle at all) and if a resource had to be modified, I would go for PATCH.\nOn one of my API assignments, I decided to make a very conscious and deliberated choice of the verbs I will be using; and an API in particular got me thinking.\nI had to write an API to modify a resource and this resource happened to have nested resources and numerous attributes. I soon realized that it wasn’t so straight forward to create an elegant PATCH API due to the sheer number of attributes on this resource that can potentially get modified (Why did you design a resource with so many attributes in the first place? you might ask. But that is a topic for another day)\nSo, coming back to the task in hand, I was aware of only two choices to go about writing a PATCH call. Either, send just the attributes requiring modifications to the API (Approach 1) or end the entire resource to API after making changes to the necessary attributes at consumers’ end (Approach 2). We will examine both the approaches in the context of the below two classes (Employee and Address)\npublic class Employee { public int EmployeeId {get; set;} public string EmployeeName {get; set;} public Address EmployeeAddress {get; set;} public string WorkLocation {get; set;} public List\u0026lt;string\u0026gt; PreferredWorkLocations {get; set;} } public class EmployeeAddress { public string HouseNumber {get; set;} public string AddressLine1 {get; set;} public string AddressLine2 {get; set;} } Approach 1: The consumer of the PATCH API will send the entire resource, after changing a few select attribute that need to be modified. At the API end, the PATCH payload will be handed over to the repository layer which would then update the entire resource to the database. Other than some basic validations, no additional work is needed at the API end. A sample PATCH call (almost a PUT) would look like below\n**api/ModifyEmployee/{empId} **{ \u0026quot;EmployeeName\u0026quot; : \u0026quot;ename\u0026quot;, \u0026quot;EmployeeAddress\u0026quot; : { \u0026quot;HouseNumber\u0026quot; : \u0026quot;F32\u0026quot;, \u0026quot;AddressLine1\u0026quot; : \u0026quot;Addr1\u0026quot;, \u0026quot;AddressLine2\u0026quot; : \u0026quot;Addr2\u0026quot; }, \u0026quot;WorkLocation\u0026quot; : \u0026quot;Brazil\u0026quot;, \u0026quot;PreferredWorkLocations\u0026quot; : \\[\u0026quot;Brazil\u0026quot;,\u0026quot;France\u0026quot;\\] } Downside: the consumer has to build the entire object even if only a single attribute requires modification. Also note that, there is no way of knowing if just the “houseNumber” has changed or any other / all the attributes of Employee has changed. So, all the attributes’ values need to be copied back to a new object object to be persisted in the database.\nApproach 2: The consumer of the API will send only the attribute that had to be modified. A sample PATCH for modifying the work location will look like below:\n**api/ModifyEmployeeWorkLocation/{empId} **{ \u0026quot;WorkLocation\u0026quot; : \u0026quot;Brazil\u0026quot; } Downside: the onus of constructing an object that can be handed over to the repository layer falls on the API service layer (an object mapper need to be used here)\nFurther, this approach might necessitates that a new API be created of each combination of possible modifications in the resource attributes. Consider if I have to update the Employee Address I will have to have another method like api/ModifyEmployeeAddress/{empId}. If the class has many attributes that could be modified this can lead to explosion of PATCH methods.\nJSON Patch Neither of the approaches appealed to me. This is when I stumbled upon JSON PATCH. Honestly, I had never heard of JSON Patch before and wanted to give it a shot as I thought it would address the downsides mentioned above.\nWhat I like the most about JSON Patch was that as a consumer I don’t have to send the entire object as payload for the PATCH call, I can just mention what operation (add / remove / replace / copy) I want to perform on which resource attribute / subset of attributes. Also, at the API end, I there is not need to have multiple methods for each type of modifications and there is no need to manually copy over the incoming values to a new object that the repository will understand and persist\nUsing JSON Patch, these call can be as simple as below\n**\\[ {** \u0026quot;**value**\u0026quot;: \u0026quot;address line one\u0026quot;, \u0026quot;**path**\u0026quot;: \u0026quot;/address/addressLine1\u0026quot;, \u0026quot;**op**\u0026quot;: \u0026quot;replace\u0026quot; **} \\]** The advantage of using JSON Patch is that you don’t have to reconstruct the object at your API end. You can use a middle-ware like NewtonsoftJsonPatch and simply use ApplyTo method to construct the object for persistence / further processing.\npublic async Task\u0026lt;IActionResult\u0026gt; UpdateEmployee(\\[FromBody\\] **JsonPatchDocument\u0026lt;Employee\u0026gt;** patchDoc, int empId) { if (patchDoc != null) { var emp = await \u0026lt;yourCache\u0026gt;.GetAsync\u0026lt;Employee\u0026gt;(\u0026quot;cacheKey\u0026quot;); if (emp == null) { emp = await yourService.GetEmployeeData(empId); } **patchDoc.ApplyTo(emp, ModelState);** //call repository to update. \\_ = await yourService.UpdateAsync(emp); return new ObjectResult(emp); } else { return BadRequest(ModelState); } } The ApplyTo method will take care of copying (or performing any operation based on the value supplied in “op” attribute of the PATCH call) the new values to the existing object. This eliminates the need to do this mapping and copying manually using a mapper.\nAnother plus is that you don’t have to have multiple PATCH calls for each of the attributes, you can club multiple modification requests in the same PATCH call like below. Please note that you have to use /- notation to add to a list.\n\\[ { \u0026quot;value\u0026quot;: \u0026quot;new work Location\u0026quot;, \u0026quot;path\u0026quot;: \u0026quot;/preferredWorkLocations/-\u0026quot;, \u0026quot;op\u0026quot;: \u0026quot;add\u0026quot; }, { \u0026quot;value\u0026quot;: \u0026quot;address line one\u0026quot;, \u0026quot;path\u0026quot;: \u0026quot;/address/addressLine1\u0026quot;, \u0026quot;op\u0026quot;: \u0026quot;replace\u0026quot; } \\] You may refer to the below link for detailed information on how to use JSON Patch in ASP.NET Core.\nJsonPatch in ASP.NET Core web API\n_By Tom Dykstra and Kirk Larkin This article explains how to handle JSON Patch requests in an ASP.NET Core web API. To…_docs.microsoft.com\nSo, that’s how I embraced JSON Patch.\nCheers!\n","permalink":"https://gurupasupathy.com/post/2020-07-05_patch-calls-using-json-patch/","summary":"\u003cp\u003e\u003cimg loading=\"lazy\" src=\"/img/1__a__P3zmoCIshJqSyuAQQGeA.jpeg\"\u003e\u003c/p\u003e\n\u003cp\u003eI have written my fair share of RESTful API but am no expert by any measure. I had never given enough thought about the \u003ca href=\"https://developer.mozilla.org/en-US/docs/Web/HTTP/Methods\"\u003eHTTP Verbs\u003c/a\u003e I should be using (like, PUT, POST, PATCH) while writing API. If a resource had to be created, I would automatically go for POST (_never considered the idempoten_cy \u003cem\u003eangle at all\u003c/em\u003e) and if a resource had to be modified, I would go for PATCH.\u003c/p\u003e","title":"How I ended up writing cleaner PATCH calls using JSON Patch"},{"content":" There are many options available when it comes to mocking API response, like, JSON server or even having a response JSON file added to your solutions, to cite a few. In this article we will see how Azure function proxies can be used to mock API responses.\nAzure function provides an elegant option to mock API response using proxies. Using a Azure function proxy, you can provide a mock endpoint which can be used by your team to continue their work till your actual API is ready for integration.\nLet us go ahead, create a simple proxy and see how the mock response is served.\nWe will be creating a proxy end point which will service a GET call, say, getCustomer. Our getCustomer API method is expected to provide a response in the below format. So, till getCustomer is up and ready for consumption, our proxy can be used to get the below JSON as mock response.\nBelow are the steps for create a Function proxy.\nStep 1: We will create an Azure function app which will host the proxy. (If there is already a general purpose / maintenance Function App present we can use that.)\nStep 2: Now that we have created the function app to host our proxy, let us create our proxy. Choose the “Proxies” item in the Azure function blade as shown below. Click on “Add” to create a new proxy. We will call this MockCustomerAPI\nAnd we will provide a route /api/getcustomer. In the HTTP Method section, we select “GET”. Please note that we can choose to mock other HTTP methods like POST as well.\nStep 3: This is the step where we will provide the response we want the proxy to send us back. We will override the response as shown below by expanding the “Response override” link and paste our mock response in the space provide in Body section.\nWe can provide the status code and status message as per our use case and click on “Create”. Once the proxy gets created successfully we will be provided with a link to access the proxy as shown below\nStep 4: Now that we are done with creating the proxy, let us test. To test our proxy, copy the generated proxy URL and open in the browser. We will see the response as below\nThus, we have created a proxy for the getCustomer API which can be used by the UX team or other API teams to integrate during the early development cycles when our API is not ready yet. Please do note that mocks are not just for GET method, you could do other HTTP methods as well\nSome of the advantages of this approach are\nMock API responses unblocks the collaborating team like UX team as they can work against the mock endpoint till the actual API is ready Testing is easier and thorough if your API relies on a partner API. Creating a mock endpoint gives you the flexibility to change the response and test your code for all possible, allowed parameter values from the partner API If there is a dependency on partner API which is not available in your lower environments, you can resort to creating a proxy in lower environment. As it is hosted in a common URL, same contract will be used across all crews consistently. Any change done to the contract will be immediately visible to all the consuming developers. Eliminates the need of having a separate JSON response file or JSON server on local dev box and thus ensure you are developing against the latest contract Before we conclude, a note about CORS: If you are hitting the proxy from your front-end web application, please ensure you tweak the CORS setting for the mock function app accordingly as show below\nConclusion\nThere are many useful feature of Azure function Proxies like redirection and route template parameters. You can read more about Azure Function proxies in official Microsoft documentation.\n","permalink":"https://gurupasupathy.com/post/2020-06-14_using-azure-function-proxies-for-mocking/","summary":"\u003cp\u003e\u003cimg loading=\"lazy\" src=\"../post/img/1__0dxAdZ9Lr__lZh7pA1UvbhA.png\"\u003e\n\u003cimg loading=\"lazy\" src=\"img/1__K8rbssn1nWY4vAurhAH3QQ.jpeg\"\u003e\u003c/p\u003e\n\u003cp\u003eThere are many options available when it comes to mocking API response, like, \u003ca href=\"https://www.npmjs.com/package/json-server\"\u003eJSON server\u003c/a\u003e or even having a response JSON file added to your solutions, to cite a few. In this article we will see how Azure function proxies can be used to mock API responses.\u003c/p\u003e\n\u003cp\u003eAzure function provides an elegant option to mock API response using proxies. Using a Azure function proxy, you can provide a mock endpoint which can be used by your team to continue their work till your actual API is ready for integration.\u003c/p\u003e","title":"Using Azure function proxies for mocking API"},{"content":"\nScaling cloud resources dynamically is a fascinating topic. Microsoft Azure provide quite a few ways to dynamically scale resources. This article focuses on creating a scheduled vertical scaling (scaled up/down) of App Services. The approach outlined here can be used for other Azure resource like SQL Databases, Redis Cache or in fact pretty much most of the Azure resources that support scaling. Just to clarify right at the outset, we are talking about vertical scaling (between pricing tiers) and not horizontal scaling (scale in/out) wherein we deal with the number of instances at our disposal.\nIt is a common knowledge that Azure provides out-of-the-box options to scale out/ scale in based on the scaling rules for App Service plan but there is no way to scale up / scale down as per some schedule.\nFor instance, there is no direct way to say that between 10 AM and 12 Noon, let my App Service plan run on P2V2 and come back to P1V2 there after or have my SQL Server move up to P6 for a few hours before coming back to S2. In other words, no option to scale up/scale down based on schedule\nNote on Serverless Azure SQL Database :- We have Serverless Azure SQL Databases with two key capabilities which make them attractive in terms of cost. 1. The option to auto scale up / down between the minimum and maximum threshold 2. Auto-pause — wherein the SQL Server is stopped after a predefined period of inactivity till some activity is detected again. You don’t get charged for the period of inactivity. The downside is that it will take some time of the SQL Server to warm up and be available for the next use after the period of inactivity. Serverless is best suited for test / dev environments where you have tolerance to the brief period of connection unavailability during the warm-up. There could also be slight performance degradation for sometime as the cache memory are gradually reclaimed. Serverless is not for your use case if these limitations are not acceptable. Furthermore, there are cases when although your usage will be limited for an interval you cannot afford to shut down the server using auto pause. Without auto pause you will be charged for the minimum number of vCores and minimum memory configured. For more details, refer to Microsoft documentation on Azure Serverless SQL Database\nSo, if your case is such that you will want to use DTU based provisioning and still want scaling based on a schedule as you have predictable utilisation, you can use the approach outlined in this article. One example that comes to my mind is bumping up your DTU for a few hours when you are doing a performance test or scaling down during a seasonal / weekend low utilisation to save costs.\nAlright, now that we have context set, let us move on to see how we can achieve this scheduled vertical scaling for Azure resource in the following section\nTo start with, create an automation account. Details on how to create automation account can be found here. Azure Automation allows us to invoke runbooks as per a schedule. We will leverage this capability for our purpose. Please remember to select the option to create a RunAs account while creating the Automation account as shown below. This is the principle under with the runbooks can execute.\nAfter the automation account is created, create two Runbooks which will be invoked by a scheduler to perform the scaling operation automatically without our intervention. These runbooks will contain PowerShell scripts to perform the scaling operation based on your need. (one for Scale up and another for Scale down) The fact that we use PowerShell to perform the scaling gives us the option to scale pretty much all resource for which you can get hold of PowerShell scripts to scale; and the best source for PowerShell reference is Microsoft’s official documentation.\nNow that you have the automation account and the runbooks that you need, create a schedule and link these runbooks as per your need. (I won’t go into details on creating a schedule as it is very well documented and simple. Refer to Microsoft’s documentation on how to create a Schedule )\nSo, for instance, the below schedule will automatically call the “ScaleDown” runbook at 5:10 AM on 7th Feb\nThe below PowerShell script can be used to scale up / down an App Service Plan\nWrite-Output \u0026ldquo;API Scale\u0026rdquo;\n$connectionName = \u0026ldquo;AzureRunAsConnection\u0026rdquo;\ntry {\n$servicePrincipalConnection=Get-AutomationConnection -Name $connectionName\nAdd-AzureRmAccount -ServicePrincipal -TenantId $servicePrincipalConnection.TenantId -ApplicationId $servicePrincipalConnection.ApplicationId -CertificateThumbprint $servicePrincipalConnection.CertificateThumbprint\nSet-AzureRmAppServicePlan -ResourceGroupName \u0026ldquo;\u0026laquo;yourresourcegroup\u0026rdquo; -Name \u0026ldquo;\u0026laquo;yourappserviceplanname\u0026rdquo; -Tier PremiumV2 -NumberofWorkers 2 -WorkerSize \u0026ldquo;Medium\u0026rdquo;}\ncatch {\nif (!$servicePrincipalConnection){\n$ErrorMessage = \u0026ldquo;Connection $connectionName not found.\u0026rdquo;\nthrow $ErrorMessage } else{\nWrite-Error -Message $_.Exception\nthrow $_.Exception\n}}\nIn case you receive an error as below, go and update the PowerShell modules in your automation account. That should fix the issue\nThe term ‘Set-AzureRmAppServicePlan’ is not recognized as the name of a cmdlet, function, script file, or operable program.\nBelow, I have provided a sample PowerShell to scale a SQL database; it scales to database to P1 tier. This script uses a credential to perform the DB scaling as opposed to the AzureRunAsAccount in the previous PowerShell.\nTo create a new credential, navigate to your automation account and select the “Credentials” option in the “Shared Resources” section. Refer to the below screen shot showing the credential creation\nparam([parameter(Mandatory=$true)] [PSCredential] $Credential )\n# Name of the Azure SQL Database server\n[string] $SqlServerName = \u0026ldquo;yourserver.database.windows.net\u0026rdquo;\n$Servercredential = New-Object System.Management.Automation.PSCredential($Credential.UserName, (($Credential).GetNetworkCredential().Password | ConvertTo-SecureString -asPlainText -Force))\n$CTX = New-AzureSqlDatabaseServerContext -ServerName $SqlServerName -Credential $ServerCredential\n[string] $DatabaseName = \u0026ldquo;yourdb\u0026rdquo;\n[string] $Edition = \u0026ldquo;Premium\u0026rdquo;\n[string] $PerfLevel = \u0026ldquo;P1\u0026rdquo;\n$Db = Get-AzureSqlDatabase $CTX –DatabaseName $DatabaseName\nWrite-Output \u0026ldquo;Database Scale state \u0026quot; - $Db.ServiceObjectiveAssignementStateDescription\nif($Db.ServiceObjectiveName -ne $PerfLevel -and $Db.ServiceObjectiveAssignementStateDescription -ne \u0026ldquo;Pending\u0026rdquo;){\n$ServiceObjective = Get-AzureSqlDatabaseServiceObjective $CTX -ServiceObjectiveName $PerfLevel\n# Set the new edition/performance level\n#None, Business, Web, Premium, Basic, Standard\u0026rdquo;\nWrite-Output \u0026ldquo;Trigger the scale operation\u0026rdquo;\nSet-AzureSqlDatabase $CTX –Database $Db –ServiceObjective $ServiceObjective –Edition $Edition -Force\nWrite-Output \u0026ldquo;Completed vertical scale\u0026rdquo;\n}else{\nWrite-Output \u0026ldquo;The DB is already in the target pricing tier Or DB is currenlty being scale up / down\u0026rdquo;}\nThis same approach can be applied for other Azure resources also. Go on and try it out!\nHappy Scaling!\n","permalink":"https://gurupasupathy.com/post/2020-04-25_scheduling-vertical-scaling/","summary":"\u003cp\u003e\u003cimg loading=\"lazy\" src=\"/img/1__6HRU03UvgOt__5ih3ssEDXw.jpeg\"\u003e\u003c/p\u003e\n\u003cp\u003eScaling cloud resources dynamically is a fascinating topic. Microsoft Azure provide quite a few ways to dynamically scale resources. This article focuses on creating a scheduled \u003cem\u003evertical scaling\u003c/em\u003e (scaled up/down) of App Services. The approach outlined here can be used for other Azure resource like SQL Databases, Redis Cache or in fact pretty much most of the Azure resources that support scaling. Just to clarify right at the outset, we are talking about vertical scaling (between pricing tiers) and not horizontal scaling (scale in/out) wherein we deal with the number of instances at our disposal.\u003c/p\u003e","title":"Scheduling vertical scaling using Microsoft Azure Automation Accounts"}]