Guozhen AIGlobal AI field notes and model intelligence

English translation

Resource Diagnostics and Troubleshooting in Azure

Published:

Category: Azure Cloud

Read time: 3 min

Reads: 0

Lesson #26Views are counted together with the original Chinese articleImages are preserved from the source page

AI Article Decision Snapshot

Turn the lesson into workflow, model, budget, and security checks before choosing tools.

Use this quick snapshot before leaving the article. It keeps the next search tied to practical AI software, model/API, cost, privacy, and implementation questions.

Workflow fit

Identify the real job behind the article: coding, research, document review, support, analytics, content, or internal automation.

Model or tool decision

Decide whether the next step is a software shortlist, an AI tool comparison, an API platform choice, or a model benchmark.

Budget and usage signal

Estimate seats, API calls, prompt volume, retries, review time, and fallback work before assuming the workflow is cheap.

Security and privacy review

Check whether source code, customer data, private documents, prompts, logs, or embeddings will enter the AI workflow.

In the previous article, we introduced how to use Azure Monitor and Log Analytics to monitor Azure resources. This article continues our exploration, focusing on resource diagnostics and troubleshooting.

Understanding Resource Diagnostics

The primary goal of resource diagnostics is to quickly identify root causes when issues arise. Azure provides a suite of tools and services that make diagnosing and troubleshooting problems straightforward. These include, but are not limited to:

  • Azure Diagnostics: Collects and analyzes diagnostic data from various Azure resources.
  • Azure Service Health: Provides health status information for Azure resources.
  • Diagnostic Logs: Record operations and events performed on resources.

Azure Diagnostics

With Azure Diagnostics, you can configure your Azure resources to collect different types of telemetry—such as performance counters, event logs, Windows logs, and custom logs. You can configure these settings via the Azure Portal, Azure CLI, or PowerShell.

Example: Configuring Diagnostics for an Azure Virtual Machine

Suppose you have an Azure virtual machine named VM1, and you want to enable diagnostics to monitor its performance.

  1. Sign in to the Azure Portal.
  2. Navigate to your virtual machine VM1.
  3. In the left-hand menu, select Diagnostic settings.
  4. Click + Add diagnostic setting, then select the metrics you wish to collect.
  5. Click Save to apply the configuration.

After completing these steps, you’ll be able to monitor metrics such as CPU utilization and memory usage for VM1.

Troubleshooting with Azure Monitor

Once diagnostic data is enabled, you can use Azure Monitor to investigate issues. If you observe abnormal performance from a resource, you can examine real-time logs and metrics directly in Azure Monitor.

Example: Troubleshooting in Azure Monitor

Suppose you notice that VM1’s response time has become unusually slow. You can follow these steps to troubleshoot:

  1. Sign in to the Azure Portal, then select Monitor.
  2. In the left-hand menu, choose Activity Log to review recent events for anomalies.
  3. Select Metrics to view CPU and memory usage trends for VM1.
  4. If CPU utilization consistently hits 100% over time, consider scaling up resources or optimizing application code.

Below is a simple Azure CLI command example to retrieve CPU metrics for VM1:

az monitor metrics list --resource /subscriptions/{subscription-id}/resourceGroups/{resource-group-name}/providers/Microsoft.Compute/virtualMachines/VM1 --metric "CPU Percentage" --interval PT1M

Using Diagnostic Logs

Certain operations on Azure resources generate diagnostic logs, which provide rich contextual detail for troubleshooting. For instance, an Azure Storage account can emit operation logs tracking all incoming requests.

Reading and Analyzing Diagnostic Logs

You can query and analyze diagnostic logs using Log Analytics. Below is a simple Kusto Query Language (KQL) example showing how to count successful and failed requests within a specified time window:

StorageBlobLogs
| where TimeGenerated >= datetime(2023-10-01) and TimeGenerated < datetime(2023-10-02)
| summarize Count = count() by StatusCode

This query returns the number of requests grouped by StatusCode for the specified date range—enabling quick identification of problematic request patterns.

Azure Service Health

Azure Service Health delivers proactive alerts about the health of Azure services. Sometimes, performance issues stem not from your application, but from underlying Azure service disruptions.

Checking Service Health Status

  1. Sign in to the Azure Portal, and type Service Health into the search bar.
  2. Review the dashboard to check current service availability and any reported incidents.
  3. Subscribe to Service Health notifications to receive timely alerts when issues occur.

Summary

In this article, we explored how to perform Azure resource diagnostics and troubleshooting, leveraging tools such as Azure Diagnostics, Azure Monitor, and Azure Service Health. By effectively utilizing these capabilities, you can rapidly detect, isolate, and resolve issues—ensuring stable, reliable operation of your applications.

In upcoming articles, we’ll focus on generating scheduled reports and performing resource optimization to further improve the efficiency and cost-effectiveness of your Azure deployments. We hope your journey toward robust resource monitoring and management continues to gain momentum!

Apply This Lesson

Turn this article into AI software, model, API, and security decisions.

English Article FAQ

Use this article as evidence before choosing AI tools

How should I use this AI Tutorials article?

Use it as the implementation or learning layer, then connect the idea to AI software buyer guides, tool comparisons, benchmarks, API choices, and security checks before making a production decision.

Is this English article different from the Chinese original?

The English edition is localized for global AI readers while preserving the original diagrams, screenshots, prompts, code examples, and source context from the Chinese article.

What should I read after Resource Diagnostics and Troubleshooting in Azure?

Continue with AI Software Buyer Guides, AI Tools Workbench, Best AI Coding Agents, AI Model Benchmarks, OpenAI vs Anthropic API, or LLM Security Tools depending on the decision you need to make.

Can this article alone choose an AI product or model?

No. Treat the article as evidence and context, then validate fit with pricing, privacy requirements, integration effort, benchmark results, workflow tests, and fallback planning.

Continue

Keep reading from here

Browse English site

Reader Messages

Reader messages

Questions, corrections, extra sources, or hands-on results can be left here. No login is required.

Max 800 characters

To reduce spam, each message is checked for length, link count, and posting frequency.

0/800

Messages

0 messages
Loading messages...