### Azure Monitor: Observability Hub Azure Monitor is a comprehensive solution for collecting, analyzing, and acting on telemetry from your cloud and on-premises environments. It helps you understand how your applications and infrastructure are performing and proactively identifies issues. - **Collect:** Gathers metrics, logs, and traces from various sources. - **Analyze:** Uses Log Analytics and Metrics Explorer for data visualization and querying. - **Respond:** Alerts, auto-scaling, and integration with ITSM tools. - **Visualize:** Workbooks, Dashboards, and Power BI integration. ### Key Data Types in Azure Monitor Azure Monitor primarily deals with two fundamental types of observational data: - **Metrics:** Numerical values that describe some aspect of a system at a particular point in time. They are lightweight and support near real-time scenarios. - *Use cases:* CPU utilization, network I/O, request rates, error counts. - *Storage:* Time-series database. - **Logs:** Event data that is structured or unstructured, recorded at specific times. They are rich and detailed, providing context for issues. - *Use cases:* Application traces, system events, access logs, performance counters. - *Storage:* Log Analytics Workspace (Kusto Query Language - KQL). ### Monitoring Components Azure Monitor integrates with various Azure services to provide end-to-end monitoring. #### 1. Application Insights - **Purpose:** Application Performance Management (APM) for live web apps. - **Collects:** Request rates, response times, failure rates, dependencies, traces, exceptions. - **Supports:** .NET, Java, Node.js, Python, JavaScript, and more. - **Key Features:** Smart Detection, Live Metrics Stream, Profiler, Snapshot Debugger. #### 2. Log Analytics - **Purpose:** Centralized log collection, storage, and querying. - **Language:** Kusto Query Language (KQL) for powerful data analysis. - **Collects:** Logs from VMs, containers, Azure resources, custom logs, security events. #### 3. Metrics Explorer - **Purpose:** Visualize and analyze numerical metric data. - **Features:** Create charts, apply filters, split by dimensions, apply aggregations. - **Data Sources:** Platform metrics (Azure resources), custom metrics (Application Insights, custom). #### 4. Alerts - **Purpose:** Proactively notify you of critical conditions. - **Types:** Metric alerts, log alerts, activity log alerts, smart detection alerts. - **Actions:** Email, SMS, push notification, webhook, Azure Function, Logic App, ITSM. ### Azure Workbooks: Interactive Reports Azure Workbooks provide a flexible canvas for data analysis and the creation of rich visual reports within the Azure portal. - **Interactive:** Combine text, analytics queries, metrics, and parameters into rich interactive reports. - **Data Sources:** Azure Monitor Logs, Metrics, Azure Resource Graph, Azure Data Explorer, and custom JSON. - **Templates:** Use pre-built templates or create custom workbooks. - **Parameters:** Allow users to interact with the workbook by selecting subscriptions, resources, time ranges, etc. - **Use Cases:** Troubleshooting guides, operational playbooks, post-mortem analysis, performance dashboards. ### Azure Dashboards: Quick Overviews Azure Dashboards provide a consolidated view of your Azure resources, allowing you to monitor multiple resources and services in a single pane. - **Customizable:** Pin metrics charts, log query results, resource health, and more. - **Sharing:** Share dashboards with other users in your organization. - **Focus:** Designed for quick operational overviews and status checks. ### Monitoring Specific Services #### 1. Applications (Web Apps / APIs) - **Tool:** Application Insights (APM). - **Metrics:** Request duration, failure rate, server response time, dependency calls. - **Logs:** Traces, exceptions, custom events, dependency calls. - **Key KQL Queries:** ```kusto requests | summarize Requests = count(), AvgDuration = avg(duration) by bin(timestamp, 5m), appName | render timechart exceptions | summarize Exceptions = count() by outerMessage | top 10 by Exceptions ``` #### 2. Storage Accounts - **Tool:** Azure Monitor Metrics & Logs. - **Metrics:** Transaction count (success/failure), latency (end-to-end/server), ingress/egress. - **Logs:** Storage Analytics logs (detailed request information). - **Key KQL Queries:** ```kusto AzureMetrics | where ResourceProvider == "MICROSOFT.STORAGE" and MetricName == "Transactions" | summarize TotalTransactions = sum(Total) by bin(TimeGenerated, 1h) | render timechart StorageBlobLogs | where OperationName == "GetBlob" and StatusCode != 200 | summarize count() by ClientIp, Uri, StatusCode ``` #### 3. SQL Database (Azure SQL DB / SQL MI) - **Tool:** Azure Monitor Metrics & Logs, SQL Analytics (pre-built solution). - **Metrics:** DTU/CPU/IO/Log IO utilization, deadlocks, successful/failed connections. - **Logs:** SQL Audit Logs, Diagnostic Logs (query store runtime stats, wait statistics). - **Key KQL Queries:** ```kusto AzureMetrics | where ResourceProvider == "MICROSOFT.SQL" and MetricName == "cpu_percent" | summarize MaxCPU = max(Average) by bin(TimeGenerated, 1h) | render timechart AzureDiagnostics | where ResourceProvider == "MICROSOFT.SQL" and Category == "QueryStoreRuntimeStatistics" | project TimeGenerated, query_text_s, avg_duration_s, avg_cpu_time_s | order by TimeGenerated desc ``` ### Hard Observability Concepts #### 1. Distributed Tracing (Epic) - **Concept:** Tracks a single request as it flows through multiple services in a distributed system. - **Tools:** Application Insights (using OpenTelemetry or custom instrumentation). - **Benefits:** Pinpoint performance bottlenecks, identify points of failure across microservices. - **Correlation:** Automatically correlates requests, dependencies, and logs across services. #### 2. Custom Metrics & Logs - **Concept:** Instrument your application code to emit custom metrics and logs tailored to your business logic. - **Metrics:** Use `TelemetryClient.TrackMetric()` in Application Insights or send to Azure Monitor directly. - **Logs:** Use `TelemetryClient.TrackTrace()`, `TrackException()`, or integrate with logging frameworks (e.g., Serilog, NLog) that send to Application Insights/Log Analytics. - **Benefits:** Deeper insights into application-specific behavior and business KPIs. #### 3. Proactive Monitoring & Smart Detection - **Concept:** Azure Monitor uses machine learning to automatically detect performance anomalies and failures. - **Application Insights Smart Detection:** Automatically warns you of unusual increases in failure rates, performance degradation, or memory leaks. - **Alerts:** Configure alerts based on dynamic thresholds or baselines to catch subtle shifts in behavior. #### 4. Cost Management for Monitoring - **Concept:** Monitoring data ingestion and retention can incur significant costs. - **Strategies:** - **Sampling:** Reduce telemetry volume for Application Insights. - **Data Retention:** Configure appropriate retention periods for Log Analytics workspaces. - **Data Volume Optimization:** Filter out noisy logs, aggregate metrics before ingestion. - **Pricing Tiers:** Choose suitable pricing tiers for Log Analytics and Application Insights.