diff --git a/content/en/profiler/connect_traces_and_profiles.md b/content/en/profiler/connect_traces_and_profiles.md index e62c64c73da34..b11a51387c4cc 100644 --- a/content/en/profiler/connect_traces_and_profiles.md +++ b/content/en/profiler/connect_traces_and_profiles.md @@ -21,7 +21,7 @@ You can move directly from span information to profiling data on the Code Hotspo ## Identify code hotspots in slow traces -{{< img src="profiler/code_hotspots_tab-2.mp4" alt="Code Hotspots tab shows profiling information for a APM trace span" video=true >}} +{{< img src="profiler/code_hotspots_tab.png" alt="Code Hotspots tab shows profiling information for a APM trace span" >}} ### Prerequisites @@ -179,7 +179,7 @@ Click the plus icon `+` to expand the stack trace to that method **in reverse or ### Span execution timeline view -{{< img src="profiler/code_hotspots_tab-timeline.mp4" alt="Code Hotspots tab has a timeline view that breakdown execution over time and threads" video=true >}} +{{< img src="profiler/code_hotspots_tab-timeline.png" alt="Code Hotspots tab has a timeline view that breakdown execution over time and threads" >}} The **Timeline** view surfaces time-based patterns and work distribution over the period of the span. @@ -238,9 +238,9 @@ Lanes on the top are runtime activities that may add extra latency to your reque ### Viewing a profile from a trace -{{< img src="profiler/flamegraph_view-1.mp4" alt="Opening a view of the profile in a flame graph" video=true >}} +{{< img src="profiler/view_profile_from_trace.png" alt="Opening a view of the profile in a flame graph" >}} -For each type from the breakdown, click **View In Full Page** to see the same data opened up in a in a new page . From there you can change visualization to the flame graph. +For each type from the breakdown, click **Open in Profiling** to see the same data opened up in a new page. From there, you can change the visualization to a flame graph. Click the **Focus On** selector to define the scope of the data: - **Span & Children** scopes the profiling data to the selected span and all descendant spans in the same service. @@ -320,7 +320,7 @@ With endpoint profiling you can: - Isolate the top endpoints responsible for the consumption of valuable resources such as CPU, memory, or exceptions. This is particularly helpful when you are generally trying to optimize your service for performance gains. - Understand if third-party code or runtime libraries are the reason for your endpoints being slow or resource-consumption heavy. -{{< img src="profiler/endpoint_agg.mp4" alt="Troubleshooting a slow endpoint by using endpoint aggregation" video=true >}} +{{< img src="profiler/endpoint_agg.png" alt="Troubleshooting a slow endpoint by using endpoint aggregation" >}} ### Surface code that impacted your production latency @@ -346,9 +346,9 @@ The following image shows that `GET /store_history` is periodically impacting th Select `Per endpoint call` to see behavior changes even as traffic shifts over time. This is useful for progressive rollout sanity checks or analyzing daily traffic patterns. -The following video shows that CPU per request doubled for `/GET train`: +The following example shows that CPU per request increased for `/GET train`: -{{< img src="profiler/endpoint_per_request.mp4" alt="Troubleshooting a endpoint that started using more resource per request" video=true >}} +{{< img src="profiler/endpoint_per_request2.mp4" alt="Troubleshooting a endpoint that started using more resource per request" video="true" >}} ## Further reading diff --git a/content/en/profiler/guide/isolate-outliers-in-monolithic-services.md b/content/en/profiler/guide/isolate-outliers-in-monolithic-services.md index 87c29350c1ff3..45cd442bbfc68 100644 --- a/content/en/profiler/guide/isolate-outliers-in-monolithic-services.md +++ b/content/en/profiler/guide/isolate-outliers-in-monolithic-services.md @@ -22,18 +22,18 @@ This guide describes how to use the Datadog Continuous Profiler to investigate t The first step in a performance investigation is to identify anomalies in resource usage over time. Consider the following graph of CPU utilization over the past hour for the service `product-recommendation`: -{{< img src="profiler/guide-monolithic-outliers/1-outliers-monolith-cpu-usage.png" alt="" style="width:100%;" >}} +{{< img src="profiler/guide-monolithic-outliers/1-outliers-monolith-cpu-usage-2.png" alt="" style="width:100%;" >}} This doesn't provide the exact root cause, but you can see anomalous peaks in CPU usage. Select the **Show - Avg of** dropdown (highlighted in the previous image) and change the graph to show `CPU Cores for Top Endpoints` instead. This graph shows how different parts of the application contribute to the overall CPU utilization: -{{< img src="profiler/guide-monolithic-outliers/2-outliers-monolith-cpu-top-endpoints.png" alt="" style="width:100%;" >}} +{{< img src="profiler/guide-monolithic-outliers/2-outliers-monolith-cpu-top-endpoints-2.png" alt="" style="width:100%;" >}} The yellow peaks indicate that the `GET /store_history` endpoint has some intermittent usage corresponding to the anomalies identified earlier. However, the peaks might be due to differences in traffic to that endpoint. To understand if profiles can provide further insights, change the metric to `CPU - Average Time Per Call for Top Endpoints`: -{{< img src="profiler/guide-monolithic-outliers/3-outliers-monolith-cpu-avg-time-per-call.png" alt="" style="width:100%;" >}} +{{< img src="profiler/guide-monolithic-outliers/3-outliers-monolith-cpu-avg-time-per-call-2.png" alt="" style="width:100%;" >}} The updated graph reveals that there is an intermittent spike in CPU utilization where each call to `GET /store_history` takes on average three seconds of CPU time. This suggests the spikes aren't due to an increase in traffic, but instead an increase in CPU usage per request. @@ -42,7 +42,7 @@ The updated graph reveals that there is an intermittent spike in CPU utilization To determine the cause of increased CPU usage each time `GET /store_history` is called, examine the profiling flame graph for this endpoint during one of the spikes. Select a time range where `GET /store_history` is showing more CPU utilization and scope the profiling page to that time range. Then switch to the **Flame Graph** visualization to see the methods using the CPU at this time: -{{< img src="profiler/guide-monolithic-outliers/4-outliers-monolith-flame-graph.png" alt="Your image description" style="width:100%;" >}} +{{< img src="profiler/guide-monolithic-outliers/4-outliers-monolith-flame-graph-2.png" alt="Your image description" style="width:100%;" >}} To better understand why the `GET /store_history` endpoint is using more CPU, refer to the table highlighted in the previous image, where the endpoint is second from the top. Select that row to focus the flame graph on the CPU utilization caused by the `GET /store_history` endpoint. @@ -56,7 +56,7 @@ To see if there are differences in which methods are using a lot of CPU time bet The view shows two graphs, labeled **A** and **B**, each representing a time range for CPU utilization per `GET /store_history` call. Adjust the time selector for **A** so that it is scoped to a period with low CPU utilization per call: -{{< img src="profiler/guide-monolithic-outliers/5-outliers-monolith-compare-flame-graphs.png" alt="Your image description" style="width:100%;" >}} +{{< img src="profiler/guide-monolithic-outliers/5-outliers-monolith-compare-flame-graphs-2.png" alt="Your image description" style="width:100%;" >}} The comparison reveals the different methods causing CPU utilization during the spike (timeframe **B**) that are not used during normal CPU usage (timeframe **A**). As shown in the previous image,`Product.loadAssets(int)`, is causing the spikes. @@ -70,7 +70,7 @@ There are other attributes available in the profiler. For example, you can filte The APM `Trace operation` attribute lets you filter and group a flame graph with the same granularity as the traces for the selected endpoints. This is a good balance between the high granularity of threads or methods, and the low granularity of entire endpoints. To isolate operations, select `Trace Operation` from the **CPU time by** dropdown: -{{< img src="profiler/guide-monolithic-outliers/7-outliers-monolith-trace-operation.png" alt="Your image description" style="width:100%;" >}} +{{< img src="profiler/guide-monolithic-outliers/7-outliers-monolith-trace-operation-2.png" alt="Your image description" style="width:100%;" >}} In the previous image, notice that the `ModelTraining` operation is taking more CPU time than its primary use in the `GET /train` endpoint, so it must be used elsewhere. Click the operation name to determine where else it is used. In this case, `ModelTraining` is also use by `POST /update_model`. diff --git a/content/en/profiler/profile_visualizations.md b/content/en/profiler/profile_visualizations.md index 8eb11d4a9d8de..1f68f816fb3b9 100644 --- a/content/en/profiler/profile_visualizations.md +++ b/content/en/profiler/profile_visualizations.md @@ -19,7 +19,7 @@ further_reading: ## Search profiles -{{< img src="profiler/search_profiles2.mp4" alt="Search profiles by tags" video=true >}} +{{< img src="profiler/search_profiles3.mp4" alt="Search profiles by tags" video=true >}} Go to **APM -> Profiles** and select a service to view its profiles. Select a profile type to view different resources (for example, CPU, Memory, Exception, and I/O). diff --git a/static/images/profiler/code_hotspots_tab-timeline.png b/static/images/profiler/code_hotspots_tab-timeline.png new file mode 100644 index 0000000000000..4425c8971fc62 Binary files /dev/null and b/static/images/profiler/code_hotspots_tab-timeline.png differ diff --git a/static/images/profiler/code_hotspots_tab.png b/static/images/profiler/code_hotspots_tab.png new file mode 100644 index 0000000000000..07a8855dbc414 Binary files /dev/null and b/static/images/profiler/code_hotspots_tab.png differ diff --git a/static/images/profiler/endpoint_agg.png b/static/images/profiler/endpoint_agg.png new file mode 100644 index 0000000000000..6472b99fb327d Binary files /dev/null and b/static/images/profiler/endpoint_agg.png differ diff --git a/static/images/profiler/endpoint_per_request2.mp4 b/static/images/profiler/endpoint_per_request2.mp4 new file mode 100644 index 0000000000000..9df7fc189cdb3 Binary files /dev/null and b/static/images/profiler/endpoint_per_request2.mp4 differ diff --git a/static/images/profiler/guide-monolithic-outliers/1-outliers-monolith-cpu-usage-2.png b/static/images/profiler/guide-monolithic-outliers/1-outliers-monolith-cpu-usage-2.png new file mode 100644 index 0000000000000..8bfff8fcdd758 Binary files /dev/null and b/static/images/profiler/guide-monolithic-outliers/1-outliers-monolith-cpu-usage-2.png differ diff --git a/static/images/profiler/guide-monolithic-outliers/2-outliers-monolith-cpu-top-endpoints-2.png b/static/images/profiler/guide-monolithic-outliers/2-outliers-monolith-cpu-top-endpoints-2.png new file mode 100644 index 0000000000000..6c631884d29c0 Binary files /dev/null and b/static/images/profiler/guide-monolithic-outliers/2-outliers-monolith-cpu-top-endpoints-2.png differ diff --git a/static/images/profiler/guide-monolithic-outliers/2-outliers-monolith-cpu-top-endpoints.png b/static/images/profiler/guide-monolithic-outliers/2-outliers-monolith-cpu-top-endpoints.png index dd1f42ca5fae4..b1bafa890e58e 100644 Binary files a/static/images/profiler/guide-monolithic-outliers/2-outliers-monolith-cpu-top-endpoints.png and b/static/images/profiler/guide-monolithic-outliers/2-outliers-monolith-cpu-top-endpoints.png differ diff --git a/static/images/profiler/guide-monolithic-outliers/3-outliers-monolith-cpu-avg-time-per-call-2.png b/static/images/profiler/guide-monolithic-outliers/3-outliers-monolith-cpu-avg-time-per-call-2.png new file mode 100644 index 0000000000000..b620b49fd12d3 Binary files /dev/null and b/static/images/profiler/guide-monolithic-outliers/3-outliers-monolith-cpu-avg-time-per-call-2.png differ diff --git a/static/images/profiler/guide-monolithic-outliers/4-outliers-monolith-flame-graph-2.png b/static/images/profiler/guide-monolithic-outliers/4-outliers-monolith-flame-graph-2.png new file mode 100644 index 0000000000000..8a16923f03588 Binary files /dev/null and b/static/images/profiler/guide-monolithic-outliers/4-outliers-monolith-flame-graph-2.png differ diff --git a/static/images/profiler/guide-monolithic-outliers/5-outliers-monolith-compare-flame-graphs-2.png b/static/images/profiler/guide-monolithic-outliers/5-outliers-monolith-compare-flame-graphs-2.png new file mode 100644 index 0000000000000..1854d739e7e6a Binary files /dev/null and b/static/images/profiler/guide-monolithic-outliers/5-outliers-monolith-compare-flame-graphs-2.png differ diff --git a/static/images/profiler/guide-monolithic-outliers/7-outliers-monolith-trace-operation-2.png b/static/images/profiler/guide-monolithic-outliers/7-outliers-monolith-trace-operation-2.png new file mode 100644 index 0000000000000..f4b97a6aa6501 Binary files /dev/null and b/static/images/profiler/guide-monolithic-outliers/7-outliers-monolith-trace-operation-2.png differ diff --git a/static/images/profiler/search_profiles3.mp4 b/static/images/profiler/search_profiles3.mp4 new file mode 100644 index 0000000000000..24a923eafd15b Binary files /dev/null and b/static/images/profiler/search_profiles3.mp4 differ diff --git a/static/images/profiler/view_profile_from_trace.png b/static/images/profiler/view_profile_from_trace.png new file mode 100644 index 0000000000000..68bfb481ff172 Binary files /dev/null and b/static/images/profiler/view_profile_from_trace.png differ