How do I monitor network communication?

Network monitoring with Dynatrace

Dynatrace infrastructure monitoring offers more than visibility into hosts and processes. With network communication monitoring, Dynatrace also gives you insight into the quality of the communication between your hosts and the processes that run on them. It isn't enough to know that a process has sufficient server resources and responds in a timely manner—you also need assurance that your processes are clearly communicating their responses to calling parties and have uninterrupted access to all required resources. You also need to know which processes are consuming your network resources. Such network communication insight can be gained by monitoring the data packets that are exchanged between processes and the hosts they run on.

Enabling network monitoring

Network monitoring of all hosts in your environment is enabled by default. You can however disable and re-enable network monitoring for individual hosts by going to Settings > Monitoring overview > Host settings and clicking the Network quality metrics switch.

Analyzing network health

By default, your homepage includes a tile that shows you three key overall network health metrics: Traffic, Retransmissions and Connectivity.

Network tile on homepage

Click the Network tile to go to your Hosts page.

Hosts page

Click a host listed on the Hosts page to view performance details of that host.

Host performance details

Click the Consuming processes button to go to the selected host's Processes page to view the list of processes running on the selected host. With network monitoring enabled you'll see three new columns: Traffic, Retransmissions and Connectivity.

Select an individual process to highlight that process' contribution to the overall value of the metric displayed in the chart above.

Network measurements for individual processes

Note that Dynatrace monitors only selected processes, so it's expected that on some hosts that metric breakdowns won't add up to 100%.

What are data retransmissions?

When a network link or segment is overloaded or under performing, it drops data packets. This is because overloaded network equipment queues are purged during periods of excessive traffic or limited hardware resources. In response, TCP protocol mechanisms attempt to fix the situation by re-transmitting the dropped packets. Such retransmissions are detected by Dynatrace and displayed on all relevant Host and Process pages and Quality tabs.

Unnaturally high Retransmissions value for a process

Ideally, retransmission rates should not exceed 0.5% on local area networks and 2% in Internet or cloud based networks. Retransmission rates above 3% will negatively affect user experience in most modern applications. Retransmission issues are especially noticeable by customers using mobile devices in poor network coverage areas.

TCP connection time-out errors

Overloaded or poorly configured processes can have trouble accepting new network connections. This results in timeouts or resets of TCP handshakes. Such issues are tracked as TCP connection refused and TCP connection timeout errors.

Dynatrace also compares the number of such errors with the total number of connection attempts to calculate Connectivity metrics—the percentage of connections that have been successfully established. Ideally, Connection metrics are never lower than 100%. Anything less suggests failed user actions that will be obvious to your customers.

Connectivity analysis

Network monitoring overhead

Overhead generated by network monitoring is negligible and varies based on the analyzed traffic volume. Dynatrace monitors the overhead generated by network monitoring. If overhead increases above 5% of available CPU, Dynatrace automatically disables network monitoring until traffic decreases.