Key Network Metrics
Bandwidth utilization: the percentage of link capacity currently in use. High utilization (>70–80%) is a leading indicator of congestion and performance degradation. Measured via SNMP interface counters or NetFlow. Sustained high utilization indicates the need for link upgrade or traffic optimization.
Latency and RTT: round-trip time measured by ICMP ping. Baseline latency for a local LAN: <1ms. LAN to internet: typically 10–100ms. Satellite: 500–600ms. Increasing latency indicates congestion or routing changes. Jitter (latency variation) is measured separately and impacts VoIP/video.
Packet loss: the percentage of packets sent that do not reach the destination. Any packet loss on a LAN indicates a problem (physical fault, duplex mismatch, congestion). For internet connections, < 1% is acceptable. > 1% impacts TCP performance (triggers retransmissions); > 3% severely impacts VoIP. Measured with extended ping or specialized tools.
Error counters: switch and router interfaces report error statistics — CRC errors (signal corruption / bad cable), runts (frames shorter than 64 bytes — collision artifact in half-duplex), giants (frames larger than 1518 bytes — misconfigured MTU), input/output errors. Increasing error counters indicate physical or configuration problems.
Device Performance Metrics
CPU utilization: high CPU on a router or switch can indicate: routing protocol convergence event, DDoS attack, excessive ACL processing, or failing hardware. Sustained CPU > 80% requires investigation. Memory utilization: insufficient memory causes device instability, route table truncation, or crashes. Monitor free memory trends.
Interface statistics: packets per second, bits per second, error rates, drops. Drops indicate congestion (output queue drops) or policy drops (ACL drops, QoS policing). Interface error counters that increment while the link is up indicate physical problems. Temperature sensors on managed devices alert to thermal issues that cause hardware degradation.