Overview
Distributed applications require information to effectively utilize the network. Some of the information they require is the current and maximum bandwidth, current and minimum latency, bottlenecks, burst frequency, and congestion extent. This type of information allows applications to determine parameters like optimal TCP buffer size. However, the network performance is not affected by just TCP buffer size. A number of factors can affect network performance. For example, incorrectly configured network elements (e.g., routers, switches) can degrade network performance significantly. Congested network condition on a router is another factor that can cause a bad cycle of traffic due to the quantity of retransmission. Also, the system limitation is a very common scenario to form a bottleneck. To identify what is the real problem causing the poor network utilization requires not only troubleshooting experience and expertise, but also a desired network analysis tool. Several network measurement tools are available to the public, however, all of these tools are limited in one or more of the following ways:
- targeted at a specific problem
- targeted at a specific platform
- measure neither static nor dynamic bandwidth
- cannot locate and distinguish bottleneck links
- cannot measure the high-speed link bandwidth accurately beyond a bottleneck link
- cannot measure high-speed links accurately at all because either the algorighm or the implementation or both do not consider the host capability effect to the measurement
- does not provide network analysis mechanism
How to characterize networks and make correct measurements are still questionable. Common issues are:
- What bandwidth do we need to measure? Static or Dynamic?
- Does the measurement tool need to use information from network elements (e.g., routers) via SNMP or MRTG, or require special privileges?
- Whether the measurement should be done hop-by-hop or end-to-end?
- Does the measurement require to access at both ends of a path? If not, does it use send-only or receiver-only?
- How asymmetric path affects the measurement?
- Does the measurement method saturate the path during the measurement? What is about the parallel-links (multi-channel)?
- Is the measurement from user level different from the kernel level? (timestamp issue)
- How is the measurement affected by cross traffic?
- Is the measurement affected by non responsive network elements (hidden switches, etc.)
- Can ICMP measure networks accurately? Can any tool measure high-speed networks accurately?
To satisfy applications demands, we developed a cooperative information-gathering tool called the network characterization service (NCS) that can solve above issues, provide user required network information and application program interface (API). NCS is implemented under Auto-Configuration System (ACS) and runs in user space so it is able to run any platform without router access privileges. Its protocol is designed for scalable and distributed deployment, similar to DNS. Its algorithms provide efficient, speedy and accurate detection of bottlenecks, especially dynamic bottlenecks. On current and future networks, dynamic bottlenecks do and will affect network performance dramatically.
Network measurement related background and issues for average users
Credits: The research and development of the Distributed Systems Department is funded by the U.S. Dept. of Energy, Office of Science, Office of Advanced Scientific Computing Research, Mathematical, Information, and Computational Sciences Division
If there is a problem with this page please, e-mail webmaster@www.dsd.lbl.gov.
This page last modified: Friday, 15-Apr-2005 13:54:06 PDT |