IBM Shared University Research Program

(SUR)


University of Florida 1997 Proposal:

Interoperability Performance Management and Analysis of a Network Core Upgrade



SUR Major Research Area:  Communications Technology

Research Focus Areas:  Service Quality & Internet 2



By:


Richard Newman,  Ph.D.,  Principal Investigator
Assistant Professor of Computer and Information Science and Engineering
University of Florida

Daniel Miller
Network Coordinator
Northeast Regional Data Center
University of Florida

Randy Chow,  Ph.D.
Professor of Computer and Information Science and Engineering
University of Florida

Richard A. Elnicki,   D.B.A.
Professor of Decision and Information Sciences
University of Florida



Last Revised: February 27, 1998




Interoperability Performance Management and Analysis of a Network Core Upgrade

PROJECT DESCRIPTION



     Network demands at the University of Florida (UF) have grown at or near exponential rates from their inception on the UF campus.   The UF's network reaches all corners of this large campus area.   Measured in land area, the campus is the second largest university in the United States.   The UF is third in the United States in terms of the number of degree majors available to undergraduate and graduate students.   Fall Semester 1997 was the first time the UF exceeded the 42,000 student mark -- 42,029 students were enrolled at the end of the drop/add period.

     Student use of the UF network has increased significantly over the last year.   This was primarily due to a decision by President John Lombardi that each UF student's electronic "rights" will include a permanent userID.   Students can register through the Web, have access to individual records on line via the Web, and have fifteen "free" hours of dial-up time per month to access the UF network and the Web.   As of the third week in February, 1998, just over 38,000 students had registered for these electronic rights by getting a GatorLink userID.  One measure of their increased use of the media was that about 9,500 of these students dialed into and through the UF network an average of 8.9 hours in January, 1998.   Two years earlier, 1,701 students used the UF's dial-up service.   This was a 5.6-fold increase in two years!  

     While no formal survey has been taken on this question, observation indicates almost all -- as of February, 1997 -- 4,038 UF faculty, 1,085 UF administrative and professional, and 5,562 "USPS" full-time-equivalent employees in office or information processing positions have micros on LANs used in their ongoing daily activities.   The UF will eventually have all its administrative and office procedures on 2-, 3-, or N-tier client/server applications.   The structure of a number of the current applications can be seen here.   These 10,685 employees are increasing their demands on the UF's network on the job days and at home nights.   The "at home nights" demand increases have also grown Significantly.   These users averaged 15.0 hours per month in January, 1996.   By January, 1998 these users averaged 20.2 hours of connect time per month.

     Demands on the UF network from users located other places in the State of Florida have also increased significantly. The State University System's (SUS) Florida Center for Library Automation ( FCLA) holds the "card catalogs" and an ever-growing inventory of on-line documents for all libraries in the SUS.   Users' terminal input/output on this system are growing at 96 million 2K bytes per month.   The UF network is a major site in the Florida Information Resource Network ( FIRN).  It, "...is an extensive telecommunications network accessible to all of Florida's public educators.   The fundamental goal of FIRN is to provide these educators with access to the computing resources which serve public education." (Department of Education, FIRN Yellow Pages, January 14, 1997, Page 2).   FIRN currently provides free dial access via 50 local calling areas throughout Florida and toll-free 800 access.  

     The UF has long supported national and international networking.   It was the 74th member of the original BITNET.   It is a member of the Internet 2 project.   The UF recently received a $350,000 grant the National Science Foundation (NSF) to support the UF's connection to the NSF's very high performance Backbone Network Service (vBNS); the connection site will be Georgia Tech.

     Many faculty and other users are awaiting the enhanced capacity Internet 2 will bring that will enable full-function real-time on-line audio and video services.   The College of Business at the UF will offer a "Flex" MBA program on the Web.   It will use multi-point on-line real-time audio and video connections to "hold class" when adequate network capacity is available to support this innovation in education.   The vBNS connection will enable UF scientists and engineers to collaborate with others across the country and share powerful computing and information resources via Internet 2.

     We believe the capacity of the existing FDDI ring-topology core has severely limited demand that would otherwise have taken place in the recent past.   This core is limited to a maximum of 100Mbps.   Utilization of the central FDDI ring is shown here, as is the utilization of the UF's 45 Mbps full-duplex link to the Internet via Jacksonville, FL.   We know of a number of cases where faculty and researchers did not attempt real-time audio and video applications, for example, because it was known, a priori, that performance would be so poor the result would not, in fact, be real-time response.

     Our estimate is that demand rates of up to 600 Mbps will occur with an alternative ATM 2-tier network hierarchy.   We are replacing the existing FDDI core with an ATM core using, initially, six IBM 8265 Nways ATM Switches.   The initial configuration of this core replacement, a partial mesh topology, is shown here.   It will give us OC12 throughput rates (622 Mbps) on the main paths that form the logical core replacement; they are shown in red. Some other major links on campus will be increased to give us OC3 rates (155 Mbps) where in the past we were limited to 100 Mbps.   The Internet 2 connection will give OC3 throughput.   The initial connection to vBNS will be at the OC12 level (NSF Advanced Networking Infrastructure and Research Division, NSF Approved 29 Connections to High-Performance Computer Network, Press Release, February 25, 1998).

     The upcoming core upgrade provides an excellent opportunity for research on

    • communications technology in general and

    • specific FDDI versus ATM core performance differences in a real production environment.
The management of the Northeast Regional Data Center (NERDC), the organization that will install, operate and maintain the ATM core, has agreed to the following plan to do controlled systematic research on the effects of the core change on the network core performance.   A set of performance tests will be performed with the current FDDI core in place and then repeated with the proposed ATM core in production mode.

A. FDDI Tests:

  1. A number (N > 30) of times during the normal weekday work periods will be randomly chosen by Dr. Newman, Dr. Elnicki, and Mr. Miller.   Total system traffic and performance measures will be taken at each of these time marks. The measures will be structured to permit analysis in performance models structured by Dr. Newman.  A typical work week will be chosen for these FDDI tests based on historical utilization data maintained by Dr. Elnicki.  Dr. Newman, Dr. Elnicki, and Mr. Miller will take the FDDI measures.

    Here (A.1), we will use standard volumetric models (bytes, packets) over various measurement interval sizes, and then determine the Hurst constant associated with the traffic seen (characterizes self-similar traffic).  In addition, if connection-based information is available, we will determine interarrival distribution, as well as volume (number of bytes, number of packets), length (in time), packet size, and packet interarrival time distributions for connections.

    We have found in the past that characterization of traffic at this higher level better represents behavior and explains what is seen at the strict volumetric level.  These characterizations will assist us in creating traffic generators that can simulate heavy loads, and load having different application mixes. In addition, measures that will be taken for A.2 will be taken during normal traffic to establish a baseline for them.

  2. Dr. Newman and Mr. Miller will create a specific set of test traffic conditions to be run when the system is otherwise almost completely idle.   This has historically been early Sunday mornings.   This part of the research is referred to as A.2 below.

    The test traffic will permit performance measures that include delay and response time, as well as jitter. Jitter is important for asynchronous traffic. These will be measured between a wide sample of different points in the network, and will allow the characterization of delay attributable to the switches/routers and the intervening networks themselves.

  3. Dr. Newman and Mr. Miller will create a set of traffic conditions designed to severely stress the system throughput.   These stress tests (A.3) will first use sheer volumetric overloading (traffic synthesis at peak rates for some number of hosts), using varying packet sizes (this allows us to distinguish byte volume from packet volume effects).   In addition, traffic adhering to more realistic models, such as self-similar traffic using the Hurst parameters ascertained in A.1 and connection-based traffic parameters also determined in A.1, will be generated to estimate the amount of traffic (bytes, packets, connections of various types) needed to cause the network to perform at various levels.

    Rather than tuning the traffic synthesis to produce a "desired level of inoperbility," several levels will be used to generate graphs depicting the response of the network to the varying loads and load types.  In this manner, "knees" and other sensitivities of the network to load may be discovered.

B. ATM Tests:      Mr. Miller and other members of his staff will determine when the ATM core and the UF network in general are in a stable production mode.  After production-level operations are achieved, the ATM tests will be taken.  Thus, both sets of test will be made when production- level operations exist to help assure maximum comparability of our results.
  1. The same types of monitoring and parameter extraction will be done here (B.1) as was done for A.1, as described above.  By using a variety of measurement times, the samples may be compared statistically to determine if they are essentially the same.   We expect them to be the same, since we do not expect the change in core routing facilities to significantly change traffic in the short run.  

    Measures taken for A.2 as described above will again be taken during normal traffic after ATM installation for B.1.  The A.1 versus B.1 results will be compared with compensation for differences in traffic levels as necessary (in case we are wrong and do find significant differences in traffic by the time the network with the ATM core is brought to production mode).

  2. The idle time performance measures taken with the network on the ATM core (B.2) will be the same as for A.2, as described above.  The two sets of results will then be compared.

  3. The same set of stress tests will be run with the network on the ATM core (B.3).  But, we expect to have to extend the peak loads to a higher level in order to characterize fully the network's performance.   The performance measures will be graphed against load in order to determine critical load levels and sensitivities of the network under various types of traffic.   These will then be compared to those found for the network with the FDDI core, as described above in A.3.
     Statistical analysis will be used by Dr. Newman, Dr. Chow, and Dr. Elnicki to determine the differences in the performance of the network system with the FDDI core versus the ATM core.  We anticipate that one or more papers will be written using the results of the research.  They will be submitted to appropriate academic, professional, and /or application journals.
    • Performance measures taken on the production systems at the times during the typical work week days will be analyzed to determine whether significant differences exist and the extent to which those differences were due to varying traffic as compared to the change in the core technology.

    • The performance differences with the specifically structured fixed set of traffic conditions run on the otherwise idle systems will provide a metric, albeit artificial, on relative performance.

    • The relative differences in stress traffic levels of the FDDI system as compared to the ATM system will give a metric of the increase in the realistic capacity achieved by upgrading from a 100 Mbps FDDI core to a 622 Mbps ATM core.

PROJECT WEB SITE

     The location where this is being accessed,

                  http://nersp.nerdc.ufl.edu/~dicke/sur/

will be used to provide work-in-progress on this research.  It is a site provided by the management of the NERDC and will remain available via the Web until the research is completed.   It will be removed only after results of the research are otherwise available in some published media.

PROJECT VALUE & SIGNIFICANCE

     The upgrade of the UF's network core provides a relatively rare opportunity for the network's manager and researchers of network performance.

     It provides a setting where a true production environment can be used to do controlled, systematic testing.   The manager of the network is available to participate in the research.   It is research that will help him maintain and operate the network in the future.   Specifically, the research should enable him to determine what levels of traffic will stress the new capacity.   When users approach him with some new application that will add demonstrably to total traffic, he will be able to respond to the request with more certainty about potential results.   Without this systematic study of the differential in service potential, he would have to respond to such requests with the only alternatives available: best guesses and pure intuition.  

     The researchers also benefit from the opportunity to do this research in a true production setting.   All too often, research must take place in the artificial confines of a laboratory setting.   While this provides the ideal setting for the control aspects of research, it is still research in an artificial setting.   It is often all too true that the artificial setting is too removed from the realities of the real work production setting where the test results are intended to apply.

     This proposed research is a merger of two desirable results.   Tests that provide real metrics on the operation of a network system, its capabilities and its limits, metrics that will be useful in the operation and management of the network.   The tests also permit the creation of a sufficiently controlled research process where meaningful inferences can be drawn about the general impact of this change in communications technology.


Last Revision, February 27, 1998, by Dick Elnicki.