Near Critical Path Analysis of Program Activity Graphs

Cedell Alexander, Donna Reese, and James Harden

NSF Engineering Research Center for Computational Field Simulation
Mississippi State University

Abstract
Program activity graphs can be constructed from time-stamped
traces of appropriate execution events.  Information
 about the activities on the k longest execution paths is
useful in the analysis of parallel program performance.
this paper, four algorithms for finding the nearcritical
paths of program activity graphs are presented and
compared, including an efficient new algorithm that utilizes
slack values calculated by the critical path method to
perform a bestfirst search in linear space.  The worst
case time and memory requirements of the new algorithm are in
O(ke) and O(k+e), where e is the number of edges in the
graph.  Results confirming the efficiency of the algorithm
are presented for five application programs.  A framework
for utilizing the nearcritical path information is also described.
The framework includes both statistical summaries
and visualization capabilities.

[3]  B. P. Miller et al., IPS2:  The second generation of a parallel
program measurement system, IEEE Trans. Parallel Distrib.
Syst., vol. 1, no. 2, pp. 206217, Apr. 1990.
[4]  E. W. Dijkstra and C. S. Scholten, Termination detection for
diffusing computations, Inform. Processing Lett., vol. 11,
no. 1, pp. 14, Aug. 1980.
[5]  C. Q. Yang and B. P. Miller, Performance measurement of
parallel and distributed programs:  A structural and automatic
approach, IEEE Trans. Software Eng., vol. 15, no. 12, pp.
16151629, Dec. 1989.
[6]  J. D. Wiest and F. K. Levy, A Management Guide to PERT/
CPM.  Englewood Cliffs, NJ:  PrenticeHall, 1977.
[7]  J. E. Hopcroft and R. E. Tarjan, Efficient algorithms for
graph manipulation, Commun. ACM, vol. 16, no. 6, pp.
372378, Jan. 1973.
[8]  E. Horowitz and S. Sahni, Fundamentals of Data Structures,
Rockville, MD:  Computer Science Press, 1983.
[9]  E. Rich, Artificial Intelligence.  New York:  McGrawHill,
1983.
[10] W. Zhang and R. E. Korf, An averagecase analysis of
branchandbound with applications:  Summary of results,
in Proc. 10th Nat. Conf. AI, AAAI Press, July 1216, 1992,
pp. 545550.
[11] C.Q. Yang and B. P. Miller, Critical path analysis for the
execution of parallel and distributed programs, in Proc. 8th
Int. Conf. Distrib. Computing Syst., IEEE Comput. Soc., June
1988, pp. 366375.  
[12] R. E. Korf, Linearspace bestfirst search:  Summary of results,
in Proc. 10th Nat. Conf. AI, AAAI Press, July 1216,
1992, pp. 533538.
[13] G. Brassard and P. Bratley, Algorithmics Theory and Practice.
Englewood Cliffs, NJ:  PrenticeHall, 1988.
[14] M.D. Atkinson, J.R. Sack, N. Santoro, and T. Strothotte,
Minmax heaps and generalized priority queues, Commun.
ACM, vol. 29, no. 10, pp. 9961000, Oct. 1986.
[15] S. Carlsson, Deap  A doubleended heap to implement
doubleended priority queues,  Inform. Processing Lett., vol.
26, no. 1, pp. 3336, Sept. 15, 1987.
[16] S.H. Yen, D. H. Du, and S. Ghanta, Efficient algorithms for
extracting the k most critical paths in timing analysis, in
Proc. 26th ACM/IEEE Design Automation Conf., June
2529, 1989, pp. 649654.
[17] Y.C. Ju and R. A. Saleh, Incremental techniques for the
identification of statically sensitizable critical paths, in Proc.
28th ACM/IEEE Design Automation Conf., June 1721,
1991, pp. 541546.
[18] C. A. Alexander, Nearcritical path algorithms for program
activity graphs, Ph.D. dissertation, Dept. of Comput. Engr.,
Mississippi State Univ., May 1994.
[19] J. C. Harden et al., A performance monitor for the MSPARC
multicomputer, in Proc. IEEE Southeastcon'92, Apr. 1215,
1992, pp. 724729.
[20] D. Reese and E. Luke, Object oriented fortran for development
 of portable parallel programs, in Proc. 3rd IEEE Symp.
Parallel and Distrib. Syst., Dec. 25, 1991, pp. 608615.
[21] A. Mink, R. Carpenter, G. Nacht, and J. Roberts, Multiprocessor
 performance measurement instrumentation, IEEE
Computer, pp. 6375, Sept. 1990.
[22] D. Bailey, J. Barton, T. Lasinski, and H. Simon, ed., The
NAS parallel benchmarks, Tech. Rep. RNR91002, NASA
Ames Research Center, Aug. 1991.
[23] D. L. Whitfield and J. M. Janus, Three dimensional unsteady
euler equations solutions using flux vector splitting, 17th     
paper no. 841552, June 2527, 1984.
[1]  B. P. Miller and C.Q. Yang, IPS:  An interactive and auto
programs, in Proc. 7th Int. Conf. Distrib. Computing       
Syst., IEEE Comput. Soc., Sept. 2125, 1987, pp. 482489.   
[24] G. J. Henley and J. M. Janus, Parallelization and convergence
of a 3D implicit unsteady turbomachinery flow code, in
Proc. 5th SIAM Conf. Parallel Processing for Scientific Computing, 
March 2527, 1991, pp. 238245.
[2]  D. A. Reed et al., The Pablo performance analysis environment,
Tech. Rep., Dep. of Comput. Sci., Univ. of Illinois, Nov. 1992.                                                       