David Tarditi Design and Implementation of Code Optimizations for a Type-Directed Compiler for Standard ML Degree Type: Ph.D. in Computer Science Advisor(s): Peter Lee Graduated: May 1997 Abstract: The trends in software development are towards larger programs, more complex programs, and more use of programs as "component software." These trends mean that the features of modern programming languages are becoming more important than ever before. Programming languages need to have features such as strong typing, a module system, polymorphism, automatic storage management, and higher-order functions. In short, modern programming languages are becoming more important than ever before. Even though modern programming lanuages are becoming more important than ever before, programmers have traditionally faced a dilemma: programs written in these languages traditionally have had lower performance than programs written in more conventional, but error-prone languages. In this thesis, I study this problem in the context of one particular modern programming language, Standard ML. Standard ML contains all the language features mentioned previously and more. I use an empirical approach to understand where Standard ML programs spend their time and how to improve the performance of Standard ML programs through better optimization. The thesis contains two main results. First, I find that a "pay-as-you-go" compilation strategy, where programmers pay for advanced language features only when they use them, is a practical strategy for compiling Standard ML. In fact, this strategy produces better code overall than a strategy that makes advanced language features run fast at the expense of slowing down programs that do not use those language features. Second, I find that compilers for Standard ML should focus on generating good code for the frequently-executed parts of programs. Specifically, just as compilers for conventional languages such as C focus on generating good code for loops, compilers for lanuages such as Standard ML should focus on generating good code for recursive functions. These results suggest that compilation of modern programming languages such as Standard ML should have a great deal in common with compilation of more conventional languages such as C. First, Standard ML programs that do not use higher-order functions or polymorphism should run just as fast as comparable C programs. Second, Standard ML compilers should apply the same sets of optimizations to recursive functions that more conventional compilers apply to loops. These results also suggest that programmers should be able to avoid the dilemma mentioned earlier: they should be able to write their programs in modern languages such as Standard ML, confident that they can rewrite parts of the programs in a subset of Standard ML if necessary for efficiency. Thesis Committee: Peter Lee (Chair) Steven Lucco Nevin Heintze Simon Peyton-Jones (Glasgow University) James Morris, Head, Computer Science Department Raj Reddy Dean, School of Computer Science Keywords: Standard ML, optimization, compilation, functional programming, garbage collection, automatic storage management CMU-CS-97-108.pdf (1.16 MB) ( 288 pages) Copyright Notice