Curriculum Vitae

Shih-wei Liao

Internet: sliao AT cs.stanford.edu-- Permanent address
http://suif.stanford.edu/~sliao


RESEARCH INTERESTS

Mobile computing, date-center computing, programming and computing systems.


EDUCATION

Ph.D. and M.S., Stanford University, Stanford, CA 2000 and 1995 (GPA: 3.9)
Advisor: Prof. Monica Lam, Computer Systems Lab, Stanford
B.S. in Computer Science, National Taiwan University 1991 (GPA: 3.9)


AWARDS AND HONORS

Stanford Engineering Fellowship, 1992-93.

Top-5% Award, National Taiwan University, 1988, 1990, 1991.

2nd in class, Computer Science Commencement, National Taiwan University, 1991.


WORK EXPERIENCE

Tech Lead, Google, USA

Before 2007:
Senior Member of Technical Staff, Microprocessor Research Lab, Intel. Architect, Managed Runtime, SSG, Intel, USA

Before 2000:
Research Assistant for Dr. Lam, Stanford University
Performed research in memory optimizations, parallelizing compilers, and parallel architectures: Developed SUIF Explorer, an interactive interprocedural parallelizer (Co-authors: Amer Diwan, Robert Bosch, Anwar Ghuloum, Monica Lam). Developed array liveness analysis and applied it to data decomposition and array contraction. Deverloped a reduction recognizer and a parallel reduction optimizer. Studied and evaluated on real machines the ingredients of an effective parallelizer for multiprocessors (Co-authors: Mary Hall, Jennifer Anderson, Saman Amarasinghe, Brian Murphy, Ed Bugnion, Monica Lam).

Senior Application Programmer, Mascot Co., Ltd.
Proposed, implemented and maintained an inventory management database.

Before 1991:
Research Assistant for Dr. Juang, National Taiwan University
Surveyed various distributed systems and traced Mach internals. Developed a fault-tolerant server on Mach. Designed and implemented an imprecise computation server on Mach.

Software Programmer, Shenyen Technology, Inc.
Implemented a speech compression program using vector quantization techniques.

Research Assistant for Dr. Oyang, National Taiwan University
Developed an architectural simulator for a multi-branching VLIW processor.


PROFESSIONAL SERVICE

Editor for IJES, keynote speaker at CTHC'04, tutorial speaker at PACT'03, program chair for IEEE PDES'05, vice chair for HPCC'05, EUC'04, and PC member for ICPP'04, WISA'06 etc.

Reviewer, ACM Supercomputing Conference, International Parallel Architectures and Compilation Techniques (PACT), International Journal of Parallel Programming (IJPP), IEEE Parallel Processing Symposium (IPPS), IEEE Micro, ACM International Conference on Supercomputing (ICS), IEEE High Performance Computer Architectures .


PATENTS

A total of 24 patent applications


PAPERS (in reverse chronological order)

  1. Machine Learning-Based Prefetch Optimization for Data Center Applications
    S. Liao, T. Hung, D. Nguyen, C. Chou, C. Tu, H. Zhou,
    Proceedings of Supercomoputing 2009.

  2. Prefetch Optimization on Large-scale Applications via Parameter Value Prediction
    S. Liao, T. Hung, H. Zhou, D. Nguyen, C. Chou, C. Tu,
    ACM International Conference on Supercomoputing, Pages: 519-520, June 2009.

  3. Scalable Loseless High Definition Image Coding on Multicore Platforms
    S. Liao, S. Hung, C. Tu, J. Chen,
    IEEE/IFIP International Conference On Embedded and Ubiquitous Computing, December 2007.

  4. Parallel XML Transformations on Multi-Core Processors
    Y. Sun, T. Li, Q. Zhang, J. Yang, S. Liao,
    IEEE International Conference on e-Business Engineering (IEEE ICEBE 2007), October 2007.

  5. Multi-Disciplinary Simulation and Analysis of Complex Products in Service Oriented Environment
    H. Wang, S. Liao, H. Zhang,
    IEEE International Conference on Service-Oriented Computing and Applications (IEEE SOCA 2007), Newport Beach, California, June 2007.

  6. A Service Oriented Paradigm to Support Collaborative Product Development
    H. Wang, H. Zhang, S. Liao,
    International Conference on Computer Supported Coperative Work in Design (CSCWD 2007), Melbourne Australia, May 2007.

  7. Service Monitoring and Management on Virtualized and Multicore Platforms
    K. Lin, S. Liao,
    Service Oriented Architecture, Integration and Collaboration. Part of IEEE International Conference on e-Business Engineering (IEEE ICEBE 2006), Shanghai, China, October 2006.

  8. Parallelizing User-Defined and Implicit Reductions Globally on Multiprocessors
    S. Liao,
    Lecture Notes in Computer Science, Springer Verlag. Also in Proceedings of Annual Asia-Pacific Computer Architecture Conference (ACSAC06), Shanghai, PRC, September 2006.

  9. Data and Computation Transformations for Brook Streaming Applications on Multiprocessors
    S. Liao, Z. Du, G. Wu, G. Lueh,
    Proceedings of Annual IEEE/ACM International Symposium on Code Generation and Optimization (CGO'06), New York, NY, March 2006.

  10. Interprocedural Parallelization Analysis in SUIF
    M. Hall, S. Amarasinghe, B. Murphy, S. Liao, M. Lam,
    ACM Transactions on Programming Languages and Systems (TOPLAS) Volume 27, Issue 4, Pages: 662 - 731, July 2005.

  11. A Code Generation Algorithm for Affine Partitioning Framework
    S. Liao, Z. Du, G. Wu, G. Lueh,
    Proceedings of IEEE/IFIP International Workshop on Parallel and Distributed Embedded Systems, Fukuoka, Japan, July 2005.

  12. Parallel Processing of a Raytracer for GPU vs. for CPU
    S. Liao, Z. Du, G. Wu, G. Lueh,
    Proceedings of the International Conference on Parallel and Distributed Processing Technique and Applications, Las Vegas, NV, June 2005.

  13. Physical Experimentation with Prefetching Helper Threads on Intel's Hyper-Threaded Processors
    D. Kim, S. Liao, P. Wang, J. Cuvillo, X. Tian, X. Zou, H. Wang, D. Yeung, M. Girkar, J. Shen,
    Proceedings of Annual IEEE/ACM International Symposium on Code Generation and Optimization (CGO'04), Palo Alto, CA, March 2004 -- Best Paper Award Nominee.

  14. AutoHelper: Profile-Guided Generation of Helper Threads
    S. Liao, X. Tian, P. Wang, D. Kim, J. Cuvillo, H. Wang, J. Shen,
    Intel Programming Technology Conference (IPTC'03), Hillsboro, OR, November 2003.

  15. EmonLite: User-Level Library Routines for Dynamic Performance Monitoring with Low Profiling Overhead
    D. Kim, J. Cuvillo, S. Liao, P. Wang, X. Tian, H. Wang, J. Shen,
    Intel Programming Technology Conference (IPTC'03), Hillsboro, OR, November 2003.

  16. Post-Pass Binary Adaptation Tool for Software-Based Speculative Precomputation
    S. Liao, P. Wang, H. Wang, G. Hoflehner, D. Lavery, J. Shen,
    Proceedings of ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI'02), Berlin, Germany, June 2002.

  17. Speculative Precomputation: Exploring the Use of Multithreading for Latency
    H. Wang, P. Wang, R. Weldon, S. Ettinger, H. Saito, M. Girkar, S. Liao, J. Shen,
    Intel Technology Journal Vol. 6 Issue 1, February 2002.

  18. Blocking and Array Contraction Across Arbitrarily Nested Loops Using Affine Partitioning
    A. Lim, S.-W Liao, M. S. Lam,
    Proceedings of the 8th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPOPP'01), Snowbird, Utah, June 2001.

  19. SUIF Explorer: An Interprocedural and Interactive Parallelizer
    S.-W Liao
    Technical Report CSL-TR-00-807, Dept. of Computer Science, Stanford University, August 2000, 146 pages, PhD thesis.

  20. Interprocedural Array Liveness Analysis and Its Application to Parallelization and Memory Optimizations
    S.-W Liao, M. S. Lam.
    Technical Report, December 1999.

  21. SUIF Explorer: An Interactive and Interprocedural Parallelizer
    S.-W Liao, A. Diwan, R. P. Bosch, Jr., A. Ghuloum, M. S. Lam,
    Proceedings of the 7th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPOPP'99), Atlanta, Georgia, May 1999, pages 37-48.

  22. Interprocedural Parallelization Analysis in SUIF
    M. Hall, S. Amarasinghe, B. Murphy, S. Liao, M. Lam,
    To appear in ACM Transactions of Programming Language and Systems.

  23. Achieving High Performance on Digital AlphaServers with the SUIF Compiler
    J. Anderson, M. Hall, S. Amarasinghe, B. Murphy, S. Liao, E. Bugnion, M. Lam,
    Digital Technical Journal, Vol. 10 No. 1, 1998, pages 71-80.

  24. SUIF Explorer: A Programming Assistant for Parallel Machines
    S.-W Liao, R. P. Bosch, Jr., A. Ghuloum, and M. S. Lam,
    Proceedings of the Second SUIF Compiler Workshop, August 1997.

  25. Software and Hardware for Exploiting Speculative Parallelism with a Multiprocessor
    J. Oplinger, D. Heine, S.-W Liao, B. Nayfeh, K. Olukotun, and M. S. Lam,
    Stanford Technical Report CSL-97-715, March 1997.

  26. Maximizing Multiprocessor Performance with the SUIF Compiler
    M. Hall, J. Anderson, S. Amarasinghe, B. Murphy, S. Liao, E. Bugnion, M. Lam,
    IEEE Computer, 29(12), December 1996.

  27. The Multiprocessor as a General-Purpose Processor: A Software Perspective.
    S. Amarasinghe, J. Anderson, C. Wilson, S. Liao, M. Hall, B. Murphy, M. Lam.
    IEEE Micro, June 1996, pages 52-61

  28. Hot Compilers for Future Hot Chips.
    S. Amarasinghe, J. Anderson, R. French, M. Hall, M. Lam, S. Liao, B. Murphy, C. Tseng, C. Wilson, R. Wilson.
    Hot Chips VII, August 1995, Stanford, CA.

  29. Detecting Coarse-Grain Parallelism Using an Interprocedural Parallelizing Compiler (Postscript)
    M. W. Hall, S. P. Amarasinghe, B. R. Murphy, S. Liao, and M. S. Lam,
    Proceedings of Supercomputing '95, December, 1995.

  30. Interprocedural Analysis for Parallelization (Postscript)
    M. W. Hall, B. R. Murphy, S. P. Amarasinghe, S. Liao, and M. S. Lam,
    Proceedings of the 8th International Workshop on Languages and Compilers for Parallel Computing (LCPC95), August, 1995.

  31. Overview of Interprocedural Parallelization Analysis
    M.W. Hall, S. Amarasinghe, B. Murphy, S. Liao and M. Lam.
    Fifth International Workshop on Compilers for Parallel Computers, June 1995.

  32. Interprocedural parallelization analysis: Preliminary results.
    M. Hall, S. Amarasinghe, B. Murphy, S.-W. Liao, and M. Lam.
    Technical Report CSL-TR-95-665, Dept. of Computer Science, Stanford University, March 1995.

  33. SUIF: A Parallelizing and Optimizing Research Compiler
    R. Wilson, R. French, C. Wilson, S. Amarasinghe, J. Anderson, S. Tjiang, S.-W Liao, C.-W. Tseng, M. Hall, M. Lam, and J. Hennessy.
    ACM SIGPLAN Notices, 29(12):31-37, December 1994.

  34. Design and Implementation of a Fault-Tolerant Imprecise Computation Server on Mach
    S.-W Liao, C. Wu, and K. Lin
    IEEE Workshop on Imprecise Computation, December, 1992.


DISSERTATION

SUIF Explorer: an interactive and interprocedural parallelizer

Developing parallel software is a major obstacle in using shared-memory multiprocessors to solve a single task. To increase the productivity of parallel programming and to exploit the multiprocessors effectively, we developed an interactive interprocedural parallelizer called SUIF Explorer. Furthermore, our experience with SUIF Explorer drives the design of next-generation parallelizers.

As a parallel programming tool, the Explorer actively guides the programmers in the parallelization process using a set of advanced static and dynamic analyses and visualization techniques. Our automatic analyses are sophisticated enough to provide high-quality information. The Explorer is the first tool to apply the slicing analysis to aid the programmer in uncovering program properties for interactive parallelization. SUIF Explorer successfully minimizes the number of lines of code requiring programmer assistance, and produces fast parallel codes on a suite of real-world applications.

As a tool for finding missing compiler techniques, SUIF Explorer helps the compiler researchers design next-generation parallelizers. We developed and evaluated two key interprocedural analyses, interprocedural array reduction analysis and array liveness analysis, and integrated them into the parallelizer. First, our reduction algorithm extends beyond previous approaches in its ability to locate interprocedural sparse reductions. Second, we show that the interprocedural array liveness analysis is an enabler of several important optimizations and should be included in the modern parallelizers. Our efficient context-sensitive and flow-sensitve array liveness algorithm is more precise than simpler schemes reported in the literature. We use the liveness information to enable contraction of arrays that are not live at loop exits, which results in a smaller memory footprint and better cache utilization. The resulting codes run faster on both uniprocessors and multiprocessors.

Advisor: Professor Monica Lam


REFERENCES

Professor Monica Lam Gates Building 307, M/C 9030
Computer Science Department
Stanford University
Stanford, CA 94305

Professor John Hennessy Gates Building 308, M/C 9030
Computer Science Department
Stanford University
Stanford, CA 94305

Professor Kunle Olukotun Gates Building 302, M/C 9030
Computer Science Department
Stanford University
Stanford, CA 94305

Professor Mary Hall Project Leader/Research Assistant Professor
Advanced System Division
Information Science Institute/University of Southern California
Marina Del Ray, CA 90292-6695