The 15th International Symposium on

Welcome to HPCA-15

Tutorials & Workshops



General Workshops & Tutorials Schedule:













The table above shows the general schedule for the workshop and tutorial days on Saturday, Feb. 14, and Sunday, Feb. 15.  Please see the individual workshop or tutorial listing and web page linked below for more detail.  Note that morning-only and all-day workshops and tutorials will start at 8:30am, while afternoon-only workshops and tutorials will start at 1:00pm.






2nd Workshop on Emerging Applications and Many-Core Architecture (EAMA)


12th Workshop on Computer Architecture Evaluation using Commercial Workloads (CAECW)



Integrating Parallelism Throughout the Undergraduate Computing Curriculum

PPoPP 2009 Workshop, for more information see:



SUNDAY, Feb. 15


1st JILP Data Prefetching Championship (DPC-1) (Afternoon only)


Interact-13 Workshop on Interaction between Compilers and Computer Architecture (Afternoon only)


Workshop on 3D Integration and Interconnection-Centric Architectures


1st International Workshop on the Influence of I/O on Microprocessor Architecture


4th ACM SIGPLAN Workshop on Transactional Computing (TRANSACT 2009)

PPoPP 2009 Workshop, for more information see:





The PARSEC Benchmark Suite Tutorial

Christian Bienia, Princeton University


SUNDAY, Feb. 15 (Morning only)


The Princeton Application Repository for Shared-Memory Computers (PARSEC) is a benchmark suite for studies of Chip-Multiprocessors (CMPs). Previous available benchmarks for multiprocessors have focused on high-performance computing applications and used a limited number of synchronization methods. PARSEC includes emerging applications in recognition, mining and synthesis (RMS) as well as systems applications which mimic large-scale multi-threaded commercial programs. The benchmark suite has been made available to the public. This tutorial covers the practical aspects of using PARSEC for computer architecture research.



1) Understanding PARSEC

2) Working with PARSEC

3) Adapting PARSEC

4) Concluding Remarks



InfiniBand and 10-Gigabit Ethernet Architectures for Emerging HPC Clusters and Enterprise Datacenters

D.K. Panda, The Ohio State University

Pavan Balaji, Argonne National Laboratory


SATURDAY, Feb. 14 (Afternoon only)


InfiniBand Architecture (IB) and 10-Gigabit Ethernet (10GE) architectures are generating a lot of excitement towards building next generation High Performance Computing (HPC) systems and enterprise datacenters. This tutorial will provide an overview of these emerging interconnect architectures, their offered features, their current market standing, and their suitability for prime-time HPC. It will start with a brief overview of IB, 10GE and their architectural features. An overview of the emerging OpenFabrics stack which encapsulates both IB and 10GE in a unified manner will be presented. IB and 10GE hardware/software solutions and the market trends will be highlighted. Finally, sample performance numbers highlighting the performance these technologies can achieve in different environments such as MPI, Sockets, Parallel File Systems, Multi-tier Datacenters, and Virtual Machines, will be shown.



1) Making the attendees familiar with the IB and 10GE architectures and the associated benefits

2) Demonstrating how the OpenFabrics stack is trying to provide a convergence between these two standards

3) Provide an overview of available IB and 10GE hardware/software solutions

4) Outlining case studies of designing next generation systems (HPC with MPI, Sockets, File systems, Storage systems, Multi-tier Datacenters and Virtual Machines) while taking advantage of IB and 10GE features and multi-core computing platgorms



Fast Simulation Without Bogus Results

Paul Bryan, Georgia Institute of Technology


SATURDAY, Feb. 14 (Afternoon only)


Contemporary hardware design is driven by simulation. An invaluable tool for evaluating design tradeoffs, simulator complexity and workload size has made simulation become an increasingly time consuming endeavor. Since exhaustive simulation of workloads is prohibitively expensive, some researchers have attempted to lower this cost at their peril. It is still common for researchers to execute a number of arbitrary instructions during the evaluation of their technique, and obtain inaccurate or misleading results. In contrast, the application of statistical sampling techniques to hardware simulation is an effective technique to significantly reduce the costs of simulation, while still achieving high levels of accuracy.


This tutorial will provide a thorough background of statistical concepts and techniques commonly used with sampling. These concepts will be explained in the context of hardware simulation environments. In order to obtain accurate measurement, two types of bias must first be removed. Each of these types of bias and their importance will be discussed in detail. Various non-sampling bias removal techniques (warm-up methods) will be discussed including the Reverse State Reconstruction algorithm. Sampling bias removal techniques will also be discussed when developing a sampling regimen including Single-Pass Sampling Regimen Design algorithm.


Finally, this tutorial will provide a detailed checklist for researchers to use to more easily incorporate sampling into their own simulation environments. Since complexity is often cited as a reason that researchers do not use sampling in their own studies, a goal of this tutorial will be to help researchers overcome this hurdle.



1) Background of statistical concepts and techniques used with sampling

2) Discussion of two types of sampling bias and their importance

3) Non-sampling bias removal, including Reverse State Reconstruction

4) Development of a sampling regimen including Single-Pass Sampling Regimen Design algorithm

5) Detailed checklist for researchers to incorporate sampling in simulation environments



Cetus: A Source-to-Source Compiler Infrastructure for Multi-Cores

PPoPP 2009 Tutorial by Rudi Eigenmann and Sam Midkiff (Purdue University)


Morning Only

See the PPoPP 2009 Workshops and Tutorials page for more information



Programming Models and Compiler Optimizations for GPUs and Multi-Core Processors

PPoPP 2009 Tutorial by J. (Ram) Ramanujam (Louisiana State University) and P. (Saday) Sadayappan (The Ohio State University)


Afternoon Only

See the PPoPP 2009 Workshops and Tutorials page for more information



Practical Formal Verification of MPI and Thread Programs

PPoPP 2009 Tutorial by Ganesh Gopalakrishnan, Sarvani Vakkalanka, and Yu Yang (University of Utah)


Full Day

See the PPoPP 2009 Workshops and Tutorials page for more information


Brian Rogers, NC State University

High-Performance Computer Architecture

Raleigh, North Carolina - February 14-18, 2009

8:00am - 8:30 am


8:30am - 10:00am

First Morning Session

10:00am - 10:30am


10:30am - Noon

Second Morning Session

Noon - 1:00pm


1:00pm - 2:30pm

First Afternoon Session

2:30pm - 3:00pm


3:00pm - 5:00pm

Second Afternoon Session