ACM SIGMICRO

MICRO Test of Time Award: Eligible Papers

chip layout (photo by Jason Leung on Unsplash)

List of Eligible Papers for the 2024 Award

View the 2024 call for nominations.

MICRO 2003

Paper TitleAuthors
VSV: L2-Miss-Driven Variable Supply-Voltage Scaling for Low PowerHai Li, Chen-Yong Cher, T. N. Vijaykumar, Kaushik Roy
TLC: Transmission Line CachesBradford M. Beckmann, David A. Wood
Distance Associativity for High-Performance Energy-Efficient Non-Uniform Cache ArchitecturesZeshan Chishti, Michael D. Powell, T. N. Vijaykumar
Near-Optimal Precharging in High-Performance Nanoscale CMOS CachesSe-Hyun Yang, Babak Falsafi
Power-Driven Design of Router Microarchitectures in On-Chip NetworksHangsheng Wang, Li-Shiuan Peh, Sharad Malik
Optimum Power/Performance Pipeline DepthAllan Hartstein, Thomas R. Puzak
Processor Acceleration Through Automated Instruction Set CustomizationNathan Clark, Hongtao Zhong, Scott A. Mahlke
The Reconfigurable Streaming Vector Processor (RSVPTM)Silviu M. S. A. Chiricescu, Ray Essick, Brian Lucas, Phil May, Kent Moat, Jim Norris, Michael A. Schuette, Ali Saidi
Scaling and Characterizing Database Workloads: Bridging the Gap Between Research and PracticeRichard A. Hankins, Trung A. Diep, Murali Annavaram, Brian Hirano, Harald Eri, Hubert Nueckel, John Paul Shen
Generational Cache Management of Code Traces in Dynamic Optimization SystemsKim M. Hazelwood, Michael D. Smith
The Performance of Runtime Data Cache Prefetching in a Dynamic Optimization SystemJiwei Lu, Howard Chen, Rao Fu, Wei-Chung Hsu, Bobbie Othmer, Pen-Chung Yew, Dong-yuan Chen
IA-32 Execution Layer: A Two-Phase Dynamic Translator Designed to Support IA-32 Applications on Itanium-Based SystemsLeonid Baraz, Tevi Devor, Orna Etzion, Shalom Goldenberg, Alex Skaletsky, Yun Wang, Yigel Zemach
LLVA: A Low-level Virtual Instruction Set ArchitectureVikram S. Adve, Chris Lattner, Michael Brukman, Anand Shukla, Brian Gaeke
Comparing Program Phase Detection TechniquesAshutosh S. Dhodapkar, James E. Smith
Using Interaction Costs for Microarchitectural Bottleneck AnalysisBrian A. Fields, Rastislav Bodík, Mark D. Hill, Chris J. Newburn
Fast Path-Based Neural Branch PredictionDaniel A. Jiménez
Hardware Support for Control Transfers in Code CachesHo-Seop Kim, James E. Smith
Exploiting Value Locality in Physical Register FilesSaisanthosh Balakrishnan, Gurindar S. Sohi
Macro-Op Scheduling: Relaxing Scheduling Loop ConstraintsIlhyun Kim, Mikko H. Lipasti
WaveScalarSteven Swanson, Ken Michelson, Andrew Schwerin, Mark Oskin
Universal Mechanisms for Data-Parallel ArchitecturesKarthikeyan Sankaralingam, Stephen W. Keckler, William R. Mark, Doug Burger
Flexible Compiler-Managed L0 Buffers for Clustered VLIW ProcessorsEnric Gibert, F. Jesús Sánchez, Antonio González
Instruction Replication for Clustered MicroarchitecturesAlex Aletà, Josep M. Codina, Antonio González, David R. Kaeli
Efficient Memory Integrity Verification and Encryption for Secure ProcessorsG. Edward Suh, Dwaine E. Clarke, Blaise Gassend, Marten van Dijk, Srinivas Devadas
Fast Secure Processor for Inhibiting Software Piracy and TamperingJun Yang, Youtao Zhang, Lan Gao
IPStash: A Power-Efficient Memory Architecture for IP-LookupStefanos Kaxiras, Georgios Keramidas
Design and Implementation of High-Performance Memory Systems for Future Packet BuffersJorge García-Vidal, Jesús Corbal, Llorenç Cerdà, Mateo Valero
Beating In-Order Stalls With "Flea-Flicker" Two-Pass PipeliningRonald D. Barnes, Erik M. Nystrom, John W. Sias, Sanjay J. Patel, Nacho Navarro, Wen-mei W. Hwu
Scalable Hardware Memory Disambiguation for High ILP ProcessorsSimha Sethumadhavan, Rajagopalan Desikan, Doug Burger, Charles R. Moore, Stephen W. Keckler
Reducing Design Complexity of the Load/Store QueueIl Park, Chong-liang Ooi, T. N. Vijaykumar
Checkpoint Processing and Recovery: Towards Scalable Large Instruction Window ProcessorsHaitham Akkary, Ravi Rajwar, Srikanth T. Srinivasan

MICRO 2004

Paper TitleAuthors
The Fuzzy Correlation Between Code and Performance PredictabilityMurali Annavaram, Ryan N. Rakvic, Marzia Polito, Jean-Yves Bouguet, Richard A. Hankins, Bob Davies
Wrong Path Events: Exploiting Unusual and Illegal Program Behavior for Early Misprediction Detection and RecoveryDavid N. Armstrong, Hyesoon Kim, Onur Mutlu, Yale N. Patt
Cache Refill/Access Decoupling for Vector MachinesChristopher Batten, Ronny Krashinsky, Steve Gerding, Krste Asanovic
Managing Wire Delay in Large Chip-Multiprocessor CachesBradford M. Beckmann, David A. Wood
Dataflow Mini-Graphs: Amplifying Superscalar Capacity and BandwidthAnne Bracy, Prashant Prahlad, Amir Roth
Automatic Synthesis of High-Speed Processor SimulatorsMartin Burtscher, Ilya Ganusov
Dynamically Controlled Resource Allocation in SMT ProcessorsFrancisco J. Cazorla, Alex Ramírez, Mateo Valero, Enrique Fernández
Application-Specific Processing on a General-Purpose Core via Transparent Instruction Set CustomizationNathan Clark, Manjunath Kudlur, Hyunchul Park, Scott A. Mahlke, Krisztián Flautner
Control Flow Optimization Via Dynamic Reconvergence PredictionJamison D. Collins, Dean M. Tullsen, Hong Wang
Minos: Control Data Attack Prevention Orthogonal to Memory ModelJedidiah R. Crandall, Frederic T. Chong
A Hardware-Software Platform for Intrusion PreventionMilenko Drinic, Darko Kirovski
Dynamically Trading Frequency for Complexity in a GALS MicroprocessorSteven G. Dropsho, Greg Semeraro, David H. Albonesi, Grigorios Magklis, Michael L. Scott
Register Packing: Exploiting Narrow-Width Operands for Reducing Register File PressureOguz Ergin, Deniz Balkan, Kanad Ghose, Dmitry V. Ponomarev
Compiler Optimizations for Transaction Processing Workloads on Itanium Linux SystemsGerolf Hoflehner, Knud Kirkegaard, Rod Skinner, Daniel M. Lavery, Yong-Fong Lee, Wei Li
Adaptive History-Based Memory SchedulersIbrahim Hur, Calvin Lin
Conjoined-Core Chip MultiprocessingRakesh Kumar, Norman P. Jouppi, Dean M. Tullsen
A Case for Clumsy Packet ProcessorsArindam Mallik, Gokhan Memik
Pinpointing Representative Portions of Large Intel Itanium Programs With Dynamic InstrumentationHarish Patil, Robert S. Cohn, Mark Charney, Rajiv Kapoor, Andrew Sun, Anand Karunanidhi
MicroLib: A Case for the Quantitative Comparison of Micro-Architecture MechanismsDaniel Gracia Pérez, Gilles Mouchard, Olivier Temam
Memory Controller Optimizations for Web ServersScott Rixner
Dynamic Strands: Collapsing Speculative Dependence Chains for Reducing Pipeline CommunicationPeter G. Sassone, D. Scott Wills
Thermal Modeling, Characterization and Management of On-Chip NetworksLi Shang, Li-Shiuan Peh, Amit Kumar, Niraj K. Jha
Optimal Superblock Scheduling Using EnumerationGhassan Shobaki, Kent D. Wilken
Efficient Resource Sharing in Concurrent Error Detecting Superscalar MicroarchitecturesJared C. Smolens, Jangwoo Kim, James C. Hoe, Babak Falsafi
Hardware and Binary Modification Support for Code Pointer Protection From Buffer OverflowNathan Tuck, Brad Calder, George Varghese
Balanced Multithreading: Increasing Throughput via a Low Cost Multithreading HierarchyEric Tune, Rakesh Kumar, Dean M. Tullsen, Brad Calder
RIFLE: An Architectural Framework for User-Centric Information-Flow SecurityNeil Vachharajani, Matthew J. Bridges, Jonathan Chang, Ram Rangan, Guilherme Ottoni, Jason A. Blome, George A. Reis, Manish Vachharajani, David I. August
Whole Execution TracesXiangyu Zhang, Rajiv Gupta
AccMon: Automatically Detecting Memory-Related Bugs via Program Counter-Based InvariantsPin Zhou, Wei Liu, Long Fei, Shan Lu, Feng Qin, Yuanyuan Zhou, Samuel P. Midkiff, Josep Torrellas

MICRO 2005

Paper TitleAuthors
How to Fake 1000 RegistersDavid W. Oehmke, Nathan L. Binkert, Trevor Mudge, Steven K. Reinhardt
Reducing Instruction Fetch Cost by Packing Instructions Into Register WindowsStephen Hines, Gary Tyson, David Whalley
Efficient Use of Invisible Registers in Thumb CodeArvind Krishnaswamy, Rajiv Gupta
Wish Branches: Combining Conditional Branching and Predication for Adaptive Predicated ExecutionHyesoon Kim, Onur Mutlu, Jared Stark, Yale N. Patt
A Criticality Analysis of Clustering in Superscalar ProcessorsPierre Salverda, Craig Zilles
Incremental Commit Groups for Non-Atomic Trace ProcessingMatt T. Yourst, Kanad Ghose
Pinot: Speculative Multi-Threading Processor Architecture Exploiting Parallelism Over a Wide Range of GranularitiesTaku Ohsawa, Masamichi Takagi, Shoji Kawahara, Satoshi Matsushita
Dynamic Helper Threaded Prefetching on the Sun UltraSPARC CMP ProcessorJiwei Lu, Abhinav Das, Wei-Chung Hsu, Khoa Nguyen, Santosh G. Abraham
Automatic Thread Extraction With Decoupled Software PipeliningGuilherme Ottoni, Ram Rangan, Adam Stoler, David I. August
Exploiting Vector Parallelism in Software Pipelined LoopsSamuel Larsen, Rodric Rabbah, Saman Amarasinghe
Continuous Path and Edge ProfilingMichael D. Bond, Kathryn S. McKinley
Improving Region Selection in Dynamic Optimization SystemsDavid Hiniker, Kim Hazelwood, Michael D. Smith
Scalable Store-Load Forwarding via Store Queue Index PredictionTingting Sha, Milo M. K. Martin, Amir Roth
Address-Indexed Memory Disambiguation and Store-to-Load ForwardingSam S. Stone, Kevin M. Woley, Matthew I. Frank
Store Memory-Level Parallelism Optimizations for Commercial ApplicationsYuan Chou, Lawrence Spracklen, Santosh G. Abraham
A Mechanism for Online Diagnosis of Hard Faults in MicroprocessorsFred A. Bower, Daniel J. Sorin, Sule Ozev
μComplexity: Estimating Processor Design EffortCyrus Bazeghi, Francisco J. Mesa-Martinez, Jose Renau
Cost Sensitive Modulo Scheduling in a Loop Accelerator Synthesis SystemKevin Fan, Manjunath Kudlur, Hyunchul Park, Scott Mahlke
Address-Value Delta (AVD) Prediction: Increasing the Effectiveness of Runahead Execution by Exploiting Regular Memory Allocation PatternsOnur Mutlu, Hyesoon Kim, Yale N. Patt
Cherry-MP: Correctly Integrating Checkpointed Early Resource Recycling in Chip MultiprocessorsMeyrem Kyrman, Nevin Kyrman, José F. Martínez
ReSlice: Selective Re-Execution of Long-Retired Misspeculated Instructions Using Forward SlicingSmruti R. Sarangi, Wei Liu, Josep Torrellas, Yuanyuan Zhou
A Dynamic Compilation Framework for Controlling Microprocessor Energy and PerformanceQiang Wu, Margaret Martonosi, Douglas W. Clark, V. J. Reddi, Dan Connors, Youfeng Wu, Jin Lee, David Brooks
Thermal Management of On-Chip Caches Through Power Density MinimizationJa Chun Ku, Serkan Ozdemir, Gokhan Memik, Yehea Ismail
Balancing Resource Utilization to Mitigate Power Density in Processor PipelinesMichael D. Powell, Ethan Schuchman, T. N. Vijaykumar
A Quantum Logic Array Microarchitecture: Scalable Quantum Data Movement and ComputationTzvetan S. Metodi, Darshan D. Thaker, Andrew W. Cross, Frederic T. Chong, Isaac L. Chuang
"Flea-flicker" Multipass Pipelining: An Alternative to the High-Power Out-of-Order OffenseRonald D. Barnes, Shane Ryoo, Wen-mei W. Hwu
The TM3270 Media-ProcessorJan-Willem van de Waerdt, Stamatis Vassiliadis, Sanjeev Das, Sebastian Mirolo, Chris Yen, Bill Zhong, Carlos Basto, Jean-Paul van Itegem, Dinesh Amirtharaj, Kulbhushan Kalra, Pedro Rodriguez, Hans van Antwerpen
Stream Programming on General-Purpose ProcessorsJayanth Gummaraju, Mendel Rosenblum
Shader Performance Analysis on a Modern GPU ArchitectureVictor Moya, Carlos Gonzalez, Jordi Roca, Agustin Fernandez, Roger Espasa

MICRO 2006

Paper TitleAuthors
Virtually Pipelined Network MemoryBanit Agrawal, Timothy Sherwood
ASR: Adaptive Selective Replication for CMP CachesBradford M. Beckmann, Michael R. Marty, David A. Wood
Die Stacking (3D) MicroarchitectureBryan Black, Murali Annavaram, Ned Brekelbaum, John DeVale, Lei Jiang, Gabriel H. Loh, Don McCaule, Patrick Morrow, Donald W. Nelson, Daniel Pantuso, Paul Reed, Jeff Rupley, Sadasivan Shankar, John Paul Shen, Clair Webb
Serialization-Aware Mini-Graphs: Performance With Fewer ResourcesAnne Bracy, Amir Roth
DMDC: Delayed Memory Dependence Checking Through Age-Based FilteringFernando Castro, Luis Piñuel, Daniel Chaver, Manuel Prieto, Michael C. Huang, Francisco Tirado
Managing Distributed, Shared L2 Caches Through OS-Level Page AllocationSangyeun Cho, Lei Jin
In-Network Cache CoherenceNoel Eisley, Li-Shiuan Peh, Li Shang
Fairness and Throughput in Switch on Event MultithreadingRon Gabor, Shlomo Weiss, Avi Mendelson
Data-Dependency Graph Transformations for Superblock SchedulingMark Heffernan, Kent D. Wilken, Ghassan Shobaki
Memory Prefetching Using Adaptive Stream DetectionIbrahim Hur, Calvin Lin
An Analysis of Efficient Multi-Core Global Power Management Policies: Maximizing Performance for a Given Power BudgetCanturk Isci, Alper Buyuktosunoglu, Chen-Yong Cher, Pradip Bose, Margaret Martonosi
Live, Runtime Phase Monitoring and Prediction on Real Systems With Application to Dynamic Power ManagementCanturk Isci, Gilberto Contreras, Margaret Martonosi
A Predictive Performance Model for Superscalar ProcessorsP. J. Joseph, Kapil Vaswani, Matthew J. Thazhuthaveetil
Diverge-Merge Processor (DMP): Dynamic Predicated Execution of Complex Control-Flow Graphs Based on Frequently Executed PathsHyesoon Kim, José A. Joao, Onur Mutlu, Yale N. Patt
Leveraging Optical Technology in Future Bus-Based Chip MultiprocessorsNevin Kirman, Meyrem Kirman, Rajeev K. Dokania, José F. Martínez, Alyssa B. Apsel, Matthew A. Watkins, David H. Albonesi
Mitigating the Impact of Process Variations on Processor Register Files and Execution UnitsXiaoyao Liang, David M. Brooks
PathExpander: Architectural Support for Increasing the Path Coverage of Dynamic Bug DetectionShan Lu, Pin Zhou, Wei Liu, Yuanyuan Zhou, Josep Torrellas
Merging Head and Tail Duplication for Convergent Hyperblock FormationBertrand A. Maher, Aaron Smith, Doug Burger, Kathryn S. McKinley
Coherence Ordering for Ring-Based Chip MultiprocessorsMichael R. Marty, Mark D. Hill
A Floorplan-Aware Dynamic Inductive Noise Controller for Reliable Processor DesignFayez Mohamood, Michael B. Healy, Sung Kyu Lim, Hsien-Hsin S. Lee
Fair Queuing Memory SystemsKyle J. Nesbit, Nidhi Aggarwal, James Laudon, James E. Smith
ViChaR: A Dynamic Virtual Channel Regulator for Network-on-Chip RoutersChrysostomos Nicopoulos, Dongkook Park, Jongman Kim, Narayanan Vijaykrishnan, Mazin S. Yousif, Chita R. Das
Yield-Aware Cache ArchitecturesSerkan Ozdemir, Debjit Sinha, Gokhan Memik, Jonathan Adams, Hai Zhou
CAPSULE: Hardware-Assisted Parallel Execution of Component-Based ProgramsPierre Palatin, Yves Lhuillier, Olivier Temam
LIFT: A Low-Overhead Practical Information Flow Tracking System for Detecting Security AttacksFeng Qin, Cheng Wang, Zhenmin Li, Ho-Seop Kim, Yuanyuan Zhou, Youfeng Wu
Support for High-Frequency Streaming in CMPsRam Rangan, Neil Vachharajani, Adam Stoler, Guilherme Ottoni, David I. August, George Z. N. Cai
Architectural Support for Software Transactional MemoryBratin Saha, Ali-Reza Adl-Tabatabai, Quinn Jacobson
Exploiting Fine-Grained Data Parallelism With Chip Multiprocessors and Fast BarriersJack Sampson, Rubén González, Jean-Francois Collard, Norman P. Jouppi, Michael S. Schlansker, Brad Calder
Distributed Microarchitectural Protocols in the TRIPS Prototype ProcessorKarthikeyan Sankaralingam, Ramadass Nagarajan, Robert G. McDonald, Rajagopalan Desikan, Saurabh Drolia, M. S. Govindan, Paul Gratz, Divya Gulati, Heather Hanson, Changkyu Kim, Haiming Liu, Nitya Ranganathan, Simha Sethumadhavan, Sadia Sharif, Premkishore Shivakumar, Stephen W. Keckler, Doug Burger
Phoenix: Detecting and Recovering From Permanent Processor Design Bugs With Programmable HardwareSmruti R. Sarangi, Abhishek Tiwari, Josep Torrellas
NoSQ: Store-Load Communication Without a Store QueueTingting Sha, Milo M. K. Martin, Amir Roth
Authentication Control Point and Its Implications for Secure Processor DesignWeidong Shi, Hsien-Hsin S. Lee
Dataflow PredicationAaron Smith, Ramadass Nagarajan, Karthikeyan Sankaralingam, Robert G. McDonald, Doug Burger, Stephen W. Keckler, Kathryn S. McKinley
Reunion: Complexity-Effective Multicore RedundancyJared C. Smolens, Brian T. Gold, Babak Falsafi, James C. Hoe
Fire-and-Forget: Load/Store Scheduling With No Store Queue at AllSamantika Subramaniam, Gabriel H. Loh
Adaptive Caches: Effective Shaping of Cache Behavior to WorkloadsRanjith Subramanian, Yannis Smaragdakis, Gabriel H. Loh
Scalable Cache Miss Handling for High Memory-Level ParallelismJames Tuck, Luis Ceze, Josep Torrellas
Molecular Caches: A Caching Structure for Dynamic Creation of Application-Specific Heterogeneous Cache RegionsKeshavan Varadarajan, S. K. Nandy, Vishal Sharda, Bharadwaj Amrutur, Ravi R. Iyer, Srihari Makineni, Donald Newell
Dynamic Standby Prediction for Leakage Tolerant Microprocessor Functional UnitsAhmed Youssef, Mohab Anis, Mohamed I. Elmasry
Memory Protection Through Dynamic Access ControlKun Zhang, Tao Zhang, Santosh Pande
Using Branch Correlation to Identify Infeasible Paths for Anomaly DetectionXiaotong Zhuang, Tao Zhang, Santosh Pande

MICRO 2007

Paper TitleAuthors
Penelope: The NBTI-Aware ProcessorJaume Abella, Xavier Vera, Antonio González
Scavenger: A New Last Level Cache Architecture With Global Block PriorityArkaprava Basu, Nevin Kirman, Meyrem Kirman, Mainak Chaudhuri, José F. Martínez
Self-Calibrating Online Wearout DetectionJason A. Blome, Shuguang Feng, Shantanu Gupta, Scott A. Mahlke
Revisiting the Sequential Programming Model for Multi-CoreMatthew J. Bridges, Neil Vachharajani, Yun Zhang, Thomas B. Jablin, David I. August
FPGA-Accelerated Simulation Technologies (FAST): Fast, Full-System, Cycle-Accurate SimulatorsDerek Chiou, Dam Sunwoo, Joonsoo Kim, Nikhil A. Patil, William H. Reinhart, Darrel Eric Johnson, Jebediah Keefe, Hari Angepat
Informed Microarchitecture Design Space Exploration Using Workload DynamicsChang-Burm Cho, Wangyuan Zhang, Tao Li
Low-Cost Epoch-Based Correlation Prefetching for Commercial ApplicationsYuan Chou
Data Access Partitioning for Fine-Grain Parallelism on Multicore ArchitecturesMichael L. Chu, Rajiv A. Ravindran, Scott A. Mahlke
Software-Based Online Detection of Hardware Defects Mechanisms, Architectural Support, and EvaluationKypros Constantinides, Onur Mutlu, Todd M. Austin, Valeria Bertacco
Microarchitectural Design Space Exploration Using an Architecture-Centric ApproachChristophe Dubach, Timothy M. Jones, Michael F. P. O'Boyle
Dynamic Warp Formation and Scheduling for Efficient GPU Control FlowWilson W. L. Fung, Ivan Sham, George L. Yuan, Tor M. Aamodt
Smart Refresh: An Enhanced Memory Controller Design for Reducing Energy in Conventional and 3D Die-Stacked DRAMsMrinmoy Ghosh, Hsien-Hsin S. Lee
A Framework for Providing Quality of Service in Chip Multi-ProcessorsFei Guo, Yan Solihin, Li Zhao, Ravishankar R. Iyer
Guaranteeing Hits to Improve the Efficiency of a Small Instruction CacheStephen Hines, David B. Whalley, Gary S. Tyson
Flattened Butterfly Topology for On-Chip NetworksJohn Kim, James D. Balfour, William J. Dally
Multi-Bit Error Tolerant Caches Using Two-Dimensional Error CodingJangwoo Kim, Nikos Hardavellas, Ken Mai, Babak Falsafi, James C. Hoe
Composable Lightweight ProcessorsChangkyu Kim, Simha Sethumadhavan, M. S. Govindan, Nitya Ranganathan, Divya Gulati, Doug Burger, Stephen W. Keckler
Impact of Cache Coherence Protocols on the Processing of Network TrafficAmit Kumar, Ram Huggahalli
Process Variation Tolerant 3T1D-Based Cache ArchitecturesXiaoyao Liang, Ramon Canal, Gu-Yeon Wei, David M. Brooks
Leveraging 3D Technology for Improved ReliabilityNiti Madan, Rajeev Balasubramonian
Argus: Low-Cost, Comprehensive Error Detection in Simple CoresAlbert Meixner, Michael E. Bauer, Daniel J. Sorin
Effective Optimistic-Checker Tandem Core Design Through Architectural PruningFrancisco J. Mesa-Martinez, Jose Renau
Optimizing NUCA Organizations and Wiring Alternatives for Large Caches With CACTI 6.0Naveen Muralimanohar, Rajeev Balasubramonian, Norman P. Jouppi
Stall-Time Fair Memory Access Scheduling for Chip MultiprocessorsOnur Mutlu, Thomas Moscibroda
Time Interpolation: So Many Metrics, So Few RegistersTodd Mytkowicz, Peter F. Sweeney, Matthias Hauswirth, Amer Diwan
Global Multi-Threaded Instruction SchedulingGuilherme Ottoni, David I. August
Emulating Optimal Replacement With a Shepherd CacheKaushik Rajan, Ramaswamy Govindarajan
Using Address Independent Seed Encryption and Bonsai Merkle Trees to Make Secure Processors OS- and Performance-FriendlyBrian Rogers, Siddhartha Chhabra, Milos Prvulovic, Yan Solihin
Implementing Signatures for Transactional MemoryDaniel Sánchez, Luke Yen, Mark D. Hill, Karthikeyan Sankaralingam
Uncorq: Unconstrained Snoop Request Delivery in Embedded-Ring MultiprocessorsKarin Strauss, Xiaowei Shen, Josep Torrellas
Mitigating Parameter Variation With Dynamic Fine-Grain Body BiasingRadu Teodorescu, Jun Nakano, Abhishek Tiwari, Josep Torrellas
A Practical Approach to Exploiting Coarse-Grained Pipeline Parallelism in C ProgramsWilliam Thies, Vikram Chandrasekhar, Saman P. Amarasinghe
Optimal Versus Heuristic Global Code SchedulingSebastian Winkel
The Art of Deception: Adaptive Precision Reduction for Area Efficient Physics AccelerationThomas Y. Yeh, Petros Faloutsos, Milos D. Ercegovac, Sanjay J. Patel, Glenn Reinman
A Framework for Coarse-Grain Optimizations in the On-Chip Memory HierarchyJason Zebchuk, Elham Safi, Andreas Moshovos