Ph.D. Dissertation Proposal Defense: On the Characterization and Optimization of On-Chip Memory Structure Reliability
Shuai Wang
Date: December 5, 2008 (Friday)
Time: 1:00pm-3:00pm
Location: ECEC 202

Abstract:

Soft errors induced by energetic particle strikes in on-chip memory structures, such as L1 data/instruction caches and register files, have become an increasing challenge in designing new generation reliable microprocessors. Due to their transient/random nature, soft errors cannot be captured by traditional verification and testing process due to the irrelevancy to the correctness of the logic, while applying techniques such as hardware triple modular redundancy (TMR) or N-modular redundancy (NMR) for addressing soft errors might not be acceptable to commercial computer systems in most market segments. This dissertation is thus focusing on the reliability characterization and cost-effective reliable designs of on-chip memories against soft errors.

Due to various performance, area/size, and energy constraints in various target systems, many existing unoptimized protection schemes on cache memories may eventually prove significantly inadequate and ineffective. This work develops new lifetime models for data and tag arrays residing in both the data and instruction caches. Those models facilitate the characterization of cache vulnerability of stored items at various lifetime phases. The design methodology is further exemplified by the proposed reliability schemes targeting at specific vulnerable phases. Furthermore, in order to design the reliable systems in the changing operating environments (error rates), this work explores the design of a self-adaptive reliable data cache that dynamically adapts its employed reliability schemes to maintain a target reliability. Besides the data/instruction caches, protecting the register file and its data buses is crucial to reliable computing in high-performance microprocessors. This work proposes to exploit narrow-width register values, which present the majority of the generated values, for making a duplicate of the value within the same data item. This in-register duplication (IRD) eliminates the requirement for additional copy registers. By integrating the proposed reliable designs in data/instruction caches and register files, the vulnerability of the entire microprocessor will be dramatically reduced. The new lifetime model, the self-adaptive design and the narrow-with value duplication scheme proposed in this work can also provide the guidance to architects for the future highly efficient reliable system design.

Committee Members:

Dr. Jie Hu, Assistant Professor, ECE Dept., NJIT (Advisor)
Dr. Sotirios G. Ziavras, Professor, ECE Dept., NJIT
Dr. Edwin Hou, Associate Professor, ECE Dept., NJIT
Dr. Roberto Rojas-Cessa, Associate Professor, ECE Dept., NJIT
Dr. Xinping Zhu, Assistant Professor, ECE Dept., Northeastern University

Click here for seminar archive

Note: All MS thesis and PhD dissertation (proposal) defense are counted towards ECE791.