Using Pattern Information to Improve Memory Performance


Rui Min
Dept. of Electrical & Computer Engineering & Computer Science
University of Cincinnati
February 26, 11:00AM-12:00PM, 202 ECEC

Memory references sometimes bear clear pattern information. This information can be effectively detected and used to optimize Cache/Memory system, Virtual Memory systems and storage systems.

The presentation consists of two parts. First, we study a comprehensive research on the general memory behavior of NPB (NAS Parallel Benchmarks) applications. It introduces a new mechanism, called Plot Cache, for collecting runtime memory reference information (the timespace plot) and using it to improve paging performance. The Plot Cache system consists of three components. The plot-collecting component resides in CPU and records TLB misses. The information collected is stored in the timespace plots in memory. The pattern-recognition component, which can be implemented as part of the OS, recognizes the plot information and uses it to guide paging. The Paging Optimizer, which is implemented as part of the page fault handler, replaces memory pages following the hints provided by the pattern-recognition component. This technique provides a better platform for runtime swap optimization than OS based strategies, as it exploits the information that is unavailable to the OS. Additionally, these hardware/software components introduce little system overhead. We conducted trace-driven simulation experiments to generate the results. Simulation results based on NPB benchmarks show very promising improvements in performance. Our replacement algorithm can drastically reduce the number of page faults. When combined with pre-fetching and page clustering, our strategy can reduce the swapping overhead of conventional demand paging systems up to a magnitude of two. Second, as modern CPUs often use large physically-indexed caches that are direct-mapped or have low associativities. Such caches do not interact well with virtual memory systems. An improperly placed physical page will end up in a wrong place in the cache, causing excessive conflicts with other cached pages. Page coloring has been proposed to reduce the conflict misses by carefully placing pages in the physical memory. While page coloring works well for some applications, many factors limit its performance. Page coloring limits the freedom of the page placement system and may crease swapping traffic. In this paper, we propose a novel and simple architecture called color-indexed, physically-tagged caches that can significantly reduce the conflict misses. With some simple modifications to the TLB (Translation Look-aside Buffer), the new architecture decouples the addresses of the cache from the addresses of the main memory. Since the cache addresses do not depend on the physical memory addresses anymore, the system can freely place data in any cache page to minimize the conflict misses, without affecting the paging system. Extensive trace-driven simulation results show that our design performs much better than traditional page coloring techniques. The new scheme enables a direct-mapped cache to achieve hit ratios very close to or better than those of a two-way set associative cache. Moreover, the architecture does not increase cache access latency, which is a drawback of set associative caches. The hardware overhead is minimal. We show that our scheme can reduce the cache size by 50% without sacrificing performance. A two-way set-associative cache that uses this strategy can perform very close to a fully-associative cache

Biography

Rui Min received his B.S. and MS degrees in Computer Science from Huazhong University, China, in July 1996 and 1999 respectively.  He received* the Ph.D. degree at the University of Cincinnati in Dec 2003. Rui Min conducts research in the general areas of Computer Architecture and Operating Systems. He has published papers in journals such as IEEE Transactions on Computers (TOC) and in major conferences such as ACM Joint International Conference on Measurement & Modeling of Computer Systems (SIGMETRICS), International Conference on Parallel Processing (ICPP) and International Conference on VLSI Design, etc.