Abstract:
An apparatus and method are described for implementing an exclusive lower level cache (LLC) policy within a computer processor. For example, one embodiment of a computer processor comprises: a mid-level cache circuit (MLC) for storing a first set of cache lines containing instructions and/or data; a lower level cache circuit (LLC) for storing a second set of cache lines of instructions and/or data; and an insertion circuit for implementing a policy for inserting or replacing cache lines within the LLC based on values of use recency and use frequency associated with the lines.
Abstract:
An apparatus and method are described for implementing an exclusive lower level cache (LLC) policy within a computer processor. For example, one embodiment of a computer processor comprises: a mid-level cache circuit (MLC) for storing a first set of cache lines containing instructions and/or data; a lower level cache circuit (LLC) for storing a second set of cache lines of instructions and/or data; and an insertion circuit for implementing a policy for inserting or replacing cache lines within the LLC based on values of use recency and use frequency associated with the lines.
Abstract:
Some implementations disclosed herein provide techniques and arrangements for a hierarchy-aware replacement policy for a last-level cache. A detector may be used to provide the last-level cache with information about blocks in a lower-level cache. For example, the detector may receive a notification identifying a block evicted from the lower-level cache. The notification may include a category associated with the block. The detector may identify a request that caused the block to be filled into the lower-level cache. The detector may determine whether one or more statistics associated with the category satisfy a threshold. In response to determining that the one or more statistics associated with the category satisfy the threshold, the detector may send an indication to the last-level cache that the block is a candidate for eviction from the last-level cache.
Abstract:
Some implementations disclosed herein provide techniques for caching memory data and for managing cache retention. Different cache retention policies may be applied to different cached data streams such as those of a graphics processing unit. Actual performance of the cache with respect to the data streams may be observed, and the cache retention policies may be varied based on the observed actual performance.
Abstract:
Some implementations disclosed herein provide techniques for caching memory data and for managing cache retention. Different cache retention policies may be applied to different cached data streams such as those of a graphics processing unit. Actual performance of the cache with respect to the data streams may be observed, and the cache retention policies may be varied based on the observed actual performance.
Abstract:
Techniques for software defined super core usage are described. In some examples, a first and a second processor core are to operate as a single virtual core as configured by the operating system to execute the first set of instruction segments of the single threaded program and the second set of instruction segments of the single threaded program concurrently using a shared memory space, wherein the instruction segments are to include one or more of a store instruction to store live register data to be shared with another core and a load instruction to load live register data shared by another core.
Abstract:
A processor includes an execution unit, a memory subsystem, and a memory management unit (MMU). The MMU includes logic to evaluate a first bandwidth usage of the memory subsystem and logic to evaluate a second bandwidth usage between the processor and a memory. The memory is communicatively coupled to the memory subsystem. The memory subsystem is to implement a cache for the memory. The MMU further includes logic to evaluate a request of the memory subsystem, and, based upon the first bandwidth usage and the second bandwidth usage, fulfill the request by bypassing the memory subsystem.
Abstract:
Methods and apparatus relating to adaptive admission control for on die interconnect are described. In one embodiment, admission control logic determines whether to cause a change in an admission rate of requests from one or more sources of data based at least in part on comparison of a threshold value and resource utilization information. The resource utilization information is received from a plurality of resources that are shared amongst the one or more sources of data. The threshold value is determined based at least in part on a number of the plurality of resources that are determined to be in a congested condition. Other embodiments are also disclosed.
Abstract:
A cache memory data compression and decompression technique is described. A processor device includes a memory controller unit (MCU) coupled to a main memory and a cache memory. The MCU includes a cache memory data compression and decompression module that compresses data received from the main memory. The compressed data may then be stored in the cache memory. The cache memory data compression and decompression module may also decompress data that is stored in the cache memory. For example, in response to a cache hit for data requested by a processor, the compressed data in the cache memory may be decompressed and subsequently read or operated upon by the processor.
Abstract:
A cache memory data compression and decompression technique is described. A processor device includes a memory controller unit (MCU) coupled to a main memory and a cache memory. The MCU includes a cache memory data compression and decompression module that compresses data received from the main memory. The compressed data may then be stored in the cache memory. The cache memory data compression and decompression module may also decompress data that is stored in the cache memory. For example, in response to a cache hit for data requested by a processor, the compressed data in the cache memory may be decompressed and subsequently read or operated upon by the processor.