DSpace Repository

A Hierarchical Cache Coherent Protocol

Show simple item record

dc.creator Wallach, Deborah A.
dc.date 2004-10-20T20:29:26Z
dc.date 2004-10-20T20:29:26Z
dc.date 1992-09-01
dc.date.accessioned 2013-10-09T02:48:12Z
dc.date.available 2013-10-09T02:48:12Z
dc.date.issued 2013-10-09
dc.identifier AITR-1645
dc.identifier http://hdl.handle.net/1721.1/7088
dc.identifier.uri http://koha.mediu.edu.my:8181/xmlui/handle/1721
dc.description As the number of processors in distributed-memory multiprocessors grows, efficiently supporting a shared-memory programming model becomes difficult. We have designed the Protocol for Hierarchical Directories (PHD) to allow shared-memory support for systems containing massive numbers of processors. PHD eliminates bandwidth problems by using a scalable network, decreases hot-spots by not relying on a single point to distribute blocks, and uses a scalable amount of space for its directories. PHD provides a shared-memory model by synthesizing a global shared memory from the local memories of processors. PHD supports sequentially consistent read, write, and test- and-set operations. This thesis also introduces a method of describing locality for hierarchical protocols and employs this method in the derivation of an abstract model of the protocol behavior. An embedded model, based on the work of Johnson[ISCA19], describes the protocol behavior when mapped to a k-ary n-cube. The thesis uses these two models to study the average height in the hierarchy that operations reach, the longest path messages travel, the number of messages that operations generate, the inter-transaction issue time, and the protocol overhead for different locality parameters, degrees of multithreading, and machine sizes. We determine that multithreading is only useful for approximately two to four threads; any additional interleaving does not decrease the overall latency. For small machines and high locality applications, this limitation is due mainly to the length of the running threads. For large machines with medium to low locality, this limitation is due mainly to the protocol overhead being too large. Our study using the embedded model shows that in situations where the run length between references to shared memory is at least an order of magnitude longer than the time to process a single state transition in the protocol, applications exhibit good performance. If separate controllers for processing protocol requests are included, the protocol scales to 32k processor machines as long as the application exhibits hierarchical locality: at least 22% of the global references must be able to be satisfied locally; at most 35% of the global references are allowed to reach the top level of the hierarchy.
dc.format 3979950 bytes
dc.format 3395110 bytes
dc.format application/postscript
dc.format application/pdf
dc.language en_US
dc.relation AITR-1645
dc.title A Hierarchical Cache Coherent Protocol


Files in this item

Files Size Format View

There are no files associated with this item.

This item appears in the following Collection(s)

Show simple item record

Search DSpace


Advanced Search

Browse

My Account