A Hierarchical Cache Coherent Protocol

DSpace Home
→
Harvested articles مقالات مستوردة من مؤسسات وجامعات عالمية
→
MIT Items
→
View Item

dc.creator	Wallach, Deborah A.
dc.date	2004-10-20T20:29:26Z
dc.date	2004-10-20T20:29:26Z
dc.date	1992-09-01
dc.date.accessioned	2013-10-09T02:48:12Z
dc.date.available	2013-10-09T02:48:12Z
dc.date.issued	2013-10-09
dc.identifier	AITR-1645
dc.identifier	http://hdl.handle.net/1721.1/7088
dc.identifier.uri	http://koha.mediu.edu.my:8181/xmlui/handle/1721
dc.description	As the number of processors in distributed-memory multiprocessors grows, efficiently supporting a shared-memory programming model becomes difficult. We have designed the Protocol for Hierarchical Directories (PHD) to allow shared-memory support for systems containing massive numbers of processors. PHD eliminates bandwidth problems by using a scalable network, decreases hot-spots by not relying on a single point to distribute blocks, and uses a scalable amount of space for its directories. PHD provides a shared-memory model by synthesizing a global shared memory from the local memories of processors. PHD supports sequentially consistent read, write, and test- and-set operations. This thesis also introduces a method of describing locality for hierarchical protocols and employs this method in the derivation of an abstract model of the protocol behavior. An embedded model, based on the work of Johnson[ISCA19], describes the protocol behavior when mapped to a k-ary n-cube. The thesis uses these two models to study the average height in the hierarchy that operations reach, the longest path messages travel, the number of messages that operations generate, the inter-transaction issue time, and the protocol overhead for different locality parameters, degrees of multithreading, and machine sizes. We determine that multithreading is only useful for approximately two to four threads; any additional interleaving does not decrease the overall latency. For small machines and high locality applications, this limitation is due mainly to the length of the running threads. For large machines with medium to low locality, this limitation is due mainly to the protocol overhead being too large. Our study using the embedded model shows that in situations where the run length between references to shared memory is at least an order of magnitude longer than the time to process a single state transition in the protocol, applications exhibit good performance. If separate controllers for processing protocol requests are included, the protocol scales to 32k processor machines as long as the application exhibits hierarchical locality: at least 22% of the global references must be able to be satisfied locally; at most 35% of the global references are allowed to reach the top level of the hierarchy.
dc.format	3979950 bytes
dc.format	3395110 bytes
dc.format	application/postscript
dc.format	application/pdf
dc.language	en_US
dc.relation	AITR-1645
dc.title	A Hierarchical Cache Coherent Protocol

Files in this item

Files	Size	Format	View
There are no files associated with this item.

This item appears in the following Collection(s)

MIT Items

Show simple item record

Search DSpace

Advanced Search

Browse

All of DSpace
This Collection
- By Issue Date
- Authors
- Titles
- Subjects

A Hierarchical Cache Coherent Protocol

Files in this item

This item appears in the following Collection(s)

Search DSpace

Browse

All of DSpace

This Collection

My Account