Archive for January, 2010

By Kin-Yip Liu, Director, Customer Solutions Architecture at Cavium Networks

Multicore performance and how well performance scales with additional cores depend to a very large extent on how execution tasks are scheduled among the cores and how efficiently synchronization among the execution threads are performed. With poor scheduling and synchronization, cores can be left idle due to dependencies, head-of-line blocking, and resource conflicts.

Software locking is the traditional method for synchronization and protection of shared data structures and critical sections in the code. Software locking is also the major roadblock and problem which inhibit multicore performance scaling. Specifically, execution threads that need to gain a lock spin in a loop and compete for a lock until the lock is granted. In the meantime, the execution threads involved do not make any forward progress, but continuously consume significant interconnect bandwidth in requesting the lock.

When architecting the first OCTEON processor more than five years ago, Cavium Networks already understood these issues. Since the first implementation which shipped in 2005, every OCTEON processor includes hardware features to classify packets into flows, to schedule execution of packets on cores while accounting for flow dependencies, and to schedule execution on cores while preserving atomic sequences without requiring the use of software locks. As a result, OCTEON processors offer near linear performance scaling even in workloads where packets have dependency and ordering requirements.

When evaluating multicore performance, it is important to verify that the benchmark software includes proper synchronization mechanisms and the workload includes cases where the synchronization mechanisms are utilized. Multicore processors without such hardware features take significant performance hit on benchmarks with dependency and synchronization requirements.

A few other multicore processor vendors are adding similar hardware features in their next generation offerings. In these cases, the evaluation should also include the extent of code changes required in order to utilize these new features.

For OCTEON processors, hardware scheduling and synchronization features have been part of the architecture since the beginning. So, software compatibility is a given.

VN:F [1.9.6_1107]
Rating: 6.3/10 (4 votes cast)
VN:F [1.9.6_1107]
Rating: -1 (from 1 vote)
Subscribe to the Forum
Categories
Archives