YetiSim Blog

Blogs about simulation and developing YetiSim.

Implementation of Parallel C++ Lists: Part 2

Last time I began to outline how I am implementing parallel lists in C++. In this blog, I will show some of the features of the completed implementation, and will discuss some refinements that could be made.

My parallel lists are of fixed size, and a bitmask is used to store information about each element position within the cache line.  In particular, whether or not that position contains an element.  Another container could easily be constructed in which two bits are stored for each element position: one bit to indicate that a position contains an element, and a second bit to indicate that element is locked.  However, that is a discussion for another time.  According to valgrind (and VTune) there are very few cache misses when my data structure is operated upon in parallel (0.001% actually).

This means that I have succeeded in my goals of creating an optimized fixed-size list data structure.  When operating upon a sublist of a fixed-size list, there are two operations that are guaranteed to be thread-safe and cache-optimized:

  1. An element may be deleted to create an empty space.
  2. An empty space may be filled with a new element.

From these rules, it follows that you can swap elements (empty or not) within a fixed-size list in a cache-friendly way.  It also follows that you can delete or add elements within a sublist with minimal cost, because another cache-line does not have to be loaded for this.  My goal is to use these guarantees to construct a better concurrent_vector than what TBB currently provides, and with more functionality (e.g. the ability to delete elements).  One issue that will arise is how to iterate over empty elements quickly.  There are a few ways to do this, one of which is to check the bitmask to see that it’s not equal to (bitmask_size -1), that is all ones.  That’s something that is not yet implemented yet however.

I’ve implemented the sublist to meet the TBB Range concept.  You can thus call list.explode() to get a sublist range, and then use TBB algorithms to operate on sublists.  This uses the Range concept elegantly, since it makes sense to a user that a slice of a list is a sublist.


Posted by AJ Guillon  (November 14, 2008)

Leave a Reply