Video Introduction to the STL, Part 5

Video Introduction to the STL, Part 5

  • Comments 11

In the fifth part of my video lecture series introducing the Standard Template Library, I explain how the advanced analysis in my Nurikabe solver works.  One of its steps involves using a breadth-first search to detect any unreachable cells.

 

Mandatory prerequisites of Part 5 are reading Wikipedia's Nurikabe page and watching Part 4 (the introduction to my Nurikabe solver).  It also assumes that you've watched Part 1 (sequence containers), Part 2 (associative containers), and Part 3 (smart pointers), or that you're already familiar with those topics.

 

I highly encourage you to read this extended example and see how basically every line uses the STL to do clever things.  My original source code and output are available.  I've been working on improving them, and versions 1.0 through 1.3 are available here.

 

Stephan T. Lavavej

Visual C++ Libraries Developer

  • I pray that Microsoft does not *actually* write code like this. Endless functions, endless if/else constructs, and unreadable lambda fetishes reminiscent of jQuery.

  • You there!  Stop what you are doing and leave the premises immediately!

    Examples are stupid by design.  You should know.  Now off with you!

  • I split up the code in separate headers and sources for main program, Grid class and helper functions and of course, performance suffered a little (~5% slower).

    However, by introducing enabling OpenMP and adding some #pragma omp parallel for lines, I was able to speed up the program from ~850 seconds for Nicoli #9 to 223 seconds on a Core i7-920 with 8 threads, CPU usage bumps to 50-85% from 14% . On request I will refactor the original 1.3 sources by Mr Lavavej (maximum kudos to you, I really enjoy your lecture) and post them back. I'd also like to see an attempt at the algorithm using PPL...

    The main issue still seems to be the method Grid::analyze_hypotheticals where the program spends dozens, sometimes hundreds of seconds instead of milliseconds like for the other rules.

    I believe, the random guessing can be improved - right now, many of the guesses go to unknown cells far away from nearly completed numbered regions and fail to contribute to the solution. Humans do it differently --- at least I tend to place guesses close to known regions and seem to have an easier time to find a contradiction.

    Tracking the unknown cells in regions instead of collecting them on demand may also be an idea worth considering.

  • Expanding on my previous comment, I'd like to suggest some modifications on the Grid::analyze_hypotheticals method. In v1.3, it collects the unknown cells in a vector, performs a random_shuffle on it to avoid re-analyzing failed guesses in subsequent calls over and over again, and then sequentially processes guess by guess in search of a contradiction.

    Instead, we could group the guesses into classes.

    Idea #1: * create classes correspondent to the Hamming distance to the closest "known" cell. [Hamming distance: |x' - x| + |y' -y|  for a pair of coordinates (x,y ), (x', y')]

    * shuffle the classes individually, starting with shortest-distance class first, and append them to vector v.

    Idea #2: * in modification to #1. consider distance to numbered cells only, apply a weight to the distance function in proportion to the number of missing cells in the corresponding numbered region distance, so proximity to nearly completed numbered  regions is preferred (thus giving a higher probability to hit a contradiction).

    Idea #3: * in modification to #2, classify unknown cells not for the weighted distance to the closest numbered cell, but for the sum of all weighted distances to all numbered cells (thus putting preference to unknown cells close to many unfinished numbered regions).

    Idea #4: Track the unknown cells in regions (this requires a split() operation symmetrical to fuse()) and process random guesses in the smallest unknown region first. (Following the observation that the hypothetical guessing speeds up towards the end of processing when unknown patches become smaller).

  • [blah]

    > I pray that Microsoft does not *actually* write code like this.

    > Endless functions, endless if/else constructs, and unreadable lambda fetishes reminiscent of jQuery.

    Please look at version 1.3, which was reorganized to be more readable. Additionally, I've removed the one lambda that was being overly cute instead of adding actual value (in the "are we done" test, which now calls known()). All of the remaining lambdas are significantly IMPROVING the clarity of their surrounding code.

    [Michael B.]

    > I split up the code in separate headers and sources for main program, Grid class and helper functions and of course, performance suffered a little (~5% slower).

    Link-Time Code Generation should actually reclaim that performance.

    > maximum kudos to you, I really enjoy your lecture

    Thanks - that means a lot to me.

    > I'd also like to see an attempt at the algorithm using PPL...

    That's what I plan to use, as soon as I get some free time again.

    > I believe, the random guessing can be improved - right now, many of the guesses

    > go to unknown cells far away from nearly completed numbered regions and fail

    > to contribute to the solution. Humans do it differently --- at least I tend

    > to place guesses close to known regions and seem to have an easier time to find a contradiction.

    Agreed. Version 1.3's pseudorandom guessing avoided the problem of repeatedly sweeping over boring cells, but doesn't prioritize interesting cells.

    I suspect that tracking both how interesting a cell is (i.e. how many cells in its neighborhood are known), and how recently we've tried to guess it, would produce the best results.

    > Tracking the unknown cells in regions instead of collecting them on demand may also be an idea worth considering.

    This is definitely implementable, but I'm not sure whether it would speed up anything other than "isolated unknown region" analysis. (It's actually convenient to not track unknown cells with regions, because the known regions always grow/fuse and never split. If unknown regions were tracked, then marking cells as known could split unknown regions.)

  • [Michael B. from coffe break in office]:

    I tried out a few variations on improving the pseudo-random guesses and found that prioritizing cells adjacent or near to "WHITE" cells (i.e. cells known to be white but not yet connected to a numbered region)  results in a considerable performance boost. It seems, these are more likely to create detectable contradictions compared to a random selection..

    My already OpenMP-enhanced modification of v1.3 went down from 223 seconds to ~100 seconds for the Nicoli #9 problem on an Core i7-920, all the other multi-second example problems profit in proportion.

    Instead of putting all unknown cells into vector<pair<int,int>> v indiscriminately, I collected them in a map<int, vector<pair<int, int>>> classes first, where the key of the map refers to the Hamming-distance to the closest WHITE cell. Each vector in the map is shuffled separately, following the established benefit of this step in the original version.

    Since map<> is sorted, I just append each shuffled vector to v traversing the map front to end and thus, the cells closest to a known WHITE cell are the first ones to be considered.

    I also tried to give preference to cells in the proximity of black cells, numbered cells of incomplete regions and equal preference to proximity of any known cell, but this did not significantly improve the performance on any of the example problems.

    The code change for the implementation is very contained and I tried to keep the spirit of the original code. I'll post the snippet in a follow-up when I get home tonight.

    [STL]

    > I suspect that tracking both how interesting a cell is (i.e. how many cells in its neighborhood are

    > known), and how recently we've tried to guess it, would produce the best results.

    Smart - this should be an improvement on top of the random_shuffle() , where trying a cell that failed to contribute just before may still happen by accident. random_shuffle is O(n) on sequences of length n, which might well be noticeable on large grids, I think.

  • Add the following method to the Grid class:

    /// <summary>

    /// Determine the distance of cell at (x,y) from the closest WHITE cell. The method starts at

    /// searching low distances first to avoid checking the whole grid.

    /// </summary>

    /// <param name="x">The x coordinate of the cell.</param>

    /// <param name="y">The y coordinate of the cell.</param>

    /// <returns>

    /// The distance of cell at (x,y) from the closest WHITE cell. If there is no WHITE cell

    /// on the grid, the maximum possible distance on the grid is returned.

    /// </returns>

    int Grid::closest_cell_distance( const int x, const int y ) const {

    for ( int distance = 1; distance < m_width + m_height - 2; ++distance ) {

    for ( int dx = 0; dx <= distance; ++dx ) {

    const int dy = distance - dx;

    if (

    ( valid( x - dx, y - dy ) && ( cell( x - dx, y - dy ) == State::WHITE ) ) ||

    ( valid( x + dx, y - dy ) && ( cell( x + dx, y - dy ) == State::WHITE ) ) ||

    ( valid( x - dx, y + dy ) && ( cell( x - dx, y + dy ) == State::WHITE ) ) ||

    ( valid( x + dx, y + dy ) && ( cell( x + dx, y + dy ) == State::WHITE ) )

    ) {

    return distance;

    }

    }

    }

    return m_width + m_height - 2;

    }

    And replace the code block in method Grid::analyze_hypotheticals (including the given start and end line):

       for (int x = 0; x < m_width; ++x) {

       ....

       random_shuffle(v.begin(), v.end(), dist);

    by this:

       // groups cells according to their closest hamming distance to known cell.

       map<int, vector<pair<int, int>>> classes;

       #pragma omp parallel for

       for ( int x = 0; x < m_width; ++x ) {

           for ( int y = 0; y < m_height; ++y ) {

               if ( cell( x, y ) == UNKNOWN ) {

                   auto p = make_pair( x, y );

                   int d = closest_cell_distance( x, y );

                   #pragma omp critical( v )

                   {

                       classes[d].push_back( p );

                   }

               }

           }

       }

       for ( auto c = classes.begin(); c != classes.end(); ++c ) {

           auto dist = [this]( const ptrdiff_t n ) {

               // random_shuffle() provides n > 0. It wants [0, n).

               // uniform_int_distribution's ctor takes a and b with a <= b. It produces [a, b].

               return uniform_int_distribution<ptrdiff_t>( 0, n - 1 )( m_prng );

            };

            random_shuffle( c->second.begin(), c->second.end(), dist );

            v.insert( v.end(), c->second.begin(), c->second.end() );

       }

    The #pragma lines are ignored without /openmp compiler option, I left them in as an example on the parallel optimizations I put in.

    Actually, I just noticed I had compiler and parallelism settings for my notebook active by accident; the performance boost due to the modified candidate selection is even better than I claimed before. I get:

    wikipedia_hard: 24.6028 milliseconds, 90/90 (100%) solved

    wikipedia_easy: 37.6328 milliseconds, 100/100 (100%) solved

    nikoli_1: 22.4631 milliseconds, 100/100 (100%) solved

    nikoli_2: 16.0642 milliseconds, 100/100 (100%) solved

    nikoli_3: 61.4219 milliseconds, 100/100 (100%) solved

    nikoli_4: 118.548 milliseconds, 180/180 (100%) solved

    nikoli_5: 94.8334 milliseconds, 180/180 (100%) solved

    nikoli_6: 96.3448 milliseconds, 180/180 (100%) solved

    nikoli_7: 2.17682 seconds, 336/336 (100%) solved

    nikoli_8: 1.52526 seconds, 336/336 (100%) solved

    nikoli_9: 44.6843 seconds, 720/720 (100%) solved

    nikoli_10: 48.0533 seconds, 720/720 (100%) solved

    beat me if you can... :-)

    Michael.

  • Stephan, your original code was perfectly understandable, don't mind the people with nothing better to do than critique others.

    Thanks for these videos!

  • Heh, I just noticed your name also happens to be STL.

  • Thanks, Michael! I've uploaded version 1.4 to my SkyDrive (linked in the post above).

    I took your heuristic, and modified it slightly. I look for the nearest white cell, not counting numbered cells, but white cells attached to numbered cells are okay. (That is, I look for white cells and not white regions, in my terminology.) This produced better results in my experimentation, and it intuitively makes sense - looking at numbered cells themselves isn't interesting because many will be scattered throughout the grid with nothing attached to them. However, partially completed islands are very interesting, as are white regions not yet attached to a number.

    Manhattan distance (note, not Hamming distance, that's different) seemed quite reasonable, especially because Nurikabe adjacency is orthogonal, so I used that. I wrote my implementation independently from yours, though (I use a random_shuffle() followed by a stable_sort() instead of introducing a map of vectors).

    I also enhanced the output to display the number and location of failed guesses. This heuristic doesn't fail very often, which is excellent.

    My timings compared to v1.3 (this is still on my Q9450, and has not been parallelized):

    wikipedia_hard: 35.4799 milliseconds (19.7% faster, was 42.4608 milliseconds)

    wikipedia_easy: 36.6243 milliseconds (was 37.4268 milliseconds)

    nikoli_1: 18.9214 milliseconds (was 19.2294 milliseconds)

    nikoli_2: 13.297 milliseconds (was 13.6004 milliseconds)

    nikoli_3: 16.8868 milliseconds (was 17.3527 milliseconds)

    nikoli_4: 109.048 milliseconds (was 112.626 milliseconds)

    nikoli_5: 90.2797 milliseconds (was 93.0231 milliseconds)

    nikoli_6: 87.9317 milliseconds (was 91.1031 milliseconds)

    nikoli_7: 4.53781 seconds (64.3% faster, was 7.45676 seconds)

    nikoli_8: 2.80949 seconds (was 2.96366 seconds)

    nikoli_9: 172.772 seconds (6.1x faster, was 1053.08 seconds)

    nikoli_10: 56.1901 seconds (2.8x faster, was 155.029 seconds)

    I've given you credit in the source code's changelog.

    [blah]

    > Stephan, your original code was perfectly understandable, don't

    > mind the people with nothing better to do than critique others.

    Thanks. I thought so too, but I'm always willing to listen to reasonable criticism, and v1.3 was improved as a result.

    > Heh, I just noticed your name also happens to be STL.

    As foretold by the dark prophecy!

  • >> Stephan, your original code was perfectly understandable, don't

    >> mind the people with nothing better to do than critique others.

    >Thanks. I thought so too, but I'm always willing to listen to reasonable criticism, and v1.3 was improved >as a result.

    I appreciate that you put out v1.3 with the criticism applied.  I also appreciate your humility, STL, as you are clearly a wizard.

    To the original comment: It isn't that people have nothing better to do, it's that when new code is put out with the purpose of instructing new impressionable developers it is actually even more important that the code represent clean, readable, and good programming practices to avoid introducing the novice to bad habits.

    To STL: Let me suggest reading Clean Code by Robert C. Martin.  As a young professional C++ developer in the games industry I found the book extremely interesting (even if the examples are mostly in Java and some of the advice is specific to the unit testing frameworks available in that environment, it is still very good.)

    The book challenges people who have already developed software to improve, and on reflection my programming habits have improved the most dramatically from this book than they have from any other.  Much of the advice that makes the most difference is also the most simple to do on a day to day basis.  I do not know how much it will affect your work on the STL, but on projects like the one you feature in this video it will be immensely useful.

    You won't think about programming in the same way again.

    www.amazon.ca/.../0132350882

    If you do read the book, I would love to hear a follow-up on your thoughts.  My personal e-mail address is mike@m2tm.net, or you can just post on the blog some time.

Page 1 of 1 (11 items)