среда, 24 мая 2023 г.

yet another maximal clique algorithm

It seems that most of known algorithms for maximal clique try to add as much vertices as possibly and evolving towards more complex heuristics for vertices ordering. But there is opposite way - we can remove some vertices from neighbors, right?

Lets assume that we sorted all vertices of graph with M vertices and N edges by their degrees in descending order and want to check if some vertex with degree K can contain clique. We can check if all of it`s neighbors mutually connected and find one or several most loosely connected vertices - lets name it L. This checking requires K -1 access to adjacency matrix for first vertex, K -2 for second etc - in average (K^ 2) / 2. If no unconnected vertices was found - all survived neighbors are clique. See sample of implementation in function naive_remove

Now we should decay what we can do with L and there is only 2 variants:

  1. we can remove it from set of neighbors
  2. we can keep it and remove from set of neighbors all vertices not connected with L

Notice that in both cases amount of neighbors decreased by at least 1. Now we can recursively repeat this process with removed and remained L at most K times, so complexity will be O = (K ^ 2) / 2 * (2 ^ K)

We can repeat this process for all vertices with degree bigger than maximal size of previously found clique - in worse case M times, so overall complexity of this algorithm is O = M * (K ^ 2) / 2 * (2 ^ K)

In average K =  N / M

well, not very good result but processing of each vertex can be done in parallel

We can share adjacency matrix (or even make it read-only) between all working threads and this recursive function will require in each step following memory:

  1. bitset of survived neighbors - K / 8 where V[i] is 1 if this vertex belongs to neighbors and 0 if it was removed
  2. array for unconnected vertices counts with size K

given that recursion level does not exceed K overall used space on stack is

S = K * (K / 8 + K * sizeof(index))

now check if we can run this algorithm on

gpu

Disclaimer: I read book about CUDA programming almost 10 years ago so I can be wrong
Let see some modern CPU and find maximal size of their local thread memory
 
NVidia GTX TITAN
6 Gb of memory / 2688 cores = 2.2Mb
if index can fit in short (2 byte) we can process vertexes with degree up to 1028
1000$ / 1028 = ~1$ for each vertex degree
 
40 Gb / 4608 cores = 10.4Mb
vertexes can have degree up to 2237
10000$ / 2237 = 4.5$ for each vertex degree

Комментариев нет:

Отправить комментарий