пятница, 26 мая 2023 г.

ctf-like task based on maximal clique problem

Sources

There is undirected graph with 1024 vertices and 100909 edges (so average degree is 98.5). It is known that the graph contains clique with size 16. You can pass indexes of clique`s vertices in command line like

./ctf 171 345

./ctf 171 346
too short clique 

This vertices of clique then used to derive AES key and decrypt some short string

Can you solve this?

среда, 24 мая 2023 г.

yet another maximal clique algorithm

It seems that most of known algorithms for maximal clique try to add as much vertices as possibly and evolving towards more complex heuristics for vertices ordering. But there is opposite way - we can remove some vertices from neighbors, right?

Lets assume that we sorted all vertices of graph with M vertices and N edges by their degrees in descending order and want to check if some vertex with degree K can contain clique. We can check if all of it`s neighbors mutually connected and find one or several most loosely connected vertices - lets name it L. This checking requires K -1 access to adjacency matrix for first vertex, K -2 for second etc - in average (K^ 2) / 2. If no unconnected vertices was found - all survived neighbors are clique. See sample of implementation in function naive_remove

Now we should decay what we can do with L and there is only 2 variants:

  1. we can remove it from set of neighbors
  2. we can keep it and remove from set of neighbors all vertices not connected with L

Notice that in both cases amount of neighbors decreased by at least 1. Now we can recursively repeat this process with removed and remained L at most K times, so complexity will be O = (K ^ 2) / 2 * (2 ^ K)

We can repeat this process for all vertices with degree bigger than maximal size of previously found clique - in worse case M times, so overall complexity of this algorithm is O = M * (K ^ 2) / 2 * (2 ^ K)

In average K =  N / M

well, not very good result but processing of each vertex can be done in parallel

We can share adjacency matrix (or even make it read-only) between all working threads and this recursive function will require in each step following memory:

  1. bitset of survived neighbors - K / 8 where V[i] is 1 if this vertex belongs to neighbors and 0 if it was removed
  2. array for unconnected vertices counts with size K

given that recursion level does not exceed K overall used space on stack is

S = K * (K / 8 + K * sizeof(index))

now check if we can run this algorithm on

gpu

Disclaimer: I read book about CUDA programming almost 10 years ago so I can be wrong

воскресенье, 21 мая 2023 г.

estimation of maximum clique size

definition 1.1 from really cool book "The Design of Approximation Algorithms":

An α-approximation algorithm for an optimization problem is a polynomial-time algorithm that for all instances of the problem produces a solution whose value is within a factor of α of the value of an optimal solution

so you need first to estimate at least size of possible optimal solution, right?

Surprisingly I was unable to find it for maximal clique. stackexchange offers very simple formula (spoiler: the actual size is a couple of orders of magnitude smaller). python networkX offers method with complexity O(1.4422n) to find maximal clique itself only. cool. Let's invent this algorithm by ourselves

From wikipedia:

A clique, C, in an undirected graph G = (V, E) is a subset of the vertices, CV, such that every two distinct vertices are adjacent

in other words this means that graph with maximal clique of size K should contains at least K vertices with degree K - 1 or bigger. So we can arrange vertices on degrees and find some degree S where amount of vertices with degree S or bigger is >= S. But this is very rough estimation and it could be refined taking into account the following observation - we can remove all edges to vertices not belonging to this subgraph. So algo is:

  1. calculate degrees of all vertices and arrange them in descending order
  2. for each degree S find first where amount of vertices with degree S or bigger is >= S
  3. put all such vertices in sub-graph SD
  4. remove from SD all edges to vertices not belonging to SD
  5. recalculate degrees of all vertices in SD
  6. find another degree S in SD where amount of vertices with degree S or bigger is >= S. this will be result R

next we can repeat steps 2-6 until enumerate all degrees or some degree will be less than the previously found result R

Complexity

Let N - amount of vertices and M - amount of edges. Then cycle can run max N times and in each cycle we can remove less that M edges (actually in average M/2), so in worst case complexity is O(MN/2)

Results