Tuesday, August 09, 2011

Probability of a union of independent events algorithm

Lately I was looking for a copy 'n paste algorithm to calculate the probability of a union of independent events that are not mutually exclusive (aka inclusion-exclusion principle in probability). Unfortunately I couldn't find any algorithm for such a basic problem.
Therefore, I decided to write the following naive algorithm which is fast enough for my purposes (O(n2) in time and space, where n is the number of events), and share with everyone:

You can find the code snippet here, sorry for not embedding it in the blog post but blogger is really boring me with snippets having broken layout.

The idea behind the dynamic programming approach starts from this observation:

Let A, B and C be independent non mutually exclusive events. Then:

P(A or B or C) = P(A) + P(B) + P(C) - P(A)P(B) - P(A)P(C) - P(B)P(C) + P(A)P(B)P(C)

Let me simplify the expression using A instead of P(A):

P(A or B or C) = A+B+C - AB - AC - BC + ABC

Notice that AB+AC+BC = A(B+C) + BC. In general:

AB+AC+AD+...+AZ + BC+BD+....+BZ = A(B+C+D+...+Z) + B(C+D+...+Z)

ABC+...+ABZ + ACD+...+ACZ+...+AYZ + BCD+...+BYZ = A(B(C+D+...+Z)+C(D+...+Z)+...+YZ) + B(C(D+...+Z)+...+YZ)

That's exactly where we exploit the dynamic programming to avoid recalculating the same expressions twice.

Edit: My effort was totally useless given that for independent events this is equivalent to 1 - (1 - pA)(1 - pB)..., which can be calculated in linear time. I even used this formula once and forgot about it :-(

6 comments:

Anonymous said...

I believe you can compute this is linear time since
A + B + C + AB + AC + BC + ABC = A(1 + B(1 + C) + C) + B(1 + C) + C

The algorithm looks like this:
double union(double *a, size_t len) {
double x = 0;
for (size_t i = 0; i < len; ++i) {
x += a[i] * (1 + x);
}
return x;
}

Anonymous said...

it should be A+B+C-AB-AC-BC+ABC (instead of -ABC)... and it's probably easier to think 1-(1-pA)(1-pB)...(1-pN) instead.. the complement of nothing occurring

Anonymous said...

Ah yes, it should be this:

double union(double *a, size_t len) {
    double x = 0;
    for (size_t i = 0; i < len; ++i) {
        x += a[i] * (1 - x);
    }
    return x;
}

Luca Bruno aka Lethalman said...

Thanks! I totally forgot that's equivalent for independent events.

Unknown said...

The code seems to be unavailable. Would you mind re-uploading it? It will be a great help to me.

Luca Bruno aka Lethalman said...

If you read the edit in the post, it's just 1 - (1 - pA)(1 - pB). That's the code itself, if pA and pB are two variables in your program.