Friday, January 25th, 2008

On the sum of power laws

Filed under: Science and Technology — lemire @ 10:37

Many real-life data sets have power laws or Zipfian distributions. An integer-valued random variable X follows a power law with parameter a if P(X=k) is proportional to k-a. Panos asked what the sum of two power laws was. He cites Wilke at al. who claim that the sum of two power laws X and Y with parameters a and b is a power law with parameter min(a,b).

I relate this problem to the sum of exponentials. Any engineer knows that if a>b, then eat + ebt will be approximately eat for t sufficiently large.

Hence, I think that the sum of power law distributions X and Y is a power law distribution with parameter min(a,b) if you are only interested in large values of k in P(X+Y=k).

For extra credit, help me solve this problem. Suppose that I have two power laws with the same parameter. Is their sum a power law with the same parameter? (I predict it does not!)

Egghe showed in The distribution of N-grams that even if the words follow a power law, the n-grams won’t!

Disclaimer: Yes, I am being lazy. I could work it out.

3 Comments »

  1. (1) Mandelbrot has proposed a generalization of Zipf’s Law. (2) Randomly generated strings follow Zipf’s Law, so some people argue that in some cases it is a statistical artifact.

    http://en.wikipedia.org/wiki/Zipf%27s_law

    Comment by Peter Turney — 25/1/2008 @ 13:17

  2. It seems that power law is really the same as Pareto distribution . This paper gives some closed formulas for distributions of sums of Pareto, which are themselves not Pareto

    If two power laws have different parameters, as you go to infinity, odds of encountering the one with higher a becomes vs. one with lower a goes to 0, so I also expect that for large values, heavier tail distribution will dominate

    BTW, I also wondered about distribution of bigrams when unigrams are power-law distributed, David Cantrell in sci.math gave an approximate formula for the cdf involving Lambert’s W function
    http://groups.google.com/group/sci.math/browse_thread/thread/8de7cee65f65ff70/810470b85f36523b?lnk=st&q=group%3Asci.math#810470b85f36523b

    Comment by Yaroslav Bulatov — 28/1/2008 @ 21:25

  3. Yaroslav,

    Thanks, very useful!

    - Panos

    Comment by Panos Ipeirotis — 31/1/2008 @ 21:34

RSS feed for comments on this post.

Leave a comment

Warning: When entering a long comment, please ensure that you make copy of your text prior to submitting it. If the server should fail or if you hit a bug, you might lose your work. I am not responsible for your lost effort.

To spammers: I carefully review every single post and make sure that spam gets deleted. You are wasting your time if you are manually entering spam using this form. Read my terms of use to see what I consider to be abusive.

Example: I + II + IX= XII. Yes, you have to enter a roman numeral. (Answer must be in upper case.)

« Blog's main page

25 queries. 0.319 seconds. Valid XHTML

Powered by WordPress

Subscribe to this blog in a reader or by Email.