Asked  7 Months ago    Answers:  5   Viewed   82 times

[I'm splitting a population number into different matrices and want to test my code using random numbers for now.]

Quick question guys and thanks for your help in advance -

If I use;

 100*rand(9,1)

What is the best way to make these 9 numbers add to 100?

I'd like 9 random numbers between 0 and 100 that add up to 100.

Is there an inbuilt command that does this because I can't seem to find it.

 Answers

20

I see the mistake so often, the suggestion that to generate random numbers with a given sum, one just uses a uniform random set, and just scale them. But is the result truly uniformly random if you do it that way?

Try this simple test in two dimensions. Generate a huge random sample, then scale them to sum to 1. I'll use bsxfun to do the scaling.

xy = rand(10000000,2);
xy = bsxfun(@times,xy,1./sum(xy,2));
hist(xy(:,1),100)

If they were truly uniformly random, then the x coordinate would be uniform, as would the y coordinate. Any value would be equally likely to happen. In effect, for two points to sum to 1 they must lie along the line that connects the two points (0,1), (1,0) in the (x,y) plane. For the points to be uniform, any point along that line must be equally likely.

xy histogram

Clearly uniformity fails when I use the scaling solution. Any point on that line is NOT equally likely. We can see the same thing happening in 3-dimensions. See that in the 3-d figure here, the points in the center of the triangular region are more densely packed. This is a reflection of non-uniformity.

xyz = rand(10000,3);
xyz = bsxfun(@times,xyz,1./sum(xyz,2));
plot3(xyz(:,1),xyz(:,2),xyz(:,3),'.')
view(70,35)
box on
grid on

xyzplot

Again, the simple scaling solution fails. It simply does NOT produce truly uniform results over the domain of interest.

Can we do better? Well, yes. A simple solution in 2-d is to generate a single random number that designates the distance along the line connecting the points (0,1) and 1,0).

t = rand(10000000,1);
xy = t*[0 1] + (1-t)*[1 0];
hist(xy(:,1),100)

Uniform x+y = 1

It can be shown that ANY point along the line defined by the equation x+y = 1, in the unit square, is now equally likely to have been chosen. This is reflected by the nice, flat histogram.

Does the sort trick suggested by David Schwartz work in n-dimensions? Clearly it does so in 2-d, and the figure below suggests that it does so in 3-dimensions. Without deep thought on the matter, I believe that it will work for this basic case in question, in n-dimensions.

n = 10000;
uv = [zeros(n,1),sort(rand(n,2),2),ones(n,1)];
xyz = diff(uv,[],2);

plot3(xyz(:,1),xyz(:,2),xyz(:,3),'.')
box on
grid on
view(70,35)

Sort trick

One can also download the function randfixedsum from the file exchange, Roger Stafford's contribution. This is a more general solution to generate truly uniform random sets in the unit hyper-cube, with any given fixed sum. Thus, to generate random sets of points that lie in the unit 3-cube, subject to the constraint they sum to 1.25...

xyz = randfixedsum(3,10000,1.25,0,1)';
plot3(xyz(:,1),xyz(:,2),xyz(:,3),'.')
view(70,35)
box on
grid on

randfixedsum

Tuesday, June 1, 2021
 
Ultimater
answered 7 Months ago
92

In your solution you generate an accumulated probability vector, which is very useful.

I have two suggestions for improvement:

  • if $probs are static, i.e. it's the same vector every time you want to generate a random number, you can preprocess $prob_vector just once and keep it.
  • you can use binary search for the $i (Newton bisection method)

EDIT: I now see that you ask for a solution without preprocessing.

Without preprocessing, you will end up with worst case linear runtime (i.e., double the length of the vector, and your running time will double as well).

Here is a method that doesn't require preprocessing. It does, however, require you to know a maximum limit of the elements in $probs:

Rejection method

  • Pick a random index, $i and a random number, X (uniformly) between 0 and max($probs)-1, inclusive.
  • If X is less than $probs[$i], you're done - $i is your random number
  • Otherwise reject $i (hence the name of the method) and restart.
Sunday, August 15, 2021
 
David
answered 4 Months ago
28

The numbers generated by each Random instance will be uniformly distributed, so if you combine the sequences of random numbers generated by both Random instances, they should be uniformly distributed too.

Note that even if the resulting distribution is uniform, you might want to pay attention to the seeds to avoid correlation between the output of the two generators. If you use the default no-arg constructor, the seeds should already be different. From the source code of java.util.Random:

private static volatile long seedUniquifier = 8682522807148012L;

public Random() { this(++seedUniquifier + System.nanoTime()); }

If you are setting the seed explicitly (by using the Random(long seed) constructor, or calling setSeed(long seed)), you'll need to take care of this yourself. One possible approach is to use a random number generator to produce the seeds for all other generators.

Sunday, August 15, 2021
 
jdmcbr
answered 4 Months ago
20

This code will create four integers that sum up to the maximum number and will not be zero

var max = 36;
var r1 = randombetween(1, max-3);
var r2 = randombetween(1, max-2-r1);
var r3 = randombetween(1, max-1-r1-r2);
var r4 = max - r1 - r2 - r3;


function randombetween(min, max) {
  return Math.floor(Math.random()*(max-min+1)+min);
}

EDIT: And this one will create thecount number of integers that sum up to max and returns them in an array (using the randombetween function above)

function generate(max, thecount) {
  var r = [];
  var currsum = 0;
  for(var i=0; i<thecount-1; i++) {
     r[i] = randombetween(1, max-(thecount-i-1)-currsum);
     currsum += r[i];
  }
  r[thecount-1] = max - currsum;
  return r;
}
Monday, October 18, 2021
 
Shobit
answered 2 Months ago
35
// Initialize rand()'s sequence. A typical seed value is the return value of time()
srand(someSeedValue);

//...

long range = 150000; // 100000 + range is the maximum value you allow
long number = 100000 + (rand() * range) / RAND_MAX;

You may need to use something larger than a long int for range and number if (100000 + range) will exceed its max value.

Monday, November 8, 2021
 
Muhamed Huseinbašić
answered 1 Month ago
Only authorized users can answer the question. Please sign in first, or register a free account.
Not the answer you're looking for? Browse other questions tagged :  
Share