Asked  7 Months ago    Answers:  5   Viewed   41 times

I'm looking to generate a random number and issue it to a table in a database for a particular user_id. The catch is, the same number can't be used twice. There's a million ways to do this, but I'm hoping someone very keen on algorithms has a clever way of solving the problem in an elegant solution in that the following criteria is met:

1) The least amount of queries to the database are made. 2) The least amount of crawling through a data structure in memory is made.

Essentially the idea is to do the following

1) Create a random number from 0 to 9999999
2) Check the database to see if the number exists
OR
2) Query the database for all numbers
3) See if the returned result matches whatever came from the db
4) If it matches, repeat step 1, if not, problem is solved.

Thanks.

 Answers

53

No your algorithm is not scalable. What I've done before is to issue numbers serially (+1 each time) and then pass them through an XOR operation to jumble the bits thus giving me a seemingly random numbers. Of course they aren't really random, but they look so to users eyes.


[Edit] Additional information

This algorithm's logic goes like this you use a known sequence to generate unique numbers and then you deterministically manipulate them, so they don't look serial anymore. The general solution is to use some form of encryption, which in my case was an XOR flipflop, because its as fast as it can get, and it fulfills the guarantee that numbers will never collide.

However you can use other forms of encryption, if you want prefer even more random looking numbers, over speed (say you don't need to generate many ids at a time). Now the important point in choosing an encryption algorithm is "the guarantee that numbers will never collide". And a way to prove if an encryption algorithm can fulfill this guarantee is to check if both the original number and the result of the encryption have the same number of bits, and that the the algorithm is reversible (bijection).

[Thanks to Adam Liss & CesarB for exapanding on the solution]

Wednesday, March 31, 2021
 
nasty
answered 7 Months ago
33

MySQL: Using giant number of connections

What are dangers of frequent connects ?
It works well, with exception of some extreme cases. If you get hundreds of connects per second from the same box you may get into running out of local port numbers. The way to fix it could be - decrease "/proc/sys/net/ipv4/tcp_fin_timeout" on linux (this breaks TCP/IP standard but you might not care in your local network), increase "/proc/sys/net/ipv4/ip_local_port_range" on the client. Other OS have similar settings. You also may use more web boxes or multiple IP for your same database host to work around this problem. I've realy seen this in production.

Some background about this problem:
TCP/IP connection is identified by localip:localport remoteip:remote port. We have MySQL IP and Port as well as client IP fixed in this case so we can only vary local port which has finite range. Note even after you close connection TCP/IP stack has to keep the port reserved for some time, this is where tcp_fin_timeout comes from.

Wednesday, March 31, 2021
 
hnkk
answered 7 Months ago
100

I am assuming that you store the frequencies as floating point numbers between 0 and 1 that total to make 1.

First you should prepare a table of cumulative frequencies, i.e. the sum of the frequency of that letter and all letters before it.

To simplify, if you start with this frequency distribution:

A  0.1
B  0.3
C  0.4
D  0.2

Your cumulative frequency table would be:

A  0.1
B  0.4 (= 0.1 + 0.3)
C  0.8 (= 0.1 + 0.3 + 0.4)
D  1.0 (= 0.1 + 0.3 + 0.4 + 0.2)

Now generate a random number between 0 and 1 and see where in this list that number lies. Choose the letter that has the smallest cumulative frequency larger than your random number. Some examples:

Say you randomly pick 0.612. This lies between 0.4 and 0.8, i.e. between B and C, so you'd choose C.

If your random number was 0.039, that comes before 0.1, i.e. before A, so choose A.

I hope that makes sense, otherwise feel free to ask for clarifications!

Friday, June 18, 2021
 
TheCarver
answered 5 Months ago
73

Take a look at Jonathan Shewchuk's work, especially on his (together with his colleagues) famous papers and implementations of:

  • Streaming Computation of Delaunay Triangulations

  • A Two-Dimensional Quality Mesh Generator and Delaunay Triangulator

There is also fast implementation of unsorted point clouds implemented in the Point Cloud Library (PCL). Check their presentation on Fast triangulation of unordered point clouds.

Sunday, July 11, 2021
 
MDDY
answered 4 Months ago
90

This trick works only if you need one single random numbers, or the interval between the random numbers includes pauses for human input. In all other cases, the numbers will not be random at all.

If you need many random numbers, then there are different pseudo-random number algorithms available.

Another note is that there is more easy way to get the number in the needed interval:

    mov  ax, dx
    xor  dx, dx
    mov  cx, 10    
    div  cx       ; here dx contains the remainder of the division - from 0 to 9

    add  dl, '0'  ; to ascii from '0' to '9'

You can use this method for every random number generator of course.

Wednesday, August 11, 2021
 
Kwadz
answered 3 Months ago
Only authorized users can answer the question. Please sign in first, or register a free account.
Not the answer you're looking for? Browse other questions tagged :