1. Introduction
In year 1978, Ronald Rivest, Adi Shamir and Leonard Adleman released one of the first public-key cryptosystem (1) known as RSA Cryptosystem. It has been over 40 years down the line and it still one of the strongest cryptosystems in the world. It takes one of the greatest problems in Number Theory (integer factorization) to be one of the most suitable solutions to computer security. There are still no trivial solutions to the factorization of semi primes. The RSA Cryptosystem works on the basis of multiplying two relatively large prime numbers k and l of at least 50 digits each (2, 3). The semi prime (y) obtained can be as big as 100 to 600 digits (500 to 2048 bits). Attempts have been made to crack these RSA numbers with modern factorization methods such as the Elliptic Curve Method, Quadratic Field Sieve and the Number Field Sieve (4). Still, the factorization of these large semi primes might take several months even for a supercomputer. It takes an estimated 2000 years’ work for a 2.2 GHz Opteron computer to factorize a 232 digit RSA number (5).
Recently, there have been advancements in the field of quantum cryptography which might lead to the possibility of factorizing these large semi primes in just a few days if not hours (6). The Shor Algorithm is one of the most commonly used algorithms in quantum computing developed by Peter Shor. Though quantum cryptography seems promising, they are still not yet widely applicable (7). This might remain the case for the next few years to come. In comparison, classical computers still cannot factorize very large semi primes in polynomial time (8, 9). At least not until there are significant technical developments in the processing power of the classical computers or a breakthrough is found for integer factorization. If a trivial solution is found for integer factorization then the RSA Cryptosystem would be rendered useless (10).
We are going to analyze a version of Fermat’s Last Theorem and Arnold’s Theorem for right angled triangles. From the Fermat’s Last Theorem, we obtain the shortened Diophantine equation
This equation can be further expanded to(x + z) (x−z) = y. From the first part of the equation you get the equations x + z = k and x − z = l where the value of y is the semi prime; k and l are the two prime factors for y.
That equation is used to form the Fermat’s method for factorization (11–15). We also look at Arnold’s Theorem which is derived from the Pythagorean equation a2+b2 = c2 which can be written as a2 = c2 − b2 (16). This can be expanded to the equation a2 = (c+b) (c−b). So if c−b = 1 it will leave us with the equation a2 = b+c. Therefore, the two equations which form Arnold’s Theorem for right angled triangles are a2 = b+c and c-b=1. They are very critical in setting up a factorization limit which will in turn create a different angle for finding the prime factors.
The technique employed after getting the factorization limit is referred to as the top-to-bottom, bottom-to-top approach. It involves several independent operations which will be running simultaneously from two points A to B, B being the factorization limit. There are two operations; one will run from point A heading toward point B and the other running from point B heading toward point A. there will also be two other operations starting from the central point O. One will run from point O toward point A and the other will run from point O toward point B. Therefore, if the prime difference between the two prime factors is too big, too small of too centralized you will find it in just a matter of seconds. This was a problem with Fermat method for factorization (17–22).
Techniques such as Arnold’s Digitized Summation Technique (A.D.S.T.) helps identify hidden properties of large semi primes used in RSA Cryptography. It simply involves adding the digits of a number until one is left with only one single digit (23). Since RSA numbers have hundreds of digits, A.D.S.T. is going to bring them down to just one digit. That is combined with the operation y = 1, which will always give you a multiple of 4. These techniques are solely meant to drastically reduce the time taken to factorize large semi primes. This aligns with the objective of the paper which is to be able to factorize and factorize RSA numbers in polynomial time using the processing power of the average classical computer.
2. Research methodology
Here, we give the definitions of some terms which are helpful as the research methodology.
Definition 2.1
Arnold’s Digitized Summation Technique (A.D.S.T.): This refers to the subsequent addition of the digits of a number until you are left with only one single digit.
Definition 2.2
Digitized number form (D.N.F.): This refers to the digit you obtain after applying A.D.S.T. to a number.
Definition 2.3
Semi primes: These refer to composite numbers which only have two prime factors.
Definition 2.4
Prime difference: This refers to the difference between the two prime factors of the semi prime.
Definition 2.5
Factorization limit: This refers to the point where you have the largest possible prime difference between the two prime factors of the semi prime.
3. Results and discussion
Here, we give the results of the study and further discussion. We start with the analysis of a version of Fermat Last Theorem which involves squares and then Arnold’s Theorem on right angled triangles.
Theorem 3.1
If y is a semi prime in x2 − z2 = y which can be expanded to (x + z) (x − z) = y, then (x + z) = k and (x − z) = l where k and l are the two prime factors of the semi prime y such that k > l.
Proof
We have seen that k = l = y from the statement above. Also k − l = d, where (d) is the prime difference and (y) is the semi prime. From the Gold Bach Conjecture we observe that when you add two prime numbers you will eventually have an even number. This is true for prime numbers greater than 2. From that we get that
The above equation means that x is half the value of the two prime numbers k and l. It can also be written as k = 2x − l. We now replace k from the two equations k = l = y and k + l = 2x. The equation k = l = y will be (2x − l)l = ywhich is 2xl − l2 = y and k + l = 2x will be 2x − l + l = 2x. The equation 2xl − l2 = y can be rewritten as the quadratic equation
This will give us the two equations and . So let us say that .We get that l = x − z and k = x + z. Therefore, the equation k = l = y is written as (x + z)(x − z) = y. This equation can be written as
We have seen that x + z = k and x − z = l. From the two equations we get that z can be expressed as z = k − x and z = x − l. Combining the two equations we get k − x = x − l and k + l = 2x. Dividing both sides by 2, one gets that . Replacing the x in the equation z = k − z we get. This can be factored into the equation .
Replacing the x and z in the equation x2 − z2 = y with and .
We get
The equation is expanded to
And the equation is expanded to
Therefore,
becomes
Solving the equation, we get,
Hence
We have observed that z can be expressed as the equation and k-l = d. So by replacing k - l with d in the first equation, we get that . This can also be written as
also since (x + z)(x − z) = y and d = (x + z) − (x − z). This gives us d = x + z − x = z which can be solved back tod = 2z. The value of d and z are directly proportional, so if the prime difference is big then the z is also be big. Also, we know that x > z, so during computation it takes a longer time to find the value of x if d is very large. This is because using the Fermat method for factorizing semi primes; we first start by the searching for the value ofx.
From the equations above we can tell that just by having the semi prime and the prime difference we obtain the two other prime numbers k and l. This is majorly due to the fact that d = 2z.
Problem
Given the semi prime 55, one finds the two prime numbers given that the prime difference is 6?
Solution
Since
We also know that x2 − z2 = y and we have the value for z and y. After replacing these values, we get the equation x2 − 32 = 55 which is x2 = 55+9 = 64. Therefore, x will be 8. We also saw that x + z = k and x − z = l where k and l are the two prime numbers.
So 8 + 3 = k and 8−3 = l
The two prime numbers will be k = 11 and l = 5.
k, l, x, y, z, d ∈ N
Theorem 3.2
Arnold’s Theorem for right angled triangle states that a2 = b+c. If a is the smallest side of a right-angled triangle where ab and a = b thenc-b = 1
Proof
The Pythagorean equation a2 + b2 = c2 can also be written as a2 = c2 − b2. This can be expanded to the equation a2 = (c + b)(c − b). So if c − b = 1 it leaves us with the equation
Therefore, the two equations which form Arnold’s Theorem for right angled triangles are
and
Using the two equations we can formulate the simultaneous equation
So when a right angled triangle satisfies both equation and we are only given the value of a, we can easily obtain the two other sides. This is possible through the simultaneous equation above. This brings us back to the equation x2 − z2 = y in theorem 3.1. In this case, we replace a2 withy. So when you are only given the semi prime y, one solves it using the simultaneous equation
Problem
Given the semi prime 55 can you obtain the two prime numbers k and l. (Note that this time we are not given the prime difference.)
Solution
This will give you z = 27. To get the value of x we substitute z from (equation a) with 27. This will give us x + 27 = 55x = 28. Therefore, the equation x2 − z2 = y will be 282 − 272 = 55.
To find the two prime numbers we used the two equations x + z = k and x − z = l where k and l are the two prime numbers. In this case we notice something very interesting with our values for k and l
Therefore, k and l will be 55 and 1. This means that using Arnold’s Theorem of right-angled triangles; it treats the semi prime as an actual prime number. So how is this important to us? The Fermat method for factorization searches for the prime difference of a semi prime y from 1 till an undefined value. Using Arnold’s Theorem, we set a limit to the largest prime difference a semi prime can have. By treating a semi prime as a prime number, it means that the prime difference will be the largest since it is k-1. That is, we are only subtracting one from the largest prime number k. When we subtract any value larger than 1 then the prime difference have a much smaller value. The largest possible prime difference is what is referred to as the factorization limit. So during the search for the prime difference through the search of z, we run two operations simultaneously. One starts from where the prime difference is smallest as it runs toward the largest possible value for the prime difference. This is actually what happens with the Fermat method for factorization. The other operation runs from the largest possible value for the prime difference as it heads down toward the smallest value. When we say that it searches through the prime difference, what we mean is that we modify the search for z since the prime difference is directly proportional to z. The search uses the equation x2 − z2 = y and we saw that d = 2z.
3.1. RSA cryptosystem
RSA Cryptosystem relies on the fact that finding the prime difference of two large prime numbers of at least 50 digits each cannot actually be done in polynomial time using a single 2.2 GHz Opteron computer. However, adding this new property into the equation it increases our chances of finding the prime difference even if it is too big. Furthermore, if the prime difference is too big or too small, we find it very fast using the two operations which runs simultaneously. So the most ideal prime difference for RSA numbers are not be too large or too small. It should be centralized. However, since we have the highest and lowest point where the values of the prime difference are, we can actually modify the computer operations to also accommodate the prime difference being centralized. This is demonstrated on Figure 1 where point A is where the prime difference is smallest, point O is where the prime difference is centralized and point B is where the prime difference is largest. Actually, we run a total of 4 operations simultaneously to find the prime difference much faster. The first operation starts at point A heading toward point O. The second operation starts at point O heading toward point A. The third operation also starts at point O but this time heading toward point B. The final operation starts at point B heading toward point O. This is what is referred to as the top-to-bottom, bottom-to-top approach.
Note that this is just the primary application of that concept. One can actually split point A to point O twice and point O to point B also twice such that we have a total of 8 operations running simultaneously. One can have as many operations running simultaneously depending on the size of the semi prime. For semi primes up to about 50 digits, having four operations running simultaneously is enough. But when the semi prime is about 600 digits as the case with RSA numbers, we can even have 32 operations running simultaneously. This, however, depends on the processing power of the computer. The operations are not exactly tied to be done on one computer. One can also assign as many computers as he/she wish to run some of the operations. This depends on the availability and practicality of using many computers in the first place. There are other properties of semi primes that we will see ahead that also help in reducing the number of tests the operations entail.
Going back to the equation x2 − z2 = y, we realize that the expression x-z does not have to be 1 as stated in Arnold’s Theorem for right angled triangles. We can try it with other integers 1, 2, 3, 4 … and subject it to the simultaneous equation. If we let x − z = 2 we get that the equation x2 − z2 = y is expanded to (x + z)(x − z) = y. So if x − z = 2 the two equations will be
Solving for the semi prime 55 we will get
We multiply equation Eqn.(b) by 2 and add the two equations which give us x = 14.75. Substituting the x in equation b we get that z = 14.75 − 2. Therefore, x = 14.75 and z = 12.75.
From the Table 1 above we observe that only when x-z is 1, 5 and 11 is when the values of x, y and z are whole numbers. That isx,y,z ∈ Z. We also observe that the integers of x-z are the same when x − z = l and x + z = kwhich in our case is 5 and 11. But for x-z = 11 the 3 in z is negative. The real value for x-z is actually 6 but in our case, it gives us different values for x and z. However, when we use 6 as the prime difference in Fermat’s Last Theorem equation x2 − z2 = y we obtained the values for x and z to be 8 and 3, respectively. Furthermore, one also notice that for (x − z) = 6 which is the actual prime difference between the two prime numbers the value of z are less than one. That is z < 1.
The y-axis represents x2 while the x- axis represents z2. The use of such curves is used in another complex method of factorizing numbers known as the Elliptic Curve Method which is not discussed in depth in this paper.
3.2. Arnold’s digitized summation technique (A.D.S.T.)
We are now going to bring in Arnold’s Digitized Summation Technique which simply involves adding the digits of a number until one is left with only one digit. When one adds the digit of a number like 1234 one gets 1 + 2 + 3 + 4 = 10, which one keeps adding till he is left with only one digit, so if one gets 10 one adds 1 + 0 = 1. The process of adding is what is referred to as A.D.S.T. The resultant digit 1 becomes the Digitized Number Form (D.N.F.) of the number 1234. Since RSA Cryptosystem is based on using very large numbers even up to 600 digits, we are going to utilize A.D.S.T to look at some properties of such numbers which would otherwise be hard to observe. The Python code below is supposed to sum down a 1000 digits number into a single digit in just a few seconds.
Considering that we are only given the semi prime (y), we first need to find the digitized number form through A.D.S.T. In other words, we need to add the digits of that semi prime until we are left with only one single digit. Since the semi primes in the case of RSA numbers have over 100 digits, the Python code above is of great help since it can find the D.N.F. of a-thousand-digit semi prime in just a second. If the Digitized Number Form (D.N.F.) is 2, 5 or 8, it means that the semi prime is of the form 6n-1. If it is 1, 4 or 7, it means that the semi prime is of the form 6n+1. A semi prime cannot have a D.N.F. of 3, 6 or 9. This is because such numbers are all multiples of 3. This holds true if the semi prime is an odd number, which is true for almost all cases apart from when another prime number is multiplied by 2 which is also a prime number.
Examples
55 = 5 + 5 = 10,10 = 1 + 0 = 1 D.N.F. of 55 is 1 and is of the form (6n + 1)
65 = 6 + 5 = 11, 11 = 1 + 1 = 2 D.N.F. of 65 is 2 and is of the form (6n − 1)
99 = 9 + 9 = 18, 18 = 1 + 8 = 9 D.N.F. of 99 is 9 and it will be a multiple of 3
For even numbers, if a number has a D.N.F. of 1, 4 or 7, then it is of the form 3n+1 where n is an odd number. If a number has a D.N.F. of 2, 5 or 8, then it is of the form 3n-1. If it has a D.N.F. of 3, 6 or 9 it still remains to be a multiple of 3. So how does A.D.S.T. come in to the actual equation used in the factorization process?
3.3. Arnold’s digitized summation technique and the x2 − z2=y equation
When we look at the D.N.F. of numbers forming the equation x2 − z2 = y, we notice something very interesting. Below, we have highlighted two tables which form the basic identities for semi primes having either a D.N.F. of 1, 4 and 7 or 2, 5 and 8.
From Table 2 we notice that all the values of x2 have a D.N.F. of 9 and from Table 3, we notice that all the values of z2 have a D.N.F. of 9. This means that those values are actually multiples of 9. This property is seen with the multiples of 9 shown below.
Example
9 = 9 D.N.F. is 9
18 = 1 + 8 = 9 D.N.F. is 9
27 = 2 + 7 = 9 D.N.F. is 9
36 = 3 + 6 = 9 D.N.F. is 9
45 = 4 + 5 = 9 D.N.F. is 9
54 = 5 + 4 = 9 D.N.F. is 9
63 = 6 + 3 = 9 D.N.F. is 9
72 = 7 + 2 = 9 D.N.F. is 9
81 = 8 + 1 = 9 D.N.F. is 9
90 = 9 + 0 = 9 D.N.F. is 9
Examples of real values of x2 − z2 = y and their D.N.F. are shown on the tables below. The values of Tables 4, 6 are obtained from Fermat Last Theorem as shown in theorem 3.1.
Table 5. D.N.F. of x, y and z for the values in Table 4.
Table 7. D.N.F of x, y and z for the values in Table 6.
Now we are look at examples of real values x2−z2=y and their D.N.F. This time the values of Tables 8, 10 are obtained from Arnold’s Theorem for right angled triangles as shown in theorem 3.2.
Table 9. D.N.F of x, y and z for the values in Table 8.
Table 11. D.N.F of x, y and z for the values in Table 10.
It is important to note that the values for the semi prime y are the same for Tables 4, 8. The same is seen with Tables 6, 10 which have the same values for y. But as we saw earlier, the two theorems give different values for x and z. However, the D.N.F. identity actually remains the same when we use either of the two theorems.
From Tables 4, 6, 8, 10 we also notice that if x2 or z2 is an even number in x2−z2 = y (Fermat’s Last Theorem) then it also be an even number in x2−z2 = y (Arnold’s Theorem)
3.4. Arnold’s digitized summation technique and they = 1 operation
Here we look at the final step in an attempt to factorize large semi primes in polynomial time. We have a total of 4 scenarios for the y = 1 operation.
i. If y+1 is divisible by 4 and y has a D.N.F. of 2, 5 or 8, then x2 is divisible by 36.
ii. If y-1 is divisible by 4 and y has a D.N.F. of 2, 5 or 8, then z2 is divisible by 4.
iii. If y+1 is divisible by 4 and y has a D.N.F. of 1, 4 or 7, then x2 is divisible by 4.
iv. If y-1 is divisible by 4 and y has a D.N.F. of 1, 4 or 7, then z2 is divisible by 36.
We obtained these properties of the semi primes while trying to find the trivial solution for integer factorization. Eventually, we did not get any trivial solution after the analysis of several semi primes and composite numbers in general. However, by studying the sequence of semi primes containing the same D.N.F., we were able to come up with the four definite scenarios shown above. They are incorporated into the programs which were running simultaneously in search for the value of x2 and z2. They are able to drastically reduce the time taken to factor large semi primes once combined with the top-to-bottom, bottom-to-top approach. We are still carrying out further research in search for a trivial solution for integer factorization.
4. Conclusion
We can be able to conclude that it is technically possible to factorize relatively large semi primes in polynomial time using the processing power of the average classical computer. Using techniques such as Arnold’s Digitized Summation Technique (A.D.S.T.) and the top-to-bottom, bottom-to-top approach, the time taken to factorize large semi primes will be drastically reduced. This, we find that has greater implications on RSA which is about large prime numbers. Our paper provides an insight into reviewing RSA and excites further research interests in co-primes and primes, as a breakthrough to improving cryptographic chances in computer and network security in regards to data.
References
3. Finotti L. A gentle introduction to number theory and cryptography. (2009). 60 p. Available online at: https://www.math.utk.edu/~finotti/papers/grad.pdf
4. Bhatt M, Aneja A, Tripathi S. Classical cryptography v/s quantum cryptography: a comparative study. Int J Electr Comput Sci Eng. (1956) 1:121–9.
5. Barak B. An intensive introduction to cryptography. (2018). 366 p. Available online at: https://files.boazbarak.org/crypto/lnotes_book.pdf
6. Hayward M. Quantum computing and shor’s algorithm. (2015). 59 p. Available online at: https://pdfs.semanticscholar.org/8072/dc7247460849b18abbb463429a09cfb2e3e6.pdf
7. Duplij SA, Shapoval II. Quantum computations: fundamentals and algorithms. Problems Atomic Sci Technol. (2007) 2007:230–5.
8. Devlin K. The millennium problems: the seven greatest unsolved mathematical puzzles of our time. Basic Books (2002) 2002:288.
9. Budd S. The distribution of prime numbers and its applications. (2015). Available online at: https://www.lakeheadu.ca/sites/default/files/uploads/77/images/Budd%20Samuel.pdf
10. Schneier B. Applied cryptography: protocols, algorithms, and source code in C. New York: John Wiley & Sons (1996). 784 p.
11. Barakat M, Eder C, Hanke T. An introduction to cryptography. (2018). 145 p. Available online at: https://www.mathematik.uni-kl.de/~ederc/download/Cryptography.pdf
12. Crandall R, Pomerance C. Prime numbers: a computational perspective. New York: Springer-Verlag (2005). 597 p.
14. Janacek GJ, Close ML. Mathematics for computer scientists. Norwalk, CT: Ventus Publishing (2011). 153 p.
15. Stallings W. Cryptography and network security: principles and practice. Prentice Hall, NJ: Prentice Hall (2011). 900 p.
16. Barnett JH. Generating pythagorean triples: a gnomonic exploration. (2017). Available online at: https://digitalcommons.ursinus.edu/cgi/viewcontent.cgi?article=1008&context=triumphs_number
17. Stein W. Elementary number theory: primes, congruence and secrets. (2017). 172 p. Available online at: https://wstein.org/ent/ent.pdf
18. Lehman E, Leighton FT, Meyer AR. Mathematics for computer science. (2017). 988 p. Available online at: https://courses.csail.mit.edu/6.042/spring17/mcs.pdf
19. Saul A, Mario C. Hidden structures in the randomness of prime number sequence? (2005). Available online at: https://arxiv.org/abs/cond-mat/0310148
20. Tao T. Structure and randomness in the prime numbers. (2007). Available online at: https://www.math.ucla.edu/~tao/preprints/Slides/primes.pdf
22. Baibekov S, Altynbek S. Development of New Methods for Generating Prime Numbers. Nat Sci Sci Res Publis. (2015) 7:416–23.