# A Totally New, Very Old Method for Finding Square Roots

A new iterative method to approximate square roots — actually shown to be centuries old!

Isaac Asimov once wrote [1] that all his discoveries in Math were either “*completely correct findings that are quite old *[or] *completely new findings that are quite wrong.*” I just had exactly the same experience, developing a totally new, greatly enhanced, surely unequaled method for finding square roots iteratively… to just end up realizing, after lots of work, that I had only rediscovered a well-known method!

I will show here where I started from, what ideas I developed, and the final sad conclusion about my totally new method being equivalent to a very old one… but since the road to that final result was interesting by itself, let’s begin!

# The Irrationality of √2

My road started with Stanley Tennenbaum’s geometric proof [2] of the irrationality of √2. Let’s assume that there exists a fraction *N*/*D* that equals √2 (so *N*²=2*D*²) with integer *N* and *D*, and that no smaller values of *N* and *D* also have that property. We can draw the following diagram, with a white square of side *N* including two gray squares of side *D*.

The dark grey part is covered twice, and the two white parts aren’t covered at all. Since the sum of the areas of the two *D*-sided squares (2*D*²) must equal the area of the original *N*-sided square (*N*²), it follows that the area of the twice-covered darker grey center square must equal the sum of the areas of the two uncovered white corner squares. Since the sides of the original squares were integers, so are the sides of the smaller squares: the doubly covered area has 2*D*-*N* sides, and the uncovered ones have *N*-*D* sides, so (2*D*-*N*)² must equal 2(*N*-*D*)²… but that’s impossible! We just found two *smaller* numbers that have the same property as *N*and *D*, which were supposed to be the *smallest* ones with that property. The final conclusion is that *N* and *D* don’t exist, and √2 cannot be a rational number; elegant!

Let’s extract something from the previous proof. If *N*/*D*=√2, we then also have:

If we were to systematically apply the iteration

starting from, say, *N*₀=2 and *D*₀=1, we could expect (wish? hope?) that this would produce better and better rational approximations to √2. Fortunately, this proves to be so, even though the iteration actually tends to -√2; not a problem!

Check: doing the numbers, we eventually get N₉=-816 and D₉=577 that approximate -√2 (1.414213562373…) as -1.41421… with 6 digits’ precision; later N₁₃=-27720 and D₁₃=19601 producing -1.41421356… with 9 digits’ precision, and eventually N₁₇=-941664 and D₁₇=665857 making -1.41421356237…with 12 digits’ precision.

# Extending the Method for √p

It seems like the idea works (at least for finding √2) and we could generalize it to get a new iterative method to find √*p* for any positive number *p*: start with a pair of values *N*₀ and *D*₀ and iterate using a formula like the following:

Our formula involves four constants (*a*, *b*, *c*, *d*) that we don’t yet know — we had *a*=-1, *b*=2, *c*=1, and *d*=-1 to find √2, but surely the constants would be different when finding √*p. *To determine them, since the fraction (at the limit) should equal √*p*, let’s square its right side. Dropping subindices for clarity we get

that leads to

In order for this to work for all *N* and *D*, we can start by making terms on* ND*equal, setting 2*ab*=2*pcd* so *a*=*pcd*/*b*. Note that we cannot pick *c*=0 because then we’d have *a*=0 and the iteration from equation (1) would become

which is useless. Thus, we can arbitrarily set *c*=1/*p* and then *a*=*d*/*b*.

Let’s go back to equation (2); discarding the *ND* terms (which we already dealt with) and noting from the left side of equation (1) that we would also have *N²*=*pD²*, we can write

so now it must be

As *c*=1/*p* and *a*=*d*/*b*, this leads to

and we arrive at a quartic equation

What are its roots? Two roots come from *b*²=*pd*², so we’d have *b*=±*d*√*p* and thus *a*=*d*/*b*=±1/√*p*, but these values (together with *c*=1/*p* and any value for *d*) transform the iteration from equation (1) into another useless version:

The other two roots come from *b*²=1, so we can pick *b*=1 and *a*=*d*. Furthermore, if *d*>0 and we start with positive values for *N*₀ and *D*₀, the iteration will converge to the positive value of √*p* since all constants are positive.

Picking b=-1 doesn’t work: instead of converging to a value, the successive fractions produce an alternating cycle of two values. Thus, out of the four roots of equation (3), only one (b=1) can be used.

Let’s recap: we have found an iterative method, depending on a yet undecided value *d*, that will (supposedly, hopefully!) converge to √*p*:

A couple of questions remain: is this actually convergent? And, what value of *d*should we select? Let’s get into that.

# Convergence of the Iteration

We can rewrite equation (4) in vectorial form as follows:

From equation (5) it follows that, for all *k*≥0,

To find a closed expression for the *k*-th power of matrix ** A**, we have to find its eigenvalues and eigenvectors. Solving det(

**-λ**

*A***)=0 finds the eigenvalues**

*I*with corresponding eigenvectors

With this result, we can write ** A**=

**, where**

*CBC⁻*¹**is the matrix formed by the eigenvectors of**

*C***, and**

*A***is a diagonal matrix with the eigenvalues of**

*B***:**

*A*It follows that

and calculating

we finally get the *k*-th power of ** A** that we wanted:

From equation (6) we get the result:

For *d*>0 then ∣ λ₁∣>∣λ₂∣ and when *k* tends to infinity our fraction tends to √*p* as we wanted. Even better: the convergence will be assured for any pair of initial nonzero values *N*₀ and *D*₀.

We can now also find the best possible value for *d*. Any *d*>0 would work, but if we can make ∣λ₂∣ very close to zero, convergence will be accelerated. Then, the value for *d* that we should pick should be as close to 1/√*p* as possible.

Check: for p=7 (so √p=2.6457513111…)and d=0.4 and starting from N₀=1 and D₀=0.4, our first approximation (1/0.4=2.64…) has 3 digits’ precision; two iterations later, we have 30.52/11.536=2.64575… with 6 digits’ precision, and after two more iterations we get to 905.128/342.10624=2.64575131… with 9 digits’ precision.

In any case, we can conclude that our iterative method always converges; how should we apply it?

# Speeding Things Up

We’re getting to the actual method now. It’s clear that we’ll attain better precision by getting to higher values of *k*, so how can we quickly get there? We can do this quite fast by calculating ** AA**=

**², then**

*A***²**

*A***²=**

*A***⁴,**

*A***⁴**

*A***⁴=**

*A***⁸, etc. Noting the special form of**

*A***:**

*A*we then find that ** A**² shares the same form, since

and therefore all the powers of ** A** that we will calculate will also share this form. We can do calculations without any actual matrix multiplication, by writing an iteration like

with initial values *r*₀=*dp*, *s*₀=*p*, and *t*₀=1, from equation (5).

But we can observe something! If we scale * A* or its powers by any number, the result of equation (7) won’t change, since we’ll be scaling both the numerator and denominator by the same value. In particular, let’s divide all formulas in equation (8) by 2

*rᵢ*:

Now the *s* and *t* values never change, and as *st*=*p* the whole iteration of equation (9) reduces to just

which is just the well-known “Babylonian method” or “Heron’s method” [3], a special case of Newton’s formula for approximating √*p*, known for more than 2,000 years!

Instead of trying to calculate complete matrices, we make do with a single element — which converges to the square root that we were trying to approximate; a good reduction in calculations. Our totally new method ended up being equivalent to a very old formula; an interesting result, but not an innovation!

# Conclusion

This article is fully aligned with Asimov’s comment on “completely correct findings that are quite old”. We went from a geometrical proof of the irrationality of √*p*, to an iteration producing fractions that approximated the wanted square root, to a matrix version of the same iteration (which we proved to converge), to an accelerated way of doing calculations by using powers of 2, to finally deduce that our newly invented method was equivalent to a several millennia old (and quite well known) formula for approximating square roots…

As we said: a totally new way to produce a very old method!

# Further reading

Check out my *A modern look at square roots in the Babylonian way* article, to see how that method worked, and why. In that article I review the original method, and prove how and why it worked, but in a simple way, using just a bit of algebra, with no calculus.

# References

[1] Isaac Asimov, ASIMOV ON NUMBERS, 1977, Pocket Books, page 36

[2] John Conway, THE POWER OF MATHEMATICS, in Alan Blackwell & David MacKay (eds), POWER, Cambridge University Press, pp 16–36, available online at http://thewe.net/math/conway.pdf

[3] Babylonian Method, available online at https://en.wikipedia.org/wiki/Methods_of_computing_square_roots#Babylonian_method