And almost every one of the comments pointed out how the algorithm could be made better. I hadn't really thought there would be more optimal solutions than the ones given out by Project Euler, but turns out there are!
The first improvement was one I might have done myself if I had tried to do an iterative version myself. In the recursive version, when I find the number is divisible by (say) 2, I divide out the 2 and call the function recursively. Now, if there's another 2 in there (that is, if the number was divisible by 4), the number would again get divided by 2 and another recursive call would occur. When I'm converting this algorithm to an iterative version too, the most natural way would be:
my $max_factor = 1;
while($n % 2 == 0)
{
$max_factor = 2;
$n /= 2;
}
for(my $i = 3; $i < $n; $i += 2)
{
while($n % $i == 0)
{
$max_factor = $i;
$n /= $i;
}
}
There's a subtle optimization (compared to the recursive version) that has happened here that you probably missed. It's actually a simple one - in the recursive version, when you've divided n by say 2, you pass it back to the same function - which again tries to divide it by 2. This is good as long as n is divisible by 2, but once you've divided out all the 2's, this becomes just a waste of CPU cycles. In the iterative version, this is avoided because, once we know we've divided out all the 2's, we no longer try dividing by that, and instead move on to greater things. That's a Good Thing, as for big numbers, this optimization might save several cycles.
But even this turns out to be a naive implementation. As someone in the math subreddit as well as one of the commenters here pointed out, there's another optimization that can be done here - after 2 and 3, what are the prime numbers you have? 5, 7, 11, 13, 17, 19, 23, 29, etc. Is there some common property these numbers have? Why, yes, they are all either of the form 6*n+1 or 6*n-1. Turns out that it has been proved that all primes after 2 and 3 are of either of these forms (second half of second paragraph). So, instead of trying out all the odd
numbers, we can try only the numbers of these forms. The first code I wrote for this purpose was,
$i = 6;That basically just implements the concept of trying out 6n +/- 1 for each value of n. After that, I remembered a comment on the previous post which had made me realize that this is the same as just adding 2 and 4 in alternate iterations of the check. Just for the heck of it, I tried that too:
while($n > ($factor = $i - 1))
{
$n /= $factor while $n % $factor == 0;
$factor = $i + 1;
last if($n == $factor);
$n /= $factor while $n % $factor == 0;
$i += 6;
}
$factor = 5;
while($n > $factor)
{
$n /= $factor while $n % $factor == 0;
$factor += 2;
last if($factor == $n);
$n /= $factor while $n % $factor == 0;
$factor += 4;
}
Finally, I put these all in different subroutines in a Perl program, and benchmarked those using Perl's Benchmark.pm module. It's a very easy to use module - my entire benchmarking code consists of
# Upper limit for random number generationThe -10 there asks Benchmark to run each code segment for at least 10 CPU seconds, so that random fluctuations due to irrelevant factors are smoothed out, and we have reliable results.
my $rangemax = 1000_000;
cmpthese(-10, {
# Recursive => " largestprimefRecurs((int(rand($rangemax))))",
Naive_Iteration => " largestprimefNaiveIter(int(rand($rangemax)))",
Arcane6nplusminus1 => " largestprimef6n(int(rand($rangemax)))",
Arcane6nadding2and4 => " largestprimef6nadding2and4(int(rand($rangemax)))",
Arcane6nusingCount => " largestprimef6nusingCount(int(rand($rangemax)))"
});
You might notice that the call to the recursive version (the original one) is commented out. Due to (what I believe to be) a problem with the Benchmark.pm module, I was not able to benchmark it using this method. Benchmark.pm would report that the recursive function ran thousands of times faster than the newer methods (probably counting the recursive calls too), but other testing revealed it was actually much slower:
sundar@System:~$ time for (( i=0; i<500; i++ )); do perl Recursive.pl; done;
real 0m5.631s
user 0m3.848s
sys 0m1.556s
sundar@System:~$ time for (( i=0; i<500; i++ )); do perl 6nMethod.pl; done;
real 0m3.219s
user 0m1.692s
sys 0m1.344s
We see the recursive version is slower by more than 2 seconds! I'm planning to investigate more about this behaviour of Benchmark.pm (a quick googling didn't turn up much), but for now suffice to know the old algorithm is much slower.
There's another entry in the code there that I haven't talked about: Arcane6nusingCount. [By the way, the 'arcane' naming convention is a tongue-in-cheek reference to this comment where the commenter said "I get the feeling that [the programmer who wrote this post] would consider that working with 6n +/- 1 as too arcane a method..." :) I couldn't start the function name with the digit 6, so I added 'Arcane' before each one. ;)]
Anyway, that Arcane6nusingCount was a version where I tried to implement the 6n +/- 1 algorithm by keeping a counter of the iterations, and incrementing by 2 or 4 based on whether count%2 was 0 or 1. Turns out this extra mod operation kills performance, as you can see in the benchmark results below.
Benchmark Results:
Legend:
Due to Blogger's @&%*&%# formatting, I had to change the names to shorter versions as follows:
Naive_Iteration : IterThe percentages are the amounts by which the algorithm in the horizontal (the row) is better than the one in the vertical (the column). The /s amounts are the number of times this subroutine could have been run in one second, averaged over the given inputs. For this test, the inputs were random numbers from 0 to 1000,000, as seen in the code above.
Arcane6nplusminus1 : 6n+/-1
Arcane6nadding2and4 : Add2,4
Arcane6nusingCount : Count
Trial 1
Rate | Count | Iter | 6n+/-1 | Add2,4 | ||
Count | 45.9/s | -- | -24% | -35% | -39% | |
Iter | 60.1/s | 31% | -- | -15% | -20% | |
6n+/-1 | 70.5/s | 54% | 17% | -- | -6% | |
Add2,4 | 75.2/s | 64% | 25% | 7% | -- | |
Trial 2
Rate | Count | Iter | 6n+/-1 | Add2,4 | ||
Count | 43.1/s | -- | -25% | -41% | -42% | |
Iter | 57.4/s | 33% | -- | -22% | -23% | |
6n+/-1 | 73.6/s | 71% | 28% | -- | -2% | |
Add2,4 | 74.9/s | 74% | 30% | 2% | -- | |
We see that the method of adding 2 and 4 directly (Arcane6nadding2and4) works best, but the one using multiples of 6 and dividing by $i-1 and $i+1 (Arcane6nplusminus1) is only slightly worse. And, in my opinion, it is clearer and more elegant. They're both much better than the naive iterative method, and the original recursive solution is not even comparable to these!
So, I guess the moral of the story is, look at ways you can optimize your code even if an 'authoritative source' seems to tell you've reached the optimum. Also, where performance is critical, profile and benchmark your code - they are much more reliable practically than theoretical algorithmic analyses (of course, this is true only when they are done right).
The program along with benchmarking code is available at: http://sundaryourfriend.googlepages.com/largestprimefactorBenchmark.pl
To run it, you need to have the Benchmark.pm Perl module installed (which would probably be available by default).
Have a nice day...