Monday 27 April, 2009

Finding the Largest prime factor efficiently

Long long ago, I wrote an article on finding the largest prime factor of a number (if you haven't read it, I'd recommend reading it now as much of this article won't make sense without that). It was just a fun little thing I did for solving Project Euler, so I didn't expect it would invite such excellent comments and get posted to Reddit!

And almost every one of the comments pointed out how the algorithm could be made better. I hadn't really thought there would be more optimal solutions than the ones given out by Project Euler, but turns out there are!

The first improvement was one I might have done myself if I had tried to do an iterative version myself. In the recursive version, when I find the number is divisible by (say) 2, I divide out the 2 and call the function recursively. Now, if there's another 2 in there (that is, if the number was divisible by 4), the number would again get divided by 2 and another recursive call would occur. When I'm converting this algorithm to an iterative version too, the most natural way would be:
my $max_factor = 1;
while($n % 2 == 0)
{
$max_factor = 2;
$n /= 2;
}

for(my $i = 3; $i < $n; $i += 2)
{
while($n % $i == 0)
{
$max_factor = $i;
$n /= $i;
}
}

There's a subtle optimization (compared to the recursive version) that has happened here that you probably missed. It's actually a simple one - in the recursive version, when you've divided n by say 2, you pass it back to the same function - which again tries to divide it by 2. This is good as long as n is divisible by 2, but once you've divided out all the 2's, this becomes just a waste of CPU cycles. In the iterative version, this is avoided because, once we know we've divided out all the 2's, we no longer try dividing by that, and instead move on to greater things. That's a Good Thing, as for big numbers, this optimization might save several cycles.

But even this turns out to be a naive implementation. As someone in the math subreddit as well as one of the commenters here pointed out, there's another optimization that can be done here - after 2 and 3, what are the prime numbers you have? 5, 7, 11, 13, 17, 19, 23, 29, etc. Is there some common property these numbers have? Why, yes, they are all either of the form 6*n+1 or 6*n-1. Turns out that it has been proved that all primes after 2 and 3 are of either of these forms (second half of second paragraph). So, instead of trying out all the odd
numbers, we can try only the numbers of these forms. The first code I wrote for this purpose was,

$i = 6;
while($n > ($factor = $i - 1))
{
$n /= $factor while $n % $factor == 0;

$factor = $i + 1;
last if($n == $factor);
$n /= $factor while $n % $factor == 0;

$i += 6;
}
That basically just implements the concept of trying out 6n +/- 1 for each value of n. After that, I remembered a comment on the previous post which had made me realize that this is the same as just adding 2 and 4 in alternate iterations of the check. Just for the heck of it, I tried that too:

$factor = 5;
while($n > $factor)
{
$n /= $factor while $n % $factor == 0;
$factor += 2;
last if($factor == $n);

$n /= $factor while $n % $factor == 0;
$factor += 4;
}

Finally, I put these all in different subroutines in a Perl program, and benchmarked those using Perl's Benchmark.pm module. It's a very easy to use module - my entire benchmarking code consists of

# Upper limit for random number generation
my $rangemax = 1000_000;
cmpthese(-10, {
# Recursive => " largestprimefRecurs((int(rand($rangemax))))",
Naive_Iteration => " largestprimefNaiveIter(int(rand($rangemax)))",
Arcane6nplusminus1 => " largestprimef6n(int(rand($rangemax)))",
Arcane6nadding2and4 => " largestprimef6nadding2and4(int(rand($rangemax)))",
Arcane6nusingCount => " largestprimef6nusingCount(int(rand($rangemax)))"
});

The -10 there asks Benchmark to run each code segment for at least 10 CPU seconds, so that random fluctuations due to irrelevant factors are smoothed out, and we have reliable results.

You might notice that the call to the recursive version (the original one) is commented out. Due to (what I believe to be) a problem with the Benchmark.pm module, I was not able to benchmark it using this method. Benchmark.pm would report that the recursive function ran thousands of times faster than the newer methods (probably counting the recursive calls too), but other testing revealed it was actually much slower:

sundar@System:~$ time for (( i=0; i<500; i++ )); do perl Recursive.pl; done;
real 0m5.631s
user 0m3.848s
sys 0m1.556s

sundar@System:~$ time for (( i=0; i<500; i++ )); do perl 6nMethod.pl; done;
real 0m3.219s
user 0m1.692s
sys 0m1.344s

We see the recursive version is slower by more than 2 seconds! I'm planning to investigate more about this behaviour of Benchmark.pm (a quick googling didn't turn up much), but for now suffice to know the old algorithm is much slower.

There's another entry in the code there that I haven't talked about: Arcane6nusingCount. [By the way, the 'arcane' naming convention is a tongue-in-cheek reference to this comment where the commenter said "I get the feeling that [the programmer who wrote this post] would consider that working with 6n +/- 1 as too arcane a method..." :) I couldn't start the function name with the digit 6, so I added 'Arcane' before each one. ;)]
Anyway, that Arcane6nusingCount was a version where I tried to implement the 6n +/- 1 algorithm by keeping a counter of the iterations, and incrementing by 2 or 4 based on whether count%2 was 0 or 1. Turns out this extra mod operation kills performance, as you can see in the benchmark results below.

Benchmark Results:
Legend:
Due to Blogger's @&%*&%# formatting, I had to change the names to shorter versions as follows:
Naive_Iteration     : Iter
Arcane6nplusminus1 : 6n+/-1
Arcane6nadding2and4 : Add2,4
Arcane6nusingCount : Count
The percentages are the amounts by which the algorithm in the horizontal (the row) is better than the one in the vertical (the column). The /s amounts are the number of times this subroutine could have been run in one second, averaged over the given inputs. For this test, the inputs were random numbers from 0 to 1000,000, as seen in the code above.


Trial 1








Rate Count Iter 6n+/-1 Add2,4
Count 45.9/s -- -24% -35% -39%
Iter 60.1/s 31% -- -15% -20%
6n+/-1 70.5/s 54% 17% -- -6%
Add2,4 75.2/s 64% 25% 7% --

Trial 2






Rate Count Iter 6n+/-1 Add2,4
Count 43.1/s -- -25% -41% -42%
Iter 57.4/s 33% -- -22% -23%
6n+/-1 73.6/s 71% 28% -- -2%
Add2,4 74.9/s 74% 30% 2% --

We see that the method of adding 2 and 4 directly (Arcane6nadding2and4) works best, but the one using multiples of 6 and dividing by $i-1 and $i+1 (Arcane6nplusminus1) is only slightly worse. And, in my opinion, it is clearer and more elegant. They're both much better than the naive iterative method, and the original recursive solution is not even comparable to these!

So, I guess the moral of the story is, look at ways you can optimize your code even if an 'authoritative source' seems to tell you've reached the optimum. Also, where performance is critical, profile and benchmark your code - they are much more reliable practically than theoretical algorithmic analyses (of course, this is true only when they are done right).

The program along with benchmarking code is available at: http://sundaryourfriend.googlepages.com/largestprimefactorBenchmark.pl
To run it, you need to have the Benchmark.pm Perl module installed (which would probably be available by default).

Have a nice day...

Wednesday 22 April, 2009

Top ten Internet resources for programmers

1. http://www.stackoverflow.com

A year ago, this website didn't exist. Now, it's at the top of this list of Internet resources for programmers! Stack Overflow is a place where programmers like you and me ask questions and answer them. Some things that make it great are: each question and answer can be 'voted' on (either +1 or -1), and the answers with the greatest number of votes appear at the top near the question; only programming related questions are allowed, and everything else is 'closed' down; any discussions on the answers go into the 'comments' section which is hidden by default - this means that flamewars and such nonsense are much less here, and mostly hidden from our view too.
But I can go all day about what makes it tick, but you won't really see how good a resource it is until you actually go there and try to search for some programming question you've been having. Chances are, someone else had it before you, and asked it there; in that case, you'd probably find very good answers to your question. If you are so unique that no other questioner thought of what you thought, you can login and post the question yourself - in which case you'd probably start getting answers within minutes, and the answer you wanted would be there within hours.
All that talk from me isn't probably going to convince you, so just go try it.

2. http://reader.google.com (actually, RSS feeds from programming blogs)

To be honest, this isn't a resource in itself, but can act as an aggregator of resources. If you're not familiar with the concept of 'feeds' or 'subscription' or 'RSS', here's a simplified view of it: when you add feeds from websites to Google Reader, you're asking it to check the sites for new content now and then, fetch the content automatically whenever it arrives and keep it stored, so that you can read it all in Reader without having to check each site yourself. If this does not seem like a big advantage to you (it didn't to me at first), start using it, and you'll see how much of a convenience it is.
The most important and common use of Google Reader (for us) is to subscribe to and read programming blogs such as Lambda the Ultimate, Joel on Software, etc., (in addition to webcomics such as Dilbert or xkcd :) ). Whenever you come across a blog that you consider interesting, click on the orange icon with the three curved lines that appears near the address bar (you are using Firefox, right?), and choose to subscribe using Google.
And after all this, don't forget to check out Google Reader every day or at least once a week. :)
While Google Reader might be the simplest for newbies (opinions differ even in this), there are many other feed readers and each one has its own fan base. ProBlogger seems to recommend FeedDemon, while other readers like the Sage extension for Firefox, Bloglines, etc. are also very popular.

3. IRC
Internet Relay Chat, simply put, is a set of 'channels' (what you might call chat rooms) each one devoted to a particular subject of interest. For example, there is a channel devoted to Wikipedia (#wikipedia I think), while another one devoted to discussions of mathematics (#math). You probably might find more use for ##c (for C programming), ##c++ (obviously, for C++ programming), etc. Like Google groups, here too, world experts on the field gather, so it's a very good resource if you know how to use it properly. If you're a beginner to IRC, I'd recommend the free IceChat client which is very newbie friendly and also powerful. The channels are hosted in things called 'IRC servers'. For programmers, almost the only server we'd need is irc.freenode.net. Connect to it, be polite, and share your knowledge too.

4. http://www.reddit.com/r/programming
Proggit, as it is fondly called, is a rich place where intelligent discussions, incandescent flamewars and 4chan memes happen side by side. Like Hacker News (#5 in this list), this too is a place where people share and vote upon links (obviously, links relating to programming). If you have the attitude of a gold miner, ready to dig out gold from among so much dirt and rock, you can learn a real lot from here. Just remember not to get trapped into the memes, for it is said there is no salvation after that.

5. http://news.ycombinator.com/
Known as 'Hacker News', it's a site where people post links to things they find interesting, and others vote upon them. This site is for 'hackers', and so though most of the content is about programming, other brain stimulating things that would interest a hacker would also be present. The ambience here is one of extreme decency and politeness, in contrast to most of the Internet. Some like the UI of the site, some don't, see if it fits you.
This site would have been much higher in the list if I had written this post a few weeks ago. But in that space, I've come to see that the discussions on Reddit (site #4 in this list) are usually much richer and informative than the ones here. Reddit does have other noise (people attempting at humour and nasty flamewars), but in trying to suppress that noise, HN (Hacker News) has seems to have suppressed some of the useful discussion too.

6. http://en.wikibooks.org/wiki/Category:Programming
Even if you knew about all the other resources here, I'd bet you didn't know this one till now. From Regular Expressions to Functional Programming, there's a wealth of information about many programming topics. This along with Wikipedia can take your programming knowledge to a higher plane.


7. comp. heirarchy in Google groups
This is a whole list of 'Usenet' groups about computers. Within these, you'll most probably find the groups under comp.lang part most useful, but the others cater to special needs - for eg., if you're into programming for graphics, you might find the comp.graphics groups useful. These groups have lots of experts in the field of the group, which means you'll be interacting with top notch programmers in your area of interest.

8. http://www.topcoder.com
A website where you get to 'compete' with other programmers from throughout the world in various 'tournaments', 'marathons', etc. Frankly though, I don't care a bit for the competition or ranking part of it. It contains so many good programming problems that, when you solve them, give you insights into so many things. Also, the Algorithm Tutorials here are said to be very good (they are on my 'to read' list :) ).
Some practical tips (since the UI of TopCoder is less than ideal and you might find it difficult to do things): To register, click on Login in the top right corner (you might have to scroll right a little to see it), then click on Register in the page that comes up. To participate in some tournament or just practice using questions from old tournaments, login to TC (that's the short form of TopCoder), click on Algorithm in the left panel, then on 'Launch Arena'. It would probably ask to install Java Web Start, go ahead and install it. After that, when the arena appears, login again in that window. Then, if you wanna participate in some tournament going on now (they'll send you mail about the schedules from time to time), choose the Active Contests menu. If you just want to try out old questions, choose the Practice Rooms menu. Either way, choose one of the SRMs from the menu. When you've been 'moved' to that room, click on the box that says 'Select one' and choose one of the scores listed there (like 250, 500 and 1000). A programming problem of that score would be presented to you. After this, I guess you'd be able to take off on your own.
There are other sites like Project Euler and Sphere Online Judge (SPOJ) which are somewhat similar to TC. Project Euler is heavily inclined towards mathematics, so I find its scope to be quite narrow. SPOJ is similar to TopCoder in that it too provides several algorithmic problems and allows us to submit solutions to be evaluated and validated. A small disadvantage is that the collection of problems is less than that in TC. But a great advantage in SPOJ is that it allows a wide variety of programming languages - TopCoder is limited to a few mainstream languages: Java, C++, C#, and VB .Net.

9. http://www.thedailywtf.com
Thought the Internet is serious business, at times we need to take some time off to laugh out and relax. What if you can learn something while you do that? Well, the Daily WTF (they claim that it means Worse Than Failure) is a place for that. Working programmers tell tales about horrid code and other maladies from their co-workers, which are mostly because of a lack of programming knowledge. The stories are hilarious, but at the end of most of them, you'd be a bit wiser knowing a little more as to how to be a good programmer. Of special mention are the 'Featured Articles' and 'Tales from the Interview' sections.

10. http://www.google.com
Ok, I can hear your 'Duh!' even from here, but give me a few seconds. To be honest, no list ever can contain all of the wonderful programming resources out there. There are just too many forums, too many personal web pages that contain precious wealth of information. With some power googling (I'm planning on making a post on this soon), you can dig all that out. So, this makes Google probably the most useful resource for programmers. The only reason this is on number 10 is that you very probably know about it already (if you didn't, what forest were you living in? :) )

So, if I were to limit myself to 10, those would be it...

I know, I know, I probably missed your favourite website out of this list!
If so, do leave a comment here so that others (including me) might know and experience its glory. Thanks... :)

Wednesday 15 April, 2009

Lisp by and for the C programmer

Finally, it's done. That program for finding the next palindrome given a number. I had to jump through several hoops and get help from the www and IRC to get it done (thanks to the helpful guys at #lisp - the newbie bashing notwithstanding).

Lisp is supposed to be a functional language. To be honest, I myself don't clearly understand what that means; I'm trying to learn little by little. As of now, it seems to me to be a conglomorate of various concepts and language facilities, some of which are present in some languages, and almost no real language seems have all of them.

Anyway, because of that, I tried not to make the program entirely C-like, and tried to do it the 'functional' way. Practically to me, it involved using recursion where I would have used iteration in C, and using variables as if they were constants - trying not to modify anything. In functional programming (FP) circles, this is called treating symbols as immutable (I think), and is glorified as a Good Thing. I haven't adhered to it fully (one place that comes to mind is the changing the value of ndigits in increm-left-from-here function), but I feel this is good enough for a C/C++/Perl programmer who understands very little of FP. :)

Some descriptions that might help understand the program are here. You can try out things with the Lisp in a Box IDE for Windows (you'll need to install the base installer and the CLISP module). Some clarifications: I'm not an expert in Lisp and don't claim to be one, I've described things as I understand them. Also, when making analogies to C, I've tried to take the thing that comes closest to it, but don't take the analogies literally. Concepts in Lisp are quite different from those in C, these are here only to give you a rough idea about them.
() syntax:
        The parenthesis syntax here is something that might be very new to you. But it's quite simple, and very consistent.
        When you see (A B C) in Lisp, what it means is call the function A with arguments B and C (A can also be a macro, which I've explained under incf). In C syntax, that might be A(B, C). One good thing about Lisp is, this notation is followed very consistently throughout the language. Even operations like addition are done with this syntax: 1+2 in C will be (+ 1 2) in Lisp. Whenever you see a list (x y z a b c), you can say it's a function call calling function x with arguments y, z, a, b, and c. The only exception to this are literal lists - you know, the lists you use to store data, like C arrays. When you want a list of number 1 to 5, you might try (1 2 3 4 5). But... Lisp sees such lists as function calls, and so tries to 'call' the function 1 with arguments 2, 3, 4, and 5. Surely not what we wanted. So, to tell Lisp that we want a literal list, and not a function call, we have to put a single quote ' before the list. So, '(1 2 3 4 5) will achieve what we wanted. This actually works by changing itself it (quote (1 2 3 4 5)) where quote is a macro. But you don't need to understand that to use the single quote. :)

defun:
        This defines a function, like the sub keyword in Perl. For the C-ers among us,
(defun funName (arg1 arg2)
(body ...
....))
is very much like
void funName(type1 arg1, type2 arg2)
{ body ...
...}
Immediately, we can notice a few things: In Lisp, we don't specify the type of the arguments that the function takes - it can take whatever it wants. arg1 itself can be an entire list itself, or a number, or a string, or an object, or any type you can imagine (and Lisp can handle :) ). Also, there's no declaration of what the function is going to return - this too can be anything.

incf:
        This is a simple macro that increments whatever you give to it as argument. If you're asking 'hello, what is a macro?', congratulations for the careful reading.
        A macro is a thing that modifies the code at the place where you put it. They are somewhat like the preprocessor macros in C - and before the Lisp fans start grumbling, I'll add - but they're supposed to be much much more powerful. There's an entire book that is almost fully dedicated to doing magic with Lisp macros: On Lisp.
        In case you didn't get any clear idea about what macros are from the previous paragraph (yes, I guessed it), I'll explain with this example itself. In my code, in one place, I had
(incf i)
While debugging the code, I saw how the incf was expanded: the macro just replaced itself with
(setq i (+ i 1))
I myself wrote a macro in this code using the defmacro command.
defmacro:
        I described what a macro is above. In Lisp, the syntax for macros is very much like the one for functions, with defun replaced by the keyword defmacro. For example, I have defined a macro in this code as:
(defmacro get-mid-index (list)
`(ceiling (1- (/ (length ,list) 2))))
        A macro too gets arguments like a function (the argument is 'list' here). What follows after the argument is the expansion of the macro, which means the macro is replaced with this thing wherever we call the macro.
        There are some curious things to note here. First off, the expansion of the macro is not in an ordinary
(some code here ...
...)
format here. There is a ` and a , in that code, which makes it special. In () syntax, we said that '() makes a literal list - that is, a list that is not evaluated, but is treated as data. The ` (that is the backtick - the thing which is to the left of 1 in the keyboard) is similar to ' but with a difference: while nothing within the '() is evaluated, in a `() you can ask Lisp to evaluate certain things alone by putting a comma (,) before it. For example, in this expansion, the 'list' has a comma before it, so it will be evaluated. Since here 'list' is the argument we pass to the macro, it gets evaluated to the actual list which we pass. For example,
(get-mid-index digits-of-n)
becomes
(ceiling (1- (/ (length digits-of-n) 2)))
That's nifty, right? :)
nth:
        nth is a function that returns the n'th element of a list. For example, (nth 2 '(a b c d e)) will return c. Here, it's important to note that Lisp lists are 0-based - the counting starts from 0. So a is the 0th element, b is the 1st, etc.
cond:
        cond is a 'special form' (which is something like a function or a macro, but they say it's not either of those. I'm yet to understand special forms) which is somewhat like the if() construct of C. Lisp too has an (if ...) syntax, but it is limited to single statements. You can say (if (condition) (do-this) (else-do-that)), but (if (condition) ((do-this) (and-this)) (else-do-that)) is not allowed. In C syntax, it's as if
if(condition)
    do-this;
else
    do-that;
is allowed, but
if(condition) {
    do-this;
    and-do-this;
}
else
    do-that;
is not allowed. In most practical cases, I find that I need to put multiple statements in the if's body, so I rarely use if in Lisp.
        The cond function is similar to if, but much more flexible. It's syntax is,
(cond ((condition1) (do-this) (and-this) (also-this) (and-much-more) ...)
          ((condition2) (do-that) (and-that) (also-that) (and-much-more) ...)
            ...
            (t (do-something) (do-something-else) (and-much-more) ...))
        In this syntax, cond is followed by many lists. Each list contains one condition and a set of statements. Lisp will look through each of those conditions, and when any of the conditions succeeds (that is, returns true), it will execute the set of statements that are with that conditions. One weird thing about cond is that it requires that at least one of the conditions be true. If not, Lisp will go mad and bite you (or so they say). So, just for safety, it is good practice to put a (t ..) as the last condition, which stands for 'true', so it is a condition that will always be true. If any of the conditions succeeds, Lisp will execute its statements and exit the cond. If none of them succeed, the statements that are with the (t ...) will run.
        If you want, you can think of the above cond as equivalent to:
if(condition1) {
do-this;
and-this;
also-this;
and-much-more; ...
}
else if(condition2) {
do-that;
and-that;
also-that;
and-much-more; ...
}
...
else {
do-something;
do-something-else;
and-much-more; ...
}

Note that the final (t ...) in Lisp is like the final else clause of C.
setq:
        setq is like the = assignment of C. i = 1; in C would be (setq i 1) in Lisp. Only today (two days after I finished this code), I read here that it is better to use setf for this purpose in Lisp, but so far everyone has told me to use setq, so I'm not sure which one is really better. The author of this FAQ that I linked to seems to know things, so I'm gonna trust him and use setf hereafter. It's really a trivial change, here it would be (setf i 1) instead of (setq i 1).
car and cdr:
        These strange names stand for simple concepts - they give you the first element of the list and the rest of the elements of the list, respectively. (car '(1 2 3)) would be 1. (cdr '(1 2 3)) would be (2 3). As you can see, car returns a single value (which the Lispers call an atom), but cdr returns a list.
format:
        The format command is the printf of Lisp. It has many fancy options - it can even iterate through a list and print the elements one by one. Peter Seibel gives a very good explanation of the format function here. One simple example is:
(format t "~D~%" n)
is similar to
printf("%d\n", n);
        The 't' argument to format tells it to print the output to the screen (and not to a file or something like that).
With this intro, and some help from the Net, I think you'd be able to get a grasp of this program and learn a lot of Lisp in the meantime. Learning Lisp is said to be an enlightening experience, so I hereby invite you to the path to Nirvana, with this piece of code (many thanks to this online syntax highlighter for producing this HTML):
(defun inc-nth (n list)
(incf (nth n list))
list)

(defun get-mid-elem (list)
(nth (get-mid-index list) list))

(defun mid-inc (list)
(inc-nth (get-mid-index list) list)
list)

(defun get-digits (n)
(cond ((and (< n 10) (>= n 0))
(list n))
((< n 0)
(get-digits (- n)))
(t
(append (get-digits (floor (/ n 10))) (list (mod n 10))))))

(defun num-from-digits (digits)
(cond ((= 1 (length digits))
(car digits))
(t
(+ (car (last digits)) (* 10 (num-from-digits (butlast digits)))))))

(defmacro get-mid-index (list)
`(ceiling (1- (/ (length ,list) 2))))

(defun get-mirrored-num (ndigits)
(num-from-digits
(let ((firsthalf (butlast ndigits (floor (/ (length ndigits) 2)))))
(append firsthalf
(if (evenp (length ndigits))
(reverse firsthalf)
(cdr (reverse firsthalf)))))))

(defun increm-left-from-here (index ndigits)
(cond ((= (nth index ndigits) 9)
(progn
(setf (nth index ndigits) 0)
(if (zerop index)
(setq ndigits (cons 1 ndigits))
(setq ndigits (increm-left-from-here (1- index) ndigits)))))
(t
(inc-nth index ndigits))))

(defun get-next-palin (n)
(let ((ndigits (get-digits n)))
(cond ((> (get-mirrored-num ndigits) n)
(get-mirrored-num ndigits))
(t (get-mirrored-num (increm-left-from-here (get-mid-index ndigits) ndigits))))))

(let ((num-cases (parse-integer (read-line))))
(format t "~{~D~%~}"
(loop
for i from 1 to num-cases
collect (get-next-palin (parse-integer (read-line))))))

Tuesday 14 April, 2009

I want a "good" editor to create blogs

It's been happening for quite a while.

Blogger's post editor has been playing games with me. Especially on my GATE solutions blog, where I like to keep things in a uniform format (dunno why, but it feels good to have similar formatting for similar posts - and all the posts there are similar to one another :) )

I do a 'blockquote' (that's the double quotes icon in the Compose mode), type something, want to get out of the blockquote mode for the next line, it un-blockquotes all the previous lines too. What the heck?

I select some text to delete it, Blogger decides I want to delete some surrounding text also and 'helpfully' deletes. Just for testing, I tried deleting 'Blogger' in the previous line, it deleted 'decides' along with it. What on earth can be the logic behind such a thing?

And the biggest fail here is that, when I try to undo that stupid deletion to restore order, it undoes more than I asked for and deletes some things I typed! Seriously, what did they think they were doing when they added this 'feature'? Now, just because I wanted to delete something, I have two options - either not do an undo and retype the things near the place I deleted, or do the undo and retype the things I typed recently. What if what I did not remember what I had typed there? I have to juggle back and forth between undo and redo to see what I typed myself, and retype it myself! That's really... Worse Than Failure on Blogger's part.

With all that frustration out of me, what I wanted to say was, I need a new editor to create my blog posts or I think I'll stop blogging from the mere frustration of dealing with this. Don't worry, I don't think the situation is so bad I won't be able to get any editor to do the job. :)

First, I remembered Microsoft Word itself has a 'Post to Blogger' option, so I tried it. Well, what can I say? The HTML it created was, to put it bluntly, horrible! I'd have been somewhat ok with that if not for the next fact: when I clicked publish to blogger (or something similar), it told me my username and password would be sent in plaintext and anyone can read them. What? This huge program doesn't have a small place for incuding some SSL agent to do encrypted transmissions? I sure ain't giving my Google username and password in the wild like that.

There's also a Blogger for Word add-in to Word that probably handles things more securely, but it's development has been discontinued by Google, and what they give for download doesn't work for me in Word 2007.

Thinking there should be some solution for it, I searched throughout the Internets, only ending up reading things like this which told me some things about Blogger's WYSIWIG editor worked. Not really what I was looking for, but I learnt something. :)

So, the option I'm looking at currently is, creating the posts in the Kompozer or some similar editor, copying the resulting HTML here and posting. Quite a lot of work for such a simple task, but it seems there's no other option. Also, Kompozer doesn't seem to have easy buttons for things such as blockquotes, etc. like Blogger's interface, which makes things a bit more difficult. But in a few days' time, I think I'll be ok with anything to get off this annoying editor.

So, kind reader, if thou knowest of any editor which can my life more pleasant, kindly bestow thy blessings on poor me and reveal its name. I shall be indebted a lot to thou. Thanks. :)

Friday 10 April, 2009

Finding the next palindrome given a number

I recently was looking for websites similar to Project Euler, and came across SPOJ - the Sphere Online Judge. It gives us some problems, then asks us to submit a solution, which it will evaluate and tell if we are worth our salt as programmers ;)
More details here.

There, I came across this problem which looked quite interesting. Basically, the challenge is, given a number n, we've got to find the palindromic number which comes after this.
Some definitions:
Palindromic number: A number that reads the same either way, like 12421, 57375, 8448, 146641, etc.
"comes after this": the palindrome number must be greater than the given number - given 12321, we must find the next palindrome and not return 12321 itself as the answer.

Ok, with those made clear, let's dive into the problem.

On first sight, it seems simple enough: iterate from n+1 checking if each number is a palindrome. Stop when you find a palindrome.
Unfortunately, that strategy is too slow for our purposes - SPOJ clearly states the inputs can be very huge, in which case this brute force would probably not be able to complete it within the given 9 seconds.

Anyway, we are programmers, and must seek more efficient ways when they exist. Here, it does seem there must be a way to find the next palindrome given a number. What would it be?

My first thought was, simply mirror the number around its centre. Then, we'd surely end up with a palindromic number.
What I mean is, given a number 16310, mirror the first half values across the central '3' to give 16361. If the number has an even number of digits, just mirror it about the centre of the number, like 3628 to 3663.

That strategy works as long as the resulting 'mirror-imaged' value is greater than the original number. What if it results in a number less than (or equal to) the original? Say, for 41258, it would give 41214, which is less than the original number. Or 12321, which would give 12321 itself, which is equal to the original number.

So my next strategy was, if simple mirror-imaging fails, increment the middle digit and then do a mirror image. So, 41258 would become 41314. 12321 would be 12421.

But again, there's a problem with this approach. What if the number had even number of digits? There is no middle digit to increment. Let's say the number was 4265. It's simple to see that, instead of the middle digit, if we increment the digit to the left of the middle (2 here), and do the mirror image, we'll have the palindrome we want: 4334 in this case.

So far, so good.

Now, onto a big issue: 9. Yes, the digit nine. We saw that when the simple mirroring fails, we increment the middle (or left to middle) digit. Now, what do we get when we increment 9? 10. Oops! We can't put a 10 where 9 was. Want proof? Try 41968. Simple mirroring would give 41914, no luck there. Incrementing the 9 to 10 would give 411014, not a palindrome even. And also, it's surely false that there's no palindrome between 41,968 and 411,014 (that's between forty one thousand and four hundred and eleven thousand).

So, what do we do? Luckily, something we learnt in elementary school - carrying over - seems to help here too. Wherever the 10 occurred, keep the 0 there itself, carry over the 1 to the previous digit. So here, 41968 would become 42068 which on applying mirroring would give 42024.

Here too, there's a caveat: consider the extreme cases: 99999 would give (with repeated carryover) 1000001, and 9999 would give 100001. In both cases there's an extra 0 hanging in there. So far I've not been able to generalize it, and have been keeping the case of all digits being 9 a special case only. In this special case, I do the usual procedure, then remove one of the 0's from the resulting number. Ugly, but works. :)

EDIT: Turns out if I finish the increment and carrying over parts fully, then mirror image the number according to how many digits it now has, the 9999 cases take care of themselves with the original algo itself - no 'special cases' needed. :)

So, I believe this is the complete algorithm to find the next palindrome given a number. I have not thought through the steps very carefully, so if you see any flaws, kindly point them out...

I'm trying to code this thing in the Lisp programming language, so I'm taking baby steps towards the solution. As of now, I'm yet to implement the carrying over for 9 part alone. And I don't have any idea how complex that is going to be.

Update: I finished coding this in Lisp, but lost the code somewhere. I coded it again in Python now, which is available here. I've put in a lot of comments so as to make it somewhat easily understandable. Hope this helps the doubting Thomases believe. :)

By the way, my GATE solutions blog now has full feed access, which means you can read it fully without leaving the comfort of your RSS reader.

Happy coding :)

Thursday 2 April, 2009

Back to blogging

Yeow... It's been quite a while since I blogged here. I've been doing somewhat better in my other blog about GATE question papers' solutions.
The reason I've not been so active here is, unlike there, here I don't have anything specific to say.. I was just bloggin whenever I felt I had free time, and also had something technically interesting to write about.
These recent days, I had convinced myself that I was busy, couldn't afford time for blogging, didn't have anything worthy enough to blog about, etc. It's crazy to what extent you can deceive yourself, given this is possibly the most vetti (translated: being without any work) period I have ever had... :)
Meanwhile, Vijay had a look at my blog and was upset that it was fully technical. He recommended I write something 'not-so-technical' and more like a blog. And that seemed like a good idea too, as that would (hopefully) be much easier to do than to analyze and write a technical one...
But what non-technical topic do I know anything about?
I'm a closet nerd, and hate small talk. I might do it in RL for 'social' reasons (gee, do I hate that word!), but surely not in my blog. This is my space, and shall remain so.
Vijay himself came to the rescue again - 'you think a lot about spiritual and philosophical issues right, why not write articles about that?' he asked. For then, I pointed him to my pseudo home page: http://sundaryourfriend.googlepages.com, but I knew it was heavily outdated and incomplete, and the content wasn't all that mature either. My views and ideas have changed a lot in the mean time (hopefully in a positive gradient), and I've (again hopefully) matured a lot.
So, the conclusion of all this rant is, this is probably going to be the platform for my spiritual or philosophical posts too (unless I decide to start a new blog for it - in which case, I'll surely tell you).
Also, don't get daunted by those two words - By spiritual, I mean things that can improve your life, and by philosophical, I just mean things that I think through deeply.. I'll keep the articles as interesting as Sundar-ly possible. ;)
Finally, I promise you this will be the only nearly-contentless post in my blog (at least for some time), and I'll post something worthwhile as soon as possible... :)