I had a brief play with this earlier. I took the PDF, fed it through some scripts and ended up with a database of word-page pairs (excluding some common stop words). The problem is, of course, that it needs a human to go through and restrict the listing to the important concepts that should be in the index.
Otherwise the index just ends up being half the size of the book again.
Are there any good ideas out their for automating this further?