You must have Linux + Scripting knowledge for this job, and a knowledge of mailboxe formats, especially the one KMail uses...
I have a very large ~/Mail folder ([url removed, login to view]) which KMail uses to store all the mail I receive in various folders. The problem is that due to various mistakes, there are several copies of many emails. The reason is that they sometimes have been stored from several sources (after being auto-forwarded) and they have been importer multiple times.
The duplicate messages are identical, so they have the same headers, text, id etc. They are often (but not always) immediately after each other too.
I want a script which will take all these mailboxes, create a new directory structure and strip out the duplicate emails, so that my inbox is much smaller.
This should be relatively easy, but it needs to be absolutely bullet-proof. The mailbox files appear to be text message after text message, with the From: line leading the next message. You should know more about this than me if you want to attempt the project.