I need a program in java that it will read from a directory the training set that it contains spam and legitimate [url removed, login to view] will read all the emails breaking them into words and put all the words in an one dimensional array with their frequency. then each email must have an array with the length of the previous array with the words that have the most frequency and each vector has to post declare if the email is spam or not . The program will make use of a stop list that it will contains all the unnecessary words and symbols like and on (, . "") and it will remove them from the vectors. we need to decide which of the words we will [url removed, login to view], we have to create the training and test set and then we have to transform it into an arff file.