Devam Ediyor

487366 Data Mining

I need a program in JAVA that it will read from a directory the training set that it contains spam and legitimate [url removed, login to view] will read all the emails breaking them into words and put all the words in an one dimensional array with their frequency. then each email must have an array with the length of the previous array with the words that have the most frequency and each vector has to declare if the email is spam or not . The program will make use of a stop list that it will contains all the unnecessary words and symbols like and on (, . "") and it will remove them from the vectors. we need to decide which of the words we will keep.

Output file format:

The left column will have the names of the .txt testing files

The right column will have the predictions

‘s' for spam

‘l' for legitimate

The two column must be separated with a tab (\t)

I will give you the training set

Beceriler: Her şey Kabul, Veri Girişi, Veri Madenciliği, Java

Daha fazlasını gör: vector remove, training program names, read vector, program data vector, data set testing, c vector array, array of vectors, testing data entry, need vectors, Java Training, entry l, Email Mining, Data testing, spam java, emails txt file list, java dimensional array, read data program, tab separated, predictions, frequency data, vector array, training data set testing data set, Testing Training, tab data, data mining java

İşveren Hakkında:
( 4 değerlendirme )

Proje NO: #2233276