In java, write a tokenzier class to tokenize a string into a word, phrase (greedy style), or other tokens according to the convention used by break iterator (i.e. subclass break iterator). Return type is List. The dictionary of reference is [url removed, login to view], initialize/cache the memory with pharses (2 words or more) for better performance. Speed is extremely important. Please discover the most optimal phrase search alogrithm.
I like coffee table!
list("I", " ", "like", " ", "coffe table", "!")
I have attached a code written to parse chinese language and found their greedy search algorithm to be usable. However, the code is buggy and has a lot of undesired processing for chinese characters. Please recommend a better alogrithm if your bid message.
9 freelancers are bidding on average $69 for this job
Tokenizer shouldnt be a problem, I am assuming you are looking at some sort of translator program that breaks each word up and then returns a Chinese equivalent.
Will code for you and explain the code to you over the phone if you reside in the USA or Canada - or Over Skype to Skype anywhere in the world. Let me know if you would likt this to be done fast. Thank you.