C++ class to take a string and efficiently determine whether it contains profanity or personal information.
Critical: must be your own work. must not leak memory or crash. must be well-formatted, well-commented in English and maintainable.
Needs to have:
* An initialisation function that reads in a text file of bad words - one word (or phrase) per line - and a text file of good words - again, one word (or phrase) per line.
* A validation function that checks if a string contains any of the bad words and returns 0 or 1 (as long as the bad word is not a substring of one of the good words -? e.g. "ass" may be bad, but "glass" is okay.)
* It should recognise character/number substitutions (cr4p)
* It should deal with spaces (c r a p)?
* It should deal with non-alphanumeric wildcards (cr*p)
* It should deal with substrings (blahblahcrapblahblah)
* ... and obviously combinations of the above
* A function to log filtered strings that fail validation to a file
It must never crash or leak memory, irrespective of the data that it is given.
Execution time is critical - more important to memory use.
Execution time must not increase proportionally with? the number of bad words, so I propose using something like a b-tree to store and process the words.
I'll be wanting to know the method that will be used in advance.