Search engine database query optimization.
The search engine has mainly followin tables:
page[link_id_uniq, site_id, content]
words#[word_id, link_id, weight ] where #=(0..32), there are words0 .. words32. the index tbl name is a simple hasshing as SUM(word_each_char) % 32
Need 6 sql statements with, limit, count_per_page, order by weight with/without distinct by site_id, to retrieve [url removed, login to view] by
a)1 word search (distinct and without distinct by site_id)
b)any words (distinct and without distinct by site_id)
c)all the words (distinct and without distinct by site_id)
The flow would be:
a) 'word' from dict get word_id ->from words[hash(word)] get link_ids ordered descending by weight, then show the links, limit, grouped, count. If word is not in
dict, use the suondex to pull closes soundex word
b) get the ids from dict, if a word from the provided set is not there use soundex. pull out link ids ordered by words weights, number of matches (how many of the words are in doc, descending by poderate weight of off the provided words)
c) as above but all the words should be in the link
The sphyder search php engine has similar database. I can provide sample database with sample data.