Devam Ediyor

488009 Screen scraping exercise

Given a list of up to several hundred thousand Chinese terms, write a C# program that will fetch search result count and the first 10 headlines (plain text headline, newspaper link, article link and date) from [url removed, login to view] into a SQLite or SQLServer database. Write out all SQL inserts explicitly. Headlines should be exact match. An API for Baidu may exist, if so, use it. Searching RSS feeds may be more straight forward, if so, do it. Use UTF8 encoding. Expected tables as described below.

Terms:

termid

term (text, unique)

Search Results

searchid

termid

count (integer)

searched (datetime)

Headlines:

headlineid

headline (text, unique) - make sure this is long enough

TermHeadlines:

termid

headlineid

Given a term, I should be able to get the hit count and the latest 10 sample headlines. The term MUST be present in the headline.

Please successful complete one run (we will review partial results) and provide source code. For a large sample word list, you may use the simplified or traditional (or both) column from [url removed, login to view]

Beceriler: .NET, Her şey Kabul, SQL, Web Scraping

Daha fazlasını görün: sql get date, scraping com, integer word search, article for a newspaper, write article for newspaper, newspaper article search, We Scraping , exercise, cc program, sqlite sqlserver, baidu search api, news rss api, chinese newspaper article, database sqlite program, database sqlite, article program review, datetime, net fetch, simplified simplified traditional chinese, newspaper column, fetch results, database scraping search, article newspaper, news article database, exercise database

İşveren Hakkında:
( 3 değerlendirme ) Bellingham, United States

Proje NO: #2233920