I represent a company dedicated to generating reports for the hospitality industry (restaurants, hotels, etc). We would like to develop a platform to populate a database based on information collected from the web. Say we are focused on Miami. We would like to crawl twitter and instagram for posts containing certain hashtags (#miami, #florida) to analyse their content. We would also like to collect data from TripAdvisor (please read the pdf file attached).
This information would then be used to generate some basic insights. Common words used by reviewers and tweets. Time series of comments/tweets per week, etc. A basic dashboard would be fine.
Ideally, this information would be stored in a MySQL database. The web crawler would have to be configurable to get info from different destinations. I am planning to run this on a AWS server (Linux) to collect data continuously but I am open to suggestions.
I am also open to your suggestions on how to implement the solution. My budget for this stage is $500 USD but we can negotiate scope or even price.
The front-end I have in mind is a basic dashboard (no design required whatsoever). The focus of this stage is the back-end.