Spyder web crawler that feeds from urls in web
System working (manual work).
We have a program done in visual basic (we have source code) which we have to load in xls (maximun 50,000 records) and scraps (it crawls domain sees number of urls crawls them and with regex patterns extract info and saves it in a xls),
We would like to have a better system than that
We would like to upload urls and that this spyder continouly works in the cloud with no need to have a computer or a virtual machine. Amazon Web services, or a mysql with a script, don’t know the best system that you see. The idea is to just focus on getting urls. Mysql would be the best system I think we would need to do some update queries every now and then to have the final results.
We would like to have it centralized with a sole mysql and that automatically (without a virtual machine if possible)