İptal Edildi

(Sub)Reddit Image Crawling Script - Python

I am looking for a Python script that I can run which can parse through a configurable number of posts in any of the categories (hot, new, rising, controversial, top, gilded, promoted) for a given subreddit. Once it has retrieved this list of posts on the given subreddit, it will then proceed to follow the links in the post and comments and will download images from the site while ignoring ads. For example, if a post links to Imgur, this script should be able to download all of the images of the Imgur album, or if a single picture is posted then it will download that picture. The directory that each image file is downloaded to should also be configurable so that the user can store all images in a desired location.

Acceptance Criteria:

1. The script will utilize a file (or global variables) to allow for simple configuration of the subreddit, number of posts, category to be parsed, and image destination directory path.

2. The script will be modular. For example, different aspects such as the parsing of the subreddit, post/comments, and parsing of the target page will each be different functions.

3. The script must be able to download all images on the target page as well as any pages that are linked in the comments posted by the posts' original author.

4. The script will utilize popular, well-established libraries to ensure reliability and correctness. If you have questions regarding a library, please ask.

5. The script can parse 10 posts and download all associated images in less than 15 seconds when utilizing a sustained download bandwidth connection of at least 15MB/s.

6. The script runs on Python 2.7.X.

7. The script runs correctly and downloads all desired images when tested on any number of subreddits.

Please note that the Acceptance Criteria may change to remove or include additional requirements or may alleviate some constraints of the requirements.

Beceriler: node.js, PHP, Python, Web Scraping

Daha fazlasını gör: web scraping python 3, sub all, c linked list library, c library linked list, all sub, python download file, download file python, scraping python, reddit, python scraping, script crawling images, web script python, python parsing script, python parse directory, remove comments php file, web scraping image, python library, parsing file python, image library, web python script, python download, python web crawling, image parsing, python scraping web page, python script parsing

İşveren Hakkında:
( 0 değerlendirme ) United States

Proje NO: #6548913

2 freelancers are bidding on average $149 for this job


A proposal has not yet been provided

in %bids___i_period_sub_35% gün içinde210%project_currencyDetails_sign_sub_37% %project_currencyDetails_code_sub_38%
(8 Değerlendirme)

Предложение еще не подано

in %bids___i_period_sub_35% gün içinde88%project_currencyDetails_sign_sub_37% %project_currencyDetails_code_sub_38%
(12 Değerlendirme)

Hi, we can do it. I do have a lot of experience in software development, Linux, java, python, advanced numerical computations, data analysis, crawler development etc. Add me on skpye, s o l v e r i o, so we can discuss Daha Fazla

in %bids___i_period_sub_35% gün içinde25%project_currencyDetails_sign_sub_37% %project_currencyDetails_code_sub_38%
(0 Değerlendirme)