İptal Edildi

(Sub)Reddit Image Crawling Script - Python

I am looking for a Python script that I can run which can parse through a configurable number of posts in any of the categories (hot, new, rising, controversial, top, gilded, promoted) for a given subreddit. Once it has retrieved this list of posts on the given subreddit, it will then proceed to follow the links in the post and comments and will download images from the site while ignoring ads. For example, if a post links to Imgur, this script should be able to download all of the images of the Imgur album, or if a single picture is posted then it will download that picture. The directory that each image file is downloaded to should also be configurable so that the user can store all images in a desired location.

Acceptance Criteria:

1. The script will utilize a file (or global variables) to allow for simple configuration of the subreddit, number of posts, category to be parsed, and image destination directory path.

2. The script will be modular. For example, different aspects such as the parsing of the subreddit, post/comments, and parsing of the target page will each be different functions.

3. The script must be able to download all images on the target page as well as any pages that are linked in the comments posted by the posts' original author.

4. The script will utilize popular, well-established libraries to ensure reliability and correctness. If you have questions regarding a library, please ask.

5. The script can parse 10 posts and download all associated images in less than 15 seconds when utilizing a sustained download bandwidth connection of at least 15MB/s.

6. The script runs on Python 2.7.X.

7. The script runs correctly and downloads all desired images when tested on any number of subreddits.

Please note that the Acceptance Criteria may change to remove or include additional requirements or may alleviate some constraints of the requirements.

Beceriler: node.js, PHP, Python, Web Scraping

Daha fazlasını görün: sub all, linked list library, library linked list, all sub, python download file, download file python, reddit, script crawling images, web script python, python parsing script, remove comments php file, python library, parsing file python, image library, web python script, python download, python web crawling, image parsing, python script parsing, python crawling, reddit links, web album image, python parsing file, python parsing, image change location

İşveren Hakkında:
( 0 değerlendirme ) United States

Proje NO: #6548913

2 freelancer bu iş için ortalamada 149$ teklif veriyor

shankarmorwal

A proposal has not yet been provided

in 10 gün içinde210$ USD
(8 Değerlendirme)
3.4
alexgospodarets

Предложение еще не подано

in 3 gün içinde88$ USD
(12 Değerlendirme)
2.9
fTwlYPJYUsyE

Hi, we can do it. I do have a lot of experience in software development, Linux, java, python, advanced numerical computations, data analysis, crawler development etc. Add me on skpye, s o l v e r i o, so we can discuss Daha fazlası

in 30 gün içinde25$ USD
(0 Değerlendirme)
0.0