Find Jobs
Hire Freelancers

web data extraction/web scraping

$10-30 USD

Kapalı
İlan edilme: yaklaşık 10 yıl önce

$10-30 USD

Teslimde ödenir
Hi -- I need to extract some information from a few different websites, to put it into a 'nice' easy to read format for myself. It's a relatively easy job - but providing this is done well, I'll have a number of additional sites for you to scrape. I have some sample code that I can send you (from a previous completed project, which is a sample of the data extraction. It works for the first two sites I will be asking you to do, however I am going to be asking for a bit more information to be extracted, as well ). Criteria: ---------- 1. Please only bid if you are American, Canadian, European or Phillipino. All other bids will most likely be ignored (simply a matter of coding/quality issue). 2. It should be done in PHP/mySQL. You may need to use CURL (as some sites will have basic password/login authentication). 3. Preferably a good english speaking/reading/writing level. 4. If you have some code samples, that would help. (I am looking for someone that implements 'good' coding standards). 5. For this bid, it is for 'two' scrapes/websites (in my existing code). However, should you do a good job, I'll be extending this to another 10-15 sites. Some sites will have data scraped from 'wordpress' type websites, while others will simply be 'directory' style websites. I'm estimating probably 3-5 hours to complete this. (especially because I have some sample code I'm sending you, and it mainly requires tweaking/making it look good/etc). Other technical details: Ideally someone will have experience in the following. (I had a previous fellow working on it, but he cancelled due to other commitments) -------------------------------------- - PHP version 5.4 or newer - Framework: Yii - Scraping library: Goutte - Database: MySQL -------------------------------------- I have existing code that you can work off of if you wish. Actual project: ------------------ 1. Please see the attached ms word document for "complete" details, but basically you will be scraping data from websites via php. I'll start off with one site, and providing you do a good job, this will most likely be a job of about 10-15 sites, and maybe more. 2. You'll go to the webpage, download all applicable pages, and scrape the data. You will then 'reformat' this data, and insert it into a mySQL table. As it is "extracting", it would be nice to have some kind of counter (i.e., processing page 1/50) as it works, as well as making sure the script doesn't time out. (I.e., it's possible some scripts may take say 5-10 minutes to process). 3. I'd like a separate link included (php) that simply does a 'database' dump in HTML format. 4. For future (separate job from this), it will most likely be a 'maitenance' job. So for the future (which of course would be arranged in a separate project), probably 1-2x per month I'd want you just to go through the code to ensure everything is working a-ok. 5. Bonus - if you know how to use online .pdf to text pages (and/or can do that via curl/etc), that is a bonus. I'll have a separate project for you for that. Thanks!
Proje No: 5712392

Proje hakkında

3 teklif
Uzaktan proje
Son aktiviteden bu yana geçen zaman 10 yıl önce

Biraz para mı kazanmak istiyorsunuz?

Freelancer'da teklif vermenin faydaları

Bütçenizi ve zaman çerçevenizi belirleyin
Çalışmanız için ödeme alın
Teklifinizin ana hatlarını belirleyin
Kaydolmak ve işlere teklif vermek ücretsizdir

Müşteri hakkında

   BANGLADESH bayrağı
dhaka, Bangladesh
0,0
0
Ödeme yöntemi onaylandı
Ağu 16, 2013 tarihinden bu yana üye

Müşteri Doğrulaması

Teşekkürler! Ücretsiz kredinizi talep etmeniz için size bir bağlantı gönderdik.
E-postanız gönderilirken bir şeyler yanlış gitti. Lütfen tekrar deneyin.
Kayıtlı Kullanıcı İlan Edlien Toplam İş
Freelancer ® is a registered Trademark of Freelancer Technology Pty Limited (ACN 142 189 759)
Copyright © 2024 Freelancer Technology Pty Limited (ACN 142 189 759)
Ön izleme yükleniyor
Coğrafik konum için izin verildi.
Giriş oturumunuzun süresi doldu ve çıkış yaptınız. Lütfen tekrar giriş yapın.