Python script for crawling API stops for some reason - make suggestions for improvement

  • Durum: Closed
  • Ödül: $20
  • Alınan Girdiler: 4
  • Kazanan: marioada

Yarışma Özeti

Dear all,

We're using the below script for making requests with the crawling provider [login to view URL] (the documentation can be accessed here after having created a free account: [login to view URL]).

The script is working well in general, however with one problem remaining: It simply stops working from time to time - sometimes after having successfully crawled a couple of hundred, sometimes only after a couple of thousand URLs. But we can't get it stable to crawl a couple of 10k URLs.

Please make suggestions right in the code - including a comment that describes why you made the change. We'll then test it and award the amount if the change brings the desired result.

Looking forward to your contributions!

Tavsiye Edilen Beceriler

İşveren Geribildirimi

“Mario is a great guy and a pleasure to work with!”

Profil Görüntüsü thomasjohn6, Germany.

Bu yarışmadan başlıca girdiler

Daha Fazla Katılım Görüntüle

Genel Açıklama Panosu

  • imo581
    • 4 hafta önce

    I tried your scripts with some links. The API responds with status code 403 Forbidden. I tried to use the API using a browser and it gives me this message "Token is invalid or account is temporarily blocked! please login to your dashboard for more details". Is something wrong with your subscription?

    • 4 hafta önce
    1. thomasjohn6
      Yarışma Sahibi
      • 3 hafta önce

      Hello Islam, Thanks for your interest in the contest! I guess for somewhat obvious reasons, before posting the script in public, I removed the real token from the script :-)

      • 3 hafta önce
  • busygayan
    • 4 hafta önce

    Literally makes no sense for you to pay a third party service which costs you money, and their prices are pretty expensive.

    Why don't you create your own tiny system which can get this done ?
    It's nothing complicated.

    • 4 hafta önce
    1. busygayan
      • 4 hafta önce

      So 40 Bucks plus you need a sever which can handle 50K plain requests per an hour ? So to answer the question

      Proxy crawl cost - 2500 USD ( basic, not JavaScript )
      Custom approach cost - less than 400 USD ( with a 64GB / 16 vCPUs Server )

      Javascript based crawl on proxy crawl - $5,054.90
      Custom approach cost - less than 1000 USD ( 192 GB of ram , 32 vCPUs Server )

      Besides all that, the code is custom, its transparent and debugging is way easy.
      Your data is private.

      • 4 hafta önce
    2. busygayan
      • 4 hafta önce

      I have a bot which crawls facebook daily with over 1,000 concurrent accounts daily. custom coded using selenium with python and i make over 100 requests each second ( each request has its own unique IP / proxy ). Still i spend only around 2,000 on a monthly basis,

      This makes no sense and the customer is being technically ripped off, paying almost 5x the amount. Still the customer is stuck having to debug his own code, I'm not even going to go why the code fails. You could pay a couple of engineer a salary and have your own servers maintained with 0 issues for the amount that you spend on this company. even if you're doing this on a small scale, makes no sense.

      High cohesion is not bad at all, that's my point basically.
      Good Luck

      • 4 hafta önce
  • thomasjohn6
    Yarışma Sahibi
    • 4 hafta önce

    Thanks for your comment! However, for now we would like to use the convenience of such a provider. Maybe later do it on our own. So do you have any idea what the problem could be in the script? Thanks in advance!

    • 4 hafta önce

Daha fazla yorum göster

Yarışmalara nasıl başlanır

  • Projenizi ilan edin

    Yarışmanızı İlan Edin Hızlı ve kolay

  • Tonlarca girdi alın

    Tonlarca Girdi Alın Bütün dünyadan

  • En iyi girdiyi seçin

    En iyi girdiyi seçin Dosyaları indirin - Kolay!

Şimdi bir Yarışma İlan Et ya da Bugün Bize Katılın!