Scrapy işler
Freelance Yazılım Geliştirici Aranıyor Çalışma Türü: Uzaktan – Proje Bazlı Bütçe: Karşılıklı Belirlenecektir Aranan Nitelikler: Python veya JavaScript dillerinde ileri seviye bilgi sahibi Web tarama ve otomasyon araçlarında deneyimli (Selenium, Scrapy, BeautifulSoup vb.) Siber güvenlik araçlarına hâkim (OWASP ZAP, Burp Suite, Nikto vb.) Güvenlik açığı tespiti ve web uygulama güvenliği konusunda bilgi sahibi API geliştirme ve e-posta entegrasyonu yapabilen Temel seviyede yapay zekâ ile anomali tespiti bilgisi (tercih sebebi) İyi derecede yazılı iletişim becerisi Takım çalışmasına yatkın ve gizliliğe önem veren Tercih Sebepleri: Daha önce benzer projelerde görev almış ol...
...bir botun yazılması Çekilen verilerin işlenerek sistemimize entegre edilmesi Web sitesinin 10 dilde hizmet verecek şekilde tasarlanması Otel fiyatlarının otomatik olarak %10 düşürülerek yayınlanması Kullanıcı dostu ve mobil uyumlu bir arayüz SEO uyumlu, hızlı ve güvenilir bir altyapı Otel bilgilerini düzenli olarak güncelleyen otomatik bir sistem Teknoloji Tercihleri: Backend: Python (Scrapy, Selenium, BeautifulSoup gibi web scraping kütüphaneleri), Node.js veya PHP Frontend: React.js, Vue.js veya Angular Veritabanı: PostgreSQL, MySQL veya MongoDB Dil Desteği: Çok dilli altyapı için gettext, i18n veya benzeri bir uluslararasılaştırma yöntemi Hosting & Deployment: AWS, Google Cloud veya Digital...
...receive: Images from Singapore card vendors showing buyback prices Use AI tools to: Identify the card from the image Search eBay: For recent SOLD listings of the same PSA card Extract: Actual accepted Best Offer prices Calculate: Total landed cost in SGD Compare: Vendor Buyback Price vs Total Cost Flag: Arbitrage opportunity if profit ≥ 15% You’re free to use Python (BeautifulSoup, Scrapy, Selenium, or the official eBay API) or any stack you prefer, so long as the process runs unattended on my VPS and avoids eBay’s anti-bot triggers. Deliverables: 1. Well-documented source code with setup instructions 2. Automated scheduler (cron, Windows Task, or similar) set to run daily 3. An Excel file generated on each run, overwriting or appending as we...
...system (Python 3.11+, aiohttp/httpx/Scrapy/Playwright). • Build deduplication & change-detection logic using hash comparison and timestamps. • Design and connect central database (PostgreSQL + SQLite) to store unique company records. • Integrate proxy rotation and throttling (BrightData/Luminati or similar). • Implement data normalization using ftfy, unidecode, python-phonenumbers, regex, and pandas. • Crawl Impressum pages to auto-fill missing fields (phone, fax, website). • Automate daily/weekly export to Excel / CSV using openpyxl. • Add basic monitoring dashboard (Streamlit) showing live progress, proxy health, and logs. • Deliver well-structured, documented, production-ready code. ⸻ Required Skills • Expert in Pytho...
...harvested content in a clean CSV or Excel file with clear column headings; if you prefer a database export, let me know and we can adjust. • Include the finished script or notebook so I can rerun the extraction later. Accuracy and formatting matter more to me than sheer speed, so please allow time for basic validation before handing over the files. If you normally work with Python (BeautifulSoup, Scrapy, Selenium) or similar tooling, that’s perfect, but I’m open to alternative stacks as long as the output meets the same standard. When you reply, briefly outline: 1. The scraping approach and libraries you’d use 2. Any anti-blocking measures you apply for public sites 3. A realistic timeframe to capture, clean, and hand back the data I’m ready ...
...data pulled from publicly available sites. The focus is simple yet crucial: for every company you find, capture the homepage URL and a working email address. (ask for details in the sheet ) A completely ethical approach is non-negotiable—no gated content, no third-party lists, and no automated harvesting that violates site terms. I’m happy for you to use tools you’re comfortable with (Python, Scrapy, BeautifulSoup, Selenium, Google Apps Script, etc.) as long as you respect and rate limits. Email addresses must appear in plain text within the sheet; please avoid hyperlinks or HTML encoding. Deliverables • A Google Sheet populated with data • A short note on your collection method (manual, scripted, hybrid) so I can replicate or update the data ...
...bateaux. Pour y parvenir, j’ai besoin d’un workflow automatisé reposant exclusivement sur Google Maps – c’est la source retenue – capable de collecter, dédupliquer puis nettoyer les données avant de les mettre en forme dans un CSV directement importable dans le CMS Wix. Les coordonnées devront être fournies en degrés décimaux. Livrables attendus • Un script réutilisable (Python + Selenium, Scrapy ou équivalent) qui interroge Google Maps, gère le rate-limit et documente chaque étape de traitement. • Le fichier CSV final contenant, pour chaque base nautique, les champs suivants : Nom, Adresse complète, Ville, Département, Région, Latitude,...
Project Title Custom Lead Generation & Email Scraper Tool (Google, Yellow Pages, & B2B Directories) Project Description I am looking for an experienced developer to build a robust, high-perfo...Address (Must have a validation check to avoid "dead" emails) Phone Number (Optional but preferred) LinkedIn Profile URL (Optional but preferred) Export Functionality: Capability to export data into CSV or Excel format. Anti-Blocking Measures: Use of rotating proxies or delays to ensure the scraper isn't blocked by Google or directories. Technical Requirements: Preference for Python (Selenium, Scrapy, or BeautifulSoup) or a dedicated desktop application. User-friendly interface (even a simple CLI is fine, but a GUI is a plus). Fast processing speed with the abil...
Project Title Custom Lead Generation & Email Scraper Tool (Google, Yellow Pages, & B2B Directories) Project Description I am looking for an experienced developer to build a robust, high-perfo...Address (Must have a validation check to avoid "dead" emails) Phone Number (Optional but preferred) LinkedIn Profile URL (Optional but preferred) Export Functionality: Capability to export data into CSV or Excel format. Anti-Blocking Measures: Use of rotating proxies or delays to ensure the scraper isn't blocked by Google or directories. Technical Requirements: Preference for Python (Selenium, Scrapy, or BeautifulSoup) or a dedicated desktop application. User-friendly interface (even a simple CLI is fine, but a GUI is a plus). Fast processing speed with the abil...
...downward with every run; newest rows appear at the bottom. Each run should happen on a schedule I can adjust (cron, task scheduler, or similar). • Resilience: graceful handling of captchas or temporary blocks (rotating proxies or headless browsing accepted), clear logging of any skipped items, and an alert if a site layout changes. • Maintainable code: well-commented Python (BeautifulSoup, Scrapy, or Playwright are all fine) or an equivalent language you prefer, plus a short README explaining setup and how to add new sites later. Once delivered I will validate that: 1. Data from all ten sites lands in the workbook with a proper timestamp. 2. Prices from consecutive runs appear on separate rows, preserving history. 3. The script can be launched by a single com...
...and scrape the relevant pages in real time or on a frequent schedule. • Apply NLP or other classification techniques to decide whether a posting is truly AI-related, then tag it by sub-domain (e.g. vision, NLP, MLOps, prompt-engineering). • Deliver concise, deduplicated listings to me through an in-app notification feed—no email or SMS required. For the deployment side I’m open to Python (Scrapy, BeautifulSoup, Selenium), Node, or any stack you are comfortable with so long as it is containerised and can run unattended on a small cloud instance. A lightweight web interface or Electron desktop app for the notification feed is ideal; you can suggest an alternative if it achieves the same user experience. Acceptance criteria 1. Agent successfully scrapes...
...phase—so the job is focused on clean data capture and a flawless import workflow. Descriptions must remain in plain text; no extra HTML markup. Images should arrive attached to the right variation, including separate gallery shots where available, and the colour options need to show as clickable swatches in WooCommerce, not just text labels. I’m comfortable if you use Python (BeautifulSoup, Scrapy) or another scraper, and either the WooCommerce REST API or a CSV/XML tool like WP All Import for the upload, as long as the end result feels native inside my store. Deliverables: • Complete product dataset (titles, plain-text descriptions, all images). • Variations set up so size and colour swatches behave exactly like on Furnx. • Products import...
...and email—nothing more. The final deliverable is a clean, well-structured Excel file ready for me to review. Speed is the priority here: please be able to start right away and turn the file around as fast as possible while still double-checking that every row is accurate and complete. If this timeline works for you and you have solid scraping experience with tools like Python, BeautifulSoup, or Scrapy, let’s move forward now. Budget small as simple Task so Low budget bidder 1st priority. But start now. Simple Task. Start bid with "Urgent" Thanks....
I have two source spreadsheets that I need merged and enriched through automated scraping: • “File 1” – 170 k Spanish local businesses with emails • “File 2” – 65 k additional businesses with websites only Phase 1 – Email extraction Using a Python script and well-known libraries (requests, BeautifulSoup, Scrapy or similar), scan every site listed in File 2, capture all working email addresses you can locate, then append them to the corresponding rows so I can produce a unified “File 3”. Phase 2 – Offer harvesting Next, visit each live site in File 3. Where an offer, deal or promotion is publicly displayed, record the details in a fresh Excel sheet with these exact columns: Business ID | Business Name ...
I have a public-facing website that I need scraped end-to-end. The site is open (no login), but the content is split across multiple pages, so your script will have to detect and follow pagination automatically. Here is exactly what I expect: • A clean, well-commented Python script (requests/BeautifulSoup, Scrapy, or Selenium—your choice) that visits every page, captures the required fields, and writes them to a neatly structured CSV. • The final CSV containing all rows pulled from the site. • A short README that tells me how to run the script and change the target URL or output path if needed. Code quality matters to me: no hard-coded absolute paths, clear variable names, and graceful error handling so the run doesn’t stop if a single page fa...
...just needs to collect every page’s copy accurately and store each page URL, headline, sub-headline, paragraph body, and any inline text in separate columns. Please make the scraper resilient to common roadblocks such as pagination, lazy-loaded sections, and basic anti-bot measures, and keep the code modular so I can rerun it myself if the site layout changes slightly. Python with BeautifulSoup, Scrapy, or Playwright is fine as long as the final CSV is UTF-8 encoded and free of HTML tags. Quantities: - we expect somewhere between 10.000 and 70.000 records - we want to pay in milestones per 5,000 - we want to pay for research work + first 5000 in the first milestone, other amount for following milestones (in case you get blocked, problems arise) Deliverables • Scrap...
Project Description: Find school districts and charter schools who use a specific vendo...Vendor Found"`. - If no website could be loaded, the script should log any failed connections or timeouts. Output Format (CSV) The final deliverable file should be structured with the same columns as the ones provided with the additional column to include your results. Skills Required - Expert proficiency in Python. - Deep experience with web scraping libraries (e.g., Requests, BeautifulSoup, Scrapy, and especially Selenium/Puppeteer for dynamic content). - Experience handling common web scraping challenges (redirects, user-agents, proxy usage (if necessary)). To bid, please confirm your familiarity with scraping dynamic content and provide a brief description of the scraping approach...
...comment text, number of comments, likes, reposts/shares, the post date and any other readily available metadata (author handle, follower count, post URL, media links, etc.). Accuracy is critical because the data will feed a trend-analysis dashboard later. Please build the workflow in a way that respects rate limits and login requirements: if you intend to use official APIs, private APIs, Selenium, Scrapy, Playwright, or headless browsers, spell that out so I know how sustainable the solution will be. The final hand-off should include: • A clean, well-commented reusable script (Python preferred) • A short README explaining environment setup, keyword input format and how to extend to new regions • The full export in CSV so I can validate before sign-off If an...
...reliable associated sources. Specific sources: Euromillones: (since Feb 13, 2004) La Primitiva: (since Oct 17, 1985 – modern version) El Gordo de la Primitiva: (since Oct 31, 1993) Updates automatic at exactly 00:02 the day after each draw, using ethical scraping (BeautifulSoup/Scrapy) with proper user-agent headers to mimic human behavior. Store data in PostgreSQL (structured) or MongoDB (flexible), including all prize categories to enable ROI calculations and backtesting. 2.2. Number Prediction Generate predictions for Euromillones, La Primitiva and/or El Gordo simultaneously using explicit advanced AI models: Machine Learning ensembles (Random Forests) for frequency/statistical
...scraped, the information should be organised into a clean CSV file—one row per page—with columns for page URL, full body text, image file names, and link destinations. Please download the images themselves as well and bundle them in a separate folder (a simple ZIP is fine); the CSV should reference the exact filenames so everything lines up. I’m happy for you to use Python with BeautifulSoup, Scrapy, Selenium or whichever stack you prefer, as long as the final output meets these acceptance criteria: • Complete CSV containing text, image names, and link URLs for each page • All images successfully downloaded and accessible via the filenames listed in the CSV • No duplicates or missing pages from the target site * Images need to be sort...
I have a data-analysis pipeline that relies on a ...after award). • Payload: high-resolution image files plus a CSV/JSON map linking each file to product ID, title, price, and category text that you extract during the same run. • Scale: thousands of products per crawl; a resumable approach is essential so partial failures don’t force a full restart. • Frequency: I’ll trigger the crawl weekly, so reusable code is a must. I’m happy with Python—Scrapy, Selenium, Playwright, or a headless solution of your choice—as long as it respects the site’s anti-bot measures and keeps requests polite. Please include a brief outline of how you’ll handle pagination, lazy-loaded images, and rate limiting. Let me know your proposed stac...
...Excel workbook. Please crawl the entire site, not just a few sections, and return each number alongside the key profile details that make the data usable at a glance—name, profile URL, and any other easily captured identifiers shown next to the number. A clean .xlsx with one row per profile, no duplicates, and clearly labelled columns is the only deliverable I’m expecting. If you prefer Python, Scrapy, Selenium, Beautiful Soup or a comparable stack, go ahead; I’m interested in results, not the specific toolset, as long as the script can be rerun later should the site content change. Before delivery, double-check that: • every row contains a valid phone number and url • no pages on the site were skipped • the sheet opens flawlessly in the late...
...issue and validate JWT tokens for every request beyond the public health-check route. Token refresh, revocation, and a simple role model (“user” vs. “admin”) should be built in from the start. Flight data extraction I do not have official Iberia developer access, so we will need to pull the data ourselves. I’m open to whichever tooling you are most comfortable with — BeautifulSoup, Selenium, Scrapy, or a hybrid approach — as long as the final solution is headless, resilient to minor layout changes, and respectful of Iberia’s rate limits. Only flights that are bookable with Avios need to be captured; no hotel or car-rental data is required. Deliverables • Clean, modular Python code (FastAPI or Flask preferred, but I’...
...issue and validate JWT tokens for every request beyond the public health-check route. Token refresh, revocation, and a simple role model (“user” vs. “admin”) should be built in from the start. Flight data extraction I do not have official Iberia developer access, so we will need to pull the data ourselves. I’m open to whichever tooling you are most comfortable with — BeautifulSoup, Selenium, Scrapy, or a hybrid approach — as long as the final solution is headless, resilient to minor layout changes, and respectful of Iberia’s rate limits. Only flights that are bookable with Avios need to be captured; no hotel or car-rental data is required. Deliverables • Clean, modular Python code (FastAPI or Flask preferred, but I’...
I need a senior-level specialist to harvest product data from several e-commerce sites and deliver it in a single, well-structured CSV file. The task demands production-ready techniques—think Scrapy spiders hardened with rotating proxies, Selenium or Playwright for dynamic content, and solid anti-bot countermeasures. The information I’m after is very specific: product names, prices, pictures, and SKU. Nothing less, nothing more. Your solution must run reliably at scale, cope with frequent layout changes, and leave no trace that could trigger blocks. Python is the preferred stack, but if you have a proven alternative that meets the same bar, I’m open to hearing it. To be considered, include in your proposal: • At least one example of a comparable e-commerce...
I’m expanding our Florida outreach list and need a reliable web-scraped data set of school, college, and university administrators who oversee Nursing or other Healthc...address • State (always Florida) Format & delivery – Send the file in Excel (.xlsx). – First progress drop: within 5 days so I can spot-check. – Final, fully cleaned file: no later than 10 calendar days from project start. Quality matters because this list feeds straight into our marketing campaigns. I’ll spot-verify a sample for accuracy. Feel free to leverage Python, BeautifulSoup, Scrapy, or similar tooling—whatever lets you move quickly while respecting each site’s robots.txt. Let me know if anything needs clarifying before you begin, otherwise I&rsqu...
...need a seasoned Python developer to build a robust scraper that collects the required data and writes it straight to JSON—no additional cleaning or processing necessary. Once we begin I’ll provide the target URL(s) and any access details; for now, assume a standard public site with pagination and occasional anti-bot checks. Core expectations • Written in Python 3 using requests/BeautifulSoup or Scrapy; resort to Selenium only if there’s no lighter workaround. • Handles pagination, retries, and polite delays gracefully so the run can complete unattended. • Config file or clear constants for headers, cookies, and start URLs, letting me tweak targets without editing core logic. • Produces a single JSON file (or one file per page if that...
...build a reliable, well-structured lead list and I already know exactly what it should contain. The task is to extract contact information—email addresses, phone numbers and full mailing addresses—from three sources: company and organisation websites, their public social-media profiles, and well-known online directories. I expect the data to be gathered with a solid scraping workflow (Python, Scrapy, BeautifulSoup, Selenium or an equivalent stack is fine) and then verified so that bounced emails and dead numbers are kept to an absolute minimum. Deliverables • One CSV or Excel file with separate columns for name, company, job title, email, phone, street address, city, state, ZIP/postcode, country, source URL and date collected. • No duplicates; every...
Preciso de um especialista em web scraping para coletar in...informações específicas consultando CPF em um site. Campos necessários: - Nome completo - Data de nascimento - Endereço - E-mails - Telefones - Veículo (marca/modelo) - Ano de fabricação - Ocupação - Faixa salarial - Provável empresa Habilidades e Experiência Ideais: - Experiência comprovada em web scraping - Proficiência em ferramentas como Python, Beautiful Soup, Scrapy, ou similares - Capacidade de trabalhar com estruturas de dados complexas - Atenção aos detalhes e precisão na extração de dados - Familiaridade com questões legais e éticas de ...
I have a data-analysis pipeline that relies on a ...after award). • Payload: high-resolution image files plus a CSV/JSON map linking each file to product ID, title, price, and category text that you extract during the same run. • Scale: thousands of products per crawl; a resumable approach is essential so partial failures don’t force a full restart. • Frequency: I’ll trigger the crawl weekly, so reusable code is a must. I’m happy with Python—Scrapy, Selenium, Playwright, or a headless solution of your choice—as long as it respects the site’s anti-bot measures and keeps requests polite. Please include a brief outline of how you’ll handle pagination, lazy-loaded images, and rate limiting. Let me know your proposed stac...
...precise location coordinates directly from Google Maps. The second will crawl a set of websites I will supply and pull out product information, on-page contact details, and any user-generated content that appears alongside those products. Please structure every field into one tidy CSV per source so I can plug the results straight into my BI dashboards. I am comfortable if you lean on Python, Scrapy, BeautifulSoup, Selenium, or similar tools, provided the script is well-commented and can run headless behind rotating proxies without tripping rate limits. Deliverables: • 4 working scripts (Maps + websites) with clear setup instructions • Sample output files proving all requested fields are captured correctly • Output data must have City Name > (Excel fi...
I need clean, structured prod...stay lean and purpose-built. I already have a clear idea of the attributes I want captured (title, price, SKU, description, availability, image URL). Once we agree on the target sites, you can build a scraper, run it, and hand back the CSV along with the script or notebook so I can reproduce the results later if needed. Please let me know: • Which language or framework you plan to use (Python, Scrapy, BeautifulSoup, Selenium, Playwright, etc.). • How you’ll handle pagination, anti-bot measures, and site structure changes. • An estimated turnaround and any milestones you suggest. Accuracy, deduplication, and clarity in the final CSV will be the acceptance criteria. If this sounds like your bread-and-butter, I’m ready...
I need a developer to collect data from multip...repeatable solution (script or small app) that I can run on demand Basic documentation: how to run it, how to adjust settings, where outputs go Quality requirements Reliable scraping with error handling and retries Respectful request rate / throttling to avoid overloading sites Clear logging (success/fail, pages processed) Ability to adapt if page structure changes Experience with Python (Scrapy/BeautifulSoup/Selenium/Playwright) or Node.js Proxy / rotating user-agents experience (only if needed) Scheduling/automation (cron, Docker, or cloud run) Deliverables Working scraper + instructions Sample output file(s) Final dataset from agreed sources (initial run) To apply, please include Examples of similar scraping work you...
I have an urgent need for a clean, well-structured dataset containing the listing agent’s first name, last name, mailing address, and phone number for well over 500 active Zillow listings. Speed is critical, but accuracy matters just as much; the final file should be ready for immediate import into my CRM. You are free to use whichever stack you prefer—Python with BeautifulSoup or Scrapy, Selenium, residential proxies, even the unofficial Zillow API—so long as rate-limits are respected and the data is complete. I don’t need property details or price history; the focus is strictly on the agent contact fields. Deliverables • CSV or XLSX with a separate column for each required field • A short read-me explaining the script or method so I can reru...
...visible textual content I specify, and returning it in a machine-readable format. I’m flexible on the final file type; CSV, Excel, or JSON all work as long as the fields are clearly labeled and easy for me to manipulate later. A small sample first will help confirm we’re on the same page before you run the full extraction. Please use whatever stack you prefer—Python with BeautifulSoup or Scrapy, JavaScript with Puppeteer, or a tool that suits the task best—just be sure to respect and provide the code so I can rerun the process when the site updates. Deliverables: • Re-usable script or notebook with clear comments • Complete dataset containing all extracted text, delivered in my chosen format • Brief read-me explaining setup, ...
...public websites * Parse HTML, JSON, CSV, and PDF files * Clean and normalize messy real-world data * Write clear, maintainable utility scripts * Deliver working code (not just prototypes) --- ### Required Skills * Strong Python fundamentals * Real experience with web scraping * Data parsing and data cleaning * Comfortable working independently and async --- ### Nice to Have * BeautifulSoup, Scrapy, Playwright, or Selenium * pandas / numpy * Experience scraping government or legacy websites * Experience handling PDFs (text extraction, OCR) --- ### How We Evaluate * This role includes a **paid trial task (1–3 days)** * We care about **output and correctness**, not resumes * Clean, working code matters more than clever abstractions --- ### Important * Please includ...
I need a reliable scraping solution that collects every open position from ten job-board and company-career sites in one specific country. I already have the full URL list and will share it right after kickoff. Scope • Write and schedule a separate scr...postings, basic keyword search in the frontend, and an export button for CSV or Excel, but these are optional. Deliverables 1. Source code for all scrapers and the data pipeline. 2. Database schema or JSON structure. 3. Front-end webview ready to run locally. 4. README covering installation, configuration, and update routine. I’m happy to discuss your preferred stack—Python with BeautifulSoup/Scrapy or Node with Cheerio/Puppeteer are both fine—as long as the final result is stable and well documented....
...LinkedIn, Indeed and HelloWork. • Captures, at minimum, the job title, full description, company name and location. • Stores everything in a structured database I can easily query or export. • Retrieves complete CVs from LinkedIn and, when possible, other social platforms, then links each profile to the same database scheme. Feel free to choose the most stable stack you trust—Python with Scrapy or Selenium, Node with Puppeteer, direct GraphQL or REST endpoints, etc.—as long as it runs unattended, copes gracefully with rate limits / captchas, and offers a simple way for me to schedule or trigger updates. Acceptance will be based on: 1. A repeatable script or service I can host (Docker image or cloud function are fine). 2. A concise setup guid...
...build a reliable, well-structured lead list and I already know exactly what it should contain. The task is to extract contact information—email addresses, phone numbers and full mailing addresses—from three sources: company and organisation websites, their public social-media profiles, and well-known online directories. I expect the data to be gathered with a solid scraping workflow (Python, Scrapy, BeautifulSoup, Selenium or an equivalent stack is fine) and then verified so that bounced emails and dead numbers are kept to an absolute minimum. Deliverables • One CSV or Excel file with separate columns for name, company, job title, email, phone, street address, city, state, ZIP/postcode, country, source URL and date collected. • No duplicates; every...
...following fields: • Job title and full description • Company name plus location (city, state/region, country) • Employment type and any salary or rate information available Your scraper should store results in a clean, normalized CSV (or optionally a relational DB if you prefer) and be easy for me to rerun on demand. I’m comfortable with Python, so a script leveraging requests/BeautifulSoup, Scrapy, or Playwright makes sense, but if another stack delivers better reliability feel free to suggest it. Key expectations • Site recommendations presented first for my approval before you start coding • Respect , add configurable request delays, and build basic anti-block measures (user-agent rotation, retries) • Clear documentation ex...
I ...first line of address, state, city, postcode • Format: every column saved as plain text (no numeric or date formatting) Delivery schedule • First 5,000 fully cleaned rows required within the first 6 hours • Remainder on a rolling basis until the full 15,000 are complete I will supply a surname list to guide the searches. A straightforward Python (requests / BeautifulSoup or Selenium) or Scrapy workflow is fine as long as the final output arrives in a single Excel file (.xlsx) that opens error-free in Microsoft Excel. Accuracy matters more than speed—random spot checks will be run. Any duplicates, blanks, or malformed addresses will be sent back for correction. Once the first 5,000 pass review, I’ll green-light the rest of the scrape so we ca...
Description: - We are looking for an experienced Data Scraping / Web Scraping expert. - We will share the industry name, and the...- Suggest suitable websites/sources to scrape - Suggest countries/regions that can be covered - Share estimated data volume & approach - After approval, the freelancer will scrape and deliver clean, structured data. Data Required (example): - Company name - Location - Contact details (email/phone/website – if available) Requirements: - Proven experience in data scraping - Knowledge of Python, Scrapy, Selenium, APIs, etc. - Ability to scrape multi-country data (based on feasibility) Deliverables: - Data in Excel / CSV / Google Sheets - Basic info of sources used To Apply, share: - Similar scraping work - Tools you use - Your approach after ...
I need all publicly available customer-facing email addresses extracted from a list of e-commerce websites that I will supply once the project begins. Please crawl only the domains I provide, respect where possible, and avoid triggering any rate limits or security blocks—rotating proxies or headless browsing with tools such as Python, Scrapy, BeautifulSoup, Selenium, or similar is fine as long as the result is reliable. Deliverable • One clean, de-duplicated CSV file containing the harvested email addresses, ready for direct import into my CRM. Acceptance criteria • Every email must originate from the target e-commerce domains. • No duplicates, placeholders, or obviously invalid addresses. • File encodes as UTF-8 and opens without warnings in Exc...
...associated images—then converts and calculates the raw values exactly as we define before pushing them straight into WooCommerce. My customers must only ever see the WooCommerce front end, so the sync has to feel native and instant. The portal changes frequently, so please code the extractor so that selectors and credentials can be updated without touching the core logic. I am open to Python (Scrapy, BeautifulSoup, Selenium), PHP or Node as long as the finished solution talks cleanly to the WooCommerce REST API and leaves no manual steps. Deliverables • Scraper that logs in and captures product details, stock, prices and images in real time or on a schedule we agree on • Conversion layer that performs the unit/price calculations before data enters WooCommerce ...
I need a developer to collect data from multip...repeatable solution (script or small app) that I can run on demand Basic documentation: how to run it, how to adjust settings, where outputs go Quality requirements Reliable scraping with error handling and retries Respectful request rate / throttling to avoid overloading sites Clear logging (success/fail, pages processed) Ability to adapt if page structure changes Experience with Python (Scrapy/BeautifulSoup/Selenium/Playwright) or Node.js Proxy / rotating user-agents experience (only if needed) Scheduling/automation (cron, Docker, or cloud run) Deliverables Working scraper + instructions Sample output file(s) Final dataset from agreed sources (initial run) To apply, please include Examples of similar scraping work you...
Lütfen detayları görmek için Kaydolun ya da Giriş Yapın.
...directory • → Record Stores tab • → search term “record shops” For each shop, capture these fields exactly: – Business name – Email address – Phone number All three data sets should be merged into one unified file; no source labels or separate sheets are required. Please scrape or crawl the sites directly—automated methods such as Python, BeautifulSoup, Scrapy, or similar tools are fine so long as the final output arrives de-duplicated and ready to open in any spreadsheet application that supports CSV. Accuracy matters more than speed, so feel free to build in basic checks (e.g., email format validation, obvious duplicate removal). Once complete, send the single CSV plus a brief note on how you gather...
I need an automated scraping solution that reliably collects product data from targeted websites and delivers it in a clean, structured file I can plug straight into my workflow. You’re free to use Python (BeautifulSoup, Scrapy, Selenium, Playwright) or a simple cloud instance, and the output lands in CSV or JSON.
...directories you can legally access. For each product, capture at minimum the product name, its full description or spec blurb, and the page URL so I can verify the entry later. If additional details such as model numbers or imagery links are readily available while you scrape, feel free to include them as extra columns, but the name and description are non-negotiable. A Python-based workflow using Scrapy, BeautifulSoup, or a comparable toolset is fine by me so long as the end result arrives as a single Excel workbook, neatly separated by sheet or filterable fields. Please ensure your methods comply with site terms of service and New Zealand data-privacy requirements. I will consider the job complete when: • Every known NZ supplier type (storefront, manufacturer, d...
I need the brochure catalogue of a JavaScript-heavy e-commerce site captured and delivered as a clean CSV. My focus is on accurate prices and every available variant, pulled from each category the site offers. Python is the language of choice and I’m flexible on tooling—Scrapy + Playwright, straight Playwright, Selenium, or another robust approach—provided the code is modular, well-documented, and easy for me to rerun when the store layout shifts. If you already have proxy rotation or rate-limit handling baked into your pipeline, that will be an advantage. What has to happen • Crawl through every category filter so no product slips through the cracks. • Render dynamic content fully to capture price and variant data, along with URL, SKU, net price an...