
Closed
Posted
My immediate goal is to develop robust predictive models that can meaningfully inform cardiovascular research and clinical decision-making. I have aggregated multiple datasets—ranging from structured EHR extracts to imaging-derived variables and device telemetry—and now need a data scientist who can turn these raw inputs into clinically relevant risk scores and outcome forecasts. Scope of work • Clean, integrate, and document the disparate datasets I will share (CSV, SQL dump, and optional imaging features in HDF5). • Engineer features, test several algorithms (e.g., gradient boosting, random forests, neural nets), and iterate toward an interpretable solution. • Provide model performance metrics—AUROC, calibration plots, and decision-curve analysis—so clinicians can easily gauge utility. • Package the final model as a reproducible Python notebook / script with clear inline comments, environment file, and concise README. Acceptance criteria 1. AUROC ≥0.80 on the held-out test set. 2. Code executes end-to-end with `conda env create -f [login to view URL]`. 3. All steps, from preprocessing through validation, are traceable in a single notebook or Markdown report. When you respond, attach a detailed project proposal outlining: your planned workflow, preferred libraries (scikit-learn, XGBoost, PyTorch, etc.), anticipated timeline with milestones, and any relevant prior cardiovascular or biomedical work you can publicly reference. I am fully open to alternative techniques or supplementary data sources you may recommend, provided they enhance predictive power and remain explainable to a clinical audience.
Project ID: 40427834
20 proposals
Remote project
Active 7 days ago
Set your budget and timeframe
Get paid for your work
Outline your proposal
It's free to sign up and bid on jobs
20 freelancers are bidding on average ₹1,020 INR/hour for this job

Your cardiovascular dataset will fail validation if we don't address class imbalance and temporal leakage upfront. Most EHR-based models I've audited show inflated AUROC scores because they leak future diagnoses into training features or ignore the fact that adverse events occur in only 3-8% of patients. Before architecting the pipeline, I need clarity on two things: What's the prevalence rate of your primary outcome in the dataset, and are your imaging features time-stamped relative to the index event? If we're predicting 30-day readmission but the model sees discharge summaries written after readmission, we'll hit 0.95 AUROC in dev and 0.62 in production. Here's the technical approach: - PYTHON + SCIKIT-LEARN + XGBOOST: Build an ensemble pipeline with SMOTE for minority class upsampling, stratified k-fold cross-validation, and SHAP values for clinical interpretability so cardiologists can see which biomarkers drive each prediction. - HADOOP + SPARK: If your EHR dumps exceed 50GB, I'll parallelize feature engineering using PySpark to handle joins across patient timelines without memory crashes. - HDF5 + PYTORCH: Extract imaging embeddings using a pretrained ResNet backbone, then fuse them with tabular features in a multi-input neural network to capture non-linear interactions between echo measurements and lab values. - CALIBRATION + DECISION CURVES: Deliver Brier scores and net benefit plots at multiple risk thresholds so you can justify clinical deployment to an IRB or hospital committee. I've built 4 FDA-submission-ready models for cardiology startups, including a heart failure readmission predictor that outperformed the LACE index by 18 points AUROC. I don't take on projects where the ground truth labels are ambiguous. Let's schedule a 20-minute call to walk through your data dictionary and confirm we can hit 0.80 AUROC without overfitting.
₹900 INR in 30 days
5.4
5.4

With my decade-long experience in the field of data science, I am more than equipped to handle your ambitious project. I specialize in **data-engineering & analytics**, **machine learning**, and **statistical analysis**; everything you need for a project of this magnitude! My skills in **Python** and with libraries like **Scikit-learn, XGBoost, PyTorch**, will allow me to efficiently clean, integrate, and engineer crucial features from the disparate datasets you'd share. My forte also lies in providing clear insights on model's performance, and I can definitely deliver on that front too! With my experience producing ROC curves, calibration plots, and decision-curve analyses - I vow to give your clinicians easy-to-understand utility gauges. Additionally, my proficiency in creating reproducible models aligns perfectly with your project requirement of packaging the final model as a Python notebook/ script with clear comments. When it comes to timeline management, I pride myself on my efficiency. Given the complexity and potential challenges posed by datasets from various sources like EHR extracts, imaging features among others - I have come up with a detailed project plan. My experience working with healthcare clients would certainly come in handy here.
₹1,000 INR in 40 days
3.8
3.8

Hi there, I have read your project requirement carefully. You need a complete data science pipeline to clean and integrate cardiovascular datasets, build predictive models, evaluate them with clinical metrics (AUROC, calibration, decision curves), and deliver a reproducible, well-documented Python solution. We can develop an end-to-end workflow using Python (scikit-learn, XGBoost, PyTorch if needed), focusing on data quality, feature engineering, model performance, and clinical interpretability (SHAP, calibration analysis). The final output will be a clean notebook/script with environment setup, ensuring reproducibility and easy extension for future research. Questions: ========= What is the primary prediction target (e.g., mortality, readmission, specific condition)? What is the approximate dataset size and number of features? Are the datasets already cleaned/anonymised, or do we handle preprocessing fully? Do you prefer interpretable models only or open to more complex models if performance improves? Best Regards, Srashtasoft Team
₹750 INR in 40 days
3.0
3.0

Good morning/evening Sir, Ready For making your Analysis and Machine learning model Sir as it a 1 day or 2 MAX job easy making and completing with perfect & satisfying Results Just give me the order to start
₹1,000 INR in 20 days
1.8
1.8

I am expert in math, statistics, and machine learning. I will develop predictive models that falls within your acceptable criteria.
₹750 INR in 40 days
0.0
0.0

I’m Gurpreet Singh, a professional freelance developer based in New Delhi, with 10+ years of experience in delivering secure, scalable, and high-performance digital solutions. I help startups and businesses turn their ideas into powerful, market-ready products. ? What I Can Do for You Mobile App Development (Android & iOS) Desktop Software Development (C#, Java, .NET) Custom Software & Web Application Development Website Design & Development (WordPress, Joomla, Drupal) Laravel, React JS & Node JS Development Game Design & Development Blockchain Solutions AI Automation & Custom Tools Meta Trading Tools, Bot Scripting & Web Scraping SEO, Digital Marketing & Branding Video Editing & Multimedia Production ⚙️ Technologies I Work With React JS, Node JS, MongoDB Python (Django) Android (Java/Kotlin), iOS (Swift) Flutter & React Native ✨ Why Work With Me? ✔ 10+ years of proven industry experience ✔ Modern, scalable & cost-effective solutions ✔ Creative and experienced development approach ✔ Transparent communication & smooth workflow ✔ Secure, optimized & future-ready technology ✔ On-time delivery with dedicated support ✔ Flexible pricing (open to discussion) ? Let’s Work Together If you’re looking for a reliable freelancer who can bring your ideas to life and deliver high-quality results — I’m here to help. Let’s build something amazing together ?
₹750 INR in 40 days
0.0
0.0

Hello, I have experience with Python, machine learning, predictive modeling, statistical analysis, biomedical datasets, and data integration. I can help clean and integrate your datasets, engineer features, test multiple models, and build an interpretable cardiovascular prediction pipeline with clear validation metrics and documented code. I am comfortable working with tools such as scikit-learn, XGBoost, PyTorch, pandas, and Jupyter notebooks, and I always focus on reproducibility, model performance, and clean documentation. I am new on this platform and currently a student, so I am offering a discounted rate to build my profile and gain trusted reviews. I will complete the project carefully, professionally, and with high attention to detail. Please give me a chance to prove my work. Thank you very much.
₹1,000 INR in 40 days
0.0
0.0

Hi there, As a Computer Engineering student specializing in Medical AI, I don't just build models; I build clinically interpretable solutions. Having previously developed a high-accuracy VGG16 model for brain tumor detection, I am deeply familiar with the nuances of processing complex biomedical datasets and multi-modal features (EHR & Imaging). To hit your ≥0.80 AUROC target, here is my dedicated workflow: 1. Robust Data Integration: Using Pandas and SQLAlchemy, I will consolidate your CSV/SQL sources, while utilizing h5py to extract high-dimensional features from the HDF5 imaging variables. 2. Multi-Algorithm Iteration: I will benchmark Gradient Boosting (XGBoost/LightGBM) against Deep Neural Networks (PyTorch) to identify the optimal balance between predictive power and clinical utility. 3. Explainability-First Approach: Beyond AUROC, I will implement SHAP values and Decision Curve Analysis. This ensures clinicians don’t just see a score, but understand the exact physiological features driving the forecast. 4. Reproducible Delivery: You will receive a clean, modular Python pipeline with a Conda environment file, ensuring a "single-click" execution for your team. I have one technical question regarding the telemetry data: Are the time-series logs already windowed, or would you like me to handle the temporal feature engineering? Looking forward to contributing to your research! Rahma
₹1,000 INR in 40 days
0.0
0.0

I’m a Data Scientist and Python Developer with experience in predictive analytics, machine learning, data preprocessing, and model evaluation. I can build an end-to-end cardiovascular risk prediction pipeline using Python, Scikit-learn, XGBoost, and PyTorch with interpretable and clinically explainable outputs. My workflow will include data cleaning, feature engineering, model comparison, AUROC evaluation, calibration analysis, and reproducible deployment-ready notebooks with proper documentation. I focus on clean code, traceable workflows, and reliable predictive performance for real-world clinical applications.
₹750 INR in 40 days
0.0
0.0

Hi, I am interested for this job and want to work with you. Please, visit my website www.vanecus.com. Thanks and regards Md. Abdul Latif Dhaka, Bangladesh.
₹1,000 INR in 40 days
0.0
0.0

Bengaluru, India
Member since Dec 26, 2021
₹750-1250 INR / hour
$30-250 USD
₹400-750 INR / hour
₹600-1500 INR
$10-30 USD
$30-250 NZD
₹150000-250000 INR
₹600-1500 INR
$250-750 USD
₹3000-4000 INR
$250-750 USD
$2-8 USD / hour
₹1500-12500 INR
₹37500-75000 INR
$250-750 USD
₹400-750 INR / hour
$800-900 USD
$8-15 USD / hour
₹1500-12500 INR
₹3000-3250 INR
$30-250 USD