This is a student project for implementing a noisy version of random decision trees using spark (Mlib in spark) or Hadoop Map-reduce(using Mahout machine learning algorithm).
In Map-reduce version: A cluster based Hadoop needs to be installed in virtual machines and map-reduce should be configured. A big sample file needs to be divided in to smaller chunks and send to each Mapper and on each mapper a decision tree needs to be created based on the data that it gets. In Reduce phase the results of this decision trees needs to be averaged and some noisy version of the results need to be published.
In Spark: the idea is still the same, install Spark on Hadoop clusters and get a big sample file, get random data sets from the input file and create decision trees and then average this trees ( like random forest techniques) and issue a noisy version of the results.
I prefer to use Spark and particularly looking for someone who can implement this noisy random decision trees algorithm in Spark using Hadoop.
Having very good understanding of cluster based Hadoop, Spark and Machine learning algorithms specially Decision trees and Random Forest are essential for this project.
7 freelancers are bidding on average $796 for this job
Very interesting problem! I have used Hadoop but with Hbase and with normal Map Reduce. I have been reading up on Spark lately and notice that if you operate in Memory, thats when the maximum performance gains can be s Daha Fazla
Greetings!!! We are glad to bid on your project and would be more happy if get a chance to work on the same. -- site links :- [url removed, login to view] (E-commerce), [url removed, login to view](hospital Site), Daha Fazla
I have MS in Information Security with research interest of Cryptography. I have implemented Cryptography algorithms in Java and C, written security solutions in J2EE and Java, have implemented datastructures in Ja Daha Fazla
Experience in Machine learning algorithm. Experience in Map/Reduce. Strong Python skills More than 10 yrs of experience.
I am familiar with Java though I prefer Scala and have already made use of Hadoop and MapReduce. On the internet I have participated in multiple Kaggle competitions which allows me to keep up-to-date with the latest to Daha Fazla
Machine learning scholar, My research area includes Data mining, Machine learning and big data . My past experience in machine learning includes 8+ year research in ML domain.