EmpNo EmpName Salary
E123 Tom 2000
E124 RAM 2000
E125 TAM 2000
E126 SAM 2000
E124 RAM 4000
E126 SAM 6000
E125 TAM 9000
E123 Tom1 4000
Transform this DataFrame to
EmpNo EmpName Salary rownum
E123 Tom 2000 1
E124 RAM 2000 1
E125 TAM 2000 1
E126 SAM 2000 1
E126 SAM1 5000 2
E124 RAM 4000 2
E126 SAM 6000 3
E125 TAM 9000 2
E123 Tom1 4000 2
Here is the summary:
-- Duplicate EMpnos should be indexed(as shown in the rownum column)
-- The order of index should be based on salary.
-- Need All of the below approaches
-- Should be optimized and should be runnable on a cluster.
1. Using RDD(a. using SparkContext, [login to view URL])
2. Using DataFrame(a. using SparkContext, [login to view URL])
3. Using DataSet(a. using SparkContext, [login to view URL])
DataFrame Operations should contain both . notation and sql notation.
4) Any corrections/small enhancements(if required)
Bu iş için 14 freelancer ortalamada $25 teklif veriyor
Hi my skills: Data Processing, Data Science, Hadoop, Scala, Spark let's discuss more over the chat.................................
Hi I am interested in this job. I believe that the skills and experiences I have gained at this position make me an ideal candidate for this position. I have extensive experience in implementing Spark-core, Spark-s Daha Fazla
Hi Dear, I'm available now for urgent working for you 40/hr for data entry work with 100% correction I will do Fast and accurate Data entry and Data processing form according to the requirement. The First prio Daha Fazla
hello, senior data engineer, I have worked from more than 6 years on software development, I know well spark, scala etc, I believe I can resolve your problem.
Scala and spark expert. Have written numerous algorithms in spark/scala. Most recently wrote cure algorithm in scala/spark. The problem presented is interesting. Can complete this project with specs2 mocking. Daha Fazla