Kapalı

Problem solution for Spark use case

1) SO I have 60 Millions usa and canada postals

Created dataframe customer_df => good and bad => 60M

2) downloaded some Good postals from internet to filter the customer_df data

Created one dataframe good_df => which is good postals => 1M

3) Perfomed Join between customer_df and good_df wiht zipcode to seperate the good values

filter_df = good zip [url removed, login to view](cus_df,zipcode)

4) Then seperated bad data with the below logic

bad_df = [url removed, login to view](filter_df)

Now still we can filter bad_df with city names

city_df = [url removed, login to view](bad_df,city)

Then did unioin between both df's

total_filter = [url removed, login to view](city_df)

it taking 1.30 mints (used spark with 8 node cluster each node 32 gb => spark-submit driver memory -8g and num-executors - 8 and executor-memory- 8g)

any other technology or any other tool to clean-up the data within 15 to 20 mints(again customer data is 60M

Beceriler: Big Data Sales, Hadoop, Hive, Spark

Daha fazlasını gör: spark streaming use cases, spark use cases examples, apache spark case studies, apache spark use cases, apache spark real world example, apache spark examples, apache spark sample project, apache spark projects, do we need a blogger case problem solution, submit application use case, visual use case, use case diagrams iphone applications, uml use case documentation, quality center use case management, card game use case

İşveren Hakkında:
( 0 değerlendirme ) United States

Proje NO: #14803384

Bu iş için 16 freelancer ortalamada $1101 teklif veriyor

rutvikparmar

I am a data scientist and have experience with Big Data Technologies like Spark and Hadoop. I also have experience with NoSQL databases like HBase, Cassandra, etc. Previously I have worked with in Spark related projec Daha Fazla

in %bids___i_period_sub_35% gün içinde1100%project_currencyDetails_sign_sub_37% %project_currencyDetails_code_sub_38%
(6 Değerlendirme)
4.4
chrisvwn

I propose first analyzing your current algorithm to find the bottleneck and either rewriting your algorithm, reconfiguring your environment or finding other technologies e.g. Impala. Relevant Skills and Experience I h Daha Fazla

in %bids___i_period_sub_35% gün içinde1250%project_currencyDetails_sign_sub_37% %project_currencyDetails_code_sub_38%
(3 Değerlendirme)
3.2
punitpuneet2008

Hello, I am 7+ years experienced Big data developer and I understand the job and will provide the desired solution. Please spare some time to discuss further. Relevant Skills and Experience My Key skills are: Java J Daha Fazla

in %bids___i_period_sub_35% gün içinde1079%project_currencyDetails_sign_sub_37% %project_currencyDetails_code_sub_38%
(4 Değerlendirme)
3.5
ubfapps

Hi, We are 5 big data enthusiasts with expertise in core technologies like Hadoop,spark,mongodb,hive,pig,R,etc. All of us have the development experience on platforms like Scala,python and java. Our vision is to delive Daha Fazla

in %bids___i_period_sub_35% gün içinde750%project_currencyDetails_sign_sub_37% %project_currencyDetails_code_sub_38%
(6 Değerlendirme)
3.4
ahmsak

Hello Sir... I have a very good experience in Spark & Scala. Please contact me for more details when possible. I look forward to work for you Sir. Best Regards. Relevant Skills and Experience I am a computer science t Daha Fazla

in %bids___i_period_sub_35% gün içinde750%project_currencyDetails_sign_sub_37% %project_currencyDetails_code_sub_38%
(2 Değerlendirme)
2.8
itsmeccr

I am new to freelancer but I have been working on field of Big data for more than 3 years. The project description tells you are technical as well. I think the pseudocode can be optimized. Relevant Skills and Experien Daha Fazla

in %bids___i_period_sub_35% gün içinde833%project_currencyDetails_sign_sub_37% %project_currencyDetails_code_sub_38%
(1 Yorum)
0.4
hpatel0426

I have experience in tuning and debugging Spark jobs for one of the Fortune 6 companies which processes large amount of data.

in %bids___i_period_sub_35% gün içinde750%project_currencyDetails_sign_sub_37% %project_currencyDetails_code_sub_38%
(0 Değerlendirme)
0.0
abhijitcaps

Hello, With an experience of 7 Years into Java, 3 Years into Hadoop & 1+ year into Spark, excellent solution is guaranteed. Whats your value for "--master" and "--deploy-mode" in spark-submit command Relevant Skills Daha Fazla

in %bids___i_period_sub_35% gün içinde833%project_currencyDetails_sign_sub_37% %project_currencyDetails_code_sub_38%
(1 Yorum)
0.0
AzzkaNoor

Hello there. I have seen your job posting. I will like to ask some questions. Please come over the chat so we can discuss things. Relevant Skills and Experience All the skills/experience will be discussed/revealed upo Daha Fazla

in %bids___i_period_sub_35% gün içinde1250%project_currencyDetails_sign_sub_37% %project_currencyDetails_code_sub_38%
(0 Değerlendirme)
0.0
pedrom9632

I have experience in working with Apache Spark and the manipulate DataFrames and RDD, by means of python

in %bids___i_period_sub_35% gün içinde1111%project_currencyDetails_sign_sub_37% %project_currencyDetails_code_sub_38%
(0 Değerlendirme)
0.0
in %bids___i_period_sub_35% gün içinde1250%project_currencyDetails_sign_sub_37% %project_currencyDetails_code_sub_38%
(0 Değerlendirme)
0.0
tobek90

Hello, i have a lot experience in the field g feel free to ask for my work,............................

in %bids___i_period_sub_35% gün içinde1500%project_currencyDetails_sign_sub_37% %project_currencyDetails_code_sub_38%
(0 Değerlendirme)
0.0
in %bids___i_period_sub_35% gün içinde1500%project_currencyDetails_sign_sub_37% %project_currencyDetails_code_sub_38%
(0 Değerlendirme)
0.0
in %bids___i_period_sub_35% gün içinde1250%project_currencyDetails_sign_sub_37% %project_currencyDetails_code_sub_38%
(0 Değerlendirme)
0.0
imkhatri

I am a Big Data Engineer certified by Simplilearn Relevant Skills and Experience Big Data Hadoop and Spark Developer Proposed Milestones $750 USD - It will be cleared in 8 days (Only Weekends will be calculated as wo Daha Fazla

in %bids___i_period_sub_35% gün içinde750%project_currencyDetails_sign_sub_37% %project_currencyDetails_code_sub_38%
(0 Değerlendirme)
0.0
henrikzilmer

django PHP Arduino hadoop metatrader web design python machine learning HTML,HTML5 graphic design wordpress Android unity3d Relevant Skills and Experience django PHP Arduino hadoop metatrader web design python machi Daha Fazla

in %bids___i_period_sub_35% gün içinde1666%project_currencyDetails_sign_sub_37% %project_currencyDetails_code_sub_38%
(0 Değerlendirme)
0.0