Devam Ediyor

Python joining files

I need some Python code that will join files. The files I will be joining are large - up to 5 GB. I need code that is fast and efficient.

I can't use the Python csv module for this since I may need this code to read from HDFS in a Hadoop cluster.

I have 3 files, A,B, and C. A is the master files and needs to do something similar to a LEFT OUTER JOIN in SQL with files B and C.

File A is the master file and has 6 fields:


File B has 3 fields


File C has 4 fields:


File A joins to file B on column A

File A joins to file C on column A

The final output file will look like


The code has to be fast since the files sizes can be up to 5 GB

Beceriler: Python

Daha fazlasını görün: joining files python large, python look for file, python hadoop, hadoop and python, d python, hadoop, python read, python csv module, joining, python read file, hdfs, python sql, python file read, python code, csv python, read csv python, hadoop python, conf files symbian python, csv files python, asp net read csv files, python csv, php read csv files, read files, python script read csv, javascript read csv files

İşveren Hakkında:
( 4 değerlendirme ) Chicago, United States

Proje NO: #4430436



HI! I would like to help you with your project.

2 gün içinde 79$ USD
(3 Değerlendirme)

7 freelancers are bidding on average $119 for this job


Are all files that large or only master?

in 6 gün içinde198$ USD
(21 Değerlendirme)

I have been working in python for more than 3 years. I can do the job for you.

in 3 gün içinde100$ USD
(1 Değerlendirme)

I can do this job with pleasure

in 3 gün içinde110$ USD
(4 Değerlendirme)

I have done the major portion of my programming in Python and am very familiar with the language and environment.

in 5 gün içinde150$ USD
(0 Değerlendirme)

Can do this quickly.

1 gün içinde 88$ USD
(0 Değerlendirme)

Are the items in file A/B/C sorted by column A? If yes, the task is trivial and the conversion will be very fast. Similar to copying the files. Otherwise the script have to build indexes for file B and C, and use th Daha fazlası

in 3 gün içinde110$ USD
(0 Değerlendirme)