Devam Ediyor

Python joining files

I need some Python code that will join files. The files I will be joining are large - up to 5 GB. I need code that is fast and efficient.

I can't use the Python csv module for this since I may need this code to read from HDFS in a Hadoop cluster.

I have 3 files, A,B, and C. A is the master files and needs to do something similar to a LEFT OUTER JOIN in SQL with files B and C.

File A is the master file and has 6 fields:

A|B|C|D|E|F

File B has 3 fields

G|H|A

File C has 4 fields:

I|J|K|A

File A joins to file B on column A

File A joins to file C on column A

The final output file will look like

A|B|C|D|E|F|G|H|I|J|K

The code has to be fast since the files sizes can be up to 5 GB

Beceriler: Python

Daha fazlasını görün: joining files python large, python look file, d python, hadoop, python read, python csv module, joining, python read file, hdfs, python sql, python file read, python code, csv python, read csv python, hadoop python, conf files symbian python, csv files python, python csv, php read csv files, read files, python script read csv, javascript read csv files, read csv file fast, ejb21 left outer join, python module

İşveren Hakkında:
( 4 değerlendirme ) Chicago, United States

Proje NO: #4430436

Seçilen:

Analist

HI! I would like to help you with your project.

2 gün içinde 79$ USD
(3 Değerlendirme)
3.0

7 freelancer bu iş için ortalamada 119$ teklif veriyor

tlyx

Are all files that large or only master?

in 6 gün içinde198$ USD
(21 Değerlendirme)
4.6
Etcherator

I have been working in python for more than 3 years. I can do the job for you.

in 3 gün içinde100$ USD
(1 Değerlendirme)
3.2
pabloz1974

I can do this job with pleasure

in 3 gün içinde110$ USD
(4 Değerlendirme)
2.6
s2devtech

I have done the major portion of my programming in Python and am very familiar with the language and environment.

in 5 gün içinde150$ USD
(0 Değerlendirme)
0.0
boosth

Can do this quickly.

1 gün içinde 88$ USD
(0 Değerlendirme)
0.0
seboknorbi

Are the items in file A/B/C sorted by column A? If yes, the task is trivial and the conversion will be very fast. Similar to copying the files. Otherwise the script have to build indexes for file B and C, and use th Daha fazlası

in 3 gün içinde110$ USD
(0 Değerlendirme)
0.0