Stock Data Analysis - Text Parsing / Excel Spreadsheet

I have a large amount of intraday stock data for over 600 stocks stored in ascii format that I need to analyze to get summary statistics for. The analysis should output the summary statistics to an Excel workbook.

The primary objective of this exercise is to screen a large amount of financial information for missing entries. This is not a difficult project by any means, but the skill involved is in getting the program to run within a reasonalble time period. Assume each of the .txt files are around 30Mb.

Currently, the data is stored in around 600 discreet .txt files. I have included one example file which is around half the size of the usual files. Freelancer won't allow me to upload due to size restrictions so please PM me for more examples. The data is stored in the usual format mm/dd/yyyy hh:mm open,high,low,close,volume.

As you will see, these files are large. I need someone who is an expert in very high speed text parsing / data analysis. I'm not at all concerned as to which programming language the developer chooses to use but I would imagine one of the Microsoft programming languages would be easier given that the summary statistics output must be in Excel 2007.

I would like the application to perform analysis and develop summary statistics for each stock. The summary statistics should include:

Date of First Data point

Date of Last data point

Number of datapoints per day (A time series excel graph for each day for each stock)

The dates Days when there are no data points (Weekends should not be included for each of these)

The output workbook will be composed of three main areas:

1) A summary sheet which list all the stocks in the folder to be analysed. Their start dates, end dates and number of missing days, and the days they are missing entries.

2) A second worksheet that contains a list of days that are not to be included in the analysis (ie. non-trading days excluding weekends)

3) The full analysis sheets for each stock file including a time series graph with the number of data points for each day. And a list of the days for which there are no entries.

That means that the final output will be an excel workbook with a large number of workbooks, one for each stock and two additional ones as described in points 1 and 2 above.

The application must also ask me for the period over which I wish to perform the analysis. Ie.

I will be using the application to analyze all of the files sequentially so here's the challenge:- I the application MUST run through all (600) of the text files in less than 18 hours on a Dual Core Pentium T2050 laptop with 2Gb of RAM running Windows XP.

To ensure compliance, I will provide the winning bidder(s) with 50 text files and the run though must be completed in less than one and a half hours on a system of similar specification to mine. The winning bidders must ensure that the final application can do this before submitting the work for payment.

In responses, I would very much appreciate an outline of how you intend to proceed and some form of evidence that you have experience working with large size, high speed parsing would be a distinct advantage. This project should be easy money for the right person. I would not expect the application to take more than two days to develop following bidder selection.

Please note that the files are much too large to import into Excel (any version).

Please note that the files are much too large to import into Excel (any version) therefore the summary statistics will need to be calculated outside of Excel.


An enormous thanks to everyone for submitting their bids.

There is quite a lot of work for me to get through answering all of the various questions that have arisen.

I intend to do all this on Sunday and make my final selection then.

Kindest Regards,

One other thing I forgot to mention is that the program must be able to removed duplicate entries. As an example of this in the newly added attachment, you will see that from 02/19/2010 onwards, there are duplicate lines for each time frame.

The program must output a new 'cleansed' file without these duplicate entries in a folder designated by the user.


Thank you all very much for your bids.

However, I cannot possible re-iterate the following enough:

The data chencking cannot be done in Excel (any version). The sample files I have provided are representative of the actual files.

Please assume that the actual files have a minimum of three million entries each and there are 600 such files which must be processed in a batch, and require no user intervention from the moment of pressing the 'GO' button.

As a result, I have considerably extended the bidding time by three weeks.

Beceriler: C# Programlama, C++ Programlama, Excel, Finansal Pazarlar, Visual Basic

Daha fazlasını görün: stock data analysis, stock data analysis excel, excel stock data analysis, stock analysis excel worksheet, analyze text, analysis text, stock analysis excel workbook, excel spreadsheet stock analysis, parsing expect, missing large amount stock, summary statistics excel, programming stock data analysis, programming excel parsing, stock data txt, excel stock data, excel 2007 data analysis, time series stocks, excel programming parsing, excel parsing, excel stock spreadsheet, stock analysis programming, excel parsing objective, stock days analysis excel worksheet, time series graph excel, xp on freelancer

İşveren Hakkında:
( 11 değerlendirme ) London, United Kingdom

Proje NO: #700061



I have the skills and technical sophistication necessary to do this project. The bulk of the analysis can be preformed in a standalone application written in C++, with the user interface implemented in Excel or perhap Daha fazlası

7 gün içinde 249$ USD
(4 Değerlendirme)

54 freelancers are bidding on average $226 for this job


I am expert in Text/ASCII file parsing with .NET and can start a detailed discussion with you in inbox/PMB. Send me sample files ASAP. Here is my small introduction : I am a GOLD member of this site, I am a Micro Daha fazlası

in 10 gün içinde750$ USD
(154 Değerlendirme)

I'm experienced with high performance processing solutions and I'd be glad to help you. Please refer to your PMB for further discussion.

in 7 gün içinde600$ USD
(2 Değerlendirme)

Greetings England from the USA {smile} Because of the volume of data.. as well as to maximize your data analysis abilities.. I am proposing that a MsAccess backend be used. Hello my name is Bill.. I am a MsA Daha fazlası

in 3 gün içinde250$ USD
(24 Değerlendirme)

Over 10 years Excel/VBA experience.

in 2 gün içinde200$ USD
(26 Değerlendirme)

Please check PMB.

in 3 gün içinde200$ USD
(39 Değerlendirme)

Hi, I can help you with this if you like.

in 15 gün içinde300$ USD
(22 Değerlendirme)

We have both passion & expertise to do the job well within specified time & budget.

in 30 gün içinde251$ USD
(11 Değerlendirme)

Hi Can be perfectly done. Thanks Al

in 5 gün içinde250$ USD
(22 Değerlendirme)

Hi, ready for taking it up.

in 10 gün içinde450$ USD
(16 Değerlendirme)

Hi, I wouldn't bore you with details of my expertise & qualifications. Plz refer PMB so that u can transfer the files and I can have a look.

in 2 gün içinde250$ USD
(14 Değerlendirme)

Exceptional quality & Quick work. Please refer to Profile & PMB for details. Thanks FAZ

1 gün içinde 200$ USD
(15 Değerlendirme)

Hi, Please check PMB. Regards

in 10 gün içinde250$ USD
(7 Değerlendirme)

Please check your PMB. Thanks

in 5 gün içinde200$ USD
(3 Değerlendirme)

Please see the PM

in 3 gün içinde210$ USD
(17 Değerlendirme)

Hello, I would like to do this project. Please see PMB. Best regards, CDumitru

in 2 gün içinde120$ USD
(5 Değerlendirme)

Ready to do this ork for you.

in 3 gün içinde190$ USD
(7 Değerlendirme)

please check pmb

in 10 gün içinde250$ USD
(3 Değerlendirme)

I have just done a project involving a lot of text manipulation. I can do this project easily. 5 days maximum.

in 5 gün içinde200$ USD
(1 Değerlendirme)

I can perform the full analysis for you in matlab. However your $250 budget will not be sufficient

in 15 gün içinde250$ USD
(1 Değerlendirme)

Hi, I am currently working on similar kind of project in one of the investment bank, which reads the tick data from diffrent markets (like EbsLive, reuters) both historical and realtime , and parses it and stores t Daha fazlası

in 15 gün içinde250$ USD
(0 Değerlendirme)