Stock Data Analysis - Text Parsing / Excel Spreadsheet

I have a large amount of intraday stock data for over 600 stocks stored in ascii format that I need to analyze to get summary statistics for. The analysis should output the summary statistics to an Excel workbook.

The primary objective of this exercise is to screen a large amount of financial information for missing entries. This is not a difficult project by any means, but the skill involved is in getting the program to run within a reasonalble time period. Assume each of the .txt files are around 30Mb.

Currently, the data is stored in around 600 discreet .txt files. I have included one example file which is around half the size of the usual files. Freelancer won't allow me to upload due to size restrictions so please PM me for more examples. The data is stored in the usual format mm/dd/yyyy hh:mm open,high,low,close,volume.

As you will see, these files are large. I need someone who is an expert in very high speed text parsing / data analysis. I'm not at all concerned as to which programming language the developer chooses to use but I would imagine one of the Microsoft programming languages would be easier given that the summary statistics output must be in Excel 2007.

I would like the application to perform analysis and develop summary statistics for each stock. The summary statistics should include:

Date of First Data point

Date of Last data point

Number of datapoints per day (A time series excel graph for each day for each stock)

The dates Days when there are no data points (Weekends should not be included for each of these)

The output workbook will be composed of three main areas:

1) A summary sheet which list all the stocks in the folder to be analysed. Their start dates, end dates and number of missing days, and the days they are missing entries.

2) A second worksheet that contains a list of days that are not to be included in the analysis (ie. non-trading days excluding weekends)

3) The full analysis sheets for each stock file including a time series graph with the number of data points for each day. And a list of the days for which there are no entries.

That means that the final output will be an excel workbook with a large number of workbooks, one for each stock and two additional ones as described in points 1 and 2 above.

The application must also ask me for the period over which I wish to perform the analysis. Ie.

I will be using the application to analyze all of the files sequentially so here's the challenge:- I the application MUST run through all (600) of the text files in less than 18 hours on a Dual Core Pentium T2050 laptop with 2Gb of RAM running Windows XP.

To ensure compliance, I will provide the winning bidder(s) with 50 text files and the run though must be completed in less than one and a half hours on a system of similar specification to mine. The winning bidders must ensure that the final application can do this before submitting the work for payment.

In responses, I would very much appreciate an outline of how you intend to proceed and some form of evidence that you have experience working with large size, high speed parsing would be a distinct advantage. This project should be easy money for the right person. I would not expect the application to take more than two days to develop following bidder selection.

Please note that the files are much too large to import into Excel (any version).

Please note that the files are much too large to import into Excel (any version) therefore the summary statistics will need to be calculated outside of Excel.


An enormous thanks to everyone for submitting their bids.

There is quite a lot of work for me to get through answering all of the various questions that have arisen.

I intend to do all this on Sunday and make my final selection then.

Kindest Regards,

One other thing I forgot to mention is that the program must be able to removed duplicate entries. As an example of this in the newly added attachment, you will see that from 02/19/2010 onwards, there are duplicate lines for each time frame.

The program must output a new 'cleansed' file without these duplicate entries in a folder designated by the user.


Thank you all very much for your bids.

However, I cannot possible re-iterate the following enough:

The data chencking cannot be done in Excel (any version). The sample files I have provided are representative of the actual files.

Please assume that the actual files have a minimum of three million entries each and there are 600 such files which must be processed in a batch, and require no user intervention from the moment of pressing the 'GO' button.

As a result, I have considerably extended the bidding time by three weeks.

Beceriler: C# Programlama, C++ Programlama, Excel, Finansal Pazarlar, Visual Basic

Daha fazlasını gör: stock data analysis, stock data analysis excel, excel stock data analysis, stock analysis excel worksheet, analyze text, analysis text, stock analysis excel workbook, excel spreadsheet stock analysis, parsing expect, missing large amount stock, summary statistics excel, programming stock data analysis, programming excel parsing, stock data txt, excel stock data, excel 2007 data analysis, time series stocks, excel programming parsing, excel parsing, excel stock spreadsheet, stock analysis programming, excel parsing objective, stock days analysis excel worksheet, time series graph excel, xp on freelancer

İşveren Hakkında:
( 12 değerlendirme ) London, United Kingdom

Proje NO: #700061



I have the skills and technical sophistication necessary to do this project. The bulk of the analysis can be preformed in a standalone application written in C++, with the user interface implemented in Excel or perhap Daha Fazla

%selectedBids___i_period_sub_7% gün içinde 249%project_currencyDetails_sign_sub_9% %project_currencyDetails_code_sub_10%
(4 Değerlendirme)

Bu iş için 53 freelancer ortalamada $226 teklif veriyor


I am expert in Text/ASCII file parsing with .NET and can start a detailed discussion with you in inbox/PMB. Send me sample files ASAP. Here is my small introduction : I am a GOLD member of this site, I am a Micro Daha Fazla

in %bids___i_period_sub_35% gün içinde750%project_currencyDetails_sign_sub_37% %project_currencyDetails_code_sub_38%
(156 Değerlendirme)

I'm experienced with high performance processing solutions and I'd be glad to help you. Please refer to your PMB for further discussion.

in %bids___i_period_sub_35% gün içinde600%project_currencyDetails_sign_sub_37% %project_currencyDetails_code_sub_38%
(9 Değerlendirme)

Hi Can be perfectly done. Thanks Al

in %bids___i_period_sub_35% gün içinde250%project_currencyDetails_sign_sub_37% %project_currencyDetails_code_sub_38%
(22 Değerlendirme)

Over 10 years Excel/VBA experience.

in %bids___i_period_sub_35% gün içinde200%project_currencyDetails_sign_sub_37% %project_currencyDetails_code_sub_38%
(34 Değerlendirme)

We have both passion & expertise to do the job well within specified time & budget.

in %bids___i_period_sub_35% gün içinde251%project_currencyDetails_sign_sub_37% %project_currencyDetails_code_sub_38%
(11 Değerlendirme)

Please check PMB.

in %bids___i_period_sub_35% gün içinde200%project_currencyDetails_sign_sub_37% %project_currencyDetails_code_sub_38%
(39 Değerlendirme)

Hi, I can help you with this if you like.

in %bids___i_period_sub_35% gün içinde300%project_currencyDetails_sign_sub_37% %project_currencyDetails_code_sub_38%
(26 Değerlendirme)

Hi, I wouldn't bore you with details of my expertise & qualifications. Plz refer PMB so that u can transfer the files and I can have a look.

in %bids___i_period_sub_35% gün içinde250%project_currencyDetails_sign_sub_37% %project_currencyDetails_code_sub_38%
(14 Değerlendirme)

Exceptional quality & Quick work. Please refer to Profile & PMB for details. Thanks FAZ

1 gün içinde %bids___i_sum_sub_32%%project_currencyDetails_sign_sub_33% USD
(17 Değerlendirme)

Hi, ready for taking it up.

in %bids___i_period_sub_35% gün içinde450%project_currencyDetails_sign_sub_37% %project_currencyDetails_code_sub_38%
(16 Değerlendirme)

Hi, Please check PMB. Regards

in %bids___i_period_sub_35% gün içinde250%project_currencyDetails_sign_sub_37% %project_currencyDetails_code_sub_38%
(7 Değerlendirme)

Please check your PMB. Thanks

in %bids___i_period_sub_35% gün içinde200%project_currencyDetails_sign_sub_37% %project_currencyDetails_code_sub_38%
(4 Değerlendirme)

Hello, I would like to do this project. Please see PMB. Best regards, CDumitru

in %bids___i_period_sub_35% gün içinde120%project_currencyDetails_sign_sub_37% %project_currencyDetails_code_sub_38%
(5 Değerlendirme)

Ready to do this ork for you.

in %bids___i_period_sub_35% gün içinde190%project_currencyDetails_sign_sub_37% %project_currencyDetails_code_sub_38%
(8 Değerlendirme)

Please see the PM

in %bids___i_period_sub_35% gün içinde210%project_currencyDetails_sign_sub_37% %project_currencyDetails_code_sub_38%
(23 Değerlendirme)

please check pmb

in %bids___i_period_sub_35% gün içinde250%project_currencyDetails_sign_sub_37% %project_currencyDetails_code_sub_38%
(3 Değerlendirme)

I have just done a project involving a lot of text manipulation. I can do this project easily. 5 days maximum.

in %bids___i_period_sub_35% gün içinde200%project_currencyDetails_sign_sub_37% %project_currencyDetails_code_sub_38%
(1 Yorum)

hi, as you may see from my profile I am expert in the field of statistics, apply math, finance. I would like to do the project with matlab, as stand alone application or as excel add in which is developed in matlab b Daha Fazla

in %bids___i_period_sub_35% gün içinde200%project_currencyDetails_sign_sub_37% %project_currencyDetails_code_sub_38%
(1 Yorum)

Hi, I am currently working on similar kind of project in one of the investment bank, which reads the tick data from diffrent markets (like EbsLive, reuters) both historical and realtime , and parses it and stores t Daha Fazla

in %bids___i_period_sub_35% gün içinde250%project_currencyDetails_sign_sub_37% %project_currencyDetails_code_sub_38%
(0 Değerlendirme)

I am very much know about MS-Office and Statistical package software such as SPSS,MINITAB,SAS,GAUSS,EVIEWS,R,SPECTRUM, and Database software, Mathematical software such as Maple , Math-lab, Mathematica. Otherwise I am Daha Fazla

in %bids___i_period_sub_35% gün içinde100%project_currencyDetails_sign_sub_37% %project_currencyDetails_code_sub_38%
(0 Değerlendirme)