I need to be able to run the levenshtein distance algorithm (or trigram formula if that's better) comparing two sets of data from within a single excel. I believe this is doable via VBA and have included a link to a site that provides the VBA code.
[url removed, login to view]
The inputs would each consist of a few data fields (columns) and potentially hundreds or thousands of values (rows). The fields being compared are names and dates with potentially differing formats. For example, data set 1 might show Michael Jason Fox, 2/6/2017, Smith, 6/22/1988, while data set 2 might show Mike J Fox, 02/06/17, SMITH B, 6/22/88. I'm thinking the most efficient way to prep the data sets to be run through the formula would be to format the values and concatenate them. I think we need to concatenate them because the records from each set that might be very similar or identical likely won't be in the same row.
I have a limited understanding of the programming behind this. I need to be able to compare 2 sets of data with similar fields and identify when a value from set A is missing from set B. There might be values in set B with similar values, i.e. a name match but the date is one or 2 or 5 days off. I need to be able to identify when the value is set B is identical or similar for the names and for the second date in the string, but differs slightly for the first date in the string vs. there is no record in set B that is similar to set A because there's no similar name match at all vs. theres a similar name match but the second date match isnt similar at all.
22 freelancers are bidding on average $355 for this job
hello if you hire me, i will finished your project perfectly. Relevant Skills and Experience Data Processing, Excel, VB.NET, Visual Basic, Visual Basic for Apps Proposed Milestones $144 USD - fee $300 USD - 1
I can make a tool to check the result test via excel VBA or Python, which will be linked to data in the .xlsx file Cheers! Stay tuned, I'm still working on this proposal.