synapse pyspark delta lake merge scd type2 without primary key
Teslim sırasında ödenir
I am looking for a skilled professional who can help me with a project titled "synapse pyspark delta lake merge scd type2 without primary key". The ideal candidate should have experience and expertise in the following areas:
- The desired outcome of the merge process is to update existing records and insert new records.
- The level of data quality required for the outcome is high integrity, with no duplicates and full accuracy.
Handling Historical Data:
- There is a specific requirement to keep track of historical changes to the data.
Skills and Experience:
- Proficiency in Synapse, Pyspark, Delta Lake
- Experience with SCD Type 2 implementation
- Strong understanding of data integrity and accuracy
- Ability to handle historical data changes
**Problem ** I have a set of rows coming from previous process which has no primary key, and the composite keys are bound to change which are not a good case for composite key, only way the rows are unique is the whole row( including all keys and all values). I need to implement the SCD type2 on this data. The environment is Synapse pyspark, using delta lake Merge command and more.
how I tried Using row hash: In this case the challenge without primary/composite key is to find which rows have changed/updated. With any updated values the row hash is changing and resulting into new row.
please suggest how this problem can be solved. If you have any questions on this, please write back.
If you have the skills and experience mentioned above, please submit your proposal for this project.
Proje NO: #37527252
Bu iş için 2 freelancer ortalamada ₹27500 teklif veriyor
I've meticulously reviewed the job description and am confident in my alignment with the requirements. With a commitment to quality, I assure timely completion within the project deadline.