Search online to download the reasonably large dataset. Define your own problem based on the dataset and provide a solution to it with your knowledge of Apache PySpark platform.
Prepare a final report including 1) motivation, 2) design and 3) relevant source code and screenshots. Also, explain difficulties experienced and how to resolve them.