Pyspark dataframe problem

  Given restaurant inspection dataset for 3 consecutive years (dataset2016, dataset 2017, and dataset 2018 as uploaded on Canvas), please do the following: convert the file into txt or any file type Getting familiar with your data (create dataframe and schema), name the columns and merge the data together , and show descriptive statistics for numerical columns. Find pairwise correlations of the numerical columns. Checking duplicates and filling missing observations with means