In this individual project, I applied both classification and regression approaches to predict the processing time of BOS:311 service request. Analysis were first written in Python with pandas in Jupyter Notebook, then rewritten in PySpark in Databricks notebook as practice.
The complete story is described in the pdf file. Code implementation can be found in the three ipynb notebooks. If the notebooks won't load, open each link with https://nbviewer.jupyter.org/.
Data source: