Everybody Wins

Team Members

Project Abstract

Our task was to create a machine learning model for Jacksons Food Stores that could answer a variety of questions and predict sales based on given data about certain franchises. We were provided transaction data from April 2021 to October 2021 from 54 Jackson franchises located mainly in the Western region. Utilizing specialized and powerful tools from the the Python programming library, namely Pandas and Scikit-learn, we cleaned and manipulated the data into a state that allowed us to train various machine learning models. These models included Random Forest, which utilizes decision trees, Facebook's Prophet algorithm, which uses a decomposable model focusing on seasonality, and a feedforward neural network commonly used for regression and time-series forecating.

Project Description

What We Built

The largest portion of our product was information. Jacksons Food Stores were looking for insight into how various aspects of the Powerball and Megamillions lotteries affected stores sales. We provided Jacksons various Jupyter Notebooks - web-based interactive computational environments that allow users to combine code, text, and visuals into a single document. These notebooks provided an in-depth look at how our team arrived to the conclusions that we did. Additionally, we provided Jacksons with a Python application specifically tailored to help them predict future sales. This application allows Jacksons to provide any new data set and receive a prediction for a month of sales in the future. Or, if they like, they can can provide a set of data to validate the accuracy of the model that was created with the input data. In our testing, the model managed to predict an entire month's sales for over 50 stores with a margin of error +/-10%.

How It Works

The application works by reading in CSVs that have been curated by Jacksons Business Intelligence department. The CSVs are converted into objects calld Dataframes from Python's Pandas library. These dataframes allow us to load huge tables of information into memory and provides tools to manipulate the columns in any way imaginable. With a dataframe created by from the CSVs in the input data folder, we train a Random Forest Regression model. If run in prediction mode, the application will generate an appropriately shaped table for the next month following the input data. It will then make predictions based on this generated table and output a CSV detailing the Total Sales predictions. If run in validation mode, it will create an additional dataframe from the provided data in the test folder, using it as a validation set. The trained model will make predictions on this validation set and output various graphs and evaluation metrics detailing it's performance.