Sales Analytics & Statistics using Python: An Introduction

Appreciating the basic of statistics coupled with the capacity being offered to businesses by data science is fundamental to achieving sale growth aspiration

Posted by baroude ntsiba on October 02, 2017

As an enthusiastic data scientist, I understand the need for companies to optimise their sales operations. Therefore, appreciating the basic of statistics coupled with the capacity being offered to businesses by data science is fundamental to achieving sale growth aspiration across various industry sectors including retail and marketing

In the following article, I will provide you with basic understanding on how at typical data science problem can be tackled using data science.

The dataset used in this article can be sourced at Tableau website and located here.

The software utilise for this mini project are:

Let’s start by looking at the dataset statistics. This statistics provide us with the data shape and some basic understanding of the data at hand

Here is how this result can be interpreted.

      Unit price
    • The average price for a product is $ 89
    • The most expensive product cost $ 6783
    • The cheapest item cost $ 0.99
    • The shipping cost range from $ 0.49 to $ 164
    • The average shipping cost is $ 12
    • The sales range between $ 2.24 and $ 80961
    • The average price per sale is $ 1776

Looking at these figures, one can start to make assumptions on the probability and amount of potential expensive business areas that need to be addressed.

Particularly, in regard to the shipping mode.

Shipping mode

let's find out how many different type of shipping mode were used?

We can see that there are three types of shipping mode and that regular air is the company’s preferred shipping mode as shown above. We can also identify the remaining two shipping mode. This operation can be achieve using the describe function ( ).

Delivery Truck, Express Air are the remaining two others shipping mode utilise by the superstore as shown below

Let's Look for Trends

Now that the shipping mode details have been identified some analysis can be undertaken to identify possible trends in support of the shipping department. However, that will be a subject for a future post. In the meantime I will be focusing on analysing sales trends.

Sales trends are amongst the most important business operations. Correctly dentifying trends will be beneficial to all the departments across an organisation.

let's find out about superstore sales trends

I agree with you that, It looka a bit confusing at time although one can see the sales patterns from 2009 to 2012

I am going to going to clarify this using the metric that statistician know well about and that most machine learning algorithms are often based upon, name the average.

Identifying trends like here gives managers the possibility to react swiftly and efficiently to prevent poor sales


Pandas, Numpy and Matplotlib are powerful tool that can help small and medium business to reduce the technological gap with very big companies.

If you are a SMe and want to leverage your business I am here to help

Keep posted if you want to find out more on how you can solve business or personal problems. Plesase contact me here.

As usual, I'll be posting the notebook on my github repository

Airbnb Data Analysis →