Data Analysis

Finding Big Square’s Next Outlet Location

The City of Nairobi has 1,695 restaurants – compared to major cities in the world, Nairobi is underserved by restaurants (see chart below).  Disparities in per capita income which stands at $1,081 in Nairobi and $54,373 in New York City explains the difference in number restaurants per person. However, an assessment of MPESA’s role in the economic lives of Kenyans revealed access to M-PESA had increased consumption levels. Thus enabled up to 2 percent of Kenyan households to move out of poverty.

Economic mobility coupled with aspirational behavior creates the perfect mindset to indulge in conspicuous consumption. Java House Africa has tapped into this economic behavior and set 55 outlets within the city of Nairobi. Other food entities are fast learning from the Java House’s success and are embracing the same business model. One such company is Big Square – a casual dinning restaurant serving burgers, fried chicken, BBQ ribs and accompaniments.

With 9 outlets opened in the last 5 years, the restaurant chain is on a growth path with two new outlets rolled out in Shell Petrol Stations. This is an aggressive expansion approach given Shell Petrol stations was a preserve for Java House outlets. Like all brick and mortar businesses, location is key. So, where should Big Square open their next outlet to maximize profits?

Finding Home
There is an interesting theory known as the Central Place Theory that explains conditions necessary to create Malls.  These are; Threshold – the minimum population and income required to sustain a market, and Range – the maximum distance consumers are prepared to travel to acquire goods. Given these two factors, Big Square’s assured market was in Karen, where they opened their first outlet in 2012.  A similar approach was used to setup the next outlet in Lavington the following year. The diagram below shows the current dispersion of their outlets.

The Magic Sauce
Using the Central Place Theory, we develop a methodology that interest and success of a given location. The underlying assumption is that when a new outlet is setup and it becomes profitable, the general area becomes of greater interest to the company thus setting up another outlet close by to serve more customers.

To achieve this objective, we rely on a method known as a Voronoi diagram. This algorithm partitions a region into areas of influence as shown on the diagram below. Initially, we have one region centered in Karen, when the next outlet is Lavington is opened, the Karen region has to be split to give territory to Lavington. The same process is repeated when other outlets are opened, i.e Gigiri is split from Lavington, Oval is split from both Gigiri and Lavington.


The interlocking regions form a Steiner Minimal Tree which visually show us where areas of interest as emerging. However, we are interested in a probability score on the viability of an area. To that end, we track which region was split to form another and produced the dataset shown below. A ‘1’ indicates a split and ‘0’ no split.

Given we have a numeric representation on the hierarchical relationships of the voronoi diagram, we can apply quantitative algorithms to the data. The algorithm that fits this problem is Bayesian Inference, a concept anchored on conditional probability which seeks to   evaluate the probability of an event happening given a previous event has happened. In our scenario, that would be evaluating the probability of a outlet being opened given another one has been opened in the same region. The diagram below shows the prior and posterior probabilities.

We compute the prior probability by calculating the probability of setting up a store in an area. Then we utilized conditional probability to calculate the posterior probability which indicate the probability of setting up an outlet given another has been setup in the same region. An amazing output emerges, Oval region has a posterior probability of 1, which means it is certain that a new outlet set up here will be profitable. Also worth noting is the high posterior probabilities of Karen and Lavington regions. But this tells us half the story.

The Other Sauce
To get a different perspective, we utilize positional probability – a concept in statistical mechanics that measures randomness of molecular structures in substances. In our case, positional probability will measure “randomness” by the different type of restaurants that exists in each region. To that end, we access restaurant distribution data from and map them to their respective region.

Positional probability relies on entropy , which is the natural logarithm of the probability of an event. The events in our data is the type of restaurant i.e Mexican, Indian, Coffee House, Fine Dining et cetera. A high entropy means there are many different types of restaurants in an area, low entropy means same type of restaurants. We seek an area of high entropy because it tells us people who frequent the area have a diverse taste hence a new speciality restaurant like Big Square can do well in the area.

The positional probability is completed by looking at how much space is available for a given configuration. In our scenario, the configuration is captured by entropy. So, we calculate the area of a region then multiply by entropy to get our positional probability.  A higher value will tell us there’s high diversity and more space for the diversity to thrive. An area might have a high diversity (entropy) but less space (high competition) to setup a new entity e.g CBD.

The Conclusion

To compute the final probability, we calculate the joint probability of two probabilities (positional and conditional), in this case multiplying the two probabilities. From the diagram above, Karen area has the highest probability (0.73) of a new store succeeding  . This tells us a lot of restaurants have been setup around Karen which are very diverse and there is a lot of space (less competition)  to setup a new ones. The best candidate location would be Galleria Mall.

Astonishing is the other location get very low probabilities – this is because the methodology makes assumptions that a good location should have an existing outlet. If the methodology is inverted and we consider a good location as one which Big Square has no existing store and competition is healthy, then we get different results. Using Steiner Minimal Tress for conclusion, we overlay the regions on to a map as show below.

You will notice most of the regions intersect at the CBD. This method has been used by the FBI to find the home of serial killers. Since serial killers want to avoid detection by neighbors, they normally commit crimes far from where they are known. By mapping those area using steiner trees, you can find the area they are avoiding  (intersection of the polygons). In our case, Big Square have neglected/avoided the CBD.

Using the two divergent methodologies, Big Square should setup the next outlet in Karen or CBD.

Cover image by Brand2D


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s