During my studies in London (UK) back in 2006, I initially wanted to write my masters dissertation thesis based on Uzbekistan market. But the main challenge I faced was the lack of any valid or credible data on the web related to Uzbekistan's social, economic and business matters. But today, the situation has changed a lot as the Internet become widespread and more accessible throughout the country. Local businesses have started expoiting the Internet opportunities by creating several news and media portal websites and more importantly, the government has been taking many Internet related intitiatives.
One of such sound initiatives was the recent launch of the open data portal of the Republic of the Uzbekistan - www.data.gov.uz, which aims to promote data democracy by offering valid social, economic and other types of data sets for all those interested parties in the form of .csv, .json, .xml formats through its website. Although the website is only few months old now, it already contains over 250 data sets, provided by over 25 government organisations, and which have already been downloaded nearly 15,000 times. And as I live and breathe data in my daily jobs, I decided to explore it myself and share with you some of the interesting insights I found. To start with, I picked the data related to number of people who left Uzbekistan between 1991-2013, cleaned the data files (more on that at the end), analysed and visualised, so here what I came up with:
As you can see, the numbers seem high and most people left the country during the first four years after Uzbekistan gained its independence. But as this total aggregated data doesn't tell us much, I then broked down the numbers by the regions and here some interesting things started to showing up.
What is clear from this treemap is that the majority of people (over 30%) who left were from Tashkent. And surprisingly to me, the least number of people left were actually from my own city, Namangan. However, the ratio given in this visual represent the percentages of total amount of people who left, and doesn't take into account the size of a region's population. So I downloaded the population data by regions from the same website and merged it with my main data set, and got the following as a result.
What's more interesting we can see from this graph is that although the highest number of people left were from Tashkent, it turns out that most people who left were actually from Navoi region compared against the region's population, and the percent of population left from Fergana and Samarkand is relatively low. So here is another summarised view for you.
I think the most important insights we can take from this visual is that every year on average 1% of Uzbekistan population leaves the country of which most of them leave from Djizak (1.2%), Tashkent, Syrdarya, Karakalpakstan and Navoi (2.1%) regions. But if you consider total figures, Navoi has lost 42% of its population since 1991 (Uzbekistan - 19%), despite many large scale investments have been made in the region in recent years. This should give enough implications to local governments to further study the causes of such big losses. This applies to the rest of top five regions in the graph as well.
However, it is also important to note the limitations of the data used as they don't include any age groups, professional background, gender, reasons for travel, percent of people returned and other attributes that would certainly add even more value to the analysis. In addition, the data available on the website is provided quite in a messy form as if they were thrown away to the bin because they require a lot of deep cleaning, formatting and optimising which can be very time consuming. Here is an example of how the downloaded data file is given in a .csv format compared to how it's displayed on the web:
I hope the administration of Data.gov.uz will try to provide further more well-structured and optimised data sets, so the people and businesses like us could even more benefit from such open data initiative. And in conclusion, I will try to publish here even more interesting data insights from time to time. Please feel free to leave your comments and suggestions below.