In recent years, it has become commonplace to draw analogies between data and natural resources. One often encounters the phrase “data is the new oil”, while the competition over the acquisition and harvesting of data has been likened to the gold rush.

Officials in Nairobi from the Independent Electoral and Boundaries Commission (IEBC) register a voter during ongoing registration in preparation for the 2022 Kenyan general election. Photo: Simon Maina/AFP

Amidst these global trends, the factors that influence the patterns and dynamics of data in Africa are multi-layered, but some of the most important include historical forces, regional and national disparities, the relevance of data at societal and individual levels, and the inequalities resulting from the prevalence of the urban-rural resource gap and economic divide. Considering the importance of data in answering questions of education, health, conflict, commerce, governance, and development, it is critical to understand the state of data in Africa.

The history of data collection across the African continent is fractured by geography, with some parts of the continent, such as ancient Egypt, providing some of the earliest historical records of sophisticated census-taking of any society, while other parts of the continent – especially in sub-Saharan Africa – scarcely had any form of formal data collection to speak of until the late 20th century.

However, for most modern African countries, the development of official statistics and formalised data began as a series of colonial initiatives, by the British and French, but with contributions by the Belgians, Spanish and Portuguese. As Professor Ben Kiregyera notes in his book, The Emerging Data Revolution in Africa, these early systems of official statistical databases were usually based within the treasury departments and finance ministries in African countries, with their purview limited to collecting economic and basic demographic data.

The development of statistical capacities continued during the wave of independence that swept across the continent during the 1950s, 1960s and 1970s. Initially, there was some disruption in many countries as many of the expatriate staff who worked in these statistical offices returned to Europe. Nonetheless, the continued influence of the former colonial powers was evident in the differences in approach that Anglophone and Francophone countries took in developing their statistical capacities post-independence. For example, whereas Anglophone countries preferred collecting data through population censuses, Francophone countries preferred to use population-related surveys, which were smaller in scale.

Alongside these changes, this period also witnessed the growing formalisation of statistical practices and data collection on both the continental level, and at the level of national governments themselves. When the United Nations Economic Commission for Africa (UNECA) was established in 1959, statistics was one of the programmes included, while the 1960s and 1970s also saw the proliferation of national statistical offices beyond their initial economic functions – with government departments such as planning, agriculture, labour, health, education, the treasury, and the office of the president/prime minister frequently incorporating a division devoted to data collection and management within their structures.

Since the 1970s and 1980s, there has been a consistent drive by international organisations, developmental agencies, and domestic bodies to improve statistical capacity and data infrastructure across the continent.

Some of the most prominent programmes have included the African Census Programme (1969), the African Household Survey Capability Programme (1978), the Statistical Training Programme for Africa (1979), and the Statistical Development Programme for Africa (1987). Newer organisations and initiatives such as the Statistical Commission for Africa (2007) and the African Development Bank’s (AfDB) Africa Information Highway (2012) have drawn inspiration from these earlier initiatives.

A homeless man sleeps while Statistics South Africa fieldworker conducts a survey during the population and housing census at Marabastad in Pretoria on 3 February 2022. Photo: Phill Magakoe/AFP

The most notable continent-wide initiative that exists to realise these aspirations is the African Union Commission’s (AUC) African Charter on Statistics. As of November 2021, 23 African countries had ratified the charter, which “seeks to serve as policy framework for statistics development in Africa, especially the production, management and dissemination of statistical data and information at national, regional and continental levels”. However, several African countries with the best statistical capacities and data infrastructures on the continent, such as Botswana, Egypt, Seychelles, and South Africa, have yet to ratify this treaty, primarily due to administrative slow-walking, something that undermines the treaty’s objective of experience sharing.


There are also several regional bodies devoted to data collection and improving statistical capacity in Africa. Foremost among these are the statistical offices within Africa’s eight regional economic communities (RECs), regional statistical organisations, and specialised training centres.

From an organisational perspective, RECs are considered the most important of these structures, and they have three main functions. Their first responsibility is developing statistical capacity within member states.  Secondly, they are charged with harmonising the different national systems. This is a key step in realising the third responsibility, which is accumulating and harmonising data in a way that is conducive to informed policymaking at a regional level.

The Strategy for Harmonisation of Statistics in Africa (SHaSA), which was prepared in 2009 by the AUC, the AfDB and the UNECA, divided the continent’s eight RECs into two categories in terms of their ability to fulfil this mandate. The first category consists of RECs that have comparatively sophisticated statistical capacities, and a greater ability to ensure harmonisation at the regional level. The RECs in this group are the Economic Community of West African States (ECOWAS), the Common Market for Eastern and Southern African States (COMESA), the Southern African Development Community (SADC), and the East African Community (EAC).

By contrast, the second group of RECs consists of the Economic Community of Central African States (ECCAS), the Community of Sahel-Saharan States (CEN-SAD), the Intergovernmental Authority on Development (IGAD), and the Arab Maghreb Union (AMU). The SHaSA notes that the capacity of these RECs to achieve the mandate set for them is “non-existent”. The most persuasive explanation of these regional disparities posits that national-level inequalities in data collection and statistical capacity are a key driver of these differences.

Currently, the main entity responsible for the state of data in Africa is countries’ national statistical office. These entities are charged with conducting large-scale data collection drives through processes such as national censuses. They also have some involvement in data collected and published by other ministries, including those devoted to issues of economy, education, health, immigration, labour, and utilities. Yet, in many African countries, national statistical offices lack the capacity and resources to carry out satisfactory data collection.

Among the countries with the weakest capacities in this regard are the Democratic Republic of Congo, Eritrea, Madagascar, and South Sudan. None of these countries has managed to conduct a full census in the 21st century. As we would expect, data collection is weakest in what are commonly referred to as “fragile states”. Some of the main reasons for this include the heightened risk to ground-level data collectors, weak infrastructure such as roads and telecommunications, and the existence of parts of a country where the capacity of the state is especially weak.

Often, data collection is a particular challenge in rural areas with low population densities which are not easily accessible from administrative hubs such as capital cities and urban centres due to poor road networks and low rates of digital connectivity. Consequently, many governments are formulating policies and making decisions without the information to make these critical processes more effective.

However, not all countries in Africa are characterised by weak systems of data collection and statistical capacity. Countries such as Algeria, Botswana, Egypt, Mauritius, Seychelles, South Africa, and Tunisia have some of the most sophisticated data collection systems within the developing world. A notable characteristic of this list is that these are among the most economically developed countries on the continent. This suggests that there is a reinforcing link between statistical capacity, better governance, and economic development.

According to South Africa’s former statistician-general, Dr Pali Lehohla, better developed and resourced countries are more capable of funding adequate systems of data collection and statistical capacity, providing their governments with a greater ability to make informed public policy decisions.

These links suggest that governments that seek to improve their statistical capacity must also ensure that they are working towards economic development through initiatives such as improving infrastructure, quality of education and ensuring the depoliticisation of their national statistical offices. Without doing so, countries risk their statistical capacity stagnating during a time when data has never been more central to policymaking.

The level of data sophistication is low across the continent. One example of this is that according to a joint survey taken in 2020/21 by the UNECA and the British Broadcasting Corporation (BBC), only eight African countries capture a “high” proportion of deaths within their national Civil Registration and Vital Statistics Systems (CRVS), three capture a “moderate” proportion, while 38 capture a “low” proportion of deaths, and there is no data for the five remaining countries.

These disparities have a significant effect on the ability of governments to formulate policies in critical fields such as healthcare, social welfare and resource allocation, as many African governments lack information about the rate of change of their populations.

The implication of these national-level inequities is that the impact and functional value of data in Africa at the societal level is scattered and inconsistent. Of even more concern is the fact that even if data does play a significant role at the societal level, it is often the case that individuals and their local communities do not benefit from the proliferation of data.

Rarely has the importance of aggregate data been clearer in the African context than since the start of the COVID-19 pandemic in December 2019. During this crisis, countries with strong data collection capabilities and statistical capacities have had the advantage of their ability to track infection rates, access information about hospital and ventilator capacity, and accurately determine feasible locations for vaccination sites through using geolocation data.

At the level of local communities and individuals, there are examples of places where the “data revolution” has had a positive influence. The collaborative Africa Data Revolution Report 2018 provides several examples of these local-level projects where data and data skills development has helped improve people’s lives.

The most prominent examples include: a mobile application in Kenya that allows users to find relevant medical information and nearby hospitals; an app in Burkina Faso that allows users to track power cuts, map public transport, and record air quality; and a training programme in Ghana, Tech Needs Girls Coding Class, which seeks to reduce existing gender-based data inequalities.

However, these examples are the exception rather than the rule. One reason for this is that while mobile internet coverage is widespread in Africa, coverage lacks the required electric grid power to enable local communities to make use of these potentially life-changing technologies. Subsequently, fewer than 50% of Africans can use mobile services on a consistent basis, reducing their impact on the lives of individuals.

Frequently, this societal versus individual disjunct is linked to the perpetual inequality between urban and rural areas. Whereas Africa’s urban areas are becoming growing and versatile sites of data hubs and statistical innovations whether in the public or private sectors, the benefits of these processes rarely diffuse into rural parts of the continent. This is a significant concern in Africa when one considers that at least half of the continent’s total population of 1.3 billion live in rural areas.

The roots of this problem are partially historical in that until late into the 20th century, the rough terrain of much of rural Africa was difficult – in both physical and financial terms – for data collecting field workers to access, and there is still substantial inequality in the reliability of data between urban and rural areas.

It is therefore essential that international organisations, RECs, and governments become more proactive in closing this gap by investing in and maintaining physical and digital infrastructure, as well as investing in skills development within Africa’s rural areas.

Rectifying the data gap will be difficult and require collaboration and investment from a variety of international, regional, national, and private entities. Nevertheless, this work is essential if Africa is to use data to its fullest potential.

[activecampaign form=1]



Pranish Desai
 | Website

Pranish is a Senior Data Analyst within the Governance Insights & Analytics programme. He holds a Master of Arts in-Science obtained with distinction from the University of the Witwatersrand. This degree formed part of the Department of Science and Innovation's National e-Science Postgraduate Teaching and Training Platform. His research interests include comparative politics, local governance, quantitative social analysis and political geography.