Data should not be the new oil

May 10, 2019|Victor Famubode

In May 2017, The Economist released an eye-catching cover touting data as the new oil. The phrase has now become a cliché, spurring people to vehemently defend or dispute it. For Nigeria, the implicit acceptance of the analogy between data and oil is a concern given the country’s mismanagement of its oil resources and the costs borne by its citizens.

Since Nigeria discovered oil, its previously thriving agriculture and textile industries has collapsed, the oil-producing Niger Delta region has degenerated, and we have embraced a culture where oil rents have been shared amongst the elite, leaving over 80 million Nigerians in poverty.

Imagine if this same culture was exported to harnessing and use of data; we are likely to encounter privacy breaches, algorithmic biases, and exploitation that fosters greater inequality. Thus, in societies like Nigeria, the idea of data as the new oil is a frightening thought, particularly with no credible frameworks for managing data as we move into the fourth industrial revolution.


No Credit Come Tomorrow

Branch international, an African-focused Fintech, raised $170 million in Series C funding earlier this year, one of the largest capital raises by an Africa-focused startup. At first glance, Branch is a simple consumer lending platform. Look closer, and you will see a business built on data analytics; specifically, Branch uses data from your mobile phone—like GPS data, SMS history, and call logs—to determine your creditworthiness. The business is manoeuvring its way around the information gap in Nigerian credit markets by “using machine learning algorithms (to) process thousands of data points to create personalised loan options”.

Branch is upfront about this process, indicating that it is done with the “explicit permission” of their customers. Machine learning is essentially technology that tries to infer patterns in big data and make predictions based on these. Left unchecked, real-world applications show that the technology is open to bias and abuse, a result of the preferences of the designer (known and unknown) and the underlying data that reflects society’s existing biases. In the lending scenario, there is a concern that ML-based risk assessments would lead to privacy infractions and algorithm bias in Nigeria. 


The cost for privacy and algorithmic bias

Branch’s model suggests a religious adherence to privacy laws. But the setup of machine learning models leaves room for potential abuse. For example, businesses can use contact lists they lawfully obtained from customers in ways that infringe their privacy rights. In Kenya, a consumer lending app was flagged for privacy breaches through contact lists. The firm had randomly sent text messages to customers’ contacts, trying to shame defaulters into repaying their loans.

A more alarming prospect when it comes to AI ethics is the fact that companies do not need full information on individuals to compromise their privacy—termed inferential analytics. The idea is that if I can put together different pieces of information about you from various sources, however partial, I may learn something new about you that you did not intend on letting anyone know. For example, by having your GPS data, access to your social media from another source and information from your bank, I may be able to tell that you are at risk of depression/schizophrenia, making you a less attractive borrower.

Algorithm bias is also a primary ethical concern. A famous example of the problem is the bias discovered in the Correctional Offender Management Profiling for Alternative Sanctions (COMPAS) for the United States (U.S.) Justice system. The machine learning algorithms used to predict recidivism (the likelihood that a convicted criminal will reoffend) in the U.S. found that the models consistently overestimated the likelihood that a black convict would reoffend and vice versa.

Another example is in image labelling. A few years ago, the Google search algorithm was discovered to be racist as it classified black people as gorillas. Again, an algorithm is only as good as the thinking that goes into it; even the best predictive model would adopt the prejudices of its creator.

To tackle this, Nigeria must develop a more sophisticated data culture. The United Nations Declaration of Human Rights gives a template for attaining this; from Article 12 which sets out rules for good data collection, to Article 15 which stresses the need for inclusion in scientific and technological research, ostensibly as a way of combating bias and prejudice.

Absent this, data is open to the same exploitation as misuse that has characterised the history of oil in Nigeria. Taking the example of consumer lending, it is not difficult to see where problems can arise. Imagine a (likely) scenario where one variable to determine creditworthiness is credit history. Gender bias could quickly become a problem here because of the historical gender disparity in access to finance in Nigeria. Our model would import existing social inequalities except we deliberately account for them. 


Putting brakes where it matters

Nigeria and its African peers must adopt an intentional approach to data, lest they risk repeating the mistakes of the commodity era. So far, governments have taken mixed approaches to the issue. While South African President Cyril Ramaphosa appointed a commission on the Fourth Industrial Revolution to explore how the likes of blockchain and artificial intelligence can address issues in education, agriculture, etc., the Zimbabwean government signed a deal with a Chinese firm to execute a mass facial recognition program.

The data revolution has the potential to transform Nigerian society—and even help us beat poverty, but we need to create adequate frameworks and best practices to make it work. One way would be to get firms that use machine learning algorithms to be transparent about the purpose of the variables used in their prediction models. Another important step would be to educate society—starting from data scientists—on data ethics. For instance, Microsoft’s course in Ethics and Law in Data Analytics which is part of a professional program in Data Science and Artificial Intelligence can be introduced to employees to have a grasp of the ethical concerns of these complex systems.

And of course, regulation needs to be created and enforced. In Nigeria, the latter part tends to be the issue. The National Information Technology Development Agency recently implemented the Nigerian Data Protection Regulation, the directive covers data processing, data security, cross-border data transfer, third-party data processing, privacy policy, etc., and would be a welcome step in the right direction. The regulation includes fines as high as ₦10 million for firms found in breach of the rules, so it is essential that data handlers are made aware of it. Of course, any data regulation is likely to be incomplete and contestable; the furore over the implementation of the General Data Protection Regulation (GDPR) in the European Union shows the depth of disagreement in the industry.

The costs of Nigeria’s abuse of oil are evident and continue to haunt the nation. In contrast, the consequences of data misuse are often silent, yet arguably more harmful. For this reason, it is important that we ensure that within the Nigerian context, data does not end up being the new oil.

Follow this Writer @BishopTopsy on Twitter.