As regulators talk tough, tackling AI bias has never been more urgent

Head over to our on-demand library to view sessions from VB Transform 2023. Register Here

The rise of powerful generative AI tools like ChatGPT has been described as this generation’s “iPhone moment.” In March, the OpenAI website, which lets visitors try ChatGPT, reportedly reached 847 million unique monthly visitors. Amid this explosion of popularity, the level of scrutiny placed on gen AI has skyrocketed, with several countries acting swiftly to protect consumers.

In April, Italy became the first Western country to block ChatGPT on privacy grounds, only to reverse the ban four weeks later. Other G7 countries are considering a coordinated approach to regulation.

The UK will host the first global AI regulation summit in the fall, with Prime Minister Rishi Sunak hoping the country can drive the establishment of “guardrails” on AI. Its stated aim is to ensure AI is “developed and adopted safely and responsibly.”

Regulation is no doubt well-intentioned. Clearly, many countries are aware of the risks posed by gen AI. Yet all this talk of safety is arguably masking a deeper issue: AI bias.

Event

VB Transform 2023 On-Demand

Did you miss a session from VB Transform 2023? Register to access the on-demand library for all of our featured sessions.

Breaking down bias

Although the term ‘AI bias’ can sound nebulous, it’s easy to define. Also known as “algorithm bias,” AI bias occurs when human biases creep into the data sets on which the AI models are trained. This data, and the subsequent AI models, then reflect any sampling bias, confirmation bias and human biases (against gender, age, nationality, race, for example) and clouds the independence and accuracy of any output from the AI technology.

As gen AI becomes more sophisticated, impacting society in ways it hadn’t before, dealing with AI bias is more urgent than ever. This technology is increasingly used to inform tasks like face recognition, credit scoring and crime risk assessment. Clearly, accuracy is paramount with such sensitive outcomes at play.

Examples of AI bias have already been observed in numerous cases. When OpenAI’s Dall-E 2, a deep learning model used to create artwork, was asked to create an image of a Fortune 500 tech founder, the pictures it supplied were mostly white and male. When asked if well-known Blues singer Bessie Smith influenced gospel singer Mahalia Jackson, ChatGPT could not answer the question without further prompts, raising doubts about its knowledge of people of color in popular culture.

A study conducted in 2021 around mortgage loans discovered that AI models designed to determine approval or rejection did not offer reliable suggestions for loans to minority applicants. These instances prove that AI bias can misrepresent race and gender — with potentially serious consequences for users.

Treating data diligently

AI that produces offensive results can be attributed to the way the AI learns and the dataset it is built upon. If the data over-represents or under-represents a particular population, the AI will repeat that bias, generating even more biased data.

For this reason, it’s important that any regulation enforced by governments doesn’t view AI as inherently dangerous. Rather, any danger it possesses is largely a function of the data it’s trained on. If businesses want to capitalize on AI’s potential, they must ensure the data it is trained on is reliable and inclusive.

To do this, greater access to an organization’s data to all stakeholders, both internal and external, should be a priority. Modern databases play a huge role here as they have the ability to manage vast amounts of user data, both structured and semi-structured, and have capabilities to quickly discover, react, redact and remodel the data once any bias is discovered. This greater visibility and manageability over large datasets means biased data is at less risk of creeping in undetected.

Better data curation

Furthermore, organizations must train data scientists to better curate data while implementing best practices for collecting and scrubbing data. Taking this a step further, the data training algorithms must be made ‘open’ and available to as many data scientists as possible to ensure that more diverse groups of people are sampling it and can point out inherent biases. In the same way modern software is often “open source,” so too should appropriate data be.

Organizations have to be constantly vigilant and appreciate that this is not a one-time action to complete before going into production with a product or a service. The ongoing challenge of AI bias calls for enterprises to look at incorporating techniques that are used in other industries to ensure general best practices.

“Blind tasting” tests borrowed from the food and drink industry, red team/blue team tactics from the cybersecurity world or the traceability concept used in nuclear power could all provide valuable frameworks for organizations in tackling AI bias. This work will help enterprises to understand the AI models, evaluate the range of possible future outcomes and gain sufficient trust with these complex and evolving systems.

Right time to regulate AI?

In previous decades, talk of ‘regulating AI’ was arguably putting the cart before the horse. How can you regulate something whose impact on society is unclear? A century ago, no one dreamt of regulating smoking because it wasn’t known to be dangerous. AI, by the same token, wasn’t something under serious threat of regulation — any sense of its danger was reduced to sci-fi films with no basis in reality.

But advances in gen AI and ChatGPT, as well as advances towards artificial general Intelligence (AGI), have changed all that. Some national governments seem to be working in unison to regulate AI, while paradoxically, others are jockeying for position as AI regulators-in-chief.

Amid this hubbub, it’s crucial that AI bias doesn’t become overly politicized and is instead viewed as a societal issue that transcends political stripes. Across the world, governments — alongside data scientists, businesses and academics — must unite to tackle it.

Ravi Mayuram is CTO of Couchbase.

DataDecisionMakers

Welcome to the VentureBeat community!

DataDecisionMakers is where experts, including the technical people doing data work, can share data-related insights and innovation.

If you want to read about cutting-edge ideas and up-to-date information, best practices, and the future of data and data tech, join us at DataDecisionMakers.

You might even consider contributing an article of your own!