UN report warns AI development is being hijacked by Big Tech and wealthy countries

Artificial intelligence (AI) is often considered the modern equivalent of electricity, powering countless human interactions every day. However, startups and developing countries are at a distinct disadvantage as big tech companies and wealthier nations dominate the field, particularly when it comes to two key areas: training data sets and computing power.

The global regulatory landscape for AI is highly complex and fragmented along lines of diverse regulation and stakeholder collaboration in both the private and public sectors. This complexity is further exacerbated by the need to harmonize regulatory frameworks and standards across international borders.

The laws governing the fair use of AI training datasets vary by region. For example, the European Union’s AI Act prohibits the use of copyrighted materials to train AI models without the express authorization of the rights holders. Japan’s Text and Data Mining Act (TDM) allows the use of copyrighted data to train AI models, without distinguishing between legally and illegally obtained materials. China, on the other hand, has introduced several rules and regulations governing the use of AI training datasets that are more in line with EU regulations, as they require that training data be obtained legally. However, these laws only apply to AI services available to the general public and exclude those developed and used by companies and research institutions.

The regulatory environment often shapes a startup’s trajectory, significantly impacting its ability to innovate and scale. An AI startup focused on training models—whether in the pre-training or post-training phase—will face different regulatory challenges that can impact its long-term success, depending on the region it operates in. For example, a startup in Japan would have an advantage over a startup in the EU when it comes to indexing copyrighted web data and using it to train high-performance AI models, as it would be protected by Japan’s TDM law. Given that AI technologies transcend national borders, this requires collaborative, cross-border solutions and global collaboration among key stakeholders.

When it comes to computing power, there is a significant gap between the big players—whether they are state-owned or private—and startups. Larger tech companies and state-owned entities have the resources to buy and stockpile computing power to support their future AI development goals, while smaller players who lack these resources are dependent on larger players for AI training and inference infrastructure. Supply chain issues with computing resources have exacerbated this gap, which is even more pronounced in the global South. For example, of the world’s top 100 high-performance computing (HPC) clusters capable of training large AI models, none are hosted in a developing country.

In October 2023, the UN High-Level Advisory Body (HLAB) on AI was established as part of the UN Secretary-General’s Action Plan on Digital Cooperation to provide UN Member States with analysis and recommendations on international AI governance. The group consists of 39 individuals from diverse backgrounds (geographic, gender, age, and discipline), spanning government, civil society, the private sector, and academia, to ensure that AI governance recommendations are both fair and inclusive.

As part of this process, we interviewed experts from startups and small and medium-sized enterprises (SMEs) to explore the challenges they face with AI training datasets. Their feedback highlighted the importance of a neutral, international body such as the United Nations in overseeing international AI governance.

HLAB’s recommendations for AI training dataset standards, covering both pre- and post-training, are detailed in a new report Governing AI for Humanity and include the following elements:

Establish a global marketplace for the exchange of anonymized data that standardizes data-related definitions, principles for the global governance of AI training data and the provenance of AI training data, and transparent, rights-based accountability. This includes introducing processes and standards for data governance and exchange.
Promoting data commons that encourage the collection of underrepresented or missing data.
Ensuring interoperability for international access to data.
Creating mechanisms for rewarding data creators in a way that respects their rights.

To address the computational gap, HLAB proposes the following recommendations:

Develop a capacity-building network for shared benefits to ensure the equitable sharing of the benefits of AI.
Establish a global fund to support access to computing resources for researchers and developers who want to apply AI to local public use cases.

International governance of AI, especially training datasets and computing power, is crucial for startups and developing countries. It provides a solid framework for accessing necessary resources and fosters international cooperation, enabling startups to innovate and scale responsibly in the global AI landscape.

More must-see comments posted by Fortune: