The black field of AI that resists researchers | Technology | EUROtoday

When a neural community is executed, not even essentially the most specialised researchers know what is going on. And we’re not speaking about biology however about a man-made intelligence algorithm—these primarily based on deep studying—which is named that as a result of it imitates the connections between neurons. These forms of methods preserve an indecipherable black field for knowledge scientists, the brightest minds in academia or engineers at OpenAI and Google, and so they have simply obtained the Nobel Prize.

The arithmetic underlying these algorithms is nicely understood. But not the behaviors generated by the community. “Although we know what data goes into the model and what the output is, that is, the result or the prediction, we cannot clearly explain how this output was reached,” says Verónica Bolón Canedo, AI researcher on the Research Center. in Information and Communication Technologies from the University of Coruña.

This occurs with ChatGPT, Google Gemini, Claude (the mannequin from the startup Anthropic), Llama (the one from Meta) or any picture generator like Dall-E. But additionally with any system primarily based on neural networks, facial recognition functions or content material suggestion engines.

Other synthetic intelligence algorithms, equivalent to resolution bushes or linear regression, utilized in medication or economics, are decipherable. “Their decision processes can be easily interpreted and visualized. You can follow the branches of the tree to know exactly how a certain result was reached,” says Bolón.

This is vital as a result of it injects transparency into the method and, due to this fact, affords ensures to whoever makes use of the algorithm. Not in useless, the EU AI Regulation insists on the necessity to have clear and explainable methods. And that is one thing that the structure of neural networks itself prevents. To perceive the black field of those algorithms you need to visualize a community of neurons or nodes linked to one another.

“When you put data on the network it means that you start calculating. You trigger the calculations with the values ​​that are in the nodes,” says Juan Antonio, analysis professor on the CSIC Artificial Intelligence Research Institute. The data enters the primary nodes and from there it spreads, touring within the type of numbers to different nodes, which in flip bounce it to the next ones. “Each node calculates a number, which it sends to all its connections, taking into account the weight (the numerical value) of each connection. And the new nodes that receive it calculate another number,” provides the researcher.

It should be taken under consideration that the fashions of deep studying (deep studying) at this time have 1000’s or thousands and thousands of parameters. These point out the variety of nodes and connections that the community has after having been skilled and, due to this fact, all of the values ​​that may affect the results of a question. “In deep neural networks there are many elements that multiply and combine. You have to imagine this in millions of elements. It is impossible to get an equation that makes sense from there,” Bolón asserts. The variability may be very excessive.

Some trade sources have estimated that GPT-4 has virtually 1.8 trillion parameters. According to this thesis, for every language it will use about 220,000 million parameters. This signifies that there are 220,000,000,000 variables that may impression the algorithm’s response each time it’s requested one thing.

On the hunt for biases and different issues

Due to the opacity of those methods, it’s tougher to appropriate their biases. And the dearth of transparency generates distrust when utilizing them, particularly in delicate areas, equivalent to medical care or justice. “If I understand what the network does, I can analyze it and predict if there are going to be errors or problems. It is a security issue,” warns Rodríguez Aguilar. “I would like to know when it works well and why. And when it doesn’t work well and why.”

The large names in AI are conscious of this lack and are engaged on initiatives to attempt to higher perceive how their very own fashions work. OpenAI’s method is to make use of one neural community to look at the mechanism of one other neural community as a way to perceive it. Anthropic, the opposite startup cutting-edge and whose founders come from the earlier one, research the connections which can be fashioned between the nodes and the circuit that’s generated when data spreads. Both search for parts smaller than nodes, equivalent to their activation patterns or their connections, to research the habits of a community. They go to the minimal with the intention of escalating the work, however it’s not straightforward. “Both OpenAI and Anthropic try to explain much smaller networks. OpenAI is trying to understand GPT-2 neurons, because the GPT-4 network is too large. They have to start with something much smaller,” clarifies Rodríguez Aguilar.

Deciphering this black field would have advantages. In language fashions, the preferred algorithms in the mean time, misguided reasoning can be averted and the well-known hallucinations can be restricted. “A problem that you could potentially solve is that many times the systems give inconsistent answers. Now it works in a very empirical way. Since we do not know how to interpret the network, the most exhaustive training possible is done and, if the training works well and the test is passed, the product is launched,” explains Rodríguez Aguilar. But this course of doesn’t at all times go nicely, as grew to become clear with the launch of Google Gemini, which initially generated photographs of Nazis with Asian options or black Vikings.

This data hole about how algorithms work would even be according to legislative aspirations. “The European AI Regulation requires developers to provide clear and understandable explanations about how AI systems work, especially in high-risk applications,” says Bolón, though he clarifies that the methods can be utilized if customers obtain adequate explanations about them. the bases of the choices made by the system.

Rodríguez Aguilar agrees that there are instruments to clarify the outcomes of the algorithm, though it’s not recognized precisely what occurs throughout the course of. “But the most worrying thing, more than explainability and transparency, for me is the issue of robustness, that the systems are secure. What we seek is to identify circuits in the network that may not be secure and give rise to unsafe behavior,” he highlights.

The final aim is to maintain AI beneath management, particularly when it’s utilized in delicate points. “If you are going to place an AI that suggests treatments in a hospital, that directs an autonomous vehicle or that gives financial recommendations, you have to be sure that it works.” Hence the obsession of researchers with understanding what occurs within the guts of an algorithm. It goes past scientific curiosity.

https://elpais.com/tecnologia/2024-10-15/la-caja-negra-de-la-ia-que-se-resiste-a-los-investigadores.html