EPISODE 1623

[INTRODUCTION]

[0:00:00] ANNOUNCER: Responsible AI is an approach to developing and deploying AI in a safe, trustworthy, and ethical fashion. The concept has gained considerable attention with the rise of generative AI technologies. Ezequiel Lanza is an AI Open Source Evangelist at Intel, and he joins the show today to talk about responsible AI and the practices and tools evolving around it.

This episode of Software Engineering Daily is hosted by Jordi Mon Companys. Check the show notes for more information on Jordi's work and where to find him.

[INTERVIEW]

[0:00:41] JMC: Hi, Ezequiel. Welcome to Software Engineering Daily.

[0:00:44] EL: Hey, Jordi. Thank you. Thank you for having me.

[0:00:46] JMC: You gave the talk at Open Source Summit Europe 2023, that took place in Bilbao, Spain. The two topics that you spoke about were responsible AI and explainable AI. Give us a brief overview of the history of these two areas.

[0:01:00] EL: Yeah, sure, sure. As you said, it's a new concept that – it's pretty old. For instance, when we used to talk about AI and about the things that we can do with AI, computer vision in the 2000s, and also, decision trees in the past and everything, when we try to use those algorithms, or these models, sometimes we need to define the guardrails, right? When we need to define the guardrails, it's when we start to think, okay, we need another technology, or we need principles, or we need something to define how we can do this model to be implemented, right? Because you can have regulations and you can have a lot of things in the middle that can start, I mean, being important, or being more relevant.

It's happened. I mean, it's growing very fast in the last five or six years, we can say, with all the growth that we have with large language models and ChatGPT and GPT, and so on, because now the models, they are, we can say that they are generative, right? We have this generative part of AI that they are creating something. Now, it's not just a model that is detecting a phase, or is detecting something, it's a model that it's creating something. It's responsible AI and also, explainable AI. They are taking a great impression in the last few years, because if you all like to use that in your work and in your environment, you need to find some principles.

Of course, the principle could be useful for all the things, right? It's a pretty old concept, but during the last years, it's gained a lot of attraction. Even now, I mean, the concept is not 100% clear, it's evolving, you have a lot of papers, but at least the concept of have some guardrails is what is mainly increasing in popularity.

[0:02:48] JMC: Okay, so let's take it one step at a time then. Responsible AI, you've said that the concept itself, the definition itself is probably evolving since it's new, and it's trying to describe a moving target. By definition, it must be dynamic, so that's okay. It's not set in stone, but could you give us a – what's your understanding of what the concept means and aims and the principles underlying responsible AI?

[0:03:13] EL: Sure. We can mainly mention four principles that can be mixed, or differentiated, but we can talk about mainly four principles. The first one is fairness. If we have a model that is biased, or some models are lamps, for instance, they can be racist, or anything, you would like to avoid that using some kind of technology. There are multiple toolkits that can give you visibility on your data set if your data set is biased and how you can mitigate that, and so on. This is one principle for that.

The other one is, of course, transparency, which is not a specific tool, for instance. You need to be transparent when you are developing your model. You need to say, for instance, okay, this model is trained on GPT, that GPT was trained on this data set and how it works, and so on. You need to provide some clarity about the model that you are using, or how you use the models. Not just one particular model. This is more a concept, right? You need to provide that when you are providing your solution.

The third concept, which is pretty important, is accountability. Someone has to be responsible for that. When you are developing an application, someone has to be responsible and now it's pretty challenging with the LLMs conversation, or even with computer vision that you have a model that was trained, or you download the model and it can create content, right? Stable diffusion, or LLM, they can create content. Who is responsible of this content that the model is creating? It's the company that is developing, is the person that trained the model? I mean, someone has to be responsible of that. This is a third concept.

The last one is more related with privacy and data protection. How you ensure, for instance, when you are developing an application that you should encrypt your model to be sure that the model is behaving the way that you would like to behave. I would like to use this model to give me answers, or to detect faces. Okay, I need to be sure that the model is performing as it was designed. There are tools that can allow you, this is more from a security perspective, right? But this is another principle. If you put together these four principles, you could use what we say responsible AI when you are developing.

What is important is that it's force. I mean, it could be force if you are using GDPR, for instance. But there are principles. You can, or you cannot use it. If you don't want to use it, you probably will say, “Okay, I don't care about fairness,” for instance, which is a really bad thing, right? No one is forcing you to do that. Using these four principles, you can create a model, or your application that can be trust, that people can trust in your model, or in your application. It's more like a concept for you.

[0:06:06] JMC: Are there already real-world examples of application of any of these four principles, or the four of them? How can you demonstrate that you've applied them? It's probably not a check mark, I know, or a green tick saying, I did it, but how can you prove it in a way?

[0:06:25] EL: That's a very good question and it's very challenging. For instance, when you are asking, supposed that you have a model to give loans to people. They are working in a bank and you were like a system who say, “Okay, this is the person that I will have and I would like to have a model to say if this person, if I can give a loan to this person.” The reality is that the model based on the data that was trained in the past, probably would perform better with some particular data.

You know that most of the AI models, they are trained to perform better on the majority of the data you have. How you can prove that, or how you can test that, that's a very challenge question, because I can have my model to provide loans, for instance, and this model could be biased. This model could only accept the loans for some particular people, for some particular region. Is it okay for me? Probably for the bank, it's okay. But you can start a lawsuit as a customer saying, “Okay, why are you not giving me that?” I mean, is there any law, any regulation that can give me that?

This is one example that most companies will start using that, because they may face some lawsuit. It happens with Uber in some particular times. There was new about this kind of problem. I mean, it was a lawsuit about that. The customer started to complain about the driver and they wanted to know why I was assigned to this particular driver and not other driver. It was a big conversation there. It was about without regulation. If you prove, or if you say, “Hey, look, my model is really fair,” most people will start using your application. The way to show it is showing that your data set is not biased, showing the data set that it was – that the model was trained. Try to showcase to people, okay, my model, I respect those principles. I'm not biased to some particular users.

[0:08:27] JMC: I did get the impression from reading between the lines of what you said that these four principles were met for the people building applications with gen AI, right? In a way, I find it difficult for them to be required to demonstrate the application of these principles, because nowadays, for example, you can go to AWS and I can't remember the name of the product. Basically, they've got a marketplace of foundational models. You can actually go there and pick a cloud, or any other model that they offer, they've got plenty, and just start building a gen AI- powered application from there.

I guess, I was wrong in thinking that the principles apply to someone, like the developer that I just described, but rather more, to the foundational models. Is this a clear differentiation? Is this that the responsible AI, four principles that you described as more targeted to the cloud, to the GPTs, anything – there's a clear difference between foundational models and then the developers building gen AI-powered applications. Am I making an appropriate differentiation, or does it apply across the board?

[0:09:40] EL: I mean, they apply for all of them. Because for foundational models, or for computer vision models, you can have the same thing. For instance, if you have a model that is able to detect faces, probably if you train the model based on some kind of faces, the model would just be able to detect the faces that you train, for instance. In this particular case, it will be racist against the other faces, or the faces that you didn't include in your main dataset to train that model.

What happened with generative AI is that it's probably easier to understand the concept based on responsible AI, but they are the same as usual. When you are deploying foundational model, it could be on loan detector, it could be computer vision solution, it could be something foundational, as you said. If you're expecting these four principles, it's exactly the same. You'll have to say which model you are using. You have to provide the privacy of the data that you are handling. You will need to be fair, of course. It has to be responsible.

Probably, what makes it more understandable for generative AI is that probably in the future, we need to add a new principle. I don't know, but these principles were related from starting from training, with not having this bias. They go all the way to training and inference also. When you are deploying in your model, you need to be – someone has to be accountable for that, someone has to protect the privacy and everything. They're probably the same concepts, the same way to use it.

I said generative AI, because for me, it's the best way to explain that, because it's when people really see a challenge. For example, for computer vision, people, probably, they don't see a challenge. Okay, something just to the check the face. Wait, it can also be racist, even in some very easy model, or a foundational model.

[0:11:35] JMC: I guess, these AI principles will be, and especially I'm thinking more of a case of the EU, which is usually the leading quality, in terms of regulation for everything. I know you're not a legal expert, but have these principles become the basis for regulation? Has any country, any polity, any legal entity come up with regulation based on it? Or would you assume that these principles will articulate, or inspire upcoming regulation?

[0:12:08] EL: Yeah. I think that the regulations, they came from – because when you are a lawyer, or when you are closer to laws, the way to define those guardrails of the AI applications is to ask for something to the models. The way to ask for something is explainable AI, or explanation, or you need something that could give you the reasons of why the model did what the model did, right?

Explainable AI is the way. This is why there are a lot of growing on research in explainable AI, because you have the principles, you have the showcase and everything, but the way for you to demonstrate that your model, for instance, is fair, it could be with explainable AI, right? Even if it's not directly attached to explainable AI, but the regulations will ask you, okay, explaining. They will just ask for explanation. Which kind of explanation they will provide, or which kind of explanations are available? These are completely different conversation about explainable AI, right?

[0:13:16] JMC: what is explainable AI then? Is it, when you say, eventually, responsible AI are guiding principles that should structure the way you, the developer, or anyone building a AI model should actually build it, right? So that it can justify that you follow the principles and so forth to avoid biases, to avoid discrimination, to avoid etc., etc. Explainable AI is what the authorities will eventually request from these developers. Did I understand that correctly, or can you explain the concept of the explainable AI a bit?

[0:13:50] EL: Absolutely. With responsible AI can help you to build trustworthiness, so people can trust in your model. Basically, you say that it's fair, and so on. When it comes with explainable AI, and you need to demonstrate why the model took that decision, you need tools to explain that. Basically, if we talk about explainable AI, in short, it's a way to explain why a model took the decision that take – It's not one tool. I mean, it could be a tool, right, of course. But it's more an outcome that probably would not be the same all the time.

If you need to show to a customer, okay, why did you denied the loan to this person? Okay, and the model, or this explainable AI tool can give you – I'd say that this customer is not a good customer to provide a loan, because based on, I don't know, the income, or the history, or whatever.

[0:14:47] JMC: I see. Is this something that – I'm probably now to sort of like, my question is too skewed towards generative AI, right? Because I'm thinking of them, of generative AI models are human-like. I'm thinking that the regulator, or whomever's in charge of checking that the principles were applied, would go directly to the model on RSK. How did you make this decision? If the model has explainable AI properly implemented, it should be able to describe the “reasoning,” the mechanism that it has followed to reach the outcome that it provided. Did I get that wrong? Should the question be addressed to the developer that created the model and that person be able to explain, or both?

[0:15:29] EL: Yeah. How it's usually created is that we can separate it in three main things when we talk about explainable AI. The first one is the explanations that you provide could be for regulations, could be for developers, or could be for end users. How they are implemented, they are different, of course. Usually, it's you have something that you provided, and the developer, or the person, or another developer will provide the explanation why the model to that issue. Of course, when it's implemented, you will probably keep track of all the explanations, just in case if something happens.

There is a big, different ways to do it, right? Sometimes, it's people say, okay, this is what I need to meet the regulations. What I would do is I would keep track of any prediction that the model makes, and I will store that in my database and whatever. In case if I have to justify something, I can provide that kind of explanations to the regulations.

[0:16:33] JMC: In my research of explainable AI, two acronyms have come up quite frequently. One would be LIME, or L-I-M-E, which stands for local interpretable model agnostic explanations, my goodness, what a mouthful. And SHAP is the other one, the second one, that stands for shapely, or shaply, additive explanations. Another difficult one. What are these two things? Am I missing anything in the same category?

[0:17:03] EL: No. These are mainly the two. Of course, there are a lot of models to explain, but these two are the good examples to showcase the group of families that we have to explain a model. When we talk about LIME, let's suppose that you have a model, could be generative AI, it could be whatever, right? Let's suppose that we have a model that is making a prediction. Sometimes this model can be represented by an explainable model. As you may know, or as most people know, we have decision trees, for instance, that it's used in AI, random forest, and so on, that it's very easy to see what's going on inside. I mean, you have the leaves, and you can see which leaf the model took to make that decision, right? It's pretty easy to see it.

Sometimes, we as a developers, we are trying to find the best model to solve our problem. Probably, the model that we selected was a neural network, for instance. The neural network is not so easy to see it, because even if you have weight, you have neurons, you have a lot of things in the middle, that it's not interpretable. It's not easy to see what's going on inside. If we need to explain this model, we have two families. One family is, you have a model that you trained with these neural networks, transformers, or whatever, you try to build a new model, which is a decision tree, or an explainable model, that will behave in the similar way as the original model, but you can explain it.

Instead of running your complex neural network, what LIME will do is they will build a model, a parallel model that will behave in the same way. For the user, it will be the same, and for you as a developer, it will be the same. But now, you have an explainable model using this decision tree. There's another group of family, that sometimes it's not easy to have a decision tree with everything. Generative models, you have transformers, and you have a very complicated model, that you cannot have a parallel explainable model.

What they do, all this family, that is the SHAP family, instead of looking at the model, they treat the model as a black box, and they just compare the inputs, or the features that you provide to the model, and the outputs. Based on how the outputs changed, based on the input, the model, which is an explainable model, which has nothing to do with the decision trees, and so on, but it's a new model that can interpret how the outputs can infer, or can interact with the output.

Let's suppose that you have a loan, and whatever, you have the inputs that could be age, could be ZOM, zip code, and whatever, and the output would be yes or no, right? You have a data set that you train a model, and so on. What SHAP will give you is, okay, with this model, with this particular local interpretation, with is one example, you said no. Based on what, based on that for me, age was the most important feature. This model is able to detect, which is a static model, is able to detect how the input is affecting the output. These are two main families, so SHAP from one side is black box, and I just see the input and the output, and LIME, which is try to create a new explainable model. This is just families, right?

If you go in the papers and so on, there are tons of different ways to create a parallel model, there are tons of variations of SHAP, and so on. But basically, the two main families are these two, right?

[0:20:45] JMC: The application is that SHAP is usually, or it's the SHAP family of explainable models, is usually applied to black box, and to those inference models that are incredibly difficult to trace back, the reasoning I'm calling, the decision tree, or the path in the decision tree that – whereas LIME, category of explanation models is applied to those AI models that do leave a trace and are easier to trace back. Is this the main difference in their application?

[0:21:21] EL: Yeah, and there's another concept that we can talk about, that is local and global explanation. When we talk about local explanation, we are just explaining one example. As SHAP, for instance, used to do. I have one example that the AI algorithm says that I'm not a good person to get a loan. Okay, for this particular, which features were most important? This is what's called local explanations, right? Global is when you have a model that is the decision tree, that you don't need to do it example by example, because you can see it. You know which was most important for that.

What I particularly see is that the models are getting more complex and it's getting more and more complex. I see that is SHAP, that the way that SHAP works could work in the future, because it will provide you, it will give you an explanation, and you treat the model as a black box. It could be probably impossible. Probably if you talk to researchers, they don't even know what is exactly going on inside of the attention layer, for instance, when we talk about LLMs.

What we do, we try to explain the attention in detail, or we try to treat it as a black box and just see the input and output and try to make an assumption based on that. I don't know, it's a very hot topic of research. How to explain generative AI is really not clear yet. You can do some explanations on text classification that is using transformers, and so on, but how to explain how they generate the text, or the image, it's something that is very hot typing now in usage.

[0:23:03] JMC: Yeah, I agree. Especially, I mean, I'm sure that for code completion, for examples of structured languages, such as programming languages, the output of even black boxes, or extremely complicated models based on transformers, is going to be relatively similar, even if the input, or the prompt is a bit different, or just exactly the same, right? The variance there will happen, because they are stochastic and they generate new stuff by definition.

Again, when they're trained on structured data, the output should be expected to be quite similar on the vicinity of the previous. When we're thinking of natural language, or generate, when you request to these same models to generate creative English language, or Spanish, or any other, and you provide them with the same input, I can see a huge variance there. I'm not sure how the SHAP category of explainable AI models is able to actually infer anything valuable. I'm not a researcher, of course, so take my skepticism with a grain of salt, because I've been exposed to this very little. I don't see how SHAP can be valuable in that case, because I can see a huge variety of an enormous variance in the output of these models, when it comes to that use case.

In any case, what are the trade-offs between one and the other? What are the considerations one should take into account between these two families? Are we missing any other type of explainable AI model?

[0:24:40] EL: I mean, if we try to mention everything, will be tons of products.

[0:24:45] JMC: Oh, okay. I didn't know, okay.

[0:24:47] EL: Yeah. Mainly, we can talk about these two main families, that they are created in model, or they try to infer why the model behaves based on input and outputs. This is what it's usually called boss hack, right? Once you have a model that is already trained and you need to explain how the model works. You create a new model, or you explain how it works.

There are tons, tons to be honest. LIME, we can talk about rule structure, model, distillation, perturbation models, perturbation methods, when you see how a model can perturbate the other model. It's a very interesting topic to go there and to try to see. But of course, the most popular are LIME and SHAP, because they are the most usable and more easy to use. The others are more research. As you said, with generative AI, I totally agree with you, the part of how SHAP can work.

Actually, SHAP works when you use it in a text classification, if you have a paragraph, or something, but it's a classification problem, which is pretty easy. It's how SHAP works. You have a label and you have a text. Even if the text is huge and if it's a long paragraph, SHAP is able to understand which part of the phrases of the paragraph were positive and which part of the paragraphs were negative. That can give you some insights.

When you are creating concept, basically, the only way to do it is as you know, generative AI, or other lens, they seem creative, and so on, but they create content based on the word by word. They create a word, they see the previews, they create a new one, they see the previews, and so on, and so on, and so on. Probably, and this is just an assumption from something that I've seen in some papers, is that going in deep on this generation in the middle could help to try to imagine why the model is predicting the next word based on the previous one.

Actually, when you are building a model, when you're building an LLM model, there is a function in the middle that it's giving you the likelihood for the next world, right? Why this likelihood was the number that they said, this is another thing that is in research. I think that we will be there, basically, because we would like to implement those models. There are tons of models and we would like to use it in our real-case lives. I mean, even for regulations, even if we can use it, or we can implement that, we can have a regulation that can say, okay, we need you to explain that. If we are not able to explain that, we'll have to remove all the models that we are using.

[0:27:27] JMC: By definition, open-source is, again, by definition more transparent and more open than the closed-source, right? Although, sometimes just having your repo, your source code open to the world doesn't mean that it is actually explainable, right? But it's easy to understand. In any case, despite what I just said, have you noticed that open-source AI in general is better at meeting the requirements from responsible AI and explainable AI? It's not they’re requirements, they're principles, but you get what I mean, right? Is the open-source AI community better at meeting these than the closed-source, or is this something that is equally distributed, or quite new to everyone and no one's actually caring as much as they should about this?

[0:28:19] EL: My answer could be a bit biased, to be honest, because I used to work, my role as open-source, so I used to work with communities, and so on. Apart from that, the growing that you have with LLMs and with AI and everything, it was important things to the open-source community, because if you have new models, some people would like to create new use cases and everything that is going up related with making AI growth, it's 100% related with open-source communities and open ecosystem. It's very important how it behaves when we talk about software and AI.

What I see in responsible AI, and explainable AI, particularly in explainable AI, I see that there are a lot of open-source projects that they are aimed to provide explanations on everything. I see that there is a community that is growing in explainable AI. You see the Google, what if, which is a toolkit for fairness, the IBM, AI 360 is another one. These are the most known, but there is also an Intel explainable tool that is multiple – I mean, AWS also has. The only way to grow of that and what I can see is that using the community and the community is very eager to propose that and to use that.

Also, because it's something that you would like to use it and I don't see a company creating a solution and selling the solution for explainable AI. I mean, of course, there will be companies that will try to sell it, but most of the times, this kind of parallel solutions, or parallel explanations, or responsible, or whatever, since they are not in the focus, or, okay, I train my model, I would like to sell the model because I trained the model, because it's mine, because I spend resources, I spend time, I spend money training the model, so I don't want to share it to everyone, which is one part of the closed software, as you said.

Explainable AI, it's more like, it's not in the middle of the conversation in my opinion. It's going in the middle, of course, but I don't see companies trying to sell explainable AI tools. They will probably will sell something around explainable AI and train the model, and so on, but it's not explainable AI itself. All the companies that are using explainable AI, they are using the open-source toolkits. Of course, they are building something on top of that, of course, and they are going there. We are in the moment when we really need research and research is going on hard. Research happens when you have open-source communities.

[0:31:02] JMC: You yourself, you work for Open Intel, which is, I guess, Intel's division, business unit, whatever you may call it, that is devoted to open source, correct me if I'm wrong. That is actually my question. What is the charter of this part of the huge organization that Intel is devoted to? Specifically, if you can tell us, then if you guys are doing anything with regards to AI. You just mentioned one project, I believe, but is there anything else?

[0:31:29] EL: Yeah, basically, what our team does is we are in charge of creating this strategy, or evangelizing what Intel is doing in the open-source environment, right? Intel is a hardware company, of course. We sell hardware. But we have 80%, or 70% of our resources, they are software engineers. For instance, for PyTorch, we are one of the top three contributors to PyTorch, right? We have a team that's dedicated to contribute optimizations and to upstream those optimizations to PyTorch. Because what we have in our hardware, we have some accelerations and optimizations that can be used and the way to use it, it's using a very low-level library.

What we do to make it available to most people and to try to democratize the usage of AI and acceleration, we upstream those libraries to PyTorch, or TensorFlow, or the main AI frameworks. That's from one side. In terms of open source, we have a few projects. We are not mainly creating software solutions. We always try to upstream what we do with optimizations and accelerations. Projects, I can mention two main projects. One project is related with explainable AI, which is Intel explainable AI toolkit, which uses SHAP and uses – we are creating also a model card, which is if you would like to visually see your model, how it's working, I mean, you can visually see the features. It's a mix between SHAP and LIME, but they are trying to put all two together just to give you a visual explanation of the fairness, but fairness is one part, but it's most of them.

The other one, which is not attached to a hardware. I mean, it's something that we are providing that it's a toolkit, it's open source, people can collaborate and we are eager to get people collaborating to our project. The other one could be, I can mention, it’s OpenFL, which is a federated learning framework. If you are, in some use cases, you need to train your data in a distributed way, because there's privacy, regulations, and so on, and you cannot move the data from one side to another side. Data should reside in one country and you can train the model with the data that is present in that country.

What we created is the OpenFL, which is a framework. It's completely hardware agnostic, of course, but it's a framework that you can use it to build this kind of solution. We are trying to make it easier, and it's just an API that's very easy to use. It works on Python chains of law and so on. The concept of what we do with open source, or with open ecosystem is try to democratize the usage of AI. Start to make it easier. All of our products are based on open standards. We don't have a new language that you need to learn. Everything is based on C++, or Python, or SQL. Basically, this is what Intel does in the open source.

[0:34:27] JMC: I should mention that at Open Source Summit Europe 2023, in Bilbao, you gave a talk. The UX self-foundation was announced and the kernel of that, the seed of that was the open – the One API project, which is also part of Intel, right? It was Intel's way of saying, “Hey, we're providing one single API for you to be able to program, mostly in C++, but C, wherever targeting any architecture out there.” It was open source from the get-go, but now it has been the governance of this new project called UAccel, which stands for Unified Acceleration Foundation, or the Unified Acceleration Project, is now under the Linux Foundation and many other companies, big companies, such as Intel and others are completely involved.

That one API – well, that you can target any single architecture. In this case, accelerator architecture, so GPUs, but not only GPUs, think of FPGAs and other complex and really powerful pieces of hardware out there, that you can do it without caring about that, right? They're providing SQL and other pieces of software that actually help with that. DP C++ and C++, and so on.

Yeah, there's a lot of beautiful and great things that Open Intel does and contributes to the world, so I'm quite thankful for the ones that you mentioned, too, and the ones that you're involved. Ezequiel, thanks so much. I will link the talk in which you elaborate way more than what we just skimmed over on the concepts of responsible AI. That one we covered mostly, quite good. But explainable AI, you get into the weeds, and you actually go into a real-world example of a case study of it. I just suggest everyone interested in these two areas, which are going to be major in developing AI applications, or AI software in the future, or models itself, to just go and watch it. It's in the Linux Foundations YouTube channel, but I'll link it in the show notes.

I just would like to thank you and I look forward to knowing more about these areas and having researchers and open source advocates and others like you involved in developing these concepts and applications as much as possible.

[0:36:47] EL: Thank you. Thank you so much for having me. It was a pleasure. I hope to see you in the next event. As you said, open source, open communities, open ecosystem. It's a very key part of this development and to try to make it grow. Thank you. Thank you for having me.

[0:37:04] JMC: I agree. Take care.

[0:37:05] EL: Bye-bye.

[END]