EPISODE 1926

[INTRODUCTION]

[0:00:01] ANNOUNCER: Artificial intelligence is transforming warfare faster than the legal and ethical frameworks designed to govern it. Militaries around the world are deploying AI-powered decision support systems to identify targets, assess proportionality, and direct weapons. The gap between what is technically possible and what international law can effectively regulate is widening by the day. Yuval Shany is a law professor at Hebrew University of Jerusalem and a research fellow at the Oxford Ethics and AI Institute. He also served on the UN Human Rights Committee, where he first encountered the legal and ethical challenges posed by autonomous weapon systems. His research focuses on the intersection of international humanitarian law, human rights, and emerging military technologies.

In this episode, Yuval joins Matt Merrill for a wide-ranging conversation. They cover topics including how close we are to fully autonomous lethal weapons, the accountability gap that AI-mediated warfare creates, and what lessons software engineers can draw from these challenges when building consequential AI systems of any kind.

Matt Merrill is a software engineering leader with over 20 years of experience building and scaling software teams across enterprise and product-focused organizations. His background is in back-end development, cloud architecture, and distributed systems design. He currently architects and delivers software products and leads a team of engineers at DEPT Agency. You can learn more about his work at code.theothermattm.com.

[INTERVIEW]

[0:01:49] MM: I am here with Yuval Shany, who is a legal scholar for autonomous weapon systems, among other things. We are here to talk about autonomous weapon systems. Before I go too much further, Yuval, could you introduce yourself?

[0:02:03] YS: Sure. First, thanks for having me. I'm a law professor. I teach at Hebrew University of Jerusalem. I'm also a research fellow at the Oxford Ethics and AI Institute. Like you said, I've been working for the last 10, 15 years on a variety of law and technology topics, including autonomous weapon systems. I've been on top of my academic role. I've served for eight years on a UN body called the UN Human Rights Committee, which is a committee of experts that is monitoring the ways in which basic human rights standards are enforced throughout the world. It was actually during my time on that committee from 2013 to 2020, where I encountered increasingly questions relating to human rights, humanitarian law, that is the laws of war, and new technologies, including the question whether the development of lethal autonomous weapon systems is compatible with basic human rights standards, specifically with the right to life.

This was the first time I had to, in a way, apply my mind professionally, so to speak, to the issue. And to realize that we do have some major gaps in international law to address satisfactorily this issue, which is a very fast-moving target that is changing quite dramatically the ways in which wars are being fought and the ways in which human rights are being upheld.

[0:03:29] MM: Wow. Okay. My next question was going to be, what put you on a path to do this, and that is quite a story. You are in Israel, so you are also, in a way, personally affected by this now. Is that correct? Maybe indirectly, or directly.

[0:03:44] YS: Yeah, I mean, being in Israel is, of course, you're being in some respects on the frontier of these developments, because first, you are in a situation where there is a longstanding and difficult conflict. Also, in terms of technology, the technology is very much part of the story, because the Israeli military has been a very fast and among the first adopters of the technology. Of course, whatever the Israeli military does is generating a lot of attention. I think there has been a lot of spotlight, which has been directed towards this topic. I am in some respects on the front lines of these issues on a metaphorically, but not only metaphorically way.

[0:04:25] MM: Yeah. Before I get into how I want to frame this chat, I do want to ask, you're a legal scholar primarily, but you're starting to get into technology. This is a podcast for software engineers. Just so we know, what's your level of comfort with this type of stuff? Have you gotten deep into this with your research?

[0:04:44] YS: You mean the technology side?

[0:04:45] MM: Yeah.

[0:04:46] YS: Yeah. I mean, part of what I think drew me to working more on law and technology was the curiosity of simply understanding better this world around us. I've been working for the last almost decade together with computer scientists at Hebrew University, and we have a joint research center and I'm writing also with technology people. I don't purport to actually have a full grasp of the technology, but I think over the years working alongside people with computer science background, I did pick up some important features. I think I'm able to understand part, at least the main contours of some of the technological aspects of the legal dilemmas I'm dealing with, but we'll let the audience judge on this in the end.

[0:05:31] MM: Yeah, that's not a question meant to be judgmental. More, just so people know where you're coming from. Also, I would make the argument that even the best computer scientists right now, with the exception of maybe some folks working at some of the big AI companies are also starting to grasp what all this means and to understand it as well. It's a brave new world that is exciting and scary, as I think we're going to talk about.

[0:05:55] YS: Yeah. Right.

[0:05:56] MM: The way I'm going to ask this is actually very authentic, right? I've been a developer and an engineer for a long time. I understand technology. I won't purport that I understand all of this gen AI technology, but I know nothing about weapons. I know nothing about war, or anything. I know nothing about law. Maybe nothing is a stretch. But I'm going to come at it from that point of view with the hope that some of my engineering knowledge will resonate with this audience and also, taking the software engineering piece out of this, this is just a fascinating topic that while I was researching, I got very intrigued by it and intrigued might not be the right word, because it's actually very scary. Let's jump right in.

You have a paper that you wrote called Program to Obey. I started reading that paper and I love what you did in a legal, scholarly paper by, you hooked me with a reference to robot killers, like The Terminator and the things in black mirror. It's such a great way to ground the audience in that way, because it really does resonate. I'm curious, can you talk a little bit about how close you think we are to having that as a reality right now?

[0:07:08] YS: Yeah. So, thanks, Matt. Yeah. I think, I mean, it makes maybe a good attention grabber when you write a maybe, could be otherwise regarded as a boring academic paper. I think this is actually a field where science fiction and films and popular culture is very much captured the imagination. Even when the lawyers and the technology people are sitting in their room, they're often thinking, yes, Terminator-like situations, where you would see these robots, these killer machines moving around and shooting at people. It's probably not accidental that one of the major public campaigns against these weapons system has been called stop killer robots, right? This has really been a very influential imagery.

Where are we in reality? We're not really there in the sense that I'm not aware of any military around the world that is developing the stormtrooper kind of robots that would move around battlefields. This is not the main effort that we are seeing. But we are seeing a lot of stuff which is going at some level in that direction. That is mostly around the creation, basically the upgrading of existing weapons systems such as drones that have been up until now, remotely controlled by human operators. Basically, adding on to them autonomous functional capacities, so that they would be able to operate without the human who is remote controlling them.

This is something that we already have. I mean, we already have drones that are being developed, sold and also put into use that have the capacity to what is called loiter over a set of potential targets. Identify and lock in on a target and engage a target without human control of the final and critical stages in the process. The human would deploy, for instance, the Israeli military has a system called Harpy, which is a drone system that is designed to neutralize air defense systems. They can loiter around a certain area. Once they pick a radar signal, they then can engage the target and basically operate like a kamikaze drone and explode themselves over this radar system. This is something that we already have.

[0:09:33] MM: Without human interaction, is that correct?

[0:09:35] YS: Without human interaction. Exactly. Exactly. What we also have is the US military already working very seriously towards drones that would be able to operate without human control in what are called communication denied environment, in situations where the signal is no longer possible for remote control to operate. These drones would have as a backup system, it's a system called CODE. This is how this program is called. Denied environment. This is a collective operation in denied environment. This is already something which is quite close to becoming operational.

We also have within the US military a program that is based on what is called JAD2C, which is Joint All Domain Command and Control. This is a system that once fully operational, it's not fully operational, but once fully operational, it would be able to identify threats automatically, then designate a response, a weapon system that would deliver a response. Then that weapon system could also operate without a human in the loop. This is sometimes called the Internet of military things, basically connecting both defensive and offensive capabilities, so that they are able to react very fast challenges without having humans in the loop.

These are not killer robots, but there are systems that do have in a way the same feature of having a weapon system eventually press the trigger, pull the trigger without a human being taking part in the critical and final stages of the process. The other thing I will say with this, this very long answer, what we already have, and this is something we have already very extensively in a number of militaries are decision support systems, which would be algorithmic systems that do not actually pull triggers, but they issue recommendations. They would recommend to military officials, what targets to engage, what weaponry to use, how to conduct what is called proportionality analysis to ensure that the harm, the collateral damage to civilians is not excessive. These systems are heavily being used by militaries, including the US military, even now as we are speaking.

[0:12:01] MM: As you were talking about that, I couldn't help but think of how much I rely on something like ChatGPT. I used it as an example to repair a pipe in my house the other day. I need a recommendation for how to repair my pipe. I just took its recommendation. When you think about that scenario, I would imagine that military commanders are going to become in the same situation. Yeah, we'll just trust it. Even that is concerning.

[0:12:28] YS: Yeah. I mean, the US military in Iran is currently - I mean, it had been reported that it is using Claude, anthropics Claude in order to identify targets. This is already something where you're already seeing generative AI being put into the battlefield and making recommendations, which, of course, raises a lot of issues, which we can discuss later on, regarding both the legal and ethical dimensions of this.

[0:12:51] MM: Before we move on from that, are you aware of this being used at all in the Ukraine conflict at all?

[0:12:57] YS: In Ukraine, yes. I think the system that has generated so far the most interest in the context of Ukraine is a system that is called GIS Arta, which is an artillery directing system. Sometimes it's called Uber for artillery, where the algorithm is identifying a certain military threat and is channeling, basically, the responsibility to react to this, or that artillery system that is deployed in the relevant battlefield. This is in terms of defensive operations, yes, the Ukrainians are using this.

[0:13:33] MM: Wow.

[0:13:34] YS: There is also a lot of use of drones, including Kamikaze drones on the other side, but these are typically still human operated drones.

[0:13:42] MM: That's fascinating. What did you call it? Uber for what?

[0:13:45] YS: Artillery.

[0:13:45] MM: Wow. Okay. All right. I think before we dive much further, I think it's pretty easy to see the negative consequences of this, right? There are some upsides to this. Can you talk a little bit about that, just to make sure that we're seeing the other side of the argument?

[0:14:02] YS: Yeah. I mean, I think for the same reasons that you're using it for your pipes and that a lot of people are using AI for a variety of functions, the advantages of using AI in a military context are obvious. I mean, it provides you with the advantages of speed. It provides you with the advantages of scale. It provides you with the advantages of scope. If you're keen on winning a battlefield situation, and now you have a system that can basically generate fire within still a legal frame, and we can talk about this further, but in a very, in a much more rapid and much more effective way, this is something that militaries are actually quite keen to have.

I mean, it is a significant force multiplier in the same way that it is in non-military context, a force multiplier, does things faster and often better than human beings. In a military context, there is also a force protection dimension, instead of sending, basically, a human pilot to basically, a dangerous mission. If you can send an AI system, you are, of course, maybe saving lives on your own side. Even sometimes one could say, well, you could actually take with an AI greater risks, because you're not so much concerned about the downing of a drone. I mean, no significant harm. You will not, for instance, engage in high-altitude bombing in the same way that you would sometimes engage with a human pilot, so as to provide that human pilot with a better protection.

Even from a humanitarian point of view, actually, sometimes using an AI system could generate better outcomes. It's also for, of course, much cheaper in many contexts. Instead of operating 1,000 intelligence officers to sift through huge mountains of data and provide you with target recommendations, maybe you could basically get now a generative AI system to do the same thing in a fraction of the time and cost. There are obvious advantages, and this is why, I mean, governments are rushing to basically, adopt and integrate this technology.

[0:16:09] MM: That makes sense. Also, what you said earlier about the decision support systems. They can more precisely identify targets and ideally reduce collateral damage as well. I mean, even when you think of something like a kamikaze drone compared with a missile that's just going to blow up a certain radius right there that I'm using air quotes, if you could see me, well, "that's an advantage," right? If you need to attack an enemy, that at least you won't hopefully kill any civilians.

[0:16:37] YS: Sure. I mean, in the same way that we are talking about personalization that is supported by AI systems. We often think about it in contexts, which are more benign, such as personalized medicine, or personalized education. This also works at some level at a battlefield scenario, when you could actually, instead of just dropping a big bomb on a crowded area, you could pinpoint and basically go after it as specific individual with a drone that can basically target that specific individual.

Having said that, like with other AI systems, we do have to think also about mistakes and about mishaps and about hallucinations and about all sorts of assumptions that are driven assumption. I think the jury at this point in time is still out on the question of whether when you scale this up, does this generate on the whole better outcomes than human beings, or worse outcomes? I think the trajectory is almost inevitable that we would get in the long run, better outcomes. But at this point in time, this is not yet obvious.

[0:17:47] MM: It's really interesting, because I see a lot of parallels with the industry I'm in, which is we have a lot of C-level people saying, "We got to get on board this agent to coding workflow. It's going to change the game and this and that." I think that people, including myself, are very excited about the possibilities of it. But we're a little bit skeptical of like, well, can we see the results first? Can we run some experiments? Everybody's charging full speed ahead on it. I think I'm hearing similar things here, which is we're being, dare I say, reckless with the technology, assuming that it will work out with perhaps not the proper safeguards in. One of those safeguards is international humanitarian law, which we're going to get into.

One of the things in your research, it's in multiple papers that I was speaking at that you've written was the concept of human choice and restraint. This is a quote from one of your papers, just for the listeners. I thought it was very profound. "The preservation of choice works against reducing combat operations to a programmed application of military necessity, thereby reasserting human capacity to step outside the four corners of what is legally permitted, but not legally required." I thought that that was very profound. I'm wondering if you can talk about why this is important and what the loss of human choice and restraint might mean.

[0:19:10] YS: Right. The laws of war, international humanitarian law, as they are normally called these days, they basically, it's a legal framework that is applicable to situations of armed conflict with the ideological goal of minimizing the harm and suffering that is associated with war. It's not a pacifist framework, because it does accept as a point of departure that wars continue to occur. But its mission, so to speak, is to try to minimize as much as possible harm and suffering, while maintaining the capacity of militaries to still conduct military operations and obtain the goals that they are trying to achieve through the application of military force.

What I write in this paper, and I should say that some of these papers I co-wrote with a colleague called Yahli Shereshevsky, who is also a lawyer. We were trying to make the following argument, and that is that the laws of war should be understood conceptually as a normative floor, not as a normative ceiling. They basically introduce some very rudimentary and basic safeguards, such as that you do not deliberately target civilians. Or, then when you operate in a mixed terrain where civilians and combatants are intermeshed, you try to apply distinction. When this is impossible, you apply this idea of proportionality, that when you target the militants, you try to minimize as much as possible collateral damage to civilians.

The understanding of this field was always that this is, like I said, a floor and not a ceiling, and that in reality, parties to a conflict wouldn't actually use this license to kill that they have been given to the fullest extent. There are other constraining factors that would sometimes push a soldier not to pull the trigger, because he might feel sorry for the other guy on the other side. He might feel that this guy doesn't really threaten him, or it's a close call, and he or she are not 100% certain that this is the right thing to do morally. Or maybe you just don't want to escalate the conflict, so you don't go after certain military targets, although legally, you are able to do so.

Sometimes you could be concerned also about reciprocity. I mean, you don't want to harm in certain contexts, certain, for instance, political higher ups that oversee military operations. You don't want to go after the political echelon. Although, legally you may be able to do so, because you don't want the other party to do the same with regard to your guy. The idea is that there are all these sets of constraints that go over and beyond the law. In actual conflict, these constraints are actually quite important in reducing the overall amount of carnage that parties to a conflict inflict on one another.

The concern here is that once you remove this additional discretionary dimension, which is not legally codified, but it is an important feature of this human experience that is called war, you basically remove a very important layer of protection that up until now operated as a form of restraint on the extent of harm that is inflicted by war. What you have done is actually, you have collapsed the floor in the ceiling. Basically, the license to kill, your power to kill could be very easily with AI systems be translated into, basically, simply in order to kill all the legally permissible targets within a certain zone of conflict. This would significantly increase the amount of killing that is associated with war transforming. And this is what we claim war from effectively a human endeavor, a very difficult and terrible human endeavor, but transforming it into what is an effectively an industrial scale engagement, where machines are actually calling the shots and not human beings.

This has been the main argument that we were making in terms of policy-wise. There are also more subtle, I would say, moral arguments concerning the importance of human agency in taking life and death decisions. There is something ethically, and also arguably legally very troubling about delegating the power to decide who lives and who dies to an algorithm. It's a deep sense, this is the humanizing decision, because it basically transforms the other party on the receiving end from a human being whose life is weighed by another human being.

If the human judgment directs in a certain direction, he or she may be killed, but he also may be spared, to basically a set of data points which the algorithm analyzes according to a certain fixed formula and decides whether this generator of data should continue to exist, or not exist. This is ethically incompatible with this notion that in law is very important, and that is the idea of human dignity, which underlies the whole feat of human rights, which takes a view that the lives of humans are valuable, and we should treat other human beings in ways that are compatible with this understanding of the value of their lives. 

It doesn't mean that you cannot take lives in certain contexts, certainly in the battlefield scenario, but you have to weigh in the process the value of their lives. This is arguably something that an algorithm would struggle to weigh in the same manner. This is why even if the algorithm would actually do a better job than human beings, both for reasons that have to do with the scale of killing, but also with this ability to consider the value of human lives, that we think it's probably not a good idea. You have to be very certain that you're going to deliver a more effective and less harmful outcome in order to make the transition. Like I said before, we're not there yet. The jury is still out on this question.

[0:25:30] MM: Yeah. Because based on what I know, it may even be questionable to call what these gen AI models are doing an algorithm, right? Because in computer science terms, a lot of what they're doing is not deterministic. You can make it deterministic with use of regular computing. That leads me to a question of, and I'm curious if you've had the same question of, well, can you prompt, or train the model to have at least a resemblance of human choice? Because these models tend to maximize reward signals, right? I think what you're describing is actually, the absence of action. I don't know if these models are able to do that, but can you prompt it to say, "You need to weigh whether or not this is worth doing with these factors"? Have you guys explored that in your research, or even with your computer scientists that are working on this?

[0:26:26] YS: Yes. I think you're right. I mean, in terms of the first intuition is these are systems that typically prioritize effectiveness. If the main function of the system is to kill enemy soldiers, this is probably going to be the main vector of their operation. It's rather unlikely, and I think also complicated to introduce values such as compassion, or restraint, or empathy as the values on those basis effectiveness will be measured. But it is possible, and we are seeing this in the generative AI space in other contexts. It is possible for AI systems to mimic human behavior, including to mimic human feelings.

We actually make the caveat in our paper that if it can be shown that in functional terms, AI systems could actually introduce a parallel level of restraint that would be analogous to the level of restraint that human beings would apply in analogous situations, then our argument to some extent weakens, certainly the argument concerning the overall killing spree that we are going to let loose by actually relying heavily on these machines. There are still questions regarding the authenticity of any display of emotion and whether you could actually fully comprehend and internalize the lives of human being. It's important for me to say that I don't want to sound and I think this is a trap that many in the field are falling for. I don't want to over romanticize human beings, of course, and human decisions.

Humans are often poor decision makers, especially under conditions of uncertainty, but also, under the kind of emotional pressure that you see in a war zone. It's sometimes, I mean, we're talking about compassion and empathy and restraint. Sometimes, of course, it's exactly the other way around. It's anger, it's hatred, it's revenge, it's fear. This is actually an area where in terms of, at least on utilitarian levels, maybe AI systems could do a better job by actually shutting down certain negative emotions. These are certainly emotions you don't want AI systems to replicate. It is a close call. I think in the end, we would need to have more data about how these systems operate and how effective they are in complying with the law, and also in basically, embracing some of the positive constraints we want them to embrace.

Only then, maybe we could make this final judgment as to how much do we want to introduce and rely on them. Because the systems that we currently have, I mean, the system that I described before is at this point in time, it's not a system that is directed against human beings. This is a threshold that we still haven't formally crossed of having an autonomous system that is trained and tasked with killing individuals. What we currently have is those decision support systems that are still recommending to human beings to do that. We think that from a human dignity point of view, there is a big difference between a decision support system which recommends something and a human being can still say, "Well, I mean, I'm not going to follow that recommendation. Although, I accept that this is a lawful target, there are reasons why I don't want to engage this target." We are still haven't crossed a threshold and we are warning in our papers against crossing that threshold without stronger data on how these systems actually function.

[0:30:15] MM: Right. Also, it's very easy for now and even in the future, assuming that you can mimic those human qualities that models and drones and whatever else could be deployed without that, and they can operate at the same scale without those constraints as well. That's a little bit scary to think about.

[0:30:35] YS: Exactly. That goes a little bit to the question of also, once you develop the system and you afford a decision maker, the tool, there is a question whether that decision maker would be interested, like you say, in basically further developing a function such as compassion, or empathy that may go against their political designs. There is also a question of once. Frankly, I haven't heard of any military program that is actually introducing this mimicry of restraining factors into these systems. This back and forth that we have recently seen in the US between the Department of War and anthropic is really pushed back by the Department of War against a technology company insisting on basically, placing certain guardrails, which go against basically, maybe the political designs of the department. The pushback, at least from the customer, has not been to introduce more safeguards, but rather, to remove safeguards.

[0:31:35] MM: Wow. Yeah. The anthropic thing, that has been in the news very recently. Anthropic has been known as the counterpoint to OpenAI, I guess. That's how you can describe it. But they are against this type of thing. I guess, I'm wondering is, do you think that that's enough to make a dent in this argument against using these right now? Or do you think that that's just too little, too late, or?

[0:32:00] YS: I think it's, of course, as we know, OpenAI did take this contract. The fact that one actor insisted on basically placing limited trust, in that case in the US Department of War, understanding of what is operating under the law means in this context, and introducing some hard stops. This didn't stop, of course, the process, because then another company that maybe was not as concerned with the same ethical issues. Although I should say, in open AI's defense that they argued that they didn't negotiate with the Department of War and interpretation of the existing law, which is ethical. They did also join the anthropic lawsuit as a third party.

[0:32:47] MM: Oh, I didn't know that. Okay.

[0:32:49] YS: Yeah. They didn't live anthropic completely out there to dry. They're also concerned, I can imagine, about the reputation, which is also, I think, important in this context. There is, of course, a very big difference between not providing a system if you're not comfortable with the way the customer, the end user might use it and providing this and relying on some representation. I don't think that this is going to, in itself, stop the process in this direction.

Currently, I should say, the US law has some safeguards regarding the use of AI systems, in that they have to be operated with what is called appropriate human judgment. Humans have to be involved, but it's not obvious in exactly what capacity.

[0:33:37] MM: Yeah, what does appropriate mean?

[0:33:39] YS: And at what stage of the process they have to be involved? I think we've seen this already a lot of push back on this idea that the human, even with regard to decision support systems, we've seen a lot of push back against this idea of maintaining a human in the loop, or on the loop as an notion that is operating only in a nominal and perform a way, without actually having the human exercise meaningful levels of control and oversight over the systems. This is a problem that is not going to go away. Even if anthropic would have gotten its wish and would have gotten a system where you need an individual to press the trigger, to pull the trigger, the metaphorical trigger in such cases to give the go ahead, it's less and less feasible that that human individual would be well situated to exercise meaningful review over the way the system would operate.

I think, I heard last year in Oxford, this lecture by Jeffrey Hinton, who's sometimes referred to as the Godfather of Deep Learning, who said something like, there are very few examples in history where a less intelligent being was able to control a more intelligent being. I'm afraid that this is the trajectory that we find ourselves here, especially with war becoming such a complex, bureaucratical engagement, where very few individuals actually have the big picture and can exercise meaningful control over very specific tactical decisions.

[0:35:21] MM: Okay. That, I think, goes nicely into the concept of what you've called in your paper's meaningful human control. That comes up a lot. Can you say what this is and why it's pretty subtle, but it's very important? I think it's directly related to what you just said about that.

[0:35:38] YS: Right. There are negotiations taking place in the last 11 years in Geneva between a number of states. This is how international humanitarian law is being periodically updated. Whenever you have a new technology and a new weapons system, what countries sometimes do is that they get together and discuss whether we want to use this weapons system. If so, what constraints we agree to impose on those systems. For instance, over the years, weapons such as landmines have been partially banned, because of civilian casualties. Even when they were not banned, there were all sorts of limits on marking mine fields, etc., so as to minimize collateral harm in this regard. We do have a parallel process now taking place with regard to lethal autonomous weapons systems.

Many countries have opposed to the idea that such weapons systems would be introduced without meaningful human control. This is the term of art that had emerged during these negotiations to distinguish between completely autonomous weapon systems and weapon systems that still have human in the loop, or on the loop, exercising a meaningful level of control. The reason why meaningful human control has been so successful, although I should say that the United States has not accepted this standard, and like I said, uses appropriate judgment as the alternative standard. I think meaningful human control has been relatively successful, because it means very different things to very different audiences, right? Because in some contexts, you do imagine this situation where you have a recommendation and then the individual would weigh the recommendation and decide whether to accept it or not.

Then, of course, the AI is simply a tool to assist the human being. But then, there is a question of what do you do and when there is a time pressure and you need to work very fast? Here, some would say, okay, in these conditions, meaningful human control means you cannot use an AI system to take human lives, because the individual has to exercise an informed level of judgment and control over the recommendations that are issued. Some would say, no. Meaningful human control would mean that the system had been test run before and has been adequately programmed. We know exactly what more or less we know what we expect to obtain from this. This is the idea of interpretability and maybe traceability. Therefore, this suffices as meaningful human control. In the aftermath of what happened, we will investigate and if there have been some problems, we would then tweak the system and ensure that it does not repeat mistakes.

I think meaningful human control, basically insists on the continued role of human beings in the process, but what exactly would the human beings do remains to be challenged. One aspect that did emerge, I think, in the negotiations that are still ongoing in Geneva now for many years is that you do have to look into this issue of accountability as one aspect of control, so that you would have, if something goes wrong, you would have a human person that you could actually hold accountable to the malfunction of the system. That is probably the flip side of control is being in a position to be accountable for the flaws, or violations that have been facilitated through the use of those systems.

[0:39:19] MM: Something you said in there was fascinating to me, which is meaningful human control may mean that the software has been coded properly, and whatever properly means, right? I couldn't help but go to my software development life cycle knowledge and being like, well, how do you test this? How can you possibly test an autonomous weapon that it will be effective at the scale of war, when it's physically doing things to people? I would make the argument that you can't. That you're testing it in the battlefield in real time.

[0:39:49] YS: In law, I mean, every weapon system has to be tested before it is actually put into operation. Whenever you have a new technology, you have to undertake what is called a legality review, which is often something you do it in these contexts through simulations. You're right, that once you're talking about basically, sophisticated machines that have also a deep learning capacity -

[0:40:14] MM: That are also non-deterministic, right?

[0:40:17] YS: And don't rely only on training data, but also on post-training data. It's going to be extremely difficult to actually anticipate. We know from other AI systems, that is very hard to anticipate the outcomes that are - or the decisions that are generated by. I guess, what you have to do in those situations, I mean, those who would support this more narrow interpretation would say, you have to do the best you can. Then you have to, in a way, investigate and interrogate after use, basically, all the debriefing and understand what had happened. If you do have these unexpected implications, then you might return it to the store and try to basically -

I mean, part of the question here is, what are going to be the use cases, right? Even those who are quite skeptical of using AI systems, I mean, people from the International Committee of the Red Cross who are, so to speak, the guardians of humanitarian law. They do not advocate a ban on autonomous weapons systems. But they would advocate, for instance, significant limitations on the use of those weapons systems. You use them in a very narrowly defined set of issues. For instance, in a specific geographical zones where you know that there are very few civilians around. Or when you basically use them only a bit regard to physical infrastructure, but you do not direct them to target human beings. These more restricted applications could be mandated if you see that once you set them loose, I mean, they are generating terrible consequences.

Like we said before, I mean, there are still serious beyond questions of accuracy. There are still difficult ethical issues that would need to be cleared. Probably at this point in time, the legally prudent thing would be not to introduce the system until we know more about how they're going to operate. You're probably right that we're not going to know a whole lot more, until they're actually tested in fire, so to speak. I would say already that you do get from the decision support system, you already can generate some alarming information about how they're being applied.

For instance, we know that the US military is heavily relying on AI systems in order to identify militants in hybrid environments, or in asymmetric conflicts, where people, you have militants that typically do not wear uniform, and they blend within a civilian population. I mean, Afghanistan was a case like that. Afghanistan, Pakistan. There was a and still is an extensive reliance in this regard and what is called pattern of life. You basically collect a lot of visual information, a lot of surveillance, reconnaissance and surveillance, which basically, could be 24/7, ubiquitous and constant. On that basis, you basically identify militants if they drive certain vehicles, if they enter or leave certain safe houses, if they associate with others who you have reason to know that they are militant.

This methodology has been heavily challenged by human rights groups, who argue there are a lot of false positives here. There are a lot of assumptions being put into the data. It also facilitates a lot of surveillance, which is in itself a cost that has to be factored in. I mean, people in Afghanistan, some of them have actually have literally complained about living under the shadow of drones, the constant hum of the drone that has actually adversely affected their mental health. The fact that you have 24/7 movement of drones over your head humming, this is in itself a cost that needs to be internalized.

Or in the Israeli-Gaza war, there have been, for instance, concerns about the use of cellular reception signals as a factor in a proportionality analysis. If you want to know how many civilians are in a house in which you plan to target a militant, so the Israeli military was heavily relying on systems that could track down and calculate the number of cellular signals. Of course, the pushback was this is a very inaccurate proxy, especially in an environment where you have a war, electricity may not be running, phones may not be charged. There could be a lot of reasons why you got it wrong. The kind of system that we already have are already generating concerns about the methodology that is being used in order to reach the determinations that are already being drawn from them. Now, you remove completely humans from the loop, the outcomes could be much more serious and much more significant.

[0:45:20] MM: This is fascinating. You mentioned this before, too, about what happens when you're in a war zone and cellular signals go down, right? Just the immense physical challenges of warfare and these systems being able to perform in here. I mean, the technical challenges around that are mind boggling. Also, to your point earlier, how do you even put a human in the loop if you can't make connectivity to these things? How does that even work? That's more of like a rhetorical question, but I am curious of, are there advancements? It sounds like it is on the radar for ethically and legally, but maybe you could talk more about that.

[0:46:05] YS: Well, I mean, like we said before, I mean, some of these systems are being developed exactly for what I call the communication denied environment, where you do not have the ability to control them. They are actually geared sometimes specifically towards these sort of environments where the fog of war is making it very difficult for human beings. Then you basically place trust in those systems and rely on all these proxies, which I mentioned, that could be highly problematic. I mean, we still don't have very good data, because part of the problem here is that military organizations are not the most transparent organizations in the world.

Also, it's easy for them to basically, if a person has been targeted, I mean, you have a very strong circling the wagons kind of phenomenon by basically saying, well, that person was targeted, because he was a terrorist, or because he was a militant. There is limited willingness and interest in actually opening up those assumptions.

[0:47:02] MM: Well, yeah. Even the definition of terrorist, right?

[0:47:05] YS: Exactly. If someone is a male, a young male who's associate with other young males who carry weapons, I think there's going to be a very easy assumption to be made in this regard. There's also a question of access. How do you access a battlefield and actually get cooperation that you got it right, or even you got it wrong? You do have here some very serious methodological constraints. Even when you try to be more accurate, this could backfire spectacularly. For instance, again, the Israeli military in Gaza, they try to hunt down suspected terrorists, whom they knew where they lived, so they had the address. The way they would operate is once they could associate a terrorist with a certain address. The person entered his or her home, this is when the strike would have taken place.

The result was that the family who lived in the house became collateral damage in this context. You increased accuracy in terms of target identification, but you generated far more collateral damage than you have had before. It's an extremely complex formula. I think this goes to the other point that we were discussing before, and that is the level of human control. You can reasonably exercise under such conditions of uncertainty, especially when you have an operation that is very bifurcated, or multifurcated. I mean, you have different agencies who are responsible for different stages in the process and the decisions are coming up your desk. Or if you're an intelligence officer, I think the Israeli military published even before the recent Gaza war, they made the point that if in the past, it took them about a year to generate 100 targets through the work of intelligence agencies. Now with AI, they do the same thing within a week. The speed has really picked up.

During wars, now you have this basically, real-time updating of target lists. Even the past, we were in a situation where militaries would run out of targets quite quickly. Now, it's exactly the opposite. The targets never stop, basically showing up on your computer screen. The question, especially in intense combat, how much of an effective oversight US and as an intelligence officer could actually exercise? There were arguments that were made again in the Gaza context that Israeli intelligence officers in some cases have 20 seconds to basically go over a commendation, before they would push it to the next room in which another team would decide which weapons to use, and another team would then go to do the proportionality analysis. But the critical link of target selection, allegedly, there were only 20 seconds for up or down decision by a human being. To this, all the considerations of uncertainty, we are headed towards a reality that these systems are not going to be effectively monitored and controlled.

[0:50:13] MM: I keep thinking about the parallels of AI. I'm not trying to compare war against these things, but when you step back and look at the problems, they're very similar. Using AI in healthcare, right? Our doctor is just going to become completely reliant on these models to identify problems and lose touch of like, no, that looks an outlier. We should take a look at that. Or development workflows. Well, okay, we're going to have agents producing thousands and thousands of lines of code, but how do we make sure that those lines of code are doing what we want them to do and that they're of quality and they're maintainable?

Yeah. Again, it's not fair to compare those things, but when you squint, they're very similar things that I think the entire, I'll just call it, industry of AI, whatever we want to call that, are struggling with. I get the sense from our conversation that international humanitarian law is not set up to handle this. I think that my question as a layperson in the legal realm is, how far behind is it? Has it even caught up with technological advances from 10 years ago, 20 years ago?

[0:51:18] YS: It's a system that hasn't really caught up. I mean, it's a system that has been some of the rules are very old, from the mid-19th century. Some of the rules have been updated after World War II. I think generally, international law is a system that plays catch up with reality. I think that's always a problem with law and technology. Law is often lagging behind technology. International law, because it's such a cumbersome system that works in the base of broad consensus, I think it's particularly bad in this regard. The short answer to your question is that it really didn't catch up. I didn't mention this process of negotiations over lethal autonomous weapons systems. But maybe I will say that this process has been going on for 11 years. Many of the horses might have already left the proverbial stable.

I mean, a lot of people and one of them think that this has been a bit of a red herring, because we focused on these killer robots. While the reality is that these decision support system have been in the interim introduce, and they are already being basically battle tested, and we do not have a good legal framework to deal with them. Especially, I would say on issues of accountability, because the accountability is really a critical link in the chain of any legal system. Here, we have actually created a structure by which it's going to be almost impossible to hold someone accountable for violations of the laws of war, which are committed through the mediation of AI systems, which paradoxically generates a huge incentive to use them more and not less in a military context.

[0:53:00] MM: One of my questions was as a software engineer, if you're directly, or indirectly contributing to these efforts, could you be held accountable? Or could your company be held accountable? I think what I'm hearing you say is that's such a muddy line that the law probably says no. Is that correct?

[0:53:19] YS: At this point in time, there's no clear legal prohibition against the use of military AI. What you have in this terrain is a combination of the problem of many hands. You have a product that had gone through a developer and then it has been marketed and then it has been purchased and then it was introduced and then it was deployed. You have this very long link of supply, or value chain, and it's very hard to make the case that the person at the beginning of the chain could actually anticipate the particular use and harm that had been created at the end of the chain. I think it's going to be extremely difficult.

On the military side of things, you also have to add this black box context, where it's quite unlikely that the people who operate the system would have a full understanding of what the system can do and what it cannot do. Even if you say, well, they are negligent in using a system, that they do not understand what it can do, or what it cannot do, at least from a loss of war, from a classic loss of war point of view, that is not good enough. In order to prosecute someone for a war crime, you need to show intent. You need to show that the person actually knew that what he or she were doing, or not doing would at least highly likely generate a certain outcome. If you don't understand how an AI system works, then this is a threshold that is not likely to be met.

In the last couple of weeks, there have been some speculation about - in the first day of the Iran war, there has been a targeting of a school by the US military. There has been some speculation that an AI system was involved in the targeting decision. We don't know that, I should say. It's not obvious. One reason why people were making this speculation was that it appears that this school used to be part of a military base, and there were maps showing it within that military base, and 10 years ago, it was detached from the military base and became a school. One hypothesis is that the AI system was basically using out of date materials in order to identify this as part of an existing structure. Again, we don't know this. The point is that if this hypothetically was the case, probably there would be no one you could actually blame from a legal point of view, because you've created this accountability gap.

[0:55:53] MM: Wow. That's mind blowing. My mind went, again, to a different use case of I've worked in companies that have FDA regulated software. Food and Drug Administration in the US. When you think about a bug that may have caused somebody to die on a surgical table, or something like that, can you blame the engineer who wrote that line of code? Not really, because they're part of such a big system. Then when you add it to AI, too, I keep coming back to this, but it's non-deterministic. Who can really understand how that thing is working? I don't know.

When I was preparing some of these questions, I saw this quote on Reddit from user DShark, and it got me thinking. It says, "We never agreed not to kill each other with nukes. We'll never agree to not use AI, especially when things get really desperate." Obviously, that's hyperbole. It's somebody on Reddit. I started thinking about all the comparisons to the development of the atomic bomb, right? Nuclear physics was born out of academic research. AI was born out of academic research. The Manhattan Project took that academic research and weaponized it, and that's exactly what's happening here. We also talked about testing, right? They tested nuclear weapons, but what they didn't find out is that the nuclear fallout was potentially worse than the initial destruction. I mean, I think they had some idea of that. It's alarming to think of the comparisons, and I'm wondering if you think that there's any lessons from the development of nuclear weapons that we might be able to learn from and apply here?

[0:57:30] YS: Well, I think once the technology is out there, the temptation to use it, or at least to have it, for eventual scenario, even if it's an unlikely scenario, I think the temptation is simply too overwhelming. Once you have it, you're not going to give it up. I think in terms of weapon control, you do have this very strong attachment by strong states to having highly destructive weapons systems. It's certainly not going to go away. There's a very strong military market, which is, I think, also very influential regarding the industry. The industry is, like we see, I mean, the big players, anthropic, OpenAI, Microsoft, Google, they're all in this military domain, because there are a lot of resources that are directed in this area. It's not going to go away.

Maybe the one distinction, which I would notice that within nuclear - it's still difficult to generate nuclear weapons. This is not something you do in your garage. I mean, it's something you need to actually get significant apparatus. Nuclear weapons, you could still contain it to maybe under 10 - I mean, eight, nine, ten countries to have it. You do have a legal regime called non-proliferation. I don't think that this is something you could also do here. I mean, because these systems are essentially dual use system. The generative AI system Claude is already being used for targeting. The idea that everyone wouldn't have it is simply unsustainable. To the extent that this is going to be weaponized, it's not going to generate, hopefully, the kind of harm a nuclear bomb generate, but it would be accessible to everyone.

This is, I think, we're already seeing this taking place with regard to drones, where drones is not an autonomous weapon system, but it is a very easily accessible technology. This is something you can build in your garage at some level, and this is something that is already changing battlefields dramatically, because you can inflict a huge amount of harm through these weapon systems. Now, an AI system over a drone, you could have your own lethal autonomous system relatively easily. I think it's going to be much harder to contain than nuclear weapons. Maybe that's not the optimistic take you were looking for.

[0:59:54] MM: That was very much a - I did not expect much good news out of this, but that is an eye opener. I'm near Boston, and I think about the marathon bombing here, and they used a pressure cooker, right? These creative uses. You think about these, like Qwen, these open-source models that are performing extremely well, and can retrain that model, take out the constraints, optimize it for whatever nefarious purpose you want, and then throw some drones at it, or, gosh, I don't know, Roomba vacuums for all - whatever it is.

We touched on this a little bit during our earlier conversation, but I think that this is an eye-opening conversation, and especially the technical challenges of this are fascinating. As software engineers on this podcast, most of us might never have the chance to work on these autonomous weapon systems, but our lives could be very much affected by them. I'm curious how you think some of the concepts that we talked about today can apply to other technical areas that are being advanced by AI?

[1:00:59] YS: That's a good point, because a lot of the issues that we've been talking about in connection with military AI actually have much broader application. You could say a very specific use case of AI technology. But a lot of the issues are actually common to other systems that are being used and deployed in very different contexts. The issue of mistakes and the need to introduce safeguards. This is not only a military AI. I mean, the price of mistakes could be very high in this context, but we can think, of course, of many other contexts where the price of mistakes could be exceptionally high. We need to think of safeguards. One of these safeguards could be human oversight and human involvement. But here, again, you have to think of how humans are able to effectively monitor systems that are much more sophisticated and have greater capacity to basically process data in certainly in short time spans than they have.

We need to think, for instance, of levels of transparency and explainability to make these systems actually liable to be monitored. There may be some areas where we do want to basically empower human decisions for ethical, or legal reasons. Again, the medical analogy is a good analogy, because there, we may want to still restrict life and death, this kind of decisions to human beings, who will perceive the patients, or the other interlocutors as human beings with human rights and human dignity. The issue of accountability that we've discussed before is a chronic problem of AI systems. We need to think also, this is not only the technology said, also the lawyers have to think of ways to reintroduce accountability, sometimes through mechanisms, for instance, such as product liability, or insurance, or reform of criminal law, or tort law in ways that will re-introduce incentives to use these products, these systems, and services in ways that are safe and are human-centric, that put human beings at the center of their moral universe.

I think that the big challenge is to actually understand that the use of these systems is not ethically and legally neutral. There are certain costs that need to be internalized and discussed. When we are developing these systems, marketing these systems, introducing the systems, we have to think of these - what are the implications of doing that for these ethical and legal choices.

[1:03:39] MM: As an engineer making things that are more business-facing and consumer-facing, we do have some power as engineers to consider these things and bring them up. We may not have the ultimate power, but we can ask the questions and force some of these things. I also couldn't help but think when you were talking about transparency and decision-making about an audit log, if you will, of how these things are operating, so that they can be analyzed later if there are mistakes. I don't know if that's in the works. I have to believe that people are looking at that type of thing. But I wasn't there back when relational databases came on the scene, but I can imagine that when relational databases came out, that they had detailed audit log functionality. I feel like, we're at the same point where we have models making decisions and we don't really know why they came to that conclusion. It's fascinating.

All right, so talked about a lot. We've got a lot of smart engineers listening to this podcast. Let's end it with, if you had one thing you wanted these engineers to take away from today's discussion, what would it be?

[1:04:45] YS: There's a lot of talk in my areas of - the academic areas. I move around about AI safety and about AI responsibility and about, also, AI ethics. These are all very important framings of the challenges that we are confronting. But at least when we're moving in the direction of using AI in such consequential context, such as a military, I mean, you also have to factor in the law and you have to factor in human rights and you have to factor in humanitarian law. You have to basically think of these frameworks as relevant frameworks of reference. Like you said, we have to think of how we as engineers, or we as entrepreneurs, or we as marketers. I mean, how are we basically trying to the best of our abilities to work in ways that would not undercut these important, normative projects that are part of the building blocks of our society and our long-term ability to continue to exist and flourish. I think not only about ethical AI, but also about legal AI. Not only about safe AI, but also about human rights friendly AI. That's one takeaway I can offer.

[1:06:01] MM: No, that's really good. I can't help but think about a parallel to civil engineering. If you're going to build a bridge and you see something going wrong, I can't imagine a civil engineer not saying, "Hey, that thing is wrong." I am going to take that away for sure. Well, thank you so, so much. This was fascinating. I have a whole new appreciation for this and I really appreciate you being here.

[1:06:25] YS: Thank you very much, Matt.

[END]