EPISODE 1718

[INTRO]

[0:00:00] ANNOUNCER: In '79 AD, in the ancient Roman town of Herculaneum, 20 meters of hot mud and ash buried an enormous villa once owned by the father-in-law of Julius Caesar. Inside there was a vast library of papyrus scrolls. The scrolls were carbonized by the heat of the volcanic debris, but they were trapped underground where they remained preserved. It wasn't until the 1750s that the scrolls were discovered, but they were fragile and resistant to being open and read.

Then in 2015, researchers used x-ray tomography and computer vision to virtually unwrap the scrolls. Last year, the Vesuvius Challenge was launched by Nat Friedman, Daniel Gross, and Brent Seales to crowdsource the process of reconstructing the text from the scrolls.

Juli Schilliger and Youssef Nader are two members from the winning team. They joined the show to talk about the computational approaches they use to reconstruct the scroll text. This episode of Software Engineering Daily is hosted by Jordi Mon Companys. Check the show notes for more information on Jordi's work and where to find him.

[EPISODE]

[0:01:11] JMC: Hi, Julian. Hi Youssef. Welcome to Software Engineering Daily.

[0:01:16] JS: Hi. Thanks for having me.

[0:01:17] YN: Yes. Hello.

[0:01:17] JMC: Before we jump into the topic that brings us today, I'd like to know if you guys had any, or have still any particular interest in ancient history or ancient texts in particular.

[0:01:28] JS: So, I'm living in Switzerland and we also have lots of Roman artifacts and ruins here. So, since I was a little child, we would sometimes go visit such ruins. That was about the extent for some time. After that, in school, I also heard from ancient libraries and how they got lost over time because they had scrolls in them and you couldn't really conserve scrolls, especially in the Mediterranean area where scrolls just deteriorate and maybe over 100 or 200 years get lost. That's my starting point there.

[0:02:06] JMC: What about you, Youssef?

[0:02:08] YN: Yes, for me as an Egyptian, I think you grew up learning a lot about ancient history.

[0:02:14] JMC: It's literally everywhere. You can't escape it. Right?

[0:02:16] YN: Exactly. You go like throughout, like, it's so much 7,000 years of civilization is quite covers like a big curriculum for 10 years or so. Yes, I think the word like papyrus was the kind of thing that got me hooked, because we grew up learning that it was like invented by Ancient Egyptians to document history, and daily life, and that sort of thing. So, yes, for me, that was kind of the hook for the Vesuvius Challenge like that.

[0:02:47] JMC: The good thing about the difference between Egypt and Switzerland is that actually, things are preserved much better there because of the dryness than it does so in the rest of the Mediterranean. Okay. So, what about computer science or just tinkering with software and technology, and the technologies that eventually allowed you guys to win the Vesuvius Challenge Award last year? What is it that brought you, that hooked you into computer science? What are you doing right now as a job or studying with in computer science or tinkering with technology, both of you?

[0:03:22] JS: For me, I started this journey of studying computer science out of the idea that computers are probably the most powerful tool that we have today, and I want to master them, or I wanted to master them. I'm still in the process.  I guess no one really mastered them so far. From that aspect, I wanted to help humanity with computers, take away some tedious tasks that no one would like to do. That also brought me to computer science, and later on to robotics, because you have lots of laborious tasks in the physical world as well that need some help, that humans wouldn't really like doing.

[0:04:07] JMC: How old are you and what are you studying currently?

[0:04:10] JS: So, I'm 29 right now, and I recently graduated from ETH. I did a Bachelor's in Computer Science and my master's was in robotics, also at ETH.

[0:04:22] JMC: ETH is really renowned. One of the most renowned computer science and many other fields, engineering fields, universities in Switzerland, but in the world. What about you, Youssef?

[0:04:33] YN: For me, I did my Bachelor's in Communications and Computer Engineering. Yes, I grew up really loving math and I wanted to like pick a heavy math-focused major and I wasn't initially going to pick like communications and signal processing and analysis, but software engineering just grabbed my heart. In the end, I ended up doing lots of machine learning in my final years of bachelor's. Then I came to Germany, did my master's at the university here and now I'm pursuing my Ph.D, focusing on AI, explainability, and self-supervised learning. I mostly work with 2D images and the Vesuvius Challenge sort of presented this more complex data format that I was just eager to play around with.

[0:05:20] JMC: The title of this episode is probably self-explanatory. We've mentioned it already, during this interview. The Vesuvius Challenge. For those that don't know anything about Mount Vesuvius or volcano Vesuvius, and the challenge, does any of you want to take a stab at explaining what was the goal of the first edition of this challenge?

[0:05:41] YN: Yes. Maybe we can, like go back a little bit for small history contexts. So, around the 17th century, a farmer was excavating in Italy, and he found like something that looked like a piece of charcoal. It didn't look quite like the other rocks and that was one of the first like scroll that was discovered from the collection, the Herculaneum collection. From then, into like a series of attempts of trying to read these scrolls. A lot of these attempts were quite invasive. We cracked open -

[0:06:12] JMC: To say the least.

[0:06:15] YN: - and you can imagine, like these pieces of the scrolls are, have been under so much pressure, so much heat, so much debris for a very long time, as well. So, they're very, very brittle. Touching them might make them crumble. So, these attempts like destroyed a lot. Over the years, some progress was made of like the machine of the [piaggio inaudible 0:06:36] where you have, like, you break off pieces, and you stitch them on a membrane. That also was a very destructive process. But that was like the best that we had.

Then, a couple of years. So, in 2000, I think around 2010, Professor Brent Seales at the University of Kentucky developed this method of virtual unwrapping. He did a lot of work on how to virtually open these scrolls, so that we can like manipulate them, look at them on our computers, rather than doing this physically. This works well for scrolls that are specially made from, where the writing is, the ink contains like some metal, because when you scan them in a CT scan, for example, and you open them, there's like quite the contrast that you can see. Then you find the ink, you can read them.

The challenging thing about this collection was the ink was carbon-based. So, you look at it in a CT scan, which is a more sophisticated version of the X-ray you'd get at your doctor's, you'd see that it's blank. We know the ink is there, it should be preserved, and there are pieces that fell off where we see the ink. But when you look at it in a CT scan, it's empty. It's not completely empty. Some contestants in the challenge have proved this, but you don't see anything. This is sort of where the beginning of the Vesuvius Challenge.

[0:07:56] JMC: Yes. I need to stop you there because you're getting into the nitty-gritty of it, which we'll cover in a minute. That's really super interesting. So yes, place yourself in ancient Rome, I think first century after Christ, '79 AD. That's when Herculaneum, small village close to Pompeii, the famous Pompeii, got covered with ashes mostly, and all sorts of debris from the eruption of Mount Vesuvius, a volcano nearby. That covered completely not only the whole town, but Lucius Calpurnius Piso's house, which was a really powerful guy. In fact, the father-in-law of Julius Caesar, and the owner of a ton of papyri. This guy must have loved to read. And he had a ton of money to bring in people that wrote copied, translated texts for him in his really famous library, which is the one that Youssef was describing, was discovered in the 17th century.

Then, we went through all the iteration of processes of trying to discover what's written in those. The first approaches were quite dramatic. They even sliced and cut the scrolls. But anyway, we've got better to the point that now we're able to read, hopefully, what's inside. So, Julian, how did you come across the Vesuvius Challenge? How did you find about it? What was the task that anyone joining the challenge had to attempt to solve?

[0:09:18] JS: Okay. So, this challenge was created by Brent Seales and Nat Friedman. They were the initiator and the task was to open a still rolled up scroll and read it. For the grand prize, you have to read 4 passages of 140 characters each where 85% of the characters had to be readable. That was like the starting point. You had the CT scan of the scroll and there you go. You can download it from the Internet and start. When I came to it, like that's all there was. There was also a competition on Kaggle to read some of the CT scans. You have to imagine some of the pieces of this scroll flake off, fall off, and you can read some characters on those small pieces. Those were used as ground truth data in the beginning to develop machine learning models on it.

So, that one was running. While we have a way to access the scroll, we had to unroll it first. After I developed some initial machine learning model that's performed quite well on the fragments, to my knowledge, I started extracting some of the sheets. There was a tool that was developed earlier over the past 10 years called Volume Cartographer that was developed by Brent Seales' teams. I tried using it and it sadly didn't work. It was way too slow, too clunky, had lots of issues, errors, broke and crashed. It was like you couldn't even extract one character.

[0:10:59] JMC: By the way, when you say unfold the scroll, you mean digitally, right?

[0:11:03] JS: Yes. Even like unfolding wouldn't probably be the proper term. It's more of an unrolling since it's not folded.

[0:11:11] JMC: So, Volume Cartographer did not work. That got you, I guess, thinking of other ways to do it, right?

[0:11:17] JS: Yes. So, I thought the framework was quite nice. The ideas on how to structure this task was also quite in the right direction, but just not there yet. I developed some segmentation algorithms. Maybe we have to say what segmentation is in this regard, because it's something different than in computer vision, generally. Segmentation is actually the process of extracting these sheets, this surface of the papyrus. You will do it in Volume Cartographer by drawing a line, a curve. If you imagine the scroll, it looks a little bit like a tree stump, just like a spiral, if you have it rolled up, and you're going to trace that spiral. That's the first step. Then, you give the spiral to the computer, and it will compute it through the volume. So, you have like one slice through this scroll, and then the spiral gets computed upwards through the volume to extract the surface, actually.

I developed some algorithms that actually compute this scroll upwards through the CT scan, and stay on the surface. You have to imagine this surface is really, really twisted and broken, has tears in it. It fuses, sheets are fusing, because this volcano was really destructive to those scrolls. That's how I started there. There were people using these tools. So, it's a manual approach. You have to do work to actually extract those sheets. I started developing Volume Cartographer further and got inputs from the people that were using it.

[0:12:55] JMC: Okay, what about you, Youssef? How did you come across this? What was your first approach to? Did you also leverage like previous work? Or did you like start from scratch yourself? What was your first ideas to tackle this, become a champion?

[0:13:09] YN: Yes. I first heard about the challenge through Kaggle. So, I usually keep an eye out for interesting Kaggle challenges, and I saw the ink detection competition, which was running there. My first idea was to sort of treat the ink detection problem as a regression problem. Try to predict like the average intensity of ink or no ink, rather than the binary, and in order to sort of tackle a few issues with different orientations of the image, different noise levels, the CT scans, like were quite different from one another and it worked. I'm actually not sure how well it works, or it fares like today, but it was like a stray idea that I looked into. At the time of the Kaggle challenge, I was also writing my master's thesis, so I was pretty busy.

[0:13:57] JS: Yes, me too.

[0:14:00] YN: Yes. Big fun fact, like we both finished our master's thesis in the same month. That was like my start with the Vesuvius Challenge. But it was quiet for a couple of months after. Then, I found, I just kept in touch, and I saw that even the winners of the Kaggle competition couldn't really see ink on the scroll, which made me very intrigued. Then, I was starting to learn about self-supervised learning. Now, you can like learn some stuff from unlabeled data, and then transfer that knowledge to labeled data. It's usually very powerful when you have like a lot of unlabeled data and just a small amount of labeled data.

I played around with it. It sort of pushed me in the right direction. It wasn't like complete success, but it wasn't a complete failure either. That's when I first found the first couple of letters and made this like sort of iterative process where you can like, your model becomes a teacher to teach like new models, what ink looks like, and these new models themselves become a teacher once they learn what ink is. This process repeats until you get like really good models that really understand what the ink is.

[0:15:06] JMC: Two questions before we move on to. It's something in deeper dive actually into the topic. What were the first two letters?

[0:15:13] YN: It was Pi and O. I mean, I'm not really like knowledgeable in Greek.

[0:15:21] JMC: No worries. No worries at all. What was it about the ink that you glanced through at the beginning. There was something about the chemistry of the ink that made it particularly challenging?

[0:15:30] YN: Yes. I think it was a lot of factors. For the Vesuvius Challenge, there was a lot of factors. The technical part was definitely very interesting, because you get access to this 2,000-year-old data that you can download on your computer and play around with. Essentially, it was like 3D data or 3D images. Once you zoom in on the sheet, you can also see the thickness, right? It's essentially this - it sounds like it should be straightforward. We sort of had like a proof of concept that how it should work. But it was sort of obscure, like for some reason it wasn't working. Can this jump from, we had some very good models for the fragments. This was even tested before the challenge from like prior work from Steven Parsons, Ph.D thesis. So, we sort of knew what flow should be and this was actually also the eventual flow, but getting get to work on the scroll, the unopen scroll, sort of technical mystery, which is for me, was like the hook.

[0:16:27] JMC: Before we actually double click and get a bit more into the solutions that you guys, told me any of you about the competing dynamics, like when the behavioral aspects of either deciding to collaborate with this guy, Youssef. I'm talking about Julian now. Or Yousef? How did these mechanics, how do these behavioral patterns work? How do you decide upon collaborating or just literally competing on your own? Be completely honest?

[0:17:00] JS: Yes. I mean, I think it's Game Theory, right? Probably. But I think the design of the challenge was just like, really great. You had this big prize money for the grand prize that drew people in, but then actually, there were also smaller prizes. If I say smaller prizes, I still mean like thousands dollars for prizes. That kind of hooked the community and they were periodically given out and you could share your information. You had to make them open source to win those prizes. Some collaboration was actually fostered throughout the year, throughout the challenge, and people that shared often and a lot were recognized and had a cool good standing in the community.

We had the discord server most of the action is or we still have that server. So, come join us on Discord. Then, we just exchange ideas and improve stuff, share things. That's how we got to know each other. I mean, Youssef, me, Luke, we all were active in the community. We all want prizes, those smaller prices. Yes, every one of us had like something to bring to the table and we've shown that throughout the year.

[0:18:12] JMC: So, you took Volume Cryptographer to another level, right? Tell us about what you actually assembled, how you called it, why you called it in such way, and what did it actually do? How did you get the price you want through this technology?

[0:18:28] JS: So, I identified quite early on that the problem with Volume Cartographer. The main one was that the segmentation algorithm was too slow and not accurate enough. If you could increase the accuracy, you wouldn't have to do so much human work. You could like to more with your time. I'm quite lazy.

[0:18:48] JMC: What made it slow?

[0:18:50] JS: Let's start with like the slowness at the algorithmic choice. Just the algorithm that was used was really bad. It was not implemented correctly. Single-threaded implementations, those data that it used, the data was really big. So, there's also a need to handle the data appropriately. So, you have to load it in a certain way to be fast prefetching data, for example. The slowness and the accuracy, just using a different segmentation algorithm to propagate line throughout the volume was the key.

There, I took inspiration, just from, if you have a video, there's this problem of tracking pixels or objects, and that's called optical flow. And if you look at this 3D data, you could also just have it as a video, 2D images and then going through the volume as the time dimension of the video would be the third dimension. So, you kind of have this video where you could also apply optical flow to it. That's how I could track my curvature of those sheets. Track points from one frame to the next, and I have this curve follow the sheet through optical flow. That's how I decided to test it out. I call it optical flow segmentation because of that, and the main workhorse underneath it is optical flow.

From then on, I had lots of iteration and testing with people that actually used the segmentation tool. Those gave me nice inputs how to smoothen those lines. So, you have to see the sheets are bright, showing up right in the CT scans and the error around it is black. If you move out off the sheet with your curfew, you move into the black area, and so the computer could itself tell that if you just set a threshold. Maybe move the points of the line back into the sheet or at least try to do it. So, those were things that I use to get more speed and more accuracy.

[0:20:57] JMC: Is that what granted you the award? Is that contribution?

[0:21:00] JS: Yes. So, multiple iterations of this algorithm. You have to see if you just calculate how powerful the tool was, how much sheets you could extract, like the area. There was like a 10,000-fold improvement of this segmentation tool. Maybe every time I had like a 10 to a 100-fold improvement, and that three times made like a 10,000-fold improvement, and for each 100-fold improvement or so, I won a prize.

[0:21:32] JMC: So, what is ThaumatoAnakalyptor then?

[0:21:34] JS: So, yes, this last Volume Cartographer program uses human labor and it's quite expensive. Actually, for the grand prize banner, that's 2,000 letters that we read. It costs around about $200,000 in human labor. My goal was to extract more sheet area quicker and fully automatic to just have an economic way to read the scrolls. That's how I went back to the drawing board, and applied all the lessons that I learned from developing further on volume cartographer. So, it's a pipeline. It splits this problem in multiple smaller sub-problems starting with extracting surface points on these sheets. Just like a point cloud, you have to imagine, having a point cloud, lots of points on the surface of the sheets only. If you look at it with your own eyes, you can see like those rings in 3D, those spirals, and the sheets. But yes, only point clouds. So, 3D point data and no connectivity information there.

Next step would be getting back this connectivity, how the points connect, getting back this connectivity means now you have connected point clouds. Now, you have to kind of mesh it because sometimes the points aren't quite accurate. So, you need to mesh it. After that, if you have it meshed, then rendering it, having your virtually unfolded surface in front of you where you could then could apply your ink detection on.

[0:23:03] JMC: What about you, Youssef? What technology got you the price? How did you call it? How did you build it? What did it deliver?

[0:23:12] YN: So, I mainly focused on the ink detection problem. Once you have like the flow of the segmentation, you have like 3D imagery to detect the ink. From the Kaggle challenge, there were like some established findings of bigger models or better predicting on like white context. So, these images are usually huge, like thousands or hundreds of thousands of pixels by 100,000 pixels. Of course, there's no way of putting all of this into the model in one go. The idea that was running was that you run a sliding window over like the different and you do this on a pixel level, is there an ink or ink, and you do this like a segmentation problem, or an instant segmentation, because the terms are getting confusing. Then once you have like the ink you the interpretation was the reading like the letters is left for us humans, when you zoom out, you see like what the writing is. For the Kaggle challenge the most of the window sizes were like really large.

So, my journey started from like using self-supervised learning to pre-train the network on the scroll, the unopened scroll, and then fine-tuning with whatever fragment data that we have. This sort of help the model, the very first like three, four letters. From there, I remember this like idea that I saw before on Kaggle, where you can take like the predictions from the model. If you don't have like certain labels, you can take the predictions from one model, turn them into labels and train like a completely new model. So, nothing shared between the two models. The new model learns like the signal from also now like the first model.

Lots of work went into preventing overfitting, which is, you'll run into the problem where your model just memorizes what you want. This is like a very big issue because you pretty much showing the model what it needs to do. It might just convert to being the previous model. If your labels are wrong, then there's really nothing to learn. There's no actual signal behind your labels. So, it will just memorize like the patterns -

[0:25:15] JMC: It's a monkey repeating, gesturing back. That mirror, right?

[0:25:17] YN: Exactly. Yes. Like, monkey see, monkey do. This is what you end up with. So, a lot of work went into how do you prevent overfitting? How do you allow the model to talk back, to sort of like, reject your labels if the wrong and be able to like still, if you say that this area has knowing, can you train the model that it has no ink? But the model sort of sees this as the signal you're looking for. It should have like some leeway, some way of like, not being too heavily penalized for my mistakes. A lot of like regularization and testing. The endless pipeline of iterative sort of labeling did like quite an incredible job and finding letters. So, I went from five letters to many letters read very quickly. Then, when the ink banner, this was like the announcement that we have letters from the scroll, the ink banner was kind of getting this approach really, really right.

For the next, like, two months for the grand prize, I did a lot of these iterations and testing. Okay, like, is it better to overestimate the labels or to underestimate? Is it better to be more conservative or less conservative? What sort of labeling techniques I - in the end, I found that the best way of labeling and comparing the models was to do this in Photoshop, because you can just flip back and forth between the predictions and see if the predictions are improving. Well, how do you scale the data? I did a lot of iterations and data was coming. Another question was how do you use this data efficiently?

In the beginning, like from the beginning of the competition, I had like this idea that there's usually all the redundancy. You're scanning at a very, very high resolution that from one cross-section to the next. There's not that much difference. You're that zoomed in. There's some redundancy. I thought, okay, it may make sense to just look at the spatial image, and then try to find a relationship between the two, between like the depth. Find some information of the spatial, the X, Y, and then try to find some relationship between the Z separately like. Say, "Okay, I see this pattern here. I also see this pattern repeating throughout the depths. So, maybe this is a letter."

I tried to do this with like convolutional and LSTM networks and some other attempts. It worked, but it wasn't like really what I'm looking for. Then I came across this paper from Meta TimeSformer, which looked at this problem of if you want to do a TimeSformer, which it's one of the key aspects of TimeSformers where we see them everywhere today, from natural language processing to images, they scale very, very well. So, they looked at the problem of, if you have like some three-dimensional image data, what's the best way of doing the attention mechanism in this three-dimensional space. They looked at many different variations of the attention mechanism, 3D, somewhere, even like some sort of a hacky heuristic approach. Their final finding was that they do this kind of idea of you learn some stuff from the spatial dimension. Then you only look at yourself. The message passing only happens through the depth. You look at yourself and then you exchange information. Okay, I see this pattern here. I also see this pattern here. So, maybe this is ink.

This was sort of a culmination of this idea. It took some work getting it work on the Vesuvius Challenge. For some reason, it was very, very, very sensitive to any like parameter change or any change in the setup. In the end, I got it working around like end of November, right before I teamed up with Luke. Yes, I was really surprised by the findings, because the predictions look really good. The model was able to ink that it wasn't able to find before. I sort of like knew, "Yes, this is the model that's going to make the final cut." And from there was sort of getting both approaches like juicing out like the final, the last two letters, because there are requirements or the criteria for winning the grand prize was four paragraphs of text, 85% readability, and that 85% readability actually proved to be like a very big hurdle. In the end, we were the only team to cross that threshold. It took a lot of these like iterative steps to refine the model predictions and find like those final two, three letters in every paragraph. This is my story with the challenge.

[0:29:39] JMC: Nice. One day we'll look back 10 years from now, 20 years from now on the TimeSformers paper, and we will realize how much value that paper hasn't locked because it's crazy how much, again, value in many senses that paper has actually just delivered to everyone. So, Luke wasn't able to join, Luke Farritor. Could you give us a sense, either Youssef, or Julian about what was his contributions?

[0:30:09] YN: Yes, sure. Luke, at the time, I was working on the first letters, was also working on the first letters, with a slightly different approach. So, I think around June, July, a contestant named Casey Handmer, who was awarded the first ink prize, he found like some patterns by just looking at the scrolls. I said, "Yes, well, my eyes are neural network as well. So, I'll just use my eyes and try to find patterns that makes sense to the ink." He actually found the pattern that resounds with the -

[0:30:36] JMC: Can I stop you for a second? Did you just say, "My eyes are neural networks?"

[0:30:40] YN: Yes. I think he said that somewhere in his blog post somewhere.

[0:30:44] JMC: That's great.

[0:30:45] YN: I do remember either reading or hearing. Correct me, Julian, if I'm wrong.

[0:30:50] JS: I think he said that. He's a cool guy, too.

[0:30:55] YN: Yes. So, he found these patterns and these patterns actually do make up the ink, or at least some portion of the ink. It was successful in finding some letters and Luke started working by looking at these patterns, then trying to train a model. So, supervised learning. This is the patterns that you have, trying to find the ink, which is very similar from my approach, except for you're relying on your eyes to create like the label data. We knew he was the first to win. So, he found that the letters and he wanted like the first place in the first 10 letters that was prize. Then, he was pursuing a mixture of automatic and manual segmentation.

There was one very big question for us, which was where are the letters that we cannot see? So there, if you look at the predictions that we have, there's a bunch of letters missing. There are some areas that the model predictions just go dark. It actually was a lot of effort trying to figure out what was going on there. Is this a problem in the segmentation itself that missed like the sheet, so there's nothing for the model to predict? Or is the data there, but the model is unable to see it?

So, I try to focus on the latter like, okay, what if it's a model problem? And Luke, spent a lot of time investigating this is a segmentation problem. He even like resegmented some of the portions by hand, some of the segments that we had, resegmenting them trying to find what the problem was. Or if there's like some damage in the scroll, like the letter is not there to begin with. It's not possible to find. So, he focused on the manual segmentation, but also explored, like automatic segmentation.

[0:32:32] JMC: So, what this then, this group present? What was the final submission, the contents of it, and many levels. What did you hand over? What did it said? By this group, I mean, the two people that are joining me today, Youssef Nader, and Julian Schilliger, but also Luke Farritor, which we mentioned. What is it that you handed over? And what gave you the price?

[0:32:53] JS: We handed over the code, actually, to produce the images that make up the text. So, what Youssef told us about before, those models predicting the ink, and having the grand prize banner actually predicted. I guess, we had to choose four columns of texts that we thought hit the threshold. So, that was one part of the submission. Also, multiple different ink predictions there. We had two main models, the TimeSformer, another one as well. So, that was the big part of it. Then, as a surplus, because we weren't sure at that time, how much letters we will see if we hit the threshold or not. We also submitted this ThaumatoAnakalyptor initial development tool. So, it wasn't finished in the sense of that you could read the complete scrolls with it. But you could read some ties, some parts of it, like unrolled it, not read, of course. So, this automatic segment segmentation approach is also part of it.

[0:34:01] JMC: I'm fascinated because I don't know if I mentioned at the beginning, but I'm an ancient history major. I'm fascinated by all of this. If I had to pick sides between Rome and its archenemy, well, the Punics, a strand I guess, of Phoenicians, I side with the Phoenicians, I like them better. So, I'm actually very pleasantly surprised that the first word to be not discovered, but read, I guess, was purple. Which is probably making reference to the purple dye that came famously from Tyre. Who knows? I mean, this is just absolutely fascinating.

Again, 1800 scrolls, if I'm not wrong, still preserved, not damaged, that techniques like these and open source technology, like the ones that you guys have developed, and the next - because the challenge continues, and we'll talk about that in a minute, is going to unveil so much information. In fact, before we move on to this challenge to the next edition, this year's edition, 2024, of the Vesuvius Challenge, which is again, and go in, and you guys are involved at a different level. I just heard, I don't know if you guys are aware that the University of Pisa has also conducted a few types of research on papyri from, I believe, Herculaneum. Were you guys aware of it?

[0:35:15] JS: Yes. So, we have two different data sets, so to speak to the unrolled papyri, then still rolled up papyri, and I believe they worked on the unrolled ones. There's also still a lot of research to be done there to have them read properly.

[0:35:31] JMC: So, are you aware if they used your software?

[0:35:34] JS: Oh, no. So they use like spectrally matching, so no CT scanning technology that was used there.

[0:35:42] JMC: I think I believe they unveiled a text about Plato and Plato being returned from Syracuse. That terrible journey, that the Athenians - well, journey, actually battle that they engaged in, in which thousands of the Athenians lost their lives. I think they unveiled a piece of the life of Plato, that was quite obscure. But anyway, regardless, still, a lot of information. Still in the early stages of both techniques, both approaches, both data sources, and I'm extremely - as a consumer of this, I'm extremely happy.

So, tell us, what is your involvement in the Vesuvius Challenge 2024 edition. I guess, if you can give us a hint of where things are heading, are they taking your contributions, making them better? Extending them? Or have other people involved in this year's edition, decided to take a different approach, start from scratch, a different from a different angle? What's going on this year?

[0:36:38] JS: So, the best thing about this challenge is, if you have an idea, you can just do it. No one's going to tell you, "No, you cannot do it." So, all the information from last year is open source. All our tools. So, some people try improving those tools, and already some of the work improved some of those aspects like better imaging, labeling techniques, better labeling. Yes, also new stuff like full 3D ink detection just dropped a month ago. Instead of unrolling the scroll first, just first predict the ink in it, and then unroll it. Lots of different approaches.

For me, I changed sides. So, I'm not a contestant anymore. I'm now part of the organizer team. There, my job is to develop further my unrolling technology, to get to a state where we can all unroll all the scrolls that are in existence and have them read efficiently, economically.

[0:37:38] JMC: What about you, Youssef?

[0:37:40] YN: Yes, so for me, I'm also sort of not competing, but I'm still very interested in the challenge and sort of helping out in any way I can. So, I had like a lot of ideas throughout the past year and early this year, on how to like read the rest of the scrolls, or more generally, like how to not enroll like all of scrolls like Julian, but how to read all other scrolls in existence once they're unrolled. I got reached out by this other project here in Berlin, which is working on Egyptian papyrus. I started using, trying out my model, and I was surprised to find that my model actually kind of works. It's not like, great, but it's still seeing something. From the very beginning of the challenge, I was fascinated by the idea of having a single approach to ink detection that can read every other school in existence. Yes, I'm still fascinated by that idea. I'm still trying to kind of bring it to life. But there are also other projects going on.

[0:38:43] JMC: Amazing. So, I mentioned a while back that I think humanity will be quite grateful with the authors of the TimeSformers paper, most of them Google, I believe, those researchers, I could be wrong, because the implications of down the line of that paper are going to be already revolutionary. I think again, it's going to bring so much good things for technology, humanity in general. I don't think you, guys' impact is as big as, because - I'm not sure. Maybe I'm missing uses of your technology, of the technology that you've contributed to the world because after all, it's open source that may be so useful.

But to be honest, unlocking the amount of information that is going to be unlocked from the papyri villa in Herculaneum first or just that. But other papyri, like Youssef was mentioning, there's plenty. It's going to make us ancient history lovers and in general those seeking to understand humanity, and the history of humanity. So happy, that guys, I'm so in awe with the work you've done. I'm really grateful that you won this prize, that you were motivated to compete, collaborate, and win it eventually. I say this, extend this also to the runners-up. There's plenty. If anyone wants to know about the Vesuvius Challenge, you can just go to scrollprize.org and you'll find all the information about these three guys, the winners, but also the runners-up, and this year's edition.

But yes, I'm just incredibly thankful for your work and your open-source contributions. I only look forward to the next year's edition, well, the winners and see what text they have interpreted or at least provided enough information to read, hopefully. I'm happy that you guys are really involved. I really wish that this has a positive effect in your careers as Ph.D or if you go to the private sector. But yes, that would be my closing comments. Do you have anything else to say about the techniques you've contributed, or the Vesuvius Challenge in general?

[0:40:56] JS: I have. It's just a really cool project to be part of. Lots of nice people and interesting and intelligent people are part of this challenge, and just interacting with those is a joy every day. I'm really glad that I started doing it. For the future approaches and applications, I guess, there might be something. I mean, this CT scanning technology, it comes from medicine, so maybe some of those tools, the ink detection, and the segmentation could be used to segment some part of the human body that don't doesn't really show uprightly or with a big contrast in CT scans, and could help with medicine in general. Who knows? Maybe it's useful for something. Maybe it isn't, but for sure, it's a nice thing to do.

[0:41:48] JMC: Well, ThaumatoAnakalyptor is a miracle uncoverer, so who knows what miracles it will uncover.

[0:41:55] JS: Exactly.

[0:41:55] JMC: By the way, so anyone that wants to reach out to you guys, to fund the repos and collaborate any URLs, any Twitter handles, anything that you would provide, anyone that wants to collaborate?

[0:42:07] JS: So, GitHub is a nice place. My GitHub handle is schillij95, and otherwise, reach out on Discord. Join the Discord community here.

[0:42:19] JMC: The Discord is, by the way, LinkedIn the scrollprize.org website that I mentioned. What about you, Youssef?

[0:42:26] YN: Yes. My also, GitHub, so you'll find all my products, the pre-training, the first letters and the grand prize. My handle is younader. If you had to scrollprize.org, you'll find like links to our repos, you'll find like the solutions of the runner-ups. You'll find like so much information from also previous tools and prices. So, make sure to check it out.

[0:42:50] JMC: Thank you so much, guys. This is incredible. Everyone. I'm so glad you did this. Thank you so much. Take care.

[0:42:57] JS: Thanks for having us. Bye.

[END]