Old books, new technologies
August 10, 2021 3:00pm Online
Can we really teach computers to read the handwriting of the past? What does pigment and DNA analysis tell us about how books were made in the medieval and early modern period?
This session will explore where new technologies meet old books. Using case studies from the Sir Charles Lyell Collection, the Papers of the Scottish Court of Session, and the medieval manuscript collection at the Centre for Research Collections at the University of Edinburgh; Green will tease out the boundaries of where technological exploration of cultural artefacts is taking us and where the new intersections in sciences and humanities might be found.
Please note transcriptions are auto-generated so may feature mistakes.
Professor Jeremy Smith FRSE: [00:00:00] Well, everybody. Welcome to one of our great events in the Curious program put on by the Royal Society of Edinburgh. I’m very pleased, welcome you to today’s webinar. The RSE, as you know, is Scotland’s National Academy. My name’s Jeremy Smith and I’ll have the honor of being one of the fellows in the RSE.
But I’m not going to be the speaker today. Yeah, we’re going to have two fantastic speakers are going to take part in the Curious program. They are Daryl Green and Margarita Vasquez Ponte and they are going to address us on issues to do with book history and examination of ancient manuscripts and books from Edinburgh university.
I should say before we kick off a couple of words about curious, it’s going to run from now the 9th of August all the way through to the 27th August online. And there’s a whole series of events, insights from some of the world’s leading experts across a number of themes, health and wellbeing, innovation and invention, our planet and, and living with co within COVID-19 very apropos at the moment.
So lots of things underway today. So I’m not going to faff around too much more. I’m going to turn round and invite Darryl and Margarita to do their presentation. There’ll be an opportunity for question and answers and this acute cute. This is a webinar there’s a Q&A button bottom of your screen.
My job will be to monitor that Q&A button and if any Q&A’s come up we’ll work our way through it. So there we go. But first of all, at Darryl and Margarita are going to do their presentation. So now the floor is yours, Daryl.
Daryl Green: [00:01:31] Thanks very much, Jeremy. It’s a pleasure to be here today and thanks very much to the Royal Society of Edinburgh for inviting me here to talk this afternoon.
As you are know, as you signed up for this event the topic of today’s event is on book history and more specifically how we’re using or how we’ve used technology in how we’re using new technology to learn new things about old books. This talk emanates from my time, certainly as a, as a librarian at a number of different institutions across this country.
But also pulls in expertise from a number of, of academics, librarians, archivists, fellows of the society. Who all contributed to, to the field of book history and the, the past 25 years or so this top that this talk will we’ll cover a few things. We’re going to look at ways in which we’ve used and are implementing new technology to help unlock the secrets that are still in old books and early printed books and manuscripts.
It’ll feature items from collections from around the world, but more specifically a number of the collection items from the university of Edinburgh’s cultural heritage collection. And as Jeremy mentioned we’re going to be supported by my colleague Marga. Who’s in the CRC in our virtual teaching media lab, who will be assisting in live viewing a few collection items as part of this talk.
So many thanks Marga. So to, to start off the focus of this talk is, is on books. And we should, we should start with defining what we’re talking about with books, but writing surfaces have been around for for three or four millennia. But what we’re talking about here is the codex is, is the folded book.
The codex probably came into existence in the second century,AD a there’s early evidence of folded bits Papyrus. And certainly we have evidence of codex is from the fourth century. So this technology is, is at the technology of the book of the codex, the bound volume. It’s something that’s been with us for going on 2000 years.
And because they’re so numerous both manuscripts, handwritten books as well as printed books they are one of mankind’s most enduring technology. If you think about going into your local bookshop today and pulling a book off the shelf, the technology that goes into making that book that you’re buying today, isn’t that different from a book that was made 500 years ago, 600 years ago the printed page, the folded page, the bound volume, although the processes might be a bit different.
The actual delivery of texts to you hasn’t changed that much. So let’s talk a little bit about the technology of the book. And this is really the, kind of the nexus of what we’re going to be talking about today is how can we explore a history from the perspective of an individual volume. This is the pursuit of book historians, historians at large of librarians and archivists, a number of different fields use individual volumes or collections or, or larger Corpus to help tell the story of our progress of our, of our history.
The, the book we have on the screen here we’ll be learning a little bit more about later in the talk is university of Edinburgh is Ms. 56. This is commonly called the Celtic Salter. It’s an 11th century Salter it’s manuscript. It’s written by hand it’s colored by hands. It’s made by hand.
And if we think of the book as this kind of a confluence of individual industries, individual technologies that helps us as we as we’ll start to talk a little bit later so to, to make a book like this, you need writing surfaces. How do you make a writing surface parchment paper papyrus in this case?
I mean, you will manuscripts produced probably in Scotland, probably in Iona. Has written on parchment that’s animal skin that’s been stretched and treated and prepared by a parchmenter you have the industry of handwriting and ink making a scribe writes, writes the script on the page and is often the one mixing the ink or in some cases mixing the pigment.
So you have the industry of, of mixing, mixing colors and acquiring the materials. You need to make these things. You have the, the process of making a book i.e. Taking the gatherings that you’ve, that you’ve written and sewing them into a bound volume. And then also to think a little bit about how the book technology has progressed.
i.e. To go from a book in the 11th century, It doesn’t have a title page or chapter headings or indices to the high middle ages and late middle ages where you have the introduction of things like indexes, especially to navigate, you know, very lengthy, legal tracks or liturgical trials. So all of these things are kind of coming into this individual cultural heritage item that has moved through time and space and has been collected and moved by different people over time.
And we should mention here too, also the, the invention or the onset of the printing press in Western Europe. By the middle of the 15th century, 1450s, the printing press had come on the scene and this was the major leap in the technology of the book. We went from individual hand copied books to something that could be reproduced over and over and over again.
So before the printing press or, or after, if you were copying out of texts from another text, the text might be the same. I hear what you’re writing down, but it’s going to look very different. It might be decorated or undecorated, or just a rough copy, or with the printing press, you can now make multiple copies of the same exact text and the same exact layout or what we refer to as an edition.
This changed the way that we thought about making books and really push forwards how books were designed and developed and made over time. There’s a lot of things I’m glossing over here, but what I’d really like to focus in on is reproduction. A book, a printed book could be reproduced multiple times and have a certain level of veracity.
What I’d like to focus on here. This book that we’ve got on the slide here is a late 17th century books, 1681 printed in Paris by Jean At this time technology for printing printing illustrations, especially had gotten to points where we could make an illustration engraving of a historical document and then print it, reproduce it.
And this was important for two different two different disciplines, historians, so that you could learn how to read old handwriting and the legal profession because the ability to verify a document that might be presented in front of you to something else was quite complicated. If you think about a land claim and the original documents, ownership documents might be 400 miles away, 300 miles.
But you have a reproduction of a signal. So for example, does zoom in here. So you can see some of the reproductions of from notarial marks and signals that verified that a document was true and real. This is the book technology being used in a new way to reproduce what earlier books earlier script look like.
Photography is something that’s very interesting. Photography and books have been intertwined since the invention of photography. And photography being used to reproduce, reproduce books and reproduce items is something that we’re going to spin on its heads here in a moment. So one of the so 1839 is the year that’s kind of the, the, the coming out year.
Photography a number of different inventors in France and in England invented photographic processes about the same time William Henry Fox Talbot who’s co credited with the negative process and venting the negative process was playing around with these the idea of making multiple copies from one negative before 1839.
And what he was doing it with was a book or printed piece of paper. So in his library and Laycock Abbey he had a 15th century little printed broadsides that he laid on to expose negative paper. And then took that negative and shine light through it onto photo sensitive paper to create multiple copies of a 15th century document using photographic processes.
And so before photography is kind of commercially available or really publicly of notes these chemists, these early pro photographers are playing around with how to reproduce old books. And it’s not too far of a leap to then go from. Photographing a book, you know, creating a 19th century photograph of book to a facsimile, the kind of thing that you can can see, which is hugely crucial for the, the, the, the progress of humanities to microfilm and microfiche and eventually digital books.
Microfilm microfiche are, are, are very important because for the first time that in fact similes for the first time a scholar didn’t have to be in a place where a medieval book was kept at i.e. Before a facsimile or a microfilm was made available, a scholar would have to travel to a library, and this could be, you know, if you’re from north America or, or Southern Europe many, many weeks to travel to a repository to study books.
And so this is really kind of unlocking of the humanities happened in the 19th century, largely due to these new technologies that were coming into play. So reproducing items. One of my favorite stories is about photographic reproduction of books is pre just at the onset of world war two. Just before the blitz for they broke out the British library did a huge microfilming micro microfilming projects of its medieval book.
For fear that they would be lost in the, in the blitz. They made multiple copies of the microfilms and sent them to national repositories. So was a national to France to library of Congress so on and so forth. And then put the books in safekeeping and hope that they survived the books thankfully survived the blitz, but also the microfilms.
did too. And it meant that for the first time. So kind of 19 1940s, 1950s scholars of medieval texts that were kept in the British library could go to their national repository instead of going to London and see some of these books. And for me, even in the late nineties, as an undergraduate, when I was working on the Canterbury tales, I took a trip to Washington, DC, to library of Congress, to see the microfilms that were produced in during world war two of the Canterbury tales, manuscripts at the British library.
So so I could do the work I could do because there was no digital copies of the Canterbury tales. So these things have, have huge effects on the fuel of humanities. And you can imagine going from static or manual photography to digital photography and the adaptation of internet and mass photography by libraries in the late nineties, early two thousands is just another technological step on the same path.
How do we use technology to unlock the book? This year, this is really the kind of field of, of Billy graph. Bibliographic inquiry. That’s on both sensors. So books that have survived for hundreds of years are kind of like immortals. They pick up all kinds of detail and information about the lives that they’ve led and the places they’ve gone.
And so this book, so this book this is a 1489 collection of sermons printed in a Yeah, Venice you can see on the title page there’s there’s stamps and names written in there’s a book plate on the left-hand side, you know, all of these marks. Tell us a bit about the journey. This book has taken from printing press to owners and through space and time.
Books like this they’ll they’ll cross borders, they’ll cross seas. They, they survive times of conflict and peace. They tell us a lot of interesting stories, but there’s also things about these books that they keep their secrets quite closely And it’s only in the last 20, 25 years that the field of conservation science has really unlocked a lot of information about medieval manuscripts and early printed books that has really advanced our understanding of, of the early lives of some of these books.
So the first question I’d like to talk about is a fellow of the Royal Society and the recipient of the Sir Walter Scott Medal in 2019 and a friend Professor Kate Rudy. Kate who’s based at the university of St Andrews was pioneering in her adaptation of technology and working in this field of conservation science which came out of the mid 1990s.
Which brought together preexisting scientific instrumentation developed for other fields into the realm of cultural heritage conservation and, and, and investigation. And Kate’s had a very specific research question in mind. So Kate had spent a lot of time at the Getty in California, the Koninklijke Bibliotheek in the Netherlands.
And of course at libraries here in the UK, So Kate’s a medieval art historian and through her work. Examining hundreds of thousands of medieval books. She began to, to want to quantify the dirt marks that we see in medieval books. So part especially books parchments the grease and dirt off of people’s fingers.
Over time, we’ll leave marks on the page. And as Kate was working, especially through devotional texts of books, hours and, and Bibles and things, I would notice that there are certain pages that would have darker smudges than others. So Kate’s, you know, could, could vocalize this and describe it, but really wanting to be able to quantify this and show that these marks I have I have quantifiable data that can show us how well you or how well looked out these pages are because unless a book has been written on, we don’t know how much somebody has read or look at a page.
So Kate who has a lot of exposure to the creative industry as well. A new tool called a densitometer. A densitometer is a non-invasive device. That’s used to measure the density or the thickness of a film negative. So this is used in the film industry and photography industry, and Kate thought that perhaps using a densitometer and using.
A blank page or a very light page as a sample. And then using that to measure the, the darkness on the pages, one could start to actually quantify how much use a page had in a manuscript. And she was right. So this is just an example of a book that’s And the graph here shows what could be visually described.
You know, it’s quite, quite clear that where you have the most common liturgies, the most common prayers, the limited pages that they’re going to get the most use, but she was able to quantify it and to do this over hundreds and now probably thousands of books to help show a dataset that that now tells us more about how these books are being used.
The next technology I’d like to talk about is was developed by a team led by Sarah who was a Marie Curie scholar at York. Who’s now based at Cambridge at the McDonald Institute for our archeological research Fiddyment’s team called Beast to Their question was, can we identify the species and subspecies of animals used to make parchment, right?
Parchment has been around for more than 2000 years as a writing surface for about 2000 years. And as certainly, you know, as a material for other things for longer. But it’s quite hard to visually identify what animal that parchments could be made it because the parts have been scraped down and that follicles are removed.
And it’s quite hard to, to know these things. So Sarah, who is a, a con a conservator and a chemist. Developed a and her team developed a method of protein and peptide analysis using very very very, very, very tiny shavings of the parchment, the razor shavings of the parchments to then determine what animals.
And indeed what’s a sub species of animal. And to then geo locate that species as well. And this has really revolutionized how we understand the making of books that are, that are written or printed on parchment. And again, she’s creating this data set that’s. Now individuals can take samples of the manuscripts in their collection, send that off to her team in Cambridge and get analysis and results to understand where the parts might have been in made.
For the case of books, we don’t know that much about that’s extremely helpful. And I’ll show that in moment. Microscopy is also something that has been brought into the field of, of book studies. Certainly within the last 30 or 40 years and high resolution digital microscopy is, is something that is really revealing.
This is Dr. Christina Duffy. Who’s at the British library. She’s an image technician and this, and the right hand side is the Lindisfarne gospels. Probably one of Britain’s most well-known. Medieval books, but not a very splashy page on a very flashy pretty page. Not on that on purpose Duffy and her team at the British library have been using high resolution microscopy to look at the materials made for decoration.
And what we’ll see here is we’re zooming in on this this bit of illustration on the right-hand side, the decoration around the text is when you get in really close. So this is at 500 microns. You can actually see the individual elements than individual drops of pigment that have been put onto the page around or before the text.
In this case, this is toasted lead used to make thousands of little dots, this kind of cracker crackle. It showed a pattern behind the texts that you see there on the right hand side. So being able to really get up close as telling us the materials and from the materials as we’ll see with the next slide as well, the materials opened up a whole new line of inquiry.
Pigment analysis is also on, on, on the rise. So Andrew Bebe, professor Andrew Bebe, who’s at the university of Durham has what’s known in the profession as team pigment. Developed a new technique using a spectrometer to analyze pigments, to being used and medieval books, manuscripts, the spectrometer is reflective reflective technique that then reproduces a graph.
And that graph is compared to a database that he’s developed to identify what actual elements have gone into the making of a pigment that you see in the medieval. Now for us. That’s very interesting. And I’ll show you why in just a moment, I’m going to hand over to Margo now, and we’re going to look out, I mean, you will book from the university of Edinburgh’s collection.
So this is again MS 56, the Celtic Salter Book, as you can see is kind of a hand sized medieval book that was produced in the 11th century. We think we’re fairly certain, it was produced on the island of Iona in the monastery and Iona. On the west coast of Scotland and it’s rumored, although there’s nothing to verify it, that it was commissioned by Malcolm, the third king of Scotland for his wife, English wife, Margaret.
This book is interesting on a multitude of levels. First and foremost, if it indeed was made on Iona it is therefore then the earliest surviving book. Made in Scotland to still be in Scotland. There are earlier Scottish books that were produced in Iona and other places that are now in Dublin and Cambridge and London.
But this wee book we don’t think has left Scotland. However, we don’t know much about it and that’s largely due to the binding as Margaret showed there at the beginning is a modern binding. It was applied in the late 19th, early 20th century. And so a lot of the contextual information that would be found with the book has been lost.
So we don’t know who owned the book. We know it came to the university and the early 17th century. But that means for 600, 650 years, this book lived a life that we just don’t know that much about. There’s things that, that we’re starting to, to find out about it. And Margaret’s gonna settle in on this page.
You can see that she’s wearing gloves normally with a medieval manuscripts. You wouldn’t wear gloves, but because this book is so wee, and also because the pigments are right to the page edge, she’s wearing gloves for this book and she’s using weights just to, hold the page open so that she can hands-free use our zoom technology to zoom in on the book.
So as I was talking about a team, a team pigments and what we call it, team parchment or Sarah Fadiman’s group, but Cambridge this book has been a subject of study of both of those groups team pigments Andrew Bebe and his team came up to Edinburgh in 2019 summer of 2019 to look at this book and a few others and examines the colors that we see.
So the colors that we’re seeing here, the, the purples and the oranges and the yellows and the greens all of these have been analyzed and are largely unsurprising pigments that are, that are produced in the British Isles, especially in Scotland. These are largely vegetable or mineral dyes they’re made by with local local elements.
However, Marga, I don’t know if you can slide the book around to a blue there’s a blue initial on the right-hand side. Let her catch up with us. What they found with the blue is that the blue pigment, so blue pigment in most insular books. So these are books that are producing. The British Isles has made from woad, which is a vegetable, vegetable dye, vegetable pigments but the blue that’s been used for the Celtic Salter is actually made from lapis lazuli.
Lapis is a precious stone precious mineral that comes out of the central Mediterranean near east state Asia. And as quite common in oils, you know, a very fine oil paintings used for jewelry and decoration uncommon, especially in, in insulin manuscripts for pigments so you know, this opens up a whole new line of inquiry is to, did somebody have lapis on Iona?
Did somebody you know, where did that lapis come from and how did it come to Iona? Or did somebody have the pigment pre-made somewhere else and brought it to Iona and who thought to use it in this manuscript or who asked for it to be used in this manuscript? So all of these elements start opening up conversations.
Thanks Marga. I’ll also say this manuscript has been and put to the study of the parchment group and we haven’t had results back yet as they, the analysis was just sent off just before the pandemic. But we’re hoping to get some results on the parchment analysis here as well. Side of things that I wanted to talk about today is using technology to help read and understand books.
And this is something that we’re working on quite closely at the university of Edinburgh. So book collections and especially manuscript collections are vast and extensive and it’s quite hard for individuals unless you’re on site and calling up lots and lots of books to read through things.
And so there’s been a lot of work in the last 10 years or so to use technology, to help unlock the information that day that that’s within. Early printed books and manuscripts. So what you see on the screen here is a collection of notebooks from geologists, sir, Charles Lyle who was a self-taught geologist and polymath and man at large in the middle of the 19th century.
And his notebooks were acquired by the university in 2019 in a major fundraising campaign with the further accrual in 2020 of his correspondence, archive and ephemera. The notebooks here are really, really interesting because here you have a notebook that span 50 years of his life as a geologist and traveler and all in his hands and all from his mind, this is kind of his mental laboratory of how he worked out his ideas, which eventually made its way into publications, such as the Elements of Geology.
These notebooks equate to about 30,000 pages of handwritten notes, several thousand letters that are both outgoing and incoming letters and lecture notes and ephemera and things. So it’s a huge amount of The notebooks themselves are also full of his own illustrations drawings often in early youth with newspaper clippings and sketches and things.
So th th they’re often, you know, the kind of, as he was in London or if he was traveling to Italy or later to, to north America, you know, kind of his own folding collection of ideas as well. But we’re face with an interesting problem here. We have this fantastically rich resource. But how do we actually unlock it to make it accessible to researchers as quickly as possible?
So first few steps are conservation which we’re getting conservation project off the ground for moments and archival description to make sure that they’re properly described in order. But then how do we actually get access to the content? Well, Back to photography and digitizing or digitally photographing these items.
So brief catalog descriptions are made by, by archivists. But we started to play around with new technology, which is, is a handwriting recognition software. Here we have a Corpus of material. That’s all written by one person. It’s over 20 sorry, 27 to 30 years, 50 years, including all the letters and correspondence.
And it’s also in quite a scruffy hand. It’s like my notebook, he’s writing quite quick and quite fast, so quite hard to read. But there there’s a software that that’s been developed in the last few years and especially one company that we’ve been working with called transcribing. That actually has progressed the ability to recognize handwriting.
That’s done through series of uploading images of handwriting and transcribing and providing a model. And then you can ingest a huge amount of data. Into into the model to produce transcriptions. And so does our plan to photograph all of the notebooks in the collection and to put them through transcription so that not only can you read the archival descriptions, but also once these have been transcribed, you can do free text, word, search, keyword search.
You can start to create indices. That can be navigatable. You can look at individual names, you saw Darwin pop up there. So it makes the unlocking of the information that in these notebooks that much more quickly accessible. And these notebooks are of interests to not only historians of science, but folks that are interested in the history of travel.
Lyle was in Italy, in north America several times, as well as other places. So there there’s so many different ways that these notebooks could help engender new research through technology. Last thing I’d like to talk about is using technology to help us describe it. Most most people would think that libraries have all their books on an online catalog.
If you go to a library where it’s a public or a university library that if you drop in a search and you don’t find a book, then that means they don’t have it. The fact is, is that certainly the libraries in Scotland and further fields most of the historic collections are between 50 to 70% represented on a library catalogue.
And that’s due to resources and the need to, to catalog new books coming in because of modern demand of student demand. But also some of that is because there’s some of our historic resources are just really difficult to deal with. One of the, the, the resources that we’re really interested in at the university on, in wider and Edinburgh is what I call the Scottish quarter session papers which are the printed materials that were submitted to Scotland, Supreme court, as part of the litigation process.
From about 1752 the end of the 19th century and these are printed documents that encompass the period of the seven years war in French revolution to the great reform act and Scotland university reform acts in Scotland and so on. So they’re a wealth of information. Before I start talking about them more, then I’m going to hand back over to Marga.
Here we go. So a paper session paper is an individual printed documents, kind of eight, eight to 16 pages in length and very formulaic. And this is really important. You can see here on this page that you’ve got dates and individuals who are the litigators are the advocates and who’s presiding over the court the courts case and then th the details of the case continue on.
As we can see from this volume as Marga turns over the page, is that almost all of the court of session papers in the university’s collection are also heavily annotated and they’re annotated by the advocates that might’ve had these papers to hands they’re annotated by those who attended the, the various trials.
They’re annotated in some cases by judges looking at results. Is it really the, kind of the, the working papers, almost like the papers you might get, if you’re going to a committee meeting or something like that, that you scribble all over. These are they, but these are printed at a printing press maybe in a run of 10 to 15 copies, very few copies.
So the survival rate is really low and because they were designed for quick and dirty use they’re printed on relatively poor quality paper. They’re also really heavily used the three libraries and Edinburgh that house collections are the university the signet library and then have no surprise whatsoever.
The advocates’ library. The advocate’s library has over 3007 hundreds volumes of these session papers. So at one volume might have a hundred different tracks within it. The advocate’s library has over 1,300 volumes and the university of Edinburgh has about 420 binds to these papers. So it means across just those three institutions, we have about 250 thousands.
Papers of the court of session, individual documents, and these documents are so rich. They’re full of information about individuals, about people, about transactions and these also, in some cases, people that don’t have any other historical record it’s home For these three institutions and then now partner institutions and in America, it’s, it’s long been an ambition to, to unlock the information that’s within, within these papers, but it’s been very difficult and it’s only the onset of new technology.
That has really allowed us to start experimenting and playing around with how we might be able to do this. So several different trials or phases have been conducted in order to understand first what these, what these papers need. You can see they’re in all different sorts and states the conservation of these across multiple institutions.
There’s a lot. I mean, there’s a lot of photography that needs to happen in order to unlock this. And we also, you know, once we get past photography stage, well then how do we open up access beyond that? Because if you just have a pile of photographs, you still have to look at them to sort through them.
So the second phase working on the session papers began in 2018 and 2019. Which was both digitization, but also working with our colleagues in the university of Virginia to look at how we could not only hosts images. So this is really important to be able to host images and to see them very quickly, but then also how to start unlocking the data that’s within the session paper.
And that’s both OCR optical character recognition, which has been around for a long time. But to, to put that on, on mass scale with historic documents has its own pitfalls and also to develop a tool, a piece of software that could automatically create records without an individual having to go through and scribe each individual case.
So the question is, you know, how do you go from a document to data to actually something that is find-able that is you can actually search for individual terms or people or, or place names within this huge Corpus of text. And that’s the work that’s ongoing currently this last year, again, with our colleagues, the university of Virginia and developers at university of Edinburgh and archivists and librarians we worked together to start developing this tool.
And two experiments that the limits of what the technology currently allows to explore this huge Corpus of work. And right now it’s working pretty well. So this is kind of behind the scenes of some new software. That’s still in development, looking at how once something has you know, once a Corpus of texts has been digitized, so we can use OCR and then underlay that underneath the digital images.
How you can then search around for individuals, names, roles, place, names, or topics such as slavery. And this is fairly revolutionary. It’s using a number of different technologies that have been around for a long time. So scanning books and OCR and so on. And so. But allowing us to, to, to serve into, to move across these historical documents in ways you would never be able to do with the physical item, you would have to sit and sift through hundreds of volumes to find what you’re looking for across multiple different institutions.
When instead, what we’re our goal is, is to create one place where all of the session papers can be brought together and you can run a search for a person’s name. So for example, we’re seeing Fraser of Kelso, or you can search a topics such as slavery or flax or whatever else you might be looking for and see where you might go.
This is super important also for folks that are interested in geology or local history, national history, international history. And this is a period when Scots were, were abroad in so many different places actively part of communities and so many different places. And the courts often is a witness to those activities.
Through economical transactions or otherwise. So this is really kind of cutting edge as well. What we’re doing with at the university of Edinburgh with technology and books, I think I’ll finish on that note and I’d just say, thanks. Thanks very much to everyone. You can see my email and Twitter handle here.
If there’s anything you hear that that sparked your curiosity I’d love to know about it. And I’d love to hear, questions that you might have as well. So thank you very much. And I’ll hand back over to Jeremy
Professor Jeremy Smith FRSE: [00:34:26] Thanks so much. Daryl for a fascinating presentation. There are already questions beginning to appear in the Q&A which I think indicates how opening up this, this material, which is what you’ve been doing here to a general audience is so provocative.
The first question is from someone from Fiona McFarlane, she asks, what is the error rate for transcribers? Does it ingest dissolvable? Every character begins. Well, how does it know the handwriting subject?
Daryl Green: [00:34:54] Yeah, absolutely. So it’s a very good question. So with transcribers let’s start with how it knows first and then not, and then I’ll go to error rates.
So how it knows is in order to create a model for an individual’s handwriting you need to ingest a certain number of images and then a certain number of transcriptions of those images. So it can learn what a character looks like spatially. And so once you have created that model and that model could be for a 12th century hand, or a 21st century hand or anything in between, as long as it’s the same person then it can, it, it creates a model and can start to recognize characters.
And what’s fascinating about it is not only does it recognize it. Lyle. So in Lyle’s case, it recognizes Lyle’s notebook hand, but if you put a notebook, a notebook hand through the model, and then you put a letter, which is in a much cleaner, more presented presentation script, the model recognizes it as well.
So that’s how you create the model. So it’s still labor intensive. It’s not just automatic that, you know, an individual has to be able to read some of that handwriting. So you have to have some specialist knowledge to read some of that handwriting to create the model, but instead of having to transcribe all of the work that’s there.
So for example, the Darwin project that has taken over 30 years of scholars transcribing every individual letter of Charles Darwin. And putting that into both published and databases where instead the computer can do the majority of the grunt work. Once the model is developed accuracy rates, we didn’t know how things were going to come out with Lyle.
But we have been running tests and samples on about four notebooks which just concluded about three months ago. And the accuracy rate was about 90%. Which is, which is astonishing. It was pretty remarkable. What we would like to do now, I mean, Lyle is still the focus because that’s one of our current live projects, but we’d like to experiment with earlier and later hands as well to see that there’s trend the community using transcribers is quite broad now.
And there’s groups all across European research libraries that are using transcribers. So there’s models out there for medieval hands and continental hands and things we’d like to see.
Professor Jeremy Smith FRSE: [00:37:00] That’s astonishing and I imagined it can make a contribution to some of these next questions about whether hand-writing is the same person or not different hand’s.
Yeah. Which of course raises a lot. I’ll keep going with that. The next question I’ve got on here is from Elizabeth, she says in the little stuff. Oh, the yellow and orange led the purple, whoa, most lean mineral vegetables. She asked that
Daryl Green: [00:37:21] question. Yep. I knew I was going to get out of crest question when I brought up the pigments.
So I know one or two of them. I’ve got some of the reports in front of you. So the orange is definitely the lead’s a yellow and the green are vegetable. Although I have a vegetable, I can’t tell you exactly what that is. I don’t have my notes in front of me. The purple is the type of woad. So that again, that’s a vegetable.
That again, that that’s pretty common in, in a lot of insulin manuscripts.
Professor Jeremy Smith FRSE: [00:37:46] Yeah. Fascinating stuff. I was very struck when you talked about the lapis, which of course is a, it’s a it’s, it’s a cultural thing, isn’t it, it’s sort of an accent. It tells us something about how people value the texts.
And that’s what I think is so illuminating about what you’re doing. And the next question I’ve got here is Marina. And she says the Scots archive is fabulous for historical linguist, to the kind of people who love crunching large quantities of data and making a drawing conclusions from that,
Daryl Green: [00:38:15] I’ll say also there’s two things there from me personally that are fascinating.
One is his place names cause Jeremy, you, and I know the place names are really important when they change. But to also that the quarter session papers does record Scots language as well. So we have 18th and 19th century witnesses direct witnesses to Scots language being used as well, which is uncommon.
Professor Jeremy Smith FRSE: [00:38:35] Fantastic. It’ll make a major contribution to things like the Corpus of Bon Scottish writing, which I would love to see attached to that. And it really opens up the same sort of way at the same that the old Bailey project did for voices of the old Bailey only. This is a more set. I have a question now from my friend Graham, who says, can transcribers deal with, let me just move this down a wee bit.
The many, medieval early modern medical manuscripts and things like that. Full of relatively obscure abbreviations. Abbreviations. Yeah. Yeah. That’s a good question.
Daryl Green: [00:39:08] Absolutely. So transcribers can deal with these things. It’s transcribers, it’s all based on how good of a model you create to adjust things.
And so abbreviations, they’re the bane of any paleographer any, any reader of early handwriting. But they’re rather formulaic, you know, once you get into a genre, Medical manuscripts. The, the, the, the abbreviations are usually quite regular in their usage. And so if you can train a model transcribers to recognize those, then yeah.
It can recognize those things.
Professor Jeremy Smith FRSE: [00:39:38] I’m fascinated, especially with dealt with dealing with other languages. You know, I still have my handy copy of Capelli the distributor relations by the side. If you could train it. Okay, next question is from Alison Steenson. My question concerns mixed or miscellaneous scripts, and thinking about national library, Scotland as a drummer, the force of the collection as I take it, the manuscript is gigantic.
It’s tons of different material hands. How did we get about trying to create a model for that kind of text?
Daryl Green: [00:40:08] Yeah, that’s that’s, that’s, that’s where technology is currently up against the glass glass ceiling. I think. It’s so transcribed. This is really the strength of transcribers is based on a model of a hand.
It can recognize over and over again. Right. When you have a composite manuscript or manuscript, we have lots of different people writing in or indeed that’s where it falls down with correspondence, as well as that, you know, for example, the correspondence archive that we have with Charles Lyle has letters from hundreds of different people, but you can’t train a model for a hundred different people.
It has to be on one individual’s handwriting because each individual writes in a slightly different way. And that’s, you know, the same for an anonymous scribe of the 15th century or a scribe of the 17th or 20th century. So I think, you know, there, there there’s still room for technology to grow and to adopt but also there’s still room for the traditional paleographer is, and in historians using traditional skills that are still necessary.
Professor Jeremy Smith FRSE: [00:40:59] Yeah, that’s fantastic. It takes you into the whole realm of what we mean by interdisciplinarity., doesn’t it, for opening up this stuff? I was hugely impressed by the range of technology. Bio code ecology was comparatively new to me. I said this, but I hadn’t. Displayed or discussed quite that way. Yeah.
Someone said to me that you can smell a manuscript, which is an area where you got it. Was it someone talked about sniffing at the codec exam, “sniffing the books”. Yeah. Yeah. Their claim was that it smelled a bit goatee. I’m not quite sure where to go with that.
Daryl Green: [00:41:32] There’s a there’s actually. So there’s a group in Oxford or they’re based in Oxford and Harvard’s that are all factory chemists that are working on breaking down the actual chemicals that are in the air when you are in a library.
So a kind of historic college library, or when you’re faced with a medieval documents. And so they’re working down breaking down the actual particles that are in that experience. And to recreate them as well. So you can imagine, you know, in a year’s time you might be able to have I don’t know, a Corpus Christi, Cambridge candle.
So you can smell what it smells like inside of the Corpus Christi Cambria.
Professor Jeremy Smith FRSE: [00:42:04] Yeah. Is it a goat or is it a sheep? I’m telling you the question I’ve told the question is actually not, not that it’s easy. In medieval times and classical antiquity it was a close call.
Daryl Green: [00:42:16] I think I’d prefer a sheep candle to a goat candle.
Professor Jeremy Smith FRSE: [00:42:19] Oh, my friend Wendy asks question, which I think is, ah, here we go. Thanks for outlining the various techniques be present. Do you know of any team? Ah, Is trying to match scribal hands across manuscripts using visual computing methods also
Daryl Green: [00:42:38] Yeah. So th this is a hot potato.
So there, there are, there are folks that are doing this both with English language manuscripts, as well as Latin manuscripts and others. And it’s difficult. It’s so there’s. Manuscripts and then there’s documents as well. Right? So you have scribes that flit back and forth between copying out medieval books.
And then her also employed as a professional ascribe to copy outs, legal or court documents. And so I think we’re getting to a point now where we could probably use the existing technology to start doing these things. It’s also getting things digitized as well. And this is always the crux of the matter is that in order to do this, we need the building blocks to build these tools.
There’s been great deal of efforts in large national institutions to digitize the code. But the, the, the vastness of the places like the national archives, so the national records of Scotland and to digitize the material that you need there too, as comparative analysis is still behind. So there’s work that is ongoing and there’s visual analysis of scribes as visual analysis of woodblocks, you know, illustrators that’s ongoing.
But it’s, it’s a new, new area of research.
Professor Jeremy Smith FRSE: [00:43:41] Yeah. Cause one of the questions always comes up sometimes comes up. Is that because of scribe work across different kinds of document, different kind of book and different kinds of production, maybe they adopt different kinds of script or you have a slightly more formal or less formal script for, I guess the next question can transcribers handle that sort of thing.
When one is writing casually as opposed to writing sort of a bit more in best.
Daryl Green: [00:44:08] Yeah. So we certainly seem to be able to handle something like that with, with a 19th century hands sort of flipped back and forth between a more rough notebook. He hands in, in Lyle versus a more presentation hand that you would see in a copy letter or letter it handled that fine.
The question of, would it be able to recognize a scribe? So we’re talking about medieval scribes. Hmm, that would, that would be trained in multiple hands and would be often advertising the wares and multiple hands. Not only there’s there’s issues that because letterforms change. Some of the, the, the tropes do bleed through, you know, when somebody’s switching a hand, they still use the same abbreviations or the same flourishes on initials.
Much more difficult. So a rough answer is probably but it would boil down to somebody being able to set up a model.
Professor Jeremy Smith FRSE: [00:44:53] Yeah, yeah, yeah. That so the question is coming up and we’ll get someone who’s been printed books, manuscripts, not really a question regarding handwriting working with printed books from the 18th century.
I was wondering how to differentiate between long as normal S L’s and F’s of words. I take it as a training.
Daryl Green: [00:45:09] Yeah, that’s that’s that’s the bane of anybody working with early printed books is the long S and F. So one example, I didn’t show you in the current quarter session papers was that it wasn’t picking up on Kelso.
But it will pick up on because the S transliterates two and a half and OCR that still takes manual correction. It gets it about 60 to 70% of the time, but there’s still some things as a researcher, you need to know that if a, if a result isn’t coming back or something, that’s very common like Kelso then try Kelfo instead
Professor Jeremy Smith FRSE: [00:45:42] Interesting. It’s a question of when you transcribe or you transliterate, that’s such an intriguing question. Yes. Here’s one. Oh, here’s an interesting one. Have you heard about the idea of using software for analyzing genetics to build systematic trees of manuscripts?
Daryl Green: [00:45:58] Wow. That’s a great
Professor Jeremy Smith FRSE: [00:45:59] question. I’ve come across that, but it’s an interesting question.
That’s my one that crosses over. Yeah. Well, whatever it is that as, as us one there I just don’t see some of them seated dish of brown. I think I’ve always, here we go. Here’s aren’t going back to the top of games. It seems to be one of those things that zoom does with it’s Q&A and a here’s another one from Wendy you could use crowdsourcing to build a database, people could upload their own images with better data.
Just like in that front street, quick visual computing. And someone likes that. I don’t know who it is, but someone has liked it.
Daryl Green: [00:46:33] I mean, that’s certainly, so that’s one of the ways that I think there’s a group in Spain that looked to build a model of medieval Spanish hands. Manasseh hands using crowdsourcing.
The bit about the, sorry, just going back to Allison’s question about the genetics and systematic trees. We’re kind of getting into the, the, the, the depths of codicology and book history, but yeah. Yeah, absolutely. I mean, there’s there’s groups. I think it was Peter Robinson. This group did a lot of work in the late nineties, early two thousands with the digital Chaucer project on how you could demonstrate the relationship between different copies of the Canterbury tales digitally where that technology has gotten I don’t know.
Professor Jeremy Smith FRSE: [00:47:10] I’ve certainly said that maybe with classical text, I see Alison to that is all an extra one saying I’ve heard about it at UCL dropped their patio, obviously summer school. So the whole idea of getting material in these complicated ways I think is, is, is, is fascinating. Yeah, the other one that struck me, so forcibly was which Katherine Rudy’s with the densitometer I think he’s fascinating because it shows how people are actually going into the book and where they’re actually focusing.
Yeah. So, so you get a sense of how things actually used.
Daryl Green: [00:47:41] I mean, for, for me, I kind of, it puts tingles on your arms a little bit because there you get, you get the ghost of a reader. Where, you know, only if somebody has marked a page. So if you’re looking at a medieval book and, or an early printed book, you often see lines or a little pointy hands or whatever, the showing that somebody has to engage with the text, but there’s really no other way to know if somebody actually spent time with the book that you’re looking at.
Kate has shown is that you kinda knew, I mean, you saw her thumb prints, you know, holding that little book of hours open and then. And know that somebody 500 years ago was holding that book in the exact same way, looking at that page in the exact same way and spending time with it as well. Yeah, it’s fascinating
Professor Jeremy Smith FRSE: [00:48:19] And you can watch, even the things like the finger moving across the page, how did you actually. Yeah, that, that raises questions about how people actually read this stuff. I think it’s fascinating because it add on to the people who worked on annotation that used book thinking that people like Bill.
Oh yeah. I’ve just I’ve set a little personal, say that we’re coming up for two. This is, this is the Royal Society of Edinburgh, the hidden people who are telling me to keep tracking where we’re going how, the, how the event. I gotta have a quick last check through to see whether there’s any more questions at the come in.
And. Oh, here we go. Alison that says that like was me and I definitely collaboration or something like that when these cases I’m looking for collaborate. So I, I think we’re all into this, into this mode now. What I think is one of the many things to come out of today that has speed the notion of collaborative activity between people work different approaches and how that is that the new ways of getting this is complicated material.
It’s been fascinating. Absolutely fascinating. Thank you so much for the presentation. It’s it’s been great to hear you talk about these things. I’ve heard you talk about off and on, but it’s just lovely to see them all in, in that way. It’s hard in the world. Especially when we’re in this strange glass world of, of, of webinars to all clap and congratulate.
So, so I, I, I’d rather take it on myself to sort of represent everybody here as a congratulatory voice. So thank you so much. And I hope you can come up. I hope you and your team can come again.
Daryl Green: [00:49:49] Thanks to Marga, thanks to you for, for beaming into this as well.
Professor Jeremy Smith FRSE: [00:49:53] Absolutely. Well Marga, thank you and Daryl and thank you everybody for asking some really interesting and provocative questions.
Look forward to, I have you go there’s yet more of a competent many thanks. Great session. Thank you. Thank you. And someone says computer. This is Maria Jose last feature assisted transcription tool for ancient documents, state. Yeah, that’s true. Yeah. Great stuff. Well, look folks, thank you again. And look forward very much to see you all again and thank you for calling.
Thank you. Cheers. Bye.