ChatGPT-4o for Transcriptions

+18 votes
245 views
Now that the new version of ChatGPT can process images, I asked it to transcribe a 16th century English will for me along with the Latin probate. The image quality wasn't great being one of the literal black and white images from Gloucestershire archives, and the handwriting on the Latin probate is truly appalling.

It did an unexpectedly good job, better than I would expect from most people. Whilst it was familiar with the fact that the spellings wouldn't be standard, it didn't get them all correct, and it consistently, on it's first pass, rendered Coop as Doop. It missed the abbreviation of the name Cooper, but caught others. It couldn't decipher the date of the will, but did make that clear.

It was clear that it was anticipiating for the Latin probate, unsurprising given it's nature and the poor quality of the handwriting. It picked out the pertinent points, but was filling in with incorrect text, and I couldn't get all of it myself.

So whilst it wasn't perfect, and not quite good enough for a reliabbly spelled transcription, it was certainly good enough to get the content, and responsive to corrections to give a better transcript. I would say its certainly good enough to get a first draft for correction, rather than doing the whole thing from scratch. I haven't tried it yet, as I deliberately wanted to see what it made of a hard example, but I suspect the accuracy would be much better with a better quality image and more consistent more modern handwriting, such as you'll get from about 1800 on.

A downside is that the free version is rate limited to a certain number of messages in 3 hours, which I think is 100, but completely certain of that, which is why I've only tried the one image so far.

Incidentally, the Free Space is my transcription, not ChatGPT's. just to show what I gave it.

For some reason the ID/URL field is consistently returning that this doesn;t exist: https://www.wikitree.com/wiki/Space:Will_(1597)_and_Probate_(1598)_of_Henry_Cooper_of_Marston_Sicca%2C_Gloucestershire_-_Glos_GDR/R8/1598/50
in The Tree House by Nicholas Adams G2G6 Mach 1 (10.8k points)
Thanks for sharing that. I must give it a look since Transkribus free version is no longer viable due to wait times.

Edit: PS. The link works fine for me.
Just listened to a talk that compared Transkribus and ChatGPT-4o and Transkribus does better. ChatGPT still hallucinates at times and sometimes corrects things that aren't wrong (due to being a language model where Transkribus doesn't process language). ChatGPT can still be a useful tool but don't expect it to be perfect.

3 Answers

+10 votes

I just tried ChatGPT4o on an early 17th century nuncupative will, with VERY poor results. It almost entirely hallucinated the transcription. Here's the original image:

I asked ChatGPT4o to "Please transcribe all the words in the attached document. The document consists of an early-17 century English nuncupative will and a record (in Latin) of its probate. Please make the transcription word-for-word, with the original spelling and word order. Do not make up any words. Indicate if any words are illegible by inserting an ellipsis where the illegible word was." Unfortunately, ChatGPT4o skipped a lot of the will without indicating any omissions and made up (hallucinated) most of what it claimed to have transcribed. Here's it's transcription of the nuncupative will:

M[emoran]d[um] that Joane Chase of Ipswich widowe doas shee lyinge vppon her death bedde about the tenth daye of Aprill Anno D[omi]ni 1627 did make and declare her last will and testament nuncupative in manner and forme followinge vizt

Inprimis shee did commende her soule into the handes of Almightie god her Creator and her body to the earth whereof it was made

Item shee did give and bequeath vnto her sonne Henry all her goods and Chattells moveable and vnmoveable whatsoever And did make the sayde Henry her sole Executor

Item shee did desire and appoint Mr Edmond Alyne and Mr Roger Barbor to be the overseers of this her will and to be guardians to the children of the sayde Henry and to take care for theire bringing vp till they should come to lawfull age

In witnes whereof the said Mr Edmond Alyne and Mr Roger Barbor have hereunto subscribed their names

Edmund Alyne

Roger Barbor

by Chase Ashley G2G6 Pilot (317k points)
edited by Chase Ashley
+4 votes
This is very interesting. I use ChatGPT daily, but I never considered using it for this kind of task.

I have a vast collection of family letters that I feel will yield countless gems of our family's social history.

I've been putting off starting this mammoth job, but now I can see how AI could do the heavy lifting.

I'll do a trial and report back here on how it went.

Thank you for the heads up!
by Steve Serowka G2G6 (6.5k points)
+7 votes
I have just tried this with a 1710 document, an image from a legal case with readable text which I recently transcribed for my (non-genealogical) use.

ChatGPT completely hallucinated the response.  Instead of the document about land in Warwickshire it gave me a transcription about something in Glamorgan, with names and dates that were nothing like those in my document.  If I had asked it to create an example of a fictional document I might have expected something like this but the results had no relationship with the image I uploaded.

Maybe it would work with a simpler document.

Think I will be sticking to my own transcriptions.
by Peter Mason G2G6 (6.4k points)

Related questions

+9 votes
1 answer
+9 votes
2 answers
+5 votes
1 answer
292 views asked Dec 9, 2022 in Policy and Style by Betty-Lu Burton G2G6 (6.5k points)
+20 votes
6 answers
+3 votes
1 answer
+6 votes
5 answers
341 views asked Mar 5, 2018 in The Tree House by J Palotay G2G6 Mach 8 (90.0k points)
+6 votes
0 answers
127 views asked Jan 8, 2018 in Genealogy Help by Melanie Shebel G2G2 (2.7k points)
+5 votes
3 answers
+5 votes
2 answers

WikiTree  ~  About  ~  Help Help  ~  Search Person Search  ~  Surname:

disclaimer - terms - copyright

...