We’ve heard the horror stories about people using AI Artificial Intelligence with disastrous results. (Like the lawyer who used ChatGPT to write a brief, only to have it create fake legal citations.) So it might surprise you that you can use AI and things like ChatGPT for your genealogy research successfully and accurately. Let me show you how.
Click the play button to watch the video below or keep scrolling to read the post.
What is Artificial Intelligence?
Artificial intelligence (or AI) is the overarching field. As IBM defines it, “artificial intelligence is a field which combines computer science and robust datasets to enable problem solving.
Chances are you have already used AI in your genealogy. For example, when you start typing a search into Google, the list of possible searches that you get is generated by a form of AI.
FamilySearch also uses AI. One example is when you’re looking at a profile in the Family Search family tree and it has something in the “Research Help” section. Family Search has analyzed the profiles in the family tree and determined that couples who were living in this time period in this location typically didn’t have children further apart than three years. So when it sees on this profile that there are children who are spaced further apart than three years, it’s suggests, hey, there might be another child in the middle.
Ancestry also uses AI in a variety of ways. You might think immediately of Ancestry’s hints, and you wouldn’t be wrong with that. But they also use AI for things like the Newspaper.com Obituary Index. Ancestry doesn’t have a team of people going through individual newspapers figuring out which articles are obituaries. Instead, Ancestry’s AI is looking at individual articles and looking at the language. If an article has a lot of words like died, buried, cemetery, survived by, chances are good that it’s an obituary.
What is ChatGPT?
This analysis of language is more like the AI that has a lot of people riled up right now, which is with tools like ChatGPT. So what is ChatGPT? “Chat” refers to how you interact with it: You type a prompt, it types something back. It’s very much a text based chat. GPT stands for Generative Pre-trained Transformer. Though that sounds really technical, it really describes what it is.
To use ChatGPT effectively for your genealogy research, you really have to understand what it is and what it is not. ChatGPT is not a search engine. It’s also not a fact checker. ChatGPT and other tools like it, like Bard or Bing Chat, are built on what’s called a large language model. Basically, a large language model takes a huge data set (ChatGPT used billions of publicly available web pages) and analyzes it to see what the patterns are of language within certain contexts.
ChatGPT takes what’s in the prompt and compares it against the training set. It then gives a reply in words that it thinks has the highest probability of fitting the pattern in that context. This technology is not new. If you’ve ever gone on to a business’s website, and you ask in their online chat, “Are you going to be open next Monday?” and immediately it gives you that answer of, “Here are our store hours” — that’s an example of a large language model. It takes your prompt, analyzes it, and finds what most closely matches what would be an expected response.
What’s revolutionary about ChatGPT, Bard, and other similar tools, is that for the first time, people who aren’t programmers have access to this technology. You don’t have to program anything in ChatGPT; you work with it in natural language.
The Biggest Mistake in Using ChatGPT
By far the biggest mistake that I see people making with ChatGPT is treating it like Google. When you create an account on ChatGPT and log in, you’ll see a box where you can enter your prompt, and it looks a lot like a Google search bar. I suspect that that’s where that lawyer got into trouble. I suspect that he entered a prompt something like, “Write a legal brief about this particular topic,” and he expected ChatGPT to go scour the web, find all of the current facts, and synthesize them into a coherent and accurate legal brief that he could then turn into the court.
But that isn’t how ChatGPT is designed to work. That lawyer asked for a legal brief, which set the context for ChatGPT. It looked at its training set and saw that legal briefs have these things called citations, which usually have name versus name, a bunch of numbers, and a year. So that’s what it gave him. That was the type of language that was expected. Again, we’re talking language, not fact checking.
How to Use ChatGPT for Genealogy Accurately
But genealogy is all about being accurate. So how can we use ChatGPT and similar tools, and still be accurate with what we’re getting? For genealogists, we like facts, and we want things to be accurate. So sometimes what some of us do is we will test the new thing and give it a name that we already know is in that data set — or in this case, ask it a question that we think it should know the answer to. But if I enter a question like, “When did Ohio birth records start,” ChatGPT is going to give me a response that in terms of language, makes sense. In terms of fact, not quite. Below is ChatGPT’s response. I highlighted in yellow the text that is incorrect.
Here’s a vital thing to know about ChatGPT prompts:
If the prompt that you’re using is something that you would otherwise have typed into Google, it’s not a good prompt. Use Google for those kinds of things. That’s what Google is designed for. Just like you wouldn’t open up PowerPoint to send an email, don’t use ChatGPT for something that you otherwise would have used Google for.
So what does make a good ChatGPT prompt? It’s going to be things that are based on language, concepts, or transforming things. I really like ChatGPT for idea generation. I asked it recently to compile a list of 10 activities for a family reunion. I was intrigued by the second item on the list, “Family Olympics.” So I continued the chat and asked ChatGPT to give more specific examples for #2 Family Olympics from the previous list.
Getting Accuracy in ChatGPT Results
We have to address ChatGPT making up facts. One of my great-great-grandfathers was John Peter Kingery. He was a pretty average, obscure individual. There aren’t going to be massive amounts of references to John Peter Kingery in ChatGPT’s training set.
If I prompt ChatGPT, “write a biography of John Peter Kingery,” ChatGPT has no context. It doesn’t know when or where he lived or anything else about his life. All ChatGPT knows is that I asked it to write a biography of a person with the name John Peter Kingery. And that’s exactly what ChatGPT did:
There’s nothing in this biography that’s correct. But I wouldn’t expect it to be. I gave ChatGPT absolutely no context to work with. Even if my ancestor was somebody famous, somebody that ChatGPT would have somewhere in the training set, it doesn’t have a good way of differentiating between people who have the same name.
You’re setting up ChatGPT to fail when you give it a prompt with absolutely no context like this.
I didn’t want to leave it at that. So I put together a short little document that had the basics of John Peter Kingery’s life. And I also added a couple of extra facts, including that he served in the 173rd Ohio Infantry, and that he’s buried at Kingery Cemetery. Now look what happens when I give ChatGPT that prompt of “using these facts, write a biography of John Peter Kingery.” Honestly, it took me longer to type up the facts to put into ChatGPT than it took for the biography to be created. Below is part of the biography that ChatGPT wrote with my second prompt.
Is this a perfect biography? No. There’s some editorializing and a little bit of embellishment that I’m not quite comfortable with, but this makes a fantastic first draft. I can now take this biography out of ChatGPT and I can edit it myself. But I know that the facts that are in this biography are correct, because I told ChatGPT to include them. I didn’t leave it up to ChatGPT to just go make up stuff. When you want ChatGPT to create something like this, the more specific you can be and the more details you can give it, the better the response is going to be.
Other Uses for ChatGPT in Genealogy
Writing biographies is not the only way the Chat GPT can help us in our genealogy. Remember that T stands for transformer. I recently found a newspaper article about a mine explosion on a website called Chronicling America. One of the things that you can do on that website is copy the OCR (the optical character recognition). So I copied the OCR from this article and opened up ChatGPT. The prompt I gave it was “take the following text and create a table with the person’s name if he was killed or injured occupation, family, relationships and residence,” and pasted in the text of the article.
And look what ChatGPT gave me. It gave me a table with the data extracted into those columns. Obviously, I would want to compare this with the actual article. But if you’re working with a lot of data like this, this is a huge time saver.
When it comes to tools like ChatGPT, Bard, or Bing Chat, we are just scratching the surface of how they can help us in our genealogy. How do you want to use these tools? Let me know in the comments.
i asked chatgpt for a fact based answer concerning ydna testing. when it gave the wrong response, i gave it the correct answer. but when i asked again, it came up with a new wrong answer. so it could not give informatioin about the original question nor could it integrate iinformation into its database.
Just telling it the correct answer and then asking the same question doesn’t immediately add that to the training set. If you have the correct information and you want ChatGPT to use it, you have to specifically include it in the prompt. That’s how I got the biography with the correct information — I included it in the prompt.
The New York Times has dedicated an entire section to Artificial Intelligence (https://www.nytimes.com/spotlight/artificial-intelligence), including a subscriber-only newsletter (On Tech: A.I.) with some how-to tutorials on using ChatGPT. Some of the newsletters have introduced uses that I’d never pondered: meal planning, research assistance, gift giving, life coach, personal shopper, travel planning, etc.
I use ChatGPT for drafting a story about my ancestors. I give it the facts that I have thoroughly researched, and ask it to write in a conversational style, (my style) and show don’t tell, no flowery words. I’ve been impressed every time.
I had a hiccup recently when I gave it the OCR from a will and asked ChatGPT to transcribe it. It did until the end then went off on a tangent. It also included commas when the whole document had none. I needed to use a better prompt with words like, accurately, transcribe as is, do not format. Then I asked it to give me a table to outline the content, beneficiaries, relationship to the person whose will it was, etc. That was spot on.
I was surprised by the Chat GPT and also the Bing Chat at how well they can be useful in Genealogy. After using them for a bit and learning how they work I was able to find some photo copies of records I had never seen before and probably never would have. They turned out to be in some of the odd places where I would have never thought to look or search and here they were online the whole time, but never came up in a general Google search or any internet search. The Bing Chat was even able to read to me the information that was written on a Death Certificate I had been looking for. I asked where the record was found and it came back with that information also. Now I’m in the process of obtaining a copy of the Death Certificate for my records. The only thing I don’t like about either one of them is the quantity of questions you can ask is limited to 20. Sometimes it is not enough to convey the information needed to narrow down exactly what you need. If they need more information, they will ask which uses part of your 20 questions also. After the 20 limit, you have to start fresh with a new question or topic and you cannot continue where you left off. Sometimes you get so close but yet so far…..
Thank you Amy for this informative article. It clarified questions I had using ChatGPT and how best to apply to genealogy research.
Amy! As always, you’re right there on the cutting edge, giving me info on things I’m currently seeing in the “news” when it comes to family history research. Thank you – and big hugs from a fellow Buckeye! – Cate
These are really helpful ChatGPT options. I’ve been struggling with knowing it can be a useful tool, but not *quite trusting it yet. These suggestions are great, thank you!
The trick is finding the AI that does what you need, and learning how to use it properly to get the best results (there are tons of tutorials out there for all of the different ai options). Here are some better options for research (of any kind) than ChatGPT:
-OpenAI Playground (able to find and search current data) (I am told google Bard does this too, but I have not tried it so I won’t comment on it)
-Perplexity (collecting information from various popular platforms like Wikipedia, LinkedIn, and Amazon)
-any of the translation ai (Elsa, Bloom, DeepL Write, Goolge Translate… most are pretty good these days you just have to find the one that supports the language you need, for me most don’t have the Latin and Gaeilge I need)
-YouChat (good if you want to have a random chat while researching, I used it for “talking my research out loud”, it give me feed back, restated things, even asks me to consider other theories. It can respond to general inquiries, translate, summaries text, suggest ideas, write code, and create emails. As it’s still in the development stage, it provides average answers. (I am told laMDA does this too, but I have not tried it so wont comment on it)
-Elicit (literature reviews. When you type a query, the app produces an immediate summary from the highest-rated documents, primarily utilized by researchers and students to track down relevant papers to cite and get an idea of future research avenues. This one has been mostly miss for genealogy related quarries, but for location, general DNA, and Archaeology topic quarries it wasn’t bad, I think it will get better in time.)
-Socratic (more geared towards homework help, but could be useful in learning research concepts and genealogy terminology)
-Search engine ai (Chatsonic, Neevaai, duckduckgo… lots out there now, including a number of them specifically for genealogy research, most are just as good as google, but unlike google they don’t give you results based on ranking, which is extremally helpful when doing research)
-language practice (there are a number of ai chatbots out there that will talk to you in other languages so you can prentice any language you are researching in)
-Mem (good for organizing and searching through your own research, the chat ai can check spelling and grammar, tone of voice, summarize, and search within your own data set for answers. note, it does not run searches outside of your own data set, it is intended for being a “second brain” specifically for you, so you have to enter in all the information you want it to access. But I find it extremally helpful with my research and with classes I am taking).
Yes, there are loads of AI programs out there, each with their own strengths and weaknesses! My goal with this particular post and video was to show 1) not to be scared off because something is AI, and 2) that you have to learn how it operates, rather than just plugging in a random prompt.
Loved the article however, the free version of ChatGPT cuts off at September of 2021, which is fine for most Genealogy research. The paid version, which is $20 per month and using plugins that can access the entire internet is much more powerful with less hallucinations. I believe the person commenting about YDNA would have gotten a much better response.
This was a very useful video and I’ve bookmarked the whole site. I used it to ties facts about my great-grandfather’s civil war service with regimental histories. I will check the sources to confirm, but at least the narrative is there. Next up – reading his pension documentation and adding. Thanks!