Exploring Dovetail's AI-Generated Summary Feature: A Review

By Amy Deschenes

Our team recently transitioned to Dovetail for interview analysis and reporting. We've found its basic features to be highly beneficial, particularly for managing exploratory research projects. Dovetail allows us to handle interview recordings, tag transcripts, create video clips, and extract themes from user interviews with ease.

Since I've been using ChatGPT in other areas of my work, I decided to test Dovetail's AI summary tool. According to Dovetail's interface, the Automation > Summary option employs AI to generate a text summary, incorporating all the text in your note, including transcripts and PDFs. Before proceeding, I ensured that I was comfortable with Dovetail's data policy concerning AI tools. For further details on Dovetail's data privacy visit the Dovetail AI Vision and review the FAQs in Data Security with Dovetail AI.

I tested the tool by uploading two user interview videos along with their transcripts. The AI tool uses text associated with your Notes, such as insights and transcripts, rather than the video content itself. Generating an interview summary was as simple as clicking a menu item.

However, the results were disappointing. The AI summary tool produced inconsistent and incorrect summaries for both interviews. While it correctly identified warm-up questions about the participant's research area, a specific task assigned to a participant, and the use of Google, it also included several inaccuracies.

The tool suffered from hallucinations, falsely claiming that the first participant had used EBSCO databases and searched for a journal on JSTOR. It also incorrectly stated that the participant regularly used Yale University Library for their work. In reality, the participant had only mentioned a task related to the Yale French Studies journal.

The AI summary also exhibited inconsistencies in its references to the participant and moderator, using varying terms such as "participant," "researcher," and "student" for the participant, and "author," "librarian," and "professor" for the moderator.

Despite trying the "Regenerate" option in the Summary box and experimenting with the “Different” and “Paragraph summary” options, the results remained largely unchanged. While the AI-generated summary had some correct elements, the number of errors made it unsuitable for our regular use. The summaries that it produced felt like a lazy research assistant taking a “best guess” at what was going on in the interview.

In conclusion, while the Dovetail AI summary tool was an interesting experiment, its current performance means I'll be sticking to creating my own human-generated summaries for the time being. I'm enthusiastic about the advancements of generative AI UX tools, but they'll need to significantly improve their accuracy before I feel like I can rely on them for data analysis.