Recently, Fusemachines and the TIME team joined forces to deliver some fun Chat GPT-generated news quizzes that quite literally turned back TIME by sifting through the magazine’s century-old content. If you haven’t read the article detailing how Chat GPT was creatively and continually challenged to deliver desired output during the creation of the quiz, we highly recommend you to.
We think these nifty AI-generated quizzes are nothing less than ingenious. So we sat down with Chris Wilson, TIME’s Director of Data Journalism, to discuss what inspired this invention, what the experience was like interacting with – and simultaneously training – a large language model and other burning questions. Here’s the full Q&A.
Let’s start with what triggered this idea and what your expectations were going into this project.
I recall mentioning the idea of AI-generated trivia to Fusemachines about two years ago, when we were just spitballing about creative applications of natural language processing. At the time, we were working on a prototype of a chatbot that fielded questions about the pandemic based on content from TIME and other reputable sources, so we were already thinking about how language models process journalistic content and how they can rephrase that information. Trivia seemed like a lot more fun than fielding questions about masks and vaccines.
And in retrospect, those were the dark ages! Now that we have access to these new models, what felt like a Herculean task was as simple as explaining the idea to the model in plain English. Which wasn’t actually that simple, but it was far easier than doing it all from scratch. So I was cautiously optimistic that we’d get something plausible enough to share with readers, even if it requires a lot of caveats.
Could you share some bizarre or hilarious experiences from when you were working on training the LLM to produce the desired output?
My favorite foible came from our early attempts to generate multiple-choice trivia based on a 2014 cover story, “The Power of Taylor Swift.” While the answers the model supplied were generally correct, the correct answer in more than half of them was “Taylor Swift.” It’s a great example of how common sense can be elusive to an LLM at first blush, but how it can be taught to understand and correct the mistakes that we initially thought it might intrinsically avoid.
How long did it take for you to reach the desired result and were there specific guardrails you put in place to avoid erroneous output?
The Fusemachines engineer we worked with was able to generate and refine the instructions to the model in just a few days, and then we spent at least as much time going over the output. TIME’s Managing Editor Lily Rothman, who knows the archives better than anyone, did a heroic job parsing every question, answer and explanation, looking for both incorrect answers – and she did find a few – but also statements that might be vague or misleading. This was tremendously helpful in understanding how we can refine the instructions to reduce the amount of oversight required.
Are there key takeaways from this project on how AI should or should not be applied in media/journalism?
I’ve often thought that journalism is a natural place for language models to integrate since the content we have produced for well over a century is designed to be clear, sober and accurate. While all language drifts and morphs over time, the content of a magazine does so at a far slower pace than, say, Twitter.
And I think the world of venerable, established media brands is a fine place to establish the as-yet-undetermined standards for when and how to disclose the use of AI, because the industry already holds itself to a high and sometimes painful standard of accountability, as anyone who’s ever had to issue a correction can tell you.
What was your experience like working with Fusemachines’ seasoned engineer on this project?
I can’t say enough about how much I value my interactions with Fusemachines. Most unions of journalism and technology are a little awkward at first, but the Fuse engineers are deeply aware of both sides of the human-computer interface.