Can ChatGPT and Bard summarise older stories? Photo / Alex Cairns
OPINION
Prime Minister Chris Hipkins went to China and ended up in a selfie with the superpower’s ambassador to New Zealand, Wang Xiaolong, who mentioned the waharoa that was gifted as part of the 2010 Shanghai World Expo.
That’s a beautiful work of art, and it brought back memories ofthe World Expo, a gigantic display of exuberance at which I bumped into James Rickard who was part of the team of carvers from Te Puia in Rotorua from whose hands the waharoa skillfully emerged on site in Shanghai.
I had not had the pleasure of meeting Rickard before and it was great to watch him work. As I stood there melting in the grey, muggy Shanghai weather, feeling jet-lagged and horrible, I awkwardly mumbled to Rickard that my grandfather and father loved working with wood too.
Rickard quickly established that I was from Tāmaki Makaurau and welcomed me to Shanghai and the crazy World Expo. We talked about Shanghai, the amazing food, being watched all the time, and how to get around China’s internet restrictions and censorship to stay in touch with Aotearoa.
Long story short, the World Expo was one of the best assignments ever. Completely overwhelming in size - some 5.2 square kilometres on Huangpu - and volume, at a cost of several tens of billions of dollars, with stunning art and architecture for the sake of it. And completely out-there things too.
The task was to take pictures that were on par with famous snappers who’d covered World Fairs and Expos in the 30s and 40s (yeah, thanks Ed.)
Thinking quantity might help the quality, I overshot like mad as there was no end of weird stuff to take pictures of.
For that, I was punished when filing as it turned out that getting images out of China was tricky. We didn’t have an encrypted VPN tunnel set up for access to servers in the United States.
Having discovered that Microsoft’s OneDrive, which was called SkyDrive at the time, was not blocked in Shanghai, I spent most of the night uploading just under a gigabyte of photos to a shared folder over glacially slow hotel Wi-Fi.
Now, seeing the waharoa on social media was a madeleine. I went looking for the story on Wired to check I hadn’t said anything embarrassing in a rare miss-step for yours truly.
Except the story was nowhere to be found. Google couldn’t find it either, and as the subeditors had made the headline better, the title I had was different and I couldn’t remember it.
More searching, but while a bunch of other Wired pieces for the Threat Level section were there, my photojournalistic masterpiece seemed to have met with an unfortunate delete key accident.
Fair enough, I thought, as it’s ancient, from 2010… no wait a moment, that’s not even 13 years ago.
Normally, the first stop to locate missing material is the Wayback Machine at the Internet Archive. I didn’t have the original web link to the piece though, and text searches netted nothing.
More Googling, Binging and DuckDuckGoing and look: two happy internet users had pirated my story, pictures included. I could now locate the story in the Wayback Machine with the original URL.
As you’d expect from a good archiving site, the Wayback Machine had made a faithful copy of the Wired story. This included the ad blocker interstitial that triggered as you scrolled down and blocked the story. Argh. I ended up downloading the whole lot to my computer and manually disabling the ad-blocking code.
So, the internet usually does not forget, but it can make things hard to find and access. In that sense, content is more resilient nowadays; while our thoughts and efforts are committed to ephemeral bits that require electricity and devices to access, they tend to be copied across multiple locations across the globe.
In the past, burning down a big library was enough to wipe out chunks of knowledge and archived thoughts - good and bad - but now much of it will remain, heavily weighted towards digitised, internet-published content.
Whether it’s good that we’ll have ever-increasing amounts of intellectual material of all sorts available (and utter dross as well), or bad that there’s no rest and start again, I cannot say.
However, in 2023 there are large language model AIs to feed with more content to plagiarise and pass off as original work.
To see what would happen, I tried ChatGPT which apologised that it could not summarise my story, blaming it on lack of access to Wired’s Gadgetlabs archives. No amount of coaxing and prompting would make ChatGPT reveal how far back its training data goes currently (it cuts off in September 2021 in the free version).
Google’s Bard however happily came up with a precis, which had nothing at all to do with what I wrote. The “stupendously scaled-up autocomplete”, as UK-based journalist David Gerard calls LLMs, was hallucinating, in other words.
This despite claiming “the training data for Bard goes back to the early days of the internet” and you know, Google. You don’t get bigger and more data than Google.
What Bard summarised looked perfectly fine if you hadn’t read the original piece, which, as you can tell from the above, wouldn’t be easy.
Ah well. It possibly suggests that we need to qualify “resilience” for our digitised thoughts with “until AI came along to subtly warped it all”. A small price to pay to live in the future, isn’t it?