Te Taka Keegan is concerned by how ChatGPT may have got the data it needs to learn te reo Māori. Photo / RNZ
Te reo Māori and data sovereignty experts are raising another red flag about artificial intelligence - ChatGPT might be getting just a little too good at te reo.
On a crisp afternoon at Waikato University, associate professor of computer science Te Taka Keegan asked ChatGPT to write in te reo Māori.
The quality of Māori, he said, was good - scarily good.
“If they are producing a very good quality of Māori, the question that could be asked is - where did they get their data from?” Keegan said.
Keegan thought Open AI would have scraped it from social media sites.
Because ChatGPT was good, Keegan said sooner or later the language itself could shift from a traditional reo to a ChatGPT version.
The consequence? It might mean Māori lose sovereignty of the language.
“We’ve lost a lot of control over our land, we’ve lost a lot of control over the education that our children get; our own data and our own stories are kind of our last control over ourselves. If we lose that, if we lose sovereignty over that, it doesn’t bode well for the uniqueness that is Māori,” Keegan said.
Ngapera Riley thinks a lot about the ethics of information, data sovereignty and te reo Māori.
Her company Figure.NZ works to democratise New Zealand data, and she said they worried about how it might be gathered and misused.
“Once we open it, it’s out there, right? But we’ve decided it is better to let people use the information and access it, than to hide it,” she said.
Riley reminded people that what ChatGPT produced should not be used as a primary source. It was a tool, and whatever it produced needed human auditing.
“That’s where it will get dangerous, if people start to get too lazy and just start using it like that [as a primary source],” she said.
Te reo champion Sonny Ngatai was optimistic the language could survive AI. He wanted to see te reo Māori everywhere - from cooking, to the back of chocolate bars, and even ChatGPT.
But he would like to see boundaries.
“Where I would put my flag up for data sovereignty is when it comes to our stories, or our narratives, or our tikanga, stuff like that,” he said.
For these, he said protecting Māori intellectual property rights was important.
Te reo was not just about stringing words together like a chatbot could, he said.
“It’s part of our identity, part of who we are as New Zealanders. There is just so much more to the language than an AI being able to translate what you want to say.”
Despite the challenges, Keegan was generally positive about AI.
He thought if it could be isolated, trained by Māori, and controlled at an iwi level, Māori could retain sovereignty and use it as a helpful tool.
“We need to make sure it is cut off from the mothership, that it wasn’t feeding everything back to to the mothership, because everyone loses if that’s the case,” he said.
Riley also thought ChatGPT had a lot to offer, as long as Māori were actively involved.
“My hope is that tools like ChatGPT can help preserve and use [te reo], but we still need the human element to input into the language, and to check that we aren’t using incorrect sources,” Riley said.