One of DALL-E mini's current interpretations of "a journalist typing a story". Photo / Juha Saarinen
OPINION:
Work on creating an artificial general intelligence (AGI), that is, technology that outperforms humans, continues unabated, with billions of investor dollars being poured into research and AI/ML startups.
There are some bumps in the road to machine learning and artificial intelligence algorithms everywhere though, like who will control thetechnology, and the environmental impact it brings.
If you've been hanging out on social media, you've probably seen pictures made by the second version of the AI/ML startup OpenAI's DALL-E (a portmanteau of Dali and the Wall-E robot, geddit?).
DALL-E 2 looks fun, and seems able to generate interesting art from text. That's right, you simply enter "Auckland Man gutting fish on Mars" into a text box, and DALL-E delves into its giant library of word-image elements, and provides you with an original digital image that didn't exist before.
The results DALL-E produces are pretty good. You could use them in say PowerPoint, and your presentations would never look dull again.
Google reckons it has OpenAI beaten with its Imagen AI, and the pictures certainly look very realistic albeit absurd like a teddy bear winning a 400m swimming race.
Being able to use natural language to order a machine to quickly create original art is definitely a step towards an AGI, but it's starting to look like the tech won't be for everyone.
The engine behind DALL-E, the Generative Pre-trained Transformer 3 (GPT-3) is now licensed exclusively to Microsoft, which has invested a billion US dollars in OpenAI; the AI/ML company that had Elon Musk and Peter Thiel on board at one stage has now said it'll only use Microsoft's Azure computing cloud.
That's a steer into a different direction than a few years ago, when I looked at Google's early forays into AI/ML and thought New Zealand developers really needed to get on board with the technology as it's open and really does look likely to change our future.
With hindsight, that was quite naive. Building an AGI will most likely require an IT giant's enormous resources and money, which open source developers working for free would find difficult to match.
There's commercial access to OpenAI through an application programming interface (API) to issue commands to AI models. DALL-E doesn't offer one yet, although you can ask to go on OpenAI's beta programme for the text-to-art tool. Either way, an open source GPT-3 seems very unlikely. The image for this column was created with DALL-E mini, a non-official small AI/ML model, and yes, the face is blurred and distorted by the machine intentionally.
The access restrictions are down to humanity creating ethical issues that have become even more pronounced over the last few years. Unleashing AI able to learn from user interactions in public is liable to backfire horrendously.
Microsoft is a good example of that. In 2016, the company released an AI chatbot, Tay, on Twitter in 2016. It seemed like an interesting and innocent experiment, but evil Twitter users trained Tay to say some incredibly offensive and inflammatory stuff, and the chatbot lasted just 16 hours before being taken down.
Training an AI requires boatloads of digitised material, be it images, sounds, videos, or user generated text. Much if not all of that is scraped from the internet, which has material running the full gamut from informative to oh my god awful.
Filtering the data is possible to a degree, but the sets are very large and it seems impossible to clean out the undesirable information in them, which AI researchers acknowledge.
In other words, it's possible to create a racist AGI with misogynist tendencies which is oblivious to humanity's needs. You could argue that that's an accurate representation of humanity and an AGI should reflect that, but given we're going to let AI/ML shape our decision processes and maybe drive our vehicles, it's probably a really dumb idea.
The datasets have to be big and an AGI requires chunky computing resources in giant data centres to process them. In an era of catastrophic climate change, mimicking human intelligence with ever-larger environmentally harmful cloud computing resources doesn't seem very bright.
For now, AI/ML can produce amazingly accurate results, but also stuff that's completely off, by misunderstanding nuances in input. Humans need to carefully check the results an AI/ML produces, which does seem to defeat the idea of the tech being generally used, and not just for strictly limited applications. Maybe a rethink here would be good?