Now, these powerful technical capabilities are moving closer to the actual users of technology. The biggest news from Microsoft this month was a new generation of AI-enabled PCs, to be launched this year under the brand Copilot+, which will be powerful enough to handle AI without needing to call on a remote data centre.
In the process, Microsoft threw down a challenge to Apple with a claim the new PCs will leapfrog Apple’s MacBooks. An AI arms race is now in full swing in the personal computing and smartphone worlds.
None of this, though, has done much to answer the overriding question for most consumers: “How - and when - will all this expensive new technology make things better for me?” So far, generative AI has brought a proliferation of text boxes online offering to answer questions (including in services such as Meta’s WhatsApp and Instagram); offers to help write emails or documents; and various services that summarise blocks of text, including the web digests Google has started to provide at the top of its search results. It is unclear yet how much people are actually using these features.
As this month’s events have underlined, the tech companies harbour a much bigger ambition than this. Their goal: personal digital assistants capable of anticipating a user’s needs and intermediating much of their online activity, as well as digital agents that can go a step further and take actions on behalf of a user. These ideas were a centrepiece of Google’s event two weeks ago and Microsoft’s last week, as well as the announcement of a new model from OpenAI called GPT-4o.
Yet if this is AI’s biggest promise, it is just that - a promise. Two fundamental problems remain unsolved. One involves making AI models that are trained on historic data understand whatever new situation they are put in and respond appropriately. In the words of Demis Hassabis, head of Google’s AI research division, AI needs to be able to “understand and respond to our complex and dynamic world, just as we do”.
That is a tall order. The challenge isn’t just to avoid the “hallucinations”, or occasional glaring mistakes, AI systems are prone to. It also means having a full understanding of context, in order to consistently deliver truly helpful results.
READ MORE: ‘No leadership’ - how tech fared in Budget 2024
Google claims to have made big strides in this department, building an extended “context window” into its latest Gemini models to enable the system to maintain an awareness of complex situations. But if the technology needs to match humans in its understanding of the world, there is a lot still to prove.
Another related problem is making communicating with AI as natural as talking to a person. Only at that point, according to the people building the systems, will the technology come into its own.
Microsoft chief executive Satya Nadella said this would involve learning “how to build computers that understand us, instead of us having to understand computers”. Despite his claim this goal is tantalisingly close to being realised, others, including Hassabis, warn trying to produce “natural” interactions with a computer remains “a very high bar”.
OpenAI gave one glimpse of what might lie ahead with a demonstration of GPT-4o, an AI model designed to work in an informal, conversational style. Yet the gap between a staged demonstration and an effective, real-world product is still large.
It remains hard to predict when AI will make its big breakthrough into the consumer world.
Written by: Richard Waters
© Financial Times