Technology

Siri is still Siri

Siri is still Siri

It has been ten years since we met . Ten years in which we have talked almost daily. But we don't know each other too much, we always talk about the same thing. I guess it's partly my fault. I only ask him about the time, how Barça has been or if, please, can you tell me when 8 minutes have passed and the eggs are cooked the way I like it.

I've tried to do my part, really. Sometimes I ask questions that can be answered in one word and he sends me to Google it . Or that he repeat what I asked him because he did not understand me. He also usually makes me an excuse that he cannot do this or that but that, very kindly, he will open the appropriate application for me if I wish. She is a bit passive-aggressive.

In ten years with me, I can openly say that he has learned absolutely nothing from me. It only helps me set timers and read the weather forecast. Which I appreciate, but I think that calling itself an “intelligent assistant” is a bit silly.

Not even Apple knows what it wants from Siri

I'm talking about Siri, Apple's virtual assistant, which was released 10 years ago generating a great sensation among consumers and the press . The company promised us to interact with technology through voice after revolutionizing the industry with the multi-touch screen of the iPhone. Promises that, in hindsight, fell on deaf ears.

“The problem with voice-controlled interfaces is that the command syntax has to be very simple,” said Apple executive Philip Schiller at the Siri presentation. “What we want is to speak to technology in a natural way and for it to understand us. Thanks to Siri, your phone will understand you and help you get what you want. ” A decade later, the relationship that users have with Siri is still closer to the problem they wanted to solve than what they promised to offer.

And it's not because the reception was bad. The first version of Siri was surprisingly good , and it made us feel like it would be a breakthrough in our relationship with technology. The “invisible” technology we seek. But advances have been coming with a dropper, and these are sometimes not even discovered because users have already assumed that Siri will only be faster and more efficient than themselves with their fingers when creating reminders, setting alarms and asking if it will rain. tomorrow.

Part of Siri died with Jobs Apple has never been clear about what Siri is or what it wants it to be. It does not become an increasingly capable assistant thanks to machine learning algorithms nor does it offer an easier way to do it complicated or tedious things with the mobile using a simple voice command. Because if we think of a human personal assistant, we would hire him for this, to really help us, saving us headaches or time.

That the vision of what Siri is and what it will be in the future was lost along the way is confirmed by one of its co-founders, Dag Kittlaus, who stated that “Steve, Scott Forstall and the founders of Siri had a plan that sadly he died on the way. Voice recognition and the naturalness with which Siri responds has gradually improved , and substantially, over the years, but not its possibilities. This is the main flaw, Siri understands, but it doesn't run satisfactorily. The theory is known, but it tends to fail in practice.

In hindsight, it is clear that Apple was unable to take advantage of its pioneer status and put its huge active user base to proper use. The cast of basic tasks that can be performed with Siri, and that fascinated the press and consumers – alarms, notes, reminders … -, has never been substantially expanded. These should have been the first step in the development of a technology capable of not only listening and understanding what the user is saying, but also helping them to carry out complex tasks through a simple sentence.

His rivals took advantage of it

The competition took advantage of the slow progress of Siri , and soon they conceived alternatives that were surpassing the capabilities of Apple's assistant with a better understanding of the language, more functions and an integration with third-party services and applications higher.

Google Assistant, focused from the beginning to be as human as possible, is the one who best understands what the user requests and the most capable when it comes to giving concrete answers to questions that, even asks little, they could only understand other human beings due to the necessary interpretation of context and abstractions of our language. Assistant is a link between the world of massive data processing using artificial intelligence and the human. This union allows complex orders to be requested easily. Let's say it is the mouse and keyboard for the use of increasingly abstract, powerful and complex computing tools.

Amazon for its part, although it does not have the most intelligent assistant, nor the largest user base, it has the largest ecosystem within the home . In addition to its line of Echo speakers, the company has made a great effort to make thousands of third-party devices compatible with Alexa or have its assistant integrated into them and to expand their skills or abilities with as many applications and services as possible. . It is the trump card of Amazon: cheap, universal and connected hardware.

Siri is still in no man's land. Surprising fact if we consider the advances in artificial intelligence algorithms and chip design for the use of neural networks that Apple has achieved in recent years. Perhaps it was born prematurely. It was left in limbo between what could be a decade ago and what could or should be today.

Virtual assistants still crawl

It is unfair, however, to exclusively uncover the shame of Siri when, in reality, all virtual assistants are still very green . Understanding simple sentences works a high percentage of the time, but it is very difficult to thread conversations with them. It's like talking to a person with a fish memory. You are not aware of what you are talking about. Progress has been made in this regard, but it is still insufficient. And it is that natural language – and its ins and outs – is one of the most difficult skills for machines to replicate.

The next point, beyond understanding, are the virtual assistants' own abilities. That certain tasks can be performed and others cannot, generates frustration and discourages the consumer. In addition, he does not know when Alexa, for example, incorporates a new skill to his repertoire.

For technology companies there are two clear challenges regarding the use and discovery of the skills of the assistants:

Availability: the number of skills grows very fast in Alexa and Assistant, it is impossible to communicate to the consumer all that there are and those that are incorporated every week The consumer is not sure what he can or cannot do. This causes the power of the assistant to be overestimated or underestimated. User expectation and user ability rarely coincide on the same point. These two situations generate a loop in which the user is anchored in the use of two or three orders that he knows and knows that they always work as shown in the following figures extracted from a study by Telefónica on virtual assistants.

Even in the group of users most used to virtual voice assistants, their use is usually limited to a maximum of three tasks. / Telefónica The different challenges to tackle Because the consumer wants things fast and well. If you get in the car you don't feel like rephrasing your order several times. Nor if you are loaded with shopping bags at the top. When you use your voice to interact with technology, it is because, usually, you need it to be that way. Which brings us to the third problem with virtual assistants, which do not offer privacy.

Despite what companies believe, voice first technology will always be reserved for devices where the use of other interfaces, such as a touch screen or a remote control, are impossible to use or less convenient. As much as we live with technology and many are capable of dancing in a ridiculous way in front of millions of people on TikTok, we cannot ignore how uncomfortable it is to speak to a mobile or watch in public.

The success of the personal computer first and the smartphone now is, in part, that are personal and intimate devices . Who knows us best is our phone. Whether or not it is beneficial for ourselves and for society is a debate that is beyond the scope of this article. But we have to accept this reality and understand why. Maybe Facebook and Google know almost everything about us, but no one directly judges us or intrudes on what we do or what we search for. Something that would happen if we used our phones only with our voice. It's cumbersome, and so natural language recognition should also be easier to use in writing. For example, within the iOS spotlight directly to execute complex tasks that require multiple actions.

Accessibility for who else needs them Interestingly, voice assistants are not so effectively developed for people who could benefit the most: older people, who are much more used to asking for things by speaking rather than interacting through a screen. The point is that their diction is sometimes slow and choppy, and the assistants, not knowing how to interpret the context and whoever is speaking to them well, act before these users complete the order. Again, notable progress is lacking in speech recognition for the elderly or those with speech difficulties.

In short, virtual assistants still need these improvements:

Consumer context and profile Perform tasks without initiating web searches and teach the user what to do. Ability to interact with them more intimately more easily. the diction of the elderly and users with difficulties in diction. CUPERTINO, CALIFORNIA – September 14, 2021: Apple CEO Tim Cook showcases the advanced camera system on the new iPhone 13 Pro during a special event at Apple Park. (Photo by Brooks Kraft / Apple Inc.)

So what should Siri and the rest of the assistants be?

For Siri to be truly useful and protagonist in our relationship with the iPhone, it needs to really start listening. You have to listen to your boss, the user, to learn from him and meet his needs better and better . She is only capable of listening and interpreting isolated sentences, translating them into simple commands and trying to execute them. If you can't find the how, you often mistakenly assume that you are trying to do a web search. It assumes, but is not ascertained by the user's response, so it cannot learn to perform the action correctly the next time.

The result? That despite being with me for 10 years, Siri still doesn't understand me or do what I ask of her.

Siri needs to be told, like a person, if it has done it right or wrong . You should also offer, when in doubt, what you can do. For example: do I open your Twitter mentions or search for Twitter mentions? This would generate less frustration for the user and Siri would learn.

Obviously, this could not be done a decade ago. But now it could begin to be implemented thanks to the fact that Apple has dedicated neural network chips to work with artificial intelligence algorithms locally and privately within of the iPhone. Why not use them little by little more ambitiously to make Siri more and more human? That Siri learn like a child, like a person who wants to help us, who doesn't know very well how yet, but with a predisposition to work and learn.

Communion between software and hardware

Both Apple and Google, which control software, hardware – the Pixel 6 will have a chip manufactured by the company itself – and services could begin to use this learning «parallel» with the help of the Username.

The assistants also have to be proactive and be able to make suggestions to the consumer like a person who attends us in a store or restaurant does. Not just with informational cards, but extending the conversation once you've been asked or ordered to do something. It would be a way to get more information about us to learn and to increase our satisfaction. Of course, it cannot go beyond the limit that would make us uncomfortable. Siri and derivatives have to learn social skills too, something very complicated for a machine, more skilled with numbers than with words. Here timing is everything, but it would be very convenient for attendees to suggest complementary or alternative actions, and to take note of our decisions to better serve us on future occasions.

Understanding is only the first step, you have to attend the second too The last and most important point is the integration with the operating system and applications. It is useless for the assistant to converse like a human if he only stays on the surface of the capabilities that smartphones offer us. What's more, the idyllic scenario would be for Siri to allow us to perform multiple and complicated actions for the user with a simple voice command explaining the result we want to obtain. It is something that Apple already raises with its application of Shortcuts shortcuts, but its use is still complex for the standard consumer and limited for the advanced consumer.

Why not use your voice when editing a photo? It would be fantastic to guide Siri and for the photography to change as we try different settings as if we were talking to a professional photographer. The same when extracting information from a spreadsheet or operating with several files at the same time. This should be the way for the assistants who work through natural language. We cannot expect them to continue to advance following the pattern created in 2011.

Attendees in the next decade

If the next decade is the metaverse and the massive adoption of virtual and augmented reality glasses, virtual assistants will have to substantially improve their understanding of the context of the conversation and their interaction with the user. Because if not, they will generate frustration and negatively impact the consumer's judgment on these new technologies.

The different artificial intelligences of the market increasingly understand better what we want, but in reality they still do not fully understand a sentence, much less a conversation. They only translate words into simplified commands that they can handle. This is the great challenge facing all big tech companies. But as soon as they manage to imitate our verbal language, something that defines us as human beings, what will prevent them from going from assistants to babysitters, teachers or even friends?

It is curious that the most complicated task possible for a machine is to understand each other. Although to be fair to them, do we even manage to understand ourselves?

Click to comment

Leave a Reply

Your email address will not be published. Required fields are marked *

Most Popular

To Top