Why Digital Assistants suck and how we can do better

Current digital assistants suck. They are impersonal, barely have any use, and it is socially unacceptable to talk to them. I say this as someone who owns 3 Amazon Echo devices and a Google Home device; these I use to control all of the lights in my apartment, set timers when I cook, and occasionally ask simple questions like “what’s the weather” or “what time is it”. I use Siri every night to set my alarm. I helped my grandparents set up an Amazon Echo to use for music. I use these assistants more than the average person, and am more privy to the “right” way to talk to them; and yet, I find myself very frustrated with them quite often. I am not a “power user” of these assistants by any means — my use of services like IFTTT and other automated scripts is limited. Perhaps if I was, I would have less of an issue. However, the convenience of digital assistants shouldn’t be reserved for those who invest hours of time into setting up intricate systems. Not everyone is willing to do this, nor is everyone capable of it.

A common promise of systems that use cloud-based AI is that it will improve over time. However, in my experience with Amazon Alexa, it has gotten worse in the past year. For a few weeks, Alexa failed to even give me the time of day; I would ask and see the indicator light, only for it to go away without a sound. To give credit where it’s due, this issue has been resolved, and her voice sounds more natural than it did even a few weeks ago. Improvements are obviously being made, but in the wrong areas. I had no complaints as to how natural my digital assistant sounded; but when she can no longer execute basic functions, that is a huge issue. Where Alexa used to respond accurately to my requests to turn on and off all lights in a room (“Alexa, turn on the living room lights”), the same request now only performs the desired action on a single, seemingly random, light in the room. I came to learn that she now expects you to say the much less natural “Alexa, turn on the living room” to perform the desired action. The living room is not a device to be turned on or off; it is a location. This is a nitpick, but it speaks to the semantic issues of most digital assistants. In my limited experience with Google Assistant, it does not suffer from many of the issues Alexa has, but it has its own shortcomings just as every digital assistant. There is no single assistant that does everything well.

One solution to this problem may be to have an open source interface that many digital assistants can interface with. If Google is better at answering factual queries, and Alexa better at managing the smart home, both should connect to a centralized API that answers to the same name, and delegates tasks to specialized assistants. Say I’ve named my assistant Jarvis (I’ll discuss the further benefits of personalization below), and query “Jarvis, what time is my appointment in Seattle and what will the weather be like when I arrive?” Jarvis, which is just an alias for this central API, might break the query into its two obvious components: “What time is my appointment at a certain location?” and “What is the weather in Seattle at a certain time?” For the sake of this example, let’s say our calendar data is stored on my iPhone, which Siri has the most expertise in dealing with, and let’s say Google Assistant is the best at finding out weather conditions at a given time. Jarvis would ask Siri for the appointment time, and once it got the response it would use it in a query to Google Assistant. This delegation concept is not new — Siri does something similar by asking Wolfram Alfa for answers to certain questions. All of the major assistant platforms have proprietary “app stores” that can interface with the assistants, but again, there is a lot of fragmentation and none do a particularly good job at automatic delegation. There is additional opportunity for improvement here by allowing a user to set preferences for which assistant handles what behind the scenes.

As alluded to above, digital assistants are all severely lacking in customizability and personalization. For all the data that the purveyors of these assistants have about me, and for all the (usually accurate) targeted ads I receive from said companies, their digital assistants know jack shit about me. Personally, I am willing to sacrifice some privacy for convenience. This is a trade-off everyone makes to varying degrees, and many people are unaware that they are making it; but in using any digital device, we sacrifice some of our privacy and security. If I am going to do this, I expect to get something out of it, and in most cases I do. My phone is immensely useful to me, as is my computer. Cars are increasingly becoming more digitally connected, and the reward we get in exchange for offering up 24/7 location data and video footage while driving is theft deterrence and the promise of self driving cars. In my eyes, and in the eyes of many others, this is worth it. But what am I giving when I put a digital assistant in my home? A company like Amazon or Google can listen in on me at any time, and in return I get to turn my lights on and off by asking. Cool, sure, but gimmicky — and to many people less dorky than I, not worth the trade off at all. In addition to the small convenience features I get from my smart assistants, I also may receive more targeted ads, which add negative value to my life. Add in the frustration that comes with constant bugs, and I don’t see how smart assistants are smart, or assistants. They may be everywhere now — the number of speakers Amazon and Google have shipped is astounding, and a smart assistant ships on every new smartphone — but they are not used by everyone, and not socially acceptable. Smart speakers are given away as an add on to dozens of devices at every sales event, and many people that receive them would not have bought them separately. It appears as though Amazon, Google, and co. do not want to help you with their assistants, but rather use them to sell you and your data to the highest bidder.

I wish for my digital assistant to have an intimate knowledge of my schedules, preferences, and day to day life without me having to give it much input. To do this, there’s no need for it to send any more data to any company. Google, Facebook, Amazon, et. al already have it. It should live on my device and both public data available online as well as private data that I’ve authorized — and send it nowhere else. My digital assistant should act as a content curator for me; in a world of content overload, I don’t want to have to sift through dozens of uninteresting or factually questionable articles and videos before I find one that’s useful to me. I don’t want to find myself wasting an hour on social media only to find one meme that’s actually funny. Some of this comes down to self control — there is no reason I can’t just stop using social media — but I imagine a digital assistant that acts as a friend. When a friend sends me something on Instagram, I will always open it, because I trust that they know me and have a good sense of humor or common interests. I often don’t open Instagram otherwise. Same goes for YouTube. I skim the news, but if a friend sends me an article I inevitably read it.

I want my digital assistant to read my emails and texts (but not to give it to the companies who made it. Better yet, maybe a company should not own my assistant), but I also want the ability to hide things from it. The vast majority of the information I know — the information I need to live my everyday life — comes in a digital form. My school schedule is digital. My practice schedule is digital. When I can’t remember what time our workout meetup is, why should I have to dig through the hundreds of emails I get every day to find it? I should be able to ask my assistant when practice is, or even better, be able to trust that he or she will remind me to go the morning and hour of, without me having to explicitly enter a calendar event. Maybe this speaks to my laziness, but I don’t use my calendar because the work that would be required to manually or semi-manually enter my schedule into the calendar is not worth the benefits. Additionally, sometimes I miss an email. My assistant should catch it.

Many social media platforms now sort content by how much you might be interested in it, but this still feels impersonal and sponsored content is often higher up than content I actually want to see. None of the first posts are ones that my friends would have sent me. My phone alerts me to news articles that supposedly I would be interested in; however, these come across not as recommendations but indiscriminate ads for content. Again, these are articles my friends never would have sent me. And to some degree, this is a good thing, because if my friends would have sent them to me then why do I need my digital assistant to send it too? That’s redundant. But I’m referring not to specific content, but the quality of the content that is being sent. The format of delivery is extremely important as well. A random push notification from my news app that says “BREAKING” when it isn’t in fact breaking news seems stupid. It’s impersonal. But a message from my own personal assistant that says “Hey this is a really interesting piece I think you might like: *insert link*” comes across very differently. It shows that the machine has a personality and opinions. It shows that the machine was thinking of me and knows me. It is statements like these that start to blur the line between machine and thinking entity. In any interaction with a real, living breathing person, flow in their speech is what distinguishes a pompous asshole from someone you’d want to be friends with. Alexa and Google Assistant are polite enough, but I would never want to be friends with them. Give me an assistant I want to be friends with. (What does that say about me if I want to be friends with a computer…)

Another area in which digital assistants lack is in the actual act of communicating with them. It is just plain awkward to talk to something that isn’t animate when you are in a social environment. Even amongst my closest friends and family, I feel stupid talking to Alexa. Again, I am a dork with no shame, and my friends and family know this — so the fact that even I feel uncomfortable interacting by voice to these assistants likely means that most people will never attempt it. What good is an assistant that you don’t communicate with? I also want my assistant to be able to talk to me, not just respond to me. With a smart speaker, it can’t exactly blurt out random information whenever it wants. The timing might be inappropriate, there may be people around that you don’t want hearing whatever it tells you. If the digital assistant could actually listen, that would be a different story; I’ll discuss that below. But for now, an assistant that talks without being prompted to is just not practical. However, an assistant that sends you a message is very much a possibility. Google Assistant and Siri already send notifications that try to preemptively predict what information you want or need, and some of these are useful. However, some are wildly inaccurate (when I’m on vacation I don’t need to see that it would take 10 hours to get to my apartment with current traffic — didn’t Siri read the confirmation email from my hotel? Take me to the hotel instead). Again, these notifications are also incredibly impersonal.

This brings me to listening. It is something we are told will help us be better people. I believe it is also the key to building better digital assistants. Most information I receive that isn’t digital is auditory. One of my friends tells me what time the party is. A professor says something in class that isn’t in the lecture slides. I almost never look at any useful information that isn’t digital anymore (except books, but I want to read those myself); but I hear a lot of it. My smart speakers are listening all the time, they’re just listening for the wrong things — maybe a buzzword to target me for an ad, but certainly its name. That’s the only thing it reliably listens for 24/7. But if, and this is a big if, I could trust my assistant not to share what it hears beyond its internal memory (or at least encrypted cloud memory only accessible by me and my assistant), I would love if it could listen and understand everything around me. If one of my friends told me a joke that I later forgot, I could ask my assistant to remind me. If I was busy in conversation with a friend and it sensed a lull, it could say “Hey Remington, don’t forget you have practice pretty soon.” Perhaps in the more far-fetched realm of things, my assistant would also know that I’m still in my jeans and need to change into shorts and remind me a few minutes earlier. If I’m studying for a class and can’t remember what the professor said, I could ask my assistant, “Hey Jarvis, do you remember what Professor Jones said about RAID storage a couple weeks ago?” and be reminded of the key facts on that topic and maybe even sent a useful YouTube video or article. When no one is in a room in my apartment, my assistant should know that and turn off the lights.

It is time for Artificial Intelligence to become Artificial Personality. R2D2 doesn’t even speak English, and yet he has infinitely more personality than any digital assistant on the market today. As Ian Bogost says here, AI has become a meaningless catch-all phrase that is nothing but glorified statistics. Many applications of current AI is used to make predictions using vast amounts of input data, and while this is useful in many cases, in some sense it is the pursuit of something that has eluded us for millennia and will continue to do so forever: predicting the future. By all means, this work should continue, but it is mislabeled. AI has plenty of knowledge but no understanding, and we can do much better.

Not a single computer in my life is aware of who I am, or even aware of its own surroundings beyond an extremely basic level. Awareness and personality are the keys to making digital assistants not only potentially useful, but something you want to use. The most powerful computer in the world is useless without a keyboard. It is time to make not just a more capable assistant, but a more friendly one too. The most innovative technologies are always at the intersection of art and science. They evoke emotion. It is well within our technological grasp to build an artificially intelligent assistant that does these things; now it is time to do just that.