I recently showed some videos of Soli in the HCI class I teach. Students immediately hit upon the two major issues I wanted to discuss (I was pretty proud!).
The first is learnability. A big problem with gestures is that there is no clear affordance as to what kinds of gestures you can do, or any clear feedback. For feedback, one could couple Soli's input with a visual display, but at that point, it's not clear if there is a big advantage over a touchscreen, unless the display is really small.
The second is what's known as the Midas touch problem. How can the system differentiate if you are intentionally gesturing as input vs incidentally gesturing? The example I used was the new Mercedes cars that have gesture recognition. While I was doing a test drive, the salesperson started waving his hands as part of his normal speech, and that accidentally raised the volume. Odds are very high Soli will have the same problem. One possibility is to activate Soli via a button, but that would defeat a lot of the purpose of gestures. Another is to use speech to activate, which might work out. Yet another possibility is that you have to do a special gesture "hotword", sort of like how Alexa is activated by saying it's name.
At any rate, these problems are not insurmountable, but it definitely adds to the learning curve, reliability, and overall utility of these gesture based interfaces.
"A loud clatter of gunk music flooded through the Heart of Gold cabin as Zaphod searched the sub-etha radio wavebands for news of himself. The machine was rather difficult to operate. For years radios had been operated by means of pressing buttons and turning dials; then as the technology became more sophisticated the controls were made touch-sensitive - you merely had to brush the panels with your fingers; now all you had to do was wave your hand in the general direction of the components and hope. It saved a lot of muscular expenditure of course, but meant that you had to sit infuriatingly still if you wanted to keep listening to the same programme."
Perhaps the computer is smart enough to determine intent. To paraphrase Marvin, "Here I am with a brain the size of a planet and they ask me to determine whether you were gesturing at me on purpose."
Sirius Cybernetics clearly had some ideas along those lines, but the results were lacking:
"He had found a Nutri-Matic machine which had provided him with a plastic cup filled with a liquid that was almost, but not quite, entirely unlike tea. The way it functioned was very interesting. When the Drink button was pressed it made an instant but highly detailed examination of the subject's taste buds, a spectroscopic examination of the subject's metabolism and then sent tiny experimental signals down the neural pathways to the taste centers of the subject's brain to see what was likely to go down well. However, no one knew quite why it did this because it invariably delivered a cupful of liquid that was almost, but not quite, entirely unlike tea."
The second one seems more of a technical one and can be solved if Soli can reliably recognize user attention, which can effectively be a "hotword" for gesture. This is hard and not sure even it's feasible with this tech, but given all the excitements in this thread on potential privacy issues I guess it's doable :D
The first one seems more troublesome. This is less intuitive than touch screen based interface. The only way I see fighting against this is to standardize a set of generic gestures, map onto existing equivalent touch/voice actions and push it to the Android ecosystem. But not sure how many third party manufacturers will join this parade. Does this technology work well under screen? The industry is now obsessed with getting rid of notch and if Soli blocks this path then it will be a pretty hopeless fight.
Snapping my fingers would be a nice trigger, like "ok Google" or "Alexa". Synchronising the sound with the gesture would cut down on the false positive rate, and it's something I'm unlikely to do unless I want to interact with my phone.
If it could penetrate my pants pocket, being able to snap my fingers next to my pocket, and then perform simple interactions without having to pick up my phone would be nice. Pick up, hang up, volume etc
I would say that snapping is definitely an incidental gesture for some people, and it's also highly inaccessible (while many gesture controls aren't perfectly accessible, audibly snapping is difficult for many more people than those who waving is difficult for)
Not to mention, half the utility of the gestures is the ability to interact with messy/wet hands. Snapping my fingers near my phone in that situation isn't attractive.
Maybe teaching a gesture to your phone is the most accessable option, respects culture and disability the best.
It's a shame though, I did like the intentionality that the sound of snapping fingers afforded.
> A big problem with gestures is that there is no clear affordance as to what kinds of gestures you can do, or any clear feedback. For feedback, one could couple Soli's input with a visual display, but at that point, it's not clear if there is a big advantage over a touchscreen, unless the display is really small.
For the Google Pixel 4 that they are using in the video you already have a big display. It can instruct you how to gesture so that you learn it and later it can let you gesture without instructions.
> The second is what's known as the Midas touch problem. How can the system differentiate if you are intentionally gesturing as input vs incidentally gesturing?
Either an activation word like you said, or it could use the front-side camera to see whether or not you are looking at it.
Or, depending how smart it is, and its range, it might detect your head attitude and use that as a proxy for attention. The website claims that it can detect a turn toward, a lean, or a look.
> A big problem with gestures is that there is no clear affordance as to what kinds of gestures you can do, or any clear feedback
> ...learnability
Do you have any examples of well structured learnable systems? I have struggled to find much of anything in this space, yet every technology release I see wants for it.
Here are my two examples, I have no others off the top of my mind. I am more impressed with the vim example.
1.
`vim-pandoc-syntax` has a set of documents exampling the feature-set of markdown. These documents are the system they document. Here is one file in a directory of 10 such documents.
I have yet to hear a good response to this question.
I have a Pixel 3 and I want a manual for the device, it appears one does not exist. Nor does documentation. My headphones which came in the box don't resume the most recent media player when I tap the middle button, I called support and over the course of an hour they found they have the same issue. Before my call the people I spoke to said I was wrong and didn't know this issue existed, aftewards they had no advice for me other than to give up. My issue persists.
>>The first is learnability. A big problem with gestures is that there is no clear affordance as to what kinds of gestures you can do, or any clear feedback. For feedback, one could couple Soli's input with a visual display, but at that point, it's not clear if there is a big advantage over a touchscreen, unless the display is really small.
That's the same reason for why I think voice controls are literally the worst way to interact with a computer ever(although I think this might actually top it).
> How can the system differentiate if you are intentionally gesturing as input vs incidentally gesturing?
This is why I had to change my Amazon Echo Dot's call word back from "Computer". Turns out one might say "computer" a lot during the course of the day, and Alexa was CONSTANTLY going off when it shouldn't have. It was so disappointing that I gave the echo dot away.
Off topic but does it seem like this link deliberately doesn't load any of the Youtube UI details, just leaving in grey hints? I thought the rest hadn't loaded but it's kind of a nice experience.
Once we figure out (non-invasive) BCI and EEG type brain activity signature patterns for when our brains process our perceived intent of taking an action and can activate that action on the system side, prior to our brain sending those electrical impulses to our motor system.
How hard would it be to teach ourselves to inhibit the electrical impulses to our motor system when BCI can identify intent?
When would this level of BCI be possible if you had to make an educated guess?
Thanks for sharing, as a fellow HCI/Cog Sci graduate!
That would be great, but can radar really tell what you are looking at? I suppose could combine it with a camera but that sounds less than ideal in terms of energy use.
Didn't students ask about health affects? If not, consider me as a student and ask what affects it have on hand health with prolonged exposure at a closer proximity in your shirt or pant pocket.
spot on. I feel this is an abuse of technology. They want to take Touch to next level with gesture, but it is doomed to fail unless they solve other issues as you pointed out(just my opinion). Gesture might be good for gaming (ex:kinect). I worked on Hover touch in one of the big smartphone company. we achieved good results at different heights but eventually it didn't take off. After all Humans need a sense of touch to interact.
The first is learnability. A big problem with gestures is that there is no clear affordance as to what kinds of gestures you can do, or any clear feedback. For feedback, one could couple Soli's input with a visual display, but at that point, it's not clear if there is a big advantage over a touchscreen, unless the display is really small.
The second is what's known as the Midas touch problem. How can the system differentiate if you are intentionally gesturing as input vs incidentally gesturing? The example I used was the new Mercedes cars that have gesture recognition. While I was doing a test drive, the salesperson started waving his hands as part of his normal speech, and that accidentally raised the volume. Odds are very high Soli will have the same problem. One possibility is to activate Soli via a button, but that would defeat a lot of the purpose of gestures. Another is to use speech to activate, which might work out. Yet another possibility is that you have to do a special gesture "hotword", sort of like how Alexa is activated by saying it's name.
At any rate, these problems are not insurmountable, but it definitely adds to the learning curve, reliability, and overall utility of these gesture based interfaces.